Tags: abstract syntax tree, annotation, bodden, bytecode analysis, class error, cloneable, cloning, compilation unit, eclipse, heuristic, intermediate representations, java programs, mail, marker, mcgill ca, mcgill university, meta data, optimization framework, sable research group, ter,
Helping manage the concern of object cloning in Java programs
(COMP 762 Project 2 report)
Eric Bodden
Sable Research Group, McGill University
eric.bodden@mail.mcgill.ca
1. Introduction then invoke a "quick fix" to fix the violation by one of two
means depending on the type of warning. (see section 2.2)
In previous work [2] we investigated the concern of
tagging entities of various intermediate representations of 2.1. Heuristics
the Soot [9] bytecode analysis and optimization framework
with meta-data. The study uncovered a problem with the In the following we describe the four heuristics in detail.
way such meta-data tags are currently being copied. The Each heuristic is implemented as a visitor that walks the
latest implementation of Soot copies tags by manual method abstract syntax tree of each compilation unit in the change
calls to Host.addAllTagsOf(Host). This necessarily set of an incremental (or full) compilation in Eclipse.
leads to an inconsistent implementation over time, as calls
to this method might be forgotten. Indeed, in [2] we pointed Consistent class annotation In Java, a class that imple-
out places in the code where such calls were added years af- ments clone() most certainly does so because it wants to
ter the surrounding code was written, indicating that a latent provide the functionality of cloning objects of that class.
error was found caused by tags not being copied. Consequently, this class should implement the Cloneable
We also suggested that automatic copying of tags could marker interface. On the other hand, any class that imple-
alleviate the problem of inconsistency. Such automatic ments Cloneable should provide a non-default implemen-
copying can be implemented by providing consistent imple- tation of clone() at least in one of its super classes.
mentations of the clone() method on all types which can Hence, this checker issues a warning whenever a class
be tagged. (All such types implement the Host interface.) (a) is concrete (not abstract), implements Cloneable but
Unfortunately, it turned out that almost all of the current only inherits the clone() implementation of Object, or
implementations of clone() are flawed: Instead of calling (b) it declares clone() itself but does not implement the
super.clone() they call a copy constructor. This leads to Cloneable interface.
tags not being copied. Furthermore, it leads to the fact that The warning message given reflects the type of problem
anybody sub classing any Host will lose that sub class in- detected. The available quick fix resolutions are (a) gen-
stance, when calling clone(). This may seriously impede erating an implementation of clone() or (b) making the
future extensibility of the Soot framework. declaring type implement the Cloneable interface.
2. Solution Returning null During our initial investigation, we found
that frequently implementations of clone() actually re-
We propose a generic solution consisting of two compo- turn null because cloning of those types is not actually
nents. Firstly, we expose a set of four heuristic checkers that supported. As is known from Software Engineering re-
intend to warn a user whenever cloning in a particular Java search, returning null can lead to null-pointer exceptions
class is implemented in a way that might impede software occurring far from the original error location. Causes
evolution. The heuristics have been implemented as an ex- of such exceptions are consequently hard to track and
tension to the Eclipse Integrated Development Environment fix, especially if the contract of the defined method (here
(IDE) 1 . Whenever a heuristic finds a violation of its rule, it Object.clone()) implies that a non-null pointer be re-
attaches a warning to the resource in question. The user can turned.
Hence, this checker looks for occurrences of the state-
1 Eclipse project http://www.eclipse.org/ ment return null; in the body of each clone() method.
If an occurrence is found, we actually issue a warn- is not an easy task at all and hence we wish to discuss it in
ing message. The associated quick fix allows to gener- more detail. (Note that our implementation allows to gener-
ate a correct implementation of clone(). The warning ate clone() methods also with no warning being present,
suggests that the method, by contract, can also throw a simply by a menu item in the context menu of an arbitrary
CloneNotSupportedException. No quick fix for gen- source type.)
erating such an implementation is provided at this time.
Generation of clone() When designing the user inter-
Not calling the super class As mentioned in the in- face for the generation of the clone() method, we fol-
troduction, Soot suffers from the problem of not calling lowed a lot the currently existing support for generating
super.clone() in order to construct the actual clone ob- hashCode() and equals(..) methods within Eclipse be-
ject. Instead it relies on the correct implementation of copy- cause both methods share quite some concepts. In particu-
constructors. This causes multiple problems. (1) Copy con- lar, all three implementations depend on the available types
structors have to be created and maintained. (2) A copy of fields and have to call the super class in order to com-
constructor has to be implemented on a class C even if C pute their final result. Also, the available implementation
does not add any fields to its super class. (3) If a class C is for hashCode() and equals(..) provided best practises
sub classed by a class S and S calls super.clone(), an in- for proper integration with the Eclipse IDE.
stance of C is returned, which can lead to all sorts of errors Compared to generating hashCode() or equals(..)
and violates the contract of clone(). methods, a complete solution to the problem of generat-
Hence, the checker searches the body of each clone() ing a clone() method is hard. The problem is that not
method for calls to super.clone(). If none is found, a every class implements the clone() method, while for
warning is issued. The associated quick fix allows to gener- hashCode() and equals(..) this is always the case. Be-
ate a correct implementation of clone(). cause of this fact, one cannot always assume that clone()
can be called on any type of instance field. Also, there might
Use of Java 5 co-variant return types From Java 5 on- be different kinds of clones desirable. Some applications
wards, methods can have co-variant return types. For the might require a shallow copy where only field references
particular example of the method clone(), it means that it are copied, others might require deep copies where the en-
can return subtypes of Object and in particular, if defined tire contents of all instance fields are copied (recursively).
in a class C, it should use return type C to avoid casting on At the latest when it comes to deep copies, a general so-
the client-side. lution is virtually impossible. Creating a deep copy in gen-
Hence, to Java 5-enabled projects, we apply a checker eral requires that all field types and all field types of those
that flags implementations of clone() which have a return types are (again, recursively) cloneable. Furthermore, all
type different from the type of their declaring class. The flag those implementations of clone() must consistently cre-
we create is of type "info" instead of "warning" because this ate deep copies. Virtually all collection classes of the Java
conversion does not really change the behaviour of the pro- runtime library violate this second property; when calling
gram, just its style. The associated quick fix allows to gen- clone(), they create shallow copies. Other classes in the
erate an implementation of clone() with co-variant return runtime library are not cloneable at all, like String or
type. StringBuffer. A general solution would hence have to
generate specialized code for cloning collections and for us-
2.2. Quick fixes ing copy constructors for non-cloneable classes (on a case-
by-case basis). A series of articles by Kreft and Langer
As mentioned above, each of the four heuristics pro- [4, 5, 6] (German) give a very detailed assessment of those
vides a set of "quick fixes" to quickly fix the detected problems and possible solution strategies. As they show, a
problem at hand. In the current implementation we pro- general solution even has to make use of reflection in certain
vide two different fixes. One makes a class implement situations.
the Cloneable interface, while the other one generates an For the scope of this work, we decided for an easier strat-
implementation of the clone() method. Generally, one egy which solves most of the problem but might still leave
could think of further fixes, like making a clone() method some manual work to the programmer for exceptional cases.
throw a CloneNotSupportedException instead of re- When opting to generate a clone method, our extension
turning null but we found that those solutions can easily presents the user the dialog shown in Figure 1. On the top
be coded by hand and also that the problems they solve do the user can select fields which should be deeply copied. We
not actually occur that often. allow the creation of deep copies of instance-fields which
While the quick fix for adding the Cloneable interface are of a reference type (the notion of deep copies makes no
is straightforward, generation of a correct clone() method sense for primitive types) that implements the Cloneable
2
3.1. Results of applying the heuristics
Altogether, this Soot revision contained 240 non-abstract
declarations of clone() methods which we had to to
deal with. Out of those, the heuristic for not calling
the super class reported 237 implementations not calling
super.clone(). The heuristic for consistent class anno-
tation reported that almost all of those types implementing
clone() did not implement the Cloneable interface. In
14 other cases, the interface was implemented but not the
clone() method. None of the implementations returned
null. All implementations used Object as return type
which was consistently reported by our heuristic for sug-
gesting co-variant return types. This is because the Soot
Figure 1. Options dialog for code generation developers have just currently started to convert Soot to a
Java 5 code base.
interface. 2 3.2. Effectiveness of code generation
Furthermore, the dialog exposes options for automat-
ically generating method comments (as defined in the We manually investigated all of the generated warnings
Eclipse preferences or project properties), using the co- and used the quick fix feature provided by our tool to find a
variant return type (shown in Java 5-enabled projects better implementation that would eliminate the warning but
only) and softening the CloneNotSupportedException not change the behaviour of the program (at least in com-
thrown when calling super.clone(). Through soft- bination with other changes of the same kind). The largest
ening, clients of this class do not have to catch changes could be made in the packages that resemble nodes
this exception again and again. Using softening of abstract syntax trees for the various intermediate repre-
CloneNotSupportedException is common and good sentations in Soot. Formerly, each single node class would
practise if used at places where it is known to be safe and implement clone() by calling a constructor and cloning
can for example be found in many places inside the Java the arguments recursively. This requires an implementation
runtime library, e.g. the class LinkedList: of clone() on every single node type. Interestingly, after
try { generating a few standard implementations of clone() far
clone = (LinkedList) super.clone(); up in the hierarchy, it turned out that most of the implemen-
} catch (CloneNotSupportedException e) { tations of clone() in sub classes could be eliminated. This
throw new InternalError(); can always be done when a sub class declares no instance
} fields.
The option to (not) soften exceptions is shown only if the Altogether, we were able to remove 179 methods that
super type declares this exception in its interface. In cases way. Another 37 methods could be replaced by standard
where CloneNotSupportedException is not declared, it implementations we generated with the tool. Those were
has to be softened in order to adhere to this interface. Con- all either cases where a clone() method was necessary be-
sequently, in such situations, the option on the dialog is re- cause the type did declare instance fields or because the su-
placed by an appropriate hint that softening will be forced per type of the type was Object, whose clone() method
if necessary. has only protected visibility. In those cases, an implemen-
tation of clone() can be used to expose the method to
clients. In another 16 cases, we had to replace methods by
3. Validation non-standard implementations. In order to do so, we first
generated a default implementation using our tool and then
In order to validate the feasibility of our approach, first modified it to our needs. Most modifications boiled down
we applied the four heuristics to the entire Soot code base to possible null-pointer checks (e.g. for linked lists) or deep
(as of revision 2665). Then, based on the warning markers, copying of arrays or collections.
we refactored Soot to implement cloning consistently, using In one case, an abstract class for constants, we return
the code generation features explained above. this from clone(), although it violates the contract. This
2 Note that here we assume that the field type actually creates a deep is because constants are immutable by definition and hence
copy when clone() is called. This behaviour is not validated. need not to be cloned, hence saving memory. In three
3
cases, the clone() method had to be added to interfaces 5. Conclusion
so that it could be called on types implementing that inter-
face. We were happy to see that only few such additions As we showed in this work, quite simple heuristics can
had to be made. This was the fact because almost all types be use to find flaws in the implementation of cloning in Java.
in Soot implement some relatively generic interface, such Moreover, we were able to provide an Eclipse plug-in that
as Value. In another eleven cases, abstract definitions of generates a default implementation for clone() methods
clone() were replaced by concrete, generated implemen- which was useful in almost all cases we investigated. Im-
tations. This is because those methods had to be called by plementing cloning using code generated that way allowed
sub classes via super.clone() and Java does not type- us to safely eliminate more than 50% of all clone() meth-
check such calls to abstract methods. ods, significantly alleviating the problem of code mainte-
In order to complete the implementation of cloning, we nance for Soot. All converted clone() methods use co-
had to add another 26 standard and two non-standard imple- variant return types. We recommend the use of co-variant
mentations. In seven cases, we refined the return types of return types, as in one case this even revealed a bug: One
existing (correct, partially abstract) methods to the type of of the clone() methods we converted was previously not
the declaring method. Eleven times we had to keep imple- even returning an object of the right type.
mentations of clone() as they were, because they did ad- For future we plan to look into extending the code gener-
ditional crucial work. In particular this is the case for types ation to be able to deal with deep copies of the Java collec-
representing method bodies in Soot. When cloning bodies, tion classes and arrays. Our study revealed that such cases
one has to clone everything but local variables which then probably occur less often than one might think, however au-
have to be cloned separately and patched up in a second tomated support might be useful for certain applications.
step. In all those cases, we marked the respective meth- Despite the fact that our experiment went well and
ods with a SuppressWarnings annotation. (Currently, our helped to support our claims, we came to the conclusion
Eclipse plug-in does not yet manage to actually suppress the that actually real language support for cloning would be
warning but this will be solved in future versions.) very desirable. For example, annotations could be used to
state whether an instance field should be deeply cloned or
not. The actual cloning could then be left entirely to the
4. Related Work
virtual machine. The same could hold for equals() and
hashCode() methods which could be parametrized by the
The work mostly related to ours is the automatic genera- same annotations. We believe that the use of aspect-oriented
tion of equals(..) and hashCode() methods in Eclipse. programming could yield such automation, however, using
As mentioned above, opposed to the case of clone(), gen- current technologies, only by the use of reflection which
erating those methods is possible in a complete manner, comes at a huge runtime cost.
since all types in Java do provide a public equals(..) and
hashCode() method. Also the semantics of those meth-
References
ods is completely defined, while for clone() this is not the
case (e.g. compare the notions of a deep or shallow copy).
[1] D. Ancona, G. Lagorio, and E. Zucca. Jam - A Smooth Ex-
The issue of whether returning null from a clone() tension of Java with Mixins. In ECOOP '00: Proceedings of
that is not actually capable of cloning or throwing a the 14th European Conference on Object-Oriented Program-
CloneNotSupportedException instead boils down to ming, pages 154178, London, UK, 2000. Springer-Verlag.
whether or not to use the so-called "return code idiom". The [2] E. Bodden. COMP 762 Project 1 report, February 2007.
work of Bruntink et al. [3] analyzes large-scale C programs [3] M. Bruntink, A. van Deursen, and T. Tourw´ . Discovering
e
using this idiom and shows that its use is very error-prone. faults in idiom-based exception handling. In ICSE '06: Pro-
We take this as a justification for our "return null" heuristic. ceeding of the 28th international conference on Software en-
The effect of needing less implementations of clone() gineering, pages 242251, New York, NY, USA, 2006. ACM
when calling super.clone() than when using copy- Press.
[4] K. Kreft and A. Langer. Das Kopieren von Objekten - Der
constructors can be explained by the power of virtual dis-
Sinn und Zweck von clone(). JavaSPEKTRUM, September
patch. If a sub class does not add any instance fields, it can
2002.
reuse the implementation of clone() from its super class [5] K. Kreft and A. Langer. Das Kopieren von Objekten - Prinzip-
and virtual dispatch is the most natural form of code reuse ien einer Implementierung von clone(). JavaSPEKTRUM,
in Java. While we that way exploit the natural Java seman- November 2002.
tics, related work on the topics of Traits [8], Mixins [1] and [6] K. Kreft and A. Langer. Das Kopieren von Objekten -
Virtual Classes [7] tries to maximize code reuse by using Die CloneNotSupportedException. JavaSPEKTRUM, Jan-
different forms of dispatch. uary 2003.
4
[7] K. Ostermann, M. Mezini, and C. Bockisch. Expressive point-
cuts for increased modularity. In A. P. Black, editor, ECOOP,
volume 3586 of Lecture Notes in Computer Science, pages
214240. Springer, 2005.
[8] N. Sch¨ rli, S. Ducasse, O. Nierstrasz, and A. Black. Traits:
a
Composable units of behavior. In European Conference on
Object-Oriented Programming, 2003.
[9] R. Vall´ e-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and
e
V. Sundaresan. Soot - a Java bytecode optimization frame-
work. In CASCON '99: Proceedings of the 1999 conference
of the Centre for Advanced Studies on Collaborative research,
page 13. IBM Press, 1999.
5