Information about http://www.sable.mcgill.ca/~ebodde/genclone/genclone.pdf

Helping manage the concern of object cloning in Java…

Tags: abstract syntax tree, annotation, bodden, bytecode analysis, class error, cloneable, cloning, compilation unit, eclipse, heuristic, intermediate representations, java programs, mail, marker, mcgill ca, mcgill university, meta data, optimization framework, sable research group, ter,
Pages: 5
Language: english
Created: Thu Apr 12 11:38:50 2007
Display cached document
Page 1
image
Page 2
image
Page 3
image
Page 4
image
Page 5
image
                Helping manage the concern of object cloning in Java programs
                                    (COMP 762 Project 2 report)

                                                     Eric Bodden
                                       Sable Research Group, McGill University
                                             eric.bodden@mail.mcgill.ca


1. Introduction                                                    then invoke a "quick fix" to fix the violation by one of two
                                                                   means depending on the type of warning. (see section 2.2)
    In previous work [2] we investigated the concern of
tagging entities of various intermediate representations of        2.1. Heuristics
the Soot [9] bytecode analysis and optimization framework
with meta-data. The study uncovered a problem with the                In the following we describe the four heuristics in detail.
way such meta-data tags are currently being copied. The            Each heuristic is implemented as a visitor that walks the
latest implementation of Soot copies tags by manual method         abstract syntax tree of each compilation unit in the change
calls to Host.addAllTagsOf(Host). This necessarily                 set of an incremental (or full) compilation in Eclipse.
leads to an inconsistent implementation over time, as calls
to this method might be forgotten. Indeed, in [2] we pointed       Consistent class annotation In Java, a class that imple-
out places in the code where such calls were added years af-       ments clone() most certainly does so because it wants to
ter the surrounding code was written, indicating that a latent     provide the functionality of cloning objects of that class.
error was found caused by tags not being copied.                   Consequently, this class should implement the Cloneable
    We also suggested that automatic copying of tags could         marker interface. On the other hand, any class that imple-
alleviate the problem of inconsistency. Such automatic             ments Cloneable should provide a non-default implemen-
copying can be implemented by providing consistent imple-          tation of clone() at least in one of its super classes.
mentations of the clone() method on all types which can                Hence, this checker issues a warning whenever a class
be tagged. (All such types implement the Host interface.)          (a) is concrete (not abstract), implements Cloneable but
Unfortunately, it turned out that almost all of the current        only inherits the clone() implementation of Object, or
implementations of clone() are flawed: Instead of calling          (b) it declares clone() itself but does not implement the
super.clone() they call a copy constructor. This leads to          Cloneable interface.
tags not being copied. Furthermore, it leads to the fact that          The warning message given reflects the type of problem
anybody sub classing any Host will lose that sub class in-         detected. The available quick fix resolutions are (a) gen-
stance, when calling clone(). This may seriously impede            erating an implementation of clone() or (b) making the
future extensibility of the Soot framework.                        declaring type implement the Cloneable interface.

2. Solution                                                        Returning null During our initial investigation, we found
                                                                   that frequently implementations of clone() actually re-
   We propose a generic solution consisting of two compo-          turn null because cloning of those types is not actually
nents. Firstly, we expose a set of four heuristic checkers that    supported. As is known from Software Engineering re-
intend to warn a user whenever cloning in a particular Java        search, returning null can lead to null-pointer exceptions
class is implemented in a way that might impede software           occurring far from the original error location. Causes
evolution. The heuristics have been implemented as an ex-          of such exceptions are consequently hard to track and
tension to the Eclipse Integrated Development Environment          fix, especially if the contract of the defined method (here
(IDE) 1 . Whenever a heuristic finds a violation of its rule, it   Object.clone()) implies that a non-null pointer be re-
attaches a warning to the resource in question. The user can       turned.
                                                                       Hence, this checker looks for occurrences of the state-
  1 Eclipse   project http://www.eclipse.org/                      ment return null; in the body of each clone() method.
If an occurrence is found, we actually issue a warn-                  is not an easy task at all and hence we wish to discuss it in
ing message. The associated quick fix allows to gener-                more detail. (Note that our implementation allows to gener-
ate a correct implementation of clone(). The warning                  ate clone() methods also with no warning being present,
suggests that the method, by contract, can also throw a               simply by a menu item in the context menu of an arbitrary
CloneNotSupportedException. No quick fix for gen-                     source type.)
erating such an implementation is provided at this time.
                                                                      Generation of clone() When designing the user inter-
Not calling the super class As mentioned in the in-                   face for the generation of the clone() method, we fol-
troduction, Soot suffers from the problem of not calling              lowed a lot the currently existing support for generating
super.clone() in order to construct the actual clone ob-              hashCode() and equals(..) methods within Eclipse be-
ject. Instead it relies on the correct implementation of copy-        cause both methods share quite some concepts. In particu-
constructors. This causes multiple problems. (1) Copy con-            lar, all three implementations depend on the available types
structors have to be created and maintained. (2) A copy               of fields and have to call the super class in order to com-
constructor has to be implemented on a class C even if C              pute their final result. Also, the available implementation
does not add any fields to its super class. (3) If a class C is       for hashCode() and equals(..) provided best practises
sub classed by a class S and S calls super.clone(), an in-            for proper integration with the Eclipse IDE.
stance of C is returned, which can lead to all sorts of errors            Compared to generating hashCode() or equals(..)
and violates the contract of clone().                                 methods, a complete solution to the problem of generat-
   Hence, the checker searches the body of each clone()               ing a clone() method is hard. The problem is that not
method for calls to super.clone(). If none is found, a                every class implements the clone() method, while for
warning is issued. The associated quick fix allows to gener-          hashCode() and equals(..) this is always the case. Be-
ate a correct implementation of clone().                              cause of this fact, one cannot always assume that clone()
                                                                      can be called on any type of instance field. Also, there might
Use of Java 5 co-variant return types From Java 5 on-                 be different kinds of clones desirable. Some applications
wards, methods can have co-variant return types. For the              might require a shallow copy where only field references
particular example of the method clone(), it means that it            are copied, others might require deep copies where the en-
can return subtypes of Object and in particular, if defined           tire contents of all instance fields are copied (recursively).
in a class C, it should use return type C to avoid casting on             At the latest when it comes to deep copies, a general so-
the client-side.                                                      lution is virtually impossible. Creating a deep copy in gen-
   Hence, to Java 5-enabled projects, we apply a checker              eral requires that all field types and all field types of those
that flags implementations of clone() which have a return             types are (again, recursively) cloneable. Furthermore, all
type different from the type of their declaring class. The flag       those implementations of clone() must consistently cre-
we create is of type "info" instead of "warning" because this         ate deep copies. Virtually all collection classes of the Java
conversion does not really change the behaviour of the pro-           runtime library violate this second property; when calling
gram, just its style. The associated quick fix allows to gen-         clone(), they create shallow copies. Other classes in the
erate an implementation of clone() with co-variant return             runtime library are not cloneable at all, like String or
type.                                                                 StringBuffer. A general solution would hence have to
                                                                      generate specialized code for cloning collections and for us-
2.2. Quick fixes                                                      ing copy constructors for non-cloneable classes (on a case-
                                                                      by-case basis). A series of articles by Kreft and Langer
    As mentioned above, each of the four heuristics pro-              [4, 5, 6] (German) give a very detailed assessment of those
vides a set of "quick fixes" to quickly fix the detected              problems and possible solution strategies. As they show, a
problem at hand. In the current implementation we pro-                general solution even has to make use of reflection in certain
vide two different fixes. One makes a class implement                 situations.
the Cloneable interface, while the other one generates an                 For the scope of this work, we decided for an easier strat-
implementation of the clone() method. Generally, one                  egy which solves most of the problem but might still leave
could think of further fixes, like making a clone() method            some manual work to the programmer for exceptional cases.
throw a CloneNotSupportedException instead of re-                     When opting to generate a clone method, our extension
turning null but we found that those solutions can easily             presents the user the dialog shown in Figure 1. On the top
be coded by hand and also that the problems they solve do             the user can select fields which should be deeply copied. We
not actually occur that often.                                        allow the creation of deep copies of instance-fields which
    While the quick fix for adding the Cloneable interface            are of a reference type (the notion of deep copies makes no
is straightforward, generation of a correct clone() method            sense for primitive types) that implements the Cloneable

                                                                  2
                                                                                3.1. Results of applying the heuristics

                                                                                    Altogether, this Soot revision contained 240 non-abstract
                                                                                declarations of clone() methods which we had to to
                                                                                deal with. Out of those, the heuristic for not calling
                                                                                the super class reported 237 implementations not calling
                                                                                super.clone(). The heuristic for consistent class anno-
                                                                                tation reported that almost all of those types implementing
                                                                                clone() did not implement the Cloneable interface. In
                                                                                14 other cases, the interface was implemented but not the
                                                                                clone() method. None of the implementations returned
                                                                                null. All implementations used Object as return type
                                                                                which was consistently reported by our heuristic for sug-
                                                                                gesting co-variant return types. This is because the Soot
    Figure 1. Options dialog for code generation                                developers have just currently started to convert Soot to a
                                                                                Java 5 code base.

interface. 2                                                                    3.2. Effectiveness of code generation
   Furthermore, the dialog exposes options for automat-
ically generating method comments (as defined in the                                We manually investigated all of the generated warnings
Eclipse preferences or project properties), using the co-                       and used the quick fix feature provided by our tool to find a
variant return type (shown in Java 5-enabled projects                           better implementation that would eliminate the warning but
only) and softening the CloneNotSupportedException                              not change the behaviour of the program (at least in com-
thrown when calling super.clone(). Through soft-                                bination with other changes of the same kind). The largest
ening, clients of this class do not have to catch                               changes could be made in the packages that resemble nodes
this exception again and again.           Using softening                       of abstract syntax trees for the various intermediate repre-
CloneNotSupportedException is common and good                                   sentations in Soot. Formerly, each single node class would
practise if used at places where it is known to be safe and                     implement clone() by calling a constructor and cloning
can for example be found in many places inside the Java                         the arguments recursively. This requires an implementation
runtime library, e.g. the class LinkedList:                                     of clone() on every single node type. Interestingly, after
try {                                                                           generating a few standard implementations of clone() far
  clone = (LinkedList) super.clone();                                        up in the hierarchy, it turned out that most of the implemen-
} catch (CloneNotSupportedException e) {                                        tations of clone() in sub classes could be eliminated. This
  throw new InternalError();                                                    can always be done when a sub class declares no instance
}                                                                               fields.
    The option to (not) soften exceptions is shown only if the                      Altogether, we were able to remove 179 methods that
super type declares this exception in its interface. In cases                   way. Another 37 methods could be replaced by standard
where CloneNotSupportedException is not declared, it                            implementations we generated with the tool. Those were
has to be softened in order to adhere to this interface. Con-                   all either cases where a clone() method was necessary be-
sequently, in such situations, the option on the dialog is re-                  cause the type did declare instance fields or because the su-
placed by an appropriate hint that softening will be forced                     per type of the type was Object, whose clone() method
if necessary.                                                                   has only protected visibility. In those cases, an implemen-
                                                                                tation of clone() can be used to expose the method to
                                                                                clients. In another 16 cases, we had to replace methods by
3. Validation                                                                   non-standard implementations. In order to do so, we first
                                                                                generated a default implementation using our tool and then
   In order to validate the feasibility of our approach, first                  modified it to our needs. Most modifications boiled down
we applied the four heuristics to the entire Soot code base                     to possible null-pointer checks (e.g. for linked lists) or deep
(as of revision 2665). Then, based on the warning markers,                      copying of arrays or collections.
we refactored Soot to implement cloning consistently, using                         In one case, an abstract class for constants, we return
the code generation features explained above.                                   this from clone(), although it violates the contract. This
   2 Note that here we assume that the field type actually creates a deep       is because constants are immutable by definition and hence
copy when clone() is called. This behaviour is not validated.                   need not to be cloned, hence saving memory. In three

                                                                            3
cases, the clone() method had to be added to interfaces                5. Conclusion
so that it could be called on types implementing that inter-
face. We were happy to see that only few such additions                   As we showed in this work, quite simple heuristics can
had to be made. This was the fact because almost all types             be use to find flaws in the implementation of cloning in Java.
in Soot implement some relatively generic interface, such              Moreover, we were able to provide an Eclipse plug-in that
as Value. In another eleven cases, abstract definitions of             generates a default implementation for clone() methods
clone() were replaced by concrete, generated implemen-                 which was useful in almost all cases we investigated. Im-
tations. This is because those methods had to be called by             plementing cloning using code generated that way allowed
sub classes via super.clone() and Java does not type-                  us to safely eliminate more than 50% of all clone() meth-
check such calls to abstract methods.                                  ods, significantly alleviating the problem of code mainte-
    In order to complete the implementation of cloning, we             nance for Soot. All converted clone() methods use co-
had to add another 26 standard and two non-standard imple-             variant return types. We recommend the use of co-variant
mentations. In seven cases, we refined the return types of             return types, as in one case this even revealed a bug: One
existing (correct, partially abstract) methods to the type of          of the clone() methods we converted was previously not
the declaring method. Eleven times we had to keep imple-               even returning an object of the right type.
mentations of clone() as they were, because they did ad-                  For future we plan to look into extending the code gener-
ditional crucial work. In particular this is the case for types        ation to be able to deal with deep copies of the Java collec-
representing method bodies in Soot. When cloning bodies,               tion classes and arrays. Our study revealed that such cases
one has to clone everything but local variables which then             probably occur less often than one might think, however au-
have to be cloned separately and patched up in a second                tomated support might be useful for certain applications.
step. In all those cases, we marked the respective meth-                  Despite the fact that our experiment went well and
ods with a SuppressWarnings annotation. (Currently, our                helped to support our claims, we came to the conclusion
Eclipse plug-in does not yet manage to actually suppress the           that actually real language support for cloning would be
warning but this will be solved in future versions.)                   very desirable. For example, annotations could be used to
                                                                       state whether an instance field should be deeply cloned or
                                                                       not. The actual cloning could then be left entirely to the
4. Related Work
                                                                       virtual machine. The same could hold for equals() and
                                                                       hashCode() methods which could be parametrized by the
    The work mostly related to ours is the automatic genera-           same annotations. We believe that the use of aspect-oriented
tion of equals(..) and hashCode() methods in Eclipse.                  programming could yield such automation, however, using
As mentioned above, opposed to the case of clone(), gen-               current technologies, only by the use of reflection which
erating those methods is possible in a complete manner,                comes at a huge runtime cost.
since all types in Java do provide a public equals(..) and
hashCode() method. Also the semantics of those meth-
                                                                       References
ods is completely defined, while for clone() this is not the
case (e.g. compare the notions of a deep or shallow copy).
                                                                       [1] D. Ancona, G. Lagorio, and E. Zucca. Jam - A Smooth Ex-
    The issue of whether returning null from a clone()                     tension of Java with Mixins. In ECOOP '00: Proceedings of
that is not actually capable of cloning or throwing a                      the 14th European Conference on Object-Oriented Program-
CloneNotSupportedException instead boils down to                           ming, pages 154­178, London, UK, 2000. Springer-Verlag.
whether or not to use the so-called "return code idiom". The           [2] E. Bodden. COMP 762 Project 1 report, February 2007.
work of Bruntink et al. [3] analyzes large-scale C programs            [3] M. Bruntink, A. van Deursen, and T. Tourw´ . Discovering
                                                                                                                        e
using this idiom and shows that its use is very error-prone.               faults in idiom-based exception handling. In ICSE '06: Pro-
We take this as a justification for our "return null" heuristic.           ceeding of the 28th international conference on Software en-
    The effect of needing less implementations of clone()                  gineering, pages 242­251, New York, NY, USA, 2006. ACM
when calling super.clone() than when using copy-                           Press.
                                                                       [4] K. Kreft and A. Langer. Das Kopieren von Objekten - Der
constructors can be explained by the power of virtual dis-
                                                                           Sinn und Zweck von clone(). JavaSPEKTRUM, September
patch. If a sub class does not add any instance fields, it can
                                                                           2002.
reuse the implementation of clone() from its super class               [5] K. Kreft and A. Langer. Das Kopieren von Objekten - Prinzip-
and virtual dispatch is the most natural form of code reuse                ien einer Implementierung von clone(). JavaSPEKTRUM,
in Java. While we that way exploit the natural Java seman-                 November 2002.
tics, related work on the topics of Traits [8], Mixins [1] and         [6] K. Kreft and A. Langer. Das Kopieren von Objekten -
Virtual Classes [7] tries to maximize code reuse by using                  Die CloneNotSupportedException. JavaSPEKTRUM, Jan-
different forms of dispatch.                                               uary 2003.


                                                                   4
[7] K. Ostermann, M. Mezini, and C. Bockisch. Expressive point-
    cuts for increased modularity. In A. P. Black, editor, ECOOP,
    volume 3586 of Lecture Notes in Computer Science, pages
    214­240. Springer, 2005.
[8] N. Sch¨ rli, S. Ducasse, O. Nierstrasz, and A. Black. Traits:
            a
    Composable units of behavior. In European Conference on
    Object-Oriented Programming, 2003.
[9] R. Vall´ e-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and
            e
    V. Sundaresan. Soot - a Java bytecode optimization frame-
    work. In CASCON '99: Proceedings of the 1999 conference
    of the Centre for Advanced Studies on Collaborative research,
    page 13. IBM Press, 1999.




                                                                    5