Tags: compliant databases, computer science university, data management group, department of computer science, interchange format, object data management, object data management group, object databases, object definition language, object oriented programming, object oriented programming language, object query language, odmg object model, oriented programming language, persistence of object, programming language objects, query language oql, specification languages, university of warwick, xml document type,
Using XML as an Object Interchange Format
G.M. Bierman
Department of Computer Science
University of Warwick
May 17, 2000
Abstract
In the ODMG standard for object databases [1], a specification language is
defined to dump and load the current state of ODMG-compliant databases. In
this paper we propose an alternative language, OIFML, based upon XML.
1 Introduction
The Object Data Management Group (ODMG) have given a number of specifi-
cations for the persistence of object-oriented programming language objects in
databases. These specifications form an industry standard for object data manage-
ment systems (ODMSs), which has been published as a book [1] (hereafter referred
to simply as the Standard).
The Standard has four main components.
1. An object model.
2. Object specification languages.
3. Object query language (OQL).
4. Programming language bindings (currently for Java, C++ and Smalltalk).
The Standard [Chapter 3] defines two object specification languages: Object
Definition Language (ODL) and Object Interchange Format (OIF). ODL is used to
specify object types that conform to the ODMG object model. OIF is a specification
language used to dump and load the current state of an ODMG-compliant ODMS.
In this paper we are especially interested in OIF. XML is fast becoming the stan-
dard for data exchange, particularly on the Internet. Rather than use the ODMG's
language, we shall show in this paper how to use XML as an object interchange
format. We define a new XML document type, OIFML, and show how it can be
used to specify ODMG-objects.
2 A brief introduction to XML
XML is a powerful language to describe documents. Documents typically have
both structure and content, and XML provides a means for separating one from
the other in an electronic document. For example, a memo typically consists of a
number of elements: a "from" element, a "to" element, a "subject" element, and
finally a "body" element. Here's an example of such a memo, written in XML.
1
W3C
Gavin
Names
What about nesting of names in XML?
The structure in this document is given by the text between the angle brackets,
these are called tags. Notice that tags always come in pairs. If an XML document
does not have matching tag pairs, then it is considered to be ill-formed. Tags are
sometimes referred to as markup, and sometimes as metadata. The information
between the matching tags is known as the content of the element.
XML elements are permitted to have additional attributes (which we'll refer to
as XML-attributes when discussing object databases in the next section). Attribute
values are given in the start tag, for example:
Gavin
An element may have attributes but no content. XML provides a shorthand for
such elements as follows.
XML documents can also contain a description of their logical structure, which
is called a document type declaration (DTD). This is declared in the beginning (the
prolog) of an XML document, either directly or by giving a URL where it can be
found. The intention is that should a DTD be given, the document is checked to
see that it adheres to the DTD (it is then said to be valid).
We have now covered most of the features of XML necessary to understand the
rest of this paper. Further details on XML can be found on the W3C website1 or,
for example, in the book by Goldfarb and Prescod [2]
3 OIFML
In this section, we demonstrate how to use an XML-based language, OIFML, as
a specification language for ODMG-objects. The DTD for OIFML is given in Ap-
pendix A. We assume in places that the reader is familiar with ODL (some helpful
examples are given in [4]).
This section follows exactly the same structure as the Standard [Ü3.3]--we con-
sider the same features, give the same examples, and even use the same headings.
We hope this will help readers familiar with the ODMG Standard.
3.1 Basic structure
An OIFML file contains a number of object definitions. Its basic structure is as
follows.
1 http://www.w3.org/TR/REC-xml
2
...
The first line of the prolog simply states which version of XML we use (1.0, for
now). The second line states that the contents should adhere to the DTD given at
the particular URL. (This DTD is given in Appendix A.)
3.2 Object definitions
The following is a simple example of an object definition in OIFML.
Person
This defines a instance of the class Person, with the unique identifier Jack.
Notice that its attributes are left undefined.
3.2.1 Physical clustering
It is possible to specify that when an object is loaded in, it be placed physically near
another object. (Obviously the notion of nearness is implementation dependent.)
Such clustering is specified using an (optional) XML-attribute proximity.
For example, the following specifies an instance of the class Engineer, with
identifier Paul, which is to be placed physically near the object with identifier
Jack.
Engineer
3.3 Attribute value initialisation
When specifying an object, an arbitrary subset of its attributes can be initialised
explicitly. A tag contents is used to specify these attributes. An attribute is rep-
resented with the tag attribute, which has a compulsory XML-attribute name,
which gives the name of the attribute. We use the tag value to specify the associ-
ated value.
For example, assume the following ODL definition.
interface Person {
attribute string Name;
attribute unsigned short Age;
};
3
The following specifies an instance of the Person class, with the value "Sally"
for the attribute Name, and value 11 for the attribute Age.
Person
3.3.1 Short initialisation format
We are also permitted to simply list the values, and not specify the attributes.2
Here the values are assumed to initialise the attributes in the order they appear in
the ODL definition. For example, here is our earlier example in such a shortened
form.
Person
3.3.2 Copy initialisation format
It is often the case that several objects are to be initialised with the same set of at-
tribute values. A tag shared_value_object is provided for this purpose. For
example, the following specifies an instance of the Company class which is phys-
ically near the object with identifier McPerth, and initialised with the same at-
tribute values.
Company
2 It is not clear how valuable this facility is in an interchange language. We have included it for
completeness.
4
3.3.3 Boolean literals
We can define a boolean literal using the bool tag. This has a (compulsory) XML-
attribute, val, which takes the values either true or false.
3.3.4 Character literals
We can define a character literal using the char tag. This value is given using the
(compulsory) XML-attribute, val.
3.3.5 Integer literals
We provide a number of different tags, corresponding to the different sorts of inte-
gers in the ODMG object model. Again the value is given using an XML-attribute,
val. Here are some examples.
3.3.6 Float literals
We provide the tags float and double to specify float literals. Here are some
examples
5
3.3.7 String literals
We provide the tag string to specify string literals. For example:
3.3.8 Initialising attributes of structured types
We allow attributes of structured types to be initialised in OIFML. For example,
assume the following ODL definition.
struct PhoneNumber{
unsigned short CountryCode;
unsigned short AreaCode;
unsigned short PersonCode;
};
struct Address{
string Street;
string City;
PhoneNumber Phone;
};
interface Person{
attribute string Name
attribute Address PersonAddress
};
We provide a tag struct to specify a structured value. The components of the
structured value are specified using the tag field. This tag has a (compulsory)
XML-attribute called name. We use the tag value to specify the associated value.
For example, the following specifies an instance of this class Person, which
initialises some attributes of structured types.
Person
6
3.3.9 Initialising multidimensional attributes
An attribute is allowed to have a dimension greater than one--such an attribute is
essentially a fixed-size array. For example, assume the following ODL definition.
interface Engineer{
attribute unsigned short PersonID[3];
};
We provide a tag array to specify attributes with dimensions. This tag has an
XML-attribute size, which is used to specify the size of the dimension. The ele-
ments of the array are specified using the element tag, which has a (compulsory)
XML-attribute index, which is used to specify which element is being initialised.
Any elements which are not specified remain uninitialised.
For example, the following specifies an instance of the class Engineer, where
the first and third elements of the attribute PersonID are initialised (arrays are
assumed to be indexed starting from zero).
Engineer
7
We permit a shorthand for specifying that a contiguous sequence of an array
is initialised (again starting from zero). For example, assume the following ODL
definition.
interface Sample{
attribute unsigned short Values[1000];
};
The following specifies an instance of the class Sample, with the first four ele-
ments defined (starting with the element indexed at zero).
Sample
3.3.10 Initialising collections
We provide a tag collection to enable attributes to be initialised with a collec-
tion. This tag has a compulsory XML-attribute, type, which takes the value set,
bag or list, as appropriate. For example, assume the following ODL definition.
interface Professor:Person{
attribute set Degrees;
};
The following specifies an instance of the class Professor, where the collec-
tion type attribute (a set) is initialised.
Professor
8
We also permit dynamic arrays. For example, assume the following ODL defi-
nition.
struct Point{
float X;
float Y;
};
interface Polygon{
attribute array RefPoints;
};
Thus the attribute RefPoints contains an array of unspecified size. We can
specify such arrays by simply dropping the XML-attribute, size, used in the pre-
vious section. For example, the following specifies an instance of the Polygon
class where two of the elements of the dynamic array are initialised.
Polygon
9
It is perfectly acceptable to have fixed-size arrays containing dynamic arrays,
as demonstrated by the following example.
PolygonSet
3.4 Link definitions
The following sections describe the OIFML syntax for specifying relationships.
10
3.4.1 Cardinality "one" relationships
We provide a tag relationship to initialise attribute relationships. This tag has
a compulsory XML-attribute, name, which takes the name of the relationship. The
link to the object which forms the relationship is given using the link tag.
For example, assume the following ODL definition.
interface Person{
relationship Company Employer
inverse Company::Employees;
};
The following specifies an instance of this class Person, where the relationship
Employer is initialised with the object with identifier McPerth.
Person
3.4.2 Cardinality "many" relationships
We also allow for relationships with cardinality "many". We use the same relationship
tag as earlier, but we provide a new tag links. This takes a list of references to
objects, and also has a compulsory XML-attribute, type, to specify whether the
relationship forms a set, bag or list. For example, assume the following ODL defi-
nition.
interface Company{
relationship set Employees
inverse Person ::Employer;
};
The following specifies an instance of this class Company, and establishes a
relationship, Employees, between this instance and the objects Jack2, Joe, and
Jim.
Company
11
4 Conclusions
In this paper we have shown how ODMG-objects can be encoded in a new XML-
based language, OIFML. XML is fast establishing itself as the standard for elec-
tronic data interchange. It seems prudent therefore to respect this standard when
defining means for the interchange of object databases, rather than defining yet
another ad-hoc language.
A nice consequence of using the XML standard is that immediately our lan-
guage, OIFML, is supported by a large number of tools. A wealth of parsers, ed-
itors, browsers are already available (For example, the CD-ROM attached to the
XML Handbook [2] contains 125 XML software packages!).
As well as providing a means for seeding new object databases, OIFML can
be used to seed special semi-structured databases, such as Lore [3], with ODMG-
compliant databases.
Acknowledgements
I am grateful to Ken Moody for his comments on an earlier draft.
References
[1] R.G.G. C ATTELL ET AL . The Object Data Standard: ODMG 3.0. Morgan Kauf-
mann, 2000.
[2] C.F. G OLFARB AND P. P RESCOD. The XML Handbook (second edition). Prentice-
Hall International, 2000.
[3] J. M C H UGH , S. A BITEBOUL , R. G OLDMAN , D. Q UASS , AND J. W IDOM. Lore:
A database management system for semistructured data. SIGMOD Record,
26(3):5466, 1997.
[4] J.D. U LLMAN AND J. W IDOM. A First Course in Database Systems. Prentice-Hall
International, 1997.
12
A DTD for OIFML
13