Tags: abstract searches, aliases, approximate searches, attributes, backend, concurrent operation, database design, database packages, execution, implementation, information references, knowledge service, ldap, performance evaluation, remainder, search space, section 6, server searches, substring, types of queries,
An X.500 and LDAP Database: Design and Implementation
Timothy A Howes
Abstract searches efficiently, including full substring searches,
several kinds of approximate searches, and, with an
This paper describes the design and implementation underlying ordered database (such as a btree), range
of xldbm, an X.500 and stand-alone LDAP backend queries. The database is fully disk-based and makes use
database. The xldbm database supports efficient of caching and threading for good performance and
execution of all queries, data modifications, and search highly concurrent operation. It supports site-specific
pruning using centroids, all using simple underlying index configuration and performance tailoring, and
technology freely available on the Internet. Our centroids indexing for distributed search space pruning.
approach to resolving various kinds of queries is
described, along with a performance evaluation and The remainder of this paper gives an overview of
comparison to other popular database packages. the X.500/LDAP information and query models driving
our database design, followed by an overview of our
approach to the problem. Section 4 gives details on
1 Introduction how we handle specific types of queries. Section 5
X.500 [1] and LDAP [2,3] define similar directory describes our handling of aliases and knowledge
service information and query models. The information references, while section 6 discusses how we use
model is centered around entries, which are composed centroids to provide efficient multi-server searches.
of attributes. The entries are organized into a tree Section 7 shows how modifications affect the database
structure, usually corresponding to a geographical and structure, and Section 8 reports some performance
organizational distribution. The query models allow measurements and summaries various optimizations
searching of portions of the tree based on filter criteria we have implemented to improve performance.
involving attributes (e.g., entries with a surname of Finally, Section 9 discusses limitations of our current
"Jensen"), and returning requested attributes from each design, and what might be done to eliminate them.
matching entry. The model also defines operations for
adding, removing, or changing entries in the directory. 2 Overview of X.500 and LDAP
The X.500/LDAP model poses some interesting X.500 is the OSI directory service. It defines an
challenges in database design. Searches have broad information model, determining the form and character
scope, spanning from one entry to the entire tree, of information in the directory; a namespace, allowing
making efficient index construction difficult. Several the information to be referred to and organized; and a
types of primitive searches are involved, including functional model, determining what operations can be
equality, substring, and approximate matching, and performed on the information.
range queries. Arbitrary boolean combinations of
search filters must be supported, requiring query The information model is centered around entries,
optimization for efficient processing. Searches can which are composed of attributes. Each attribute has a
span multipe servers, following aliases or not. Data type and one or more values. The type determines the
can be arbitrarily distributed among servers by means attribute's syntax, which defines what kind of
of knowledge references. Efficient alias and knowledge information is allowed in the values. Entries are
handling during a search are key to good distributed arranged in a tree structure and divided among servers
performance. by means of knowledge references in a geographical
and organizational distribution. Entries are named
We imposed an additional constraint on the according to their position in this hierarchy. Alias
underlying technology in our system. It had to be entries are allowed, which point to other entries,
simple and freely-available Internet software. We did circumventing the hierarchy. Figure 1 depicts the
not want to require a commercial DBMS package, and relationship between entries, attributes, and values and
we wanted the system to be understandable by users shows how entries are arranged into a tree.
without much effort. Also, our goal was to make
installation and administration of the system as
straightforward as possible.
The database we have designed and implemented is
based on any of several freely available hash or btree
packages, such as GNU dbm [4], or the Berkeley db
package [5]. It handles all types of X.500/LDAP
to support the X.500 list and read operations).
3 Approach
Our approach to the X.500/LDAP database design
problem is a simple one. We wanted high performance,
but not at the expense of complicated management,
Alias Object administration, or recovery procedures. We wanted
reliability, but not the complications introduced by a
entry entry two-phase commit , roll-back or other guaranteed-
Attr, Attr, ... reliable transaction protocol. We felt the system should
be understandable with little effort, and easy to
manage. In short, our goal was to develop a highly-
Figure 1: The X.500 model is centered functional system with good performance that was easy
around entries which arecomposed of to administer and understand, and had reasonable
attributes. Entries are arranged into a tree reliability and recovery capabilities.
structure. Alias entries can circumvent the
hierarchy. We started with the simple two-part index structure,
depicted in Figure 2. First, we assign each entry a
Functionally, X.500 defines operations for uniqne identifier by which it can be referred to
searching, reading, and writing directory information. efficiently. All entries in the database are maintained in
The search operation can span a single entry, an entry's a single index file, keyed by this ID. Entries are stored
children, or an entire subtree of the directory. Alias in this index using a simple text format of the form
entries can be followed automatically during a search, "attribute: value," as shown in Table 1. Non-ASCII
even if they cross server boundaries. There are also values, or values that are too long to fit on a
operations to read a single entry or list the children of reasonably sized line are represented using a base 64
an entry. Operations are provided to add entries, delete encoding, making entries in the index human readable
entries, and modify existing entries. and easy to reconstruct if lost. Given an ID, the
LDAP, the lightweight directory access protocol, corresponding entry can be returned quickly and
was originally developed as a front-end to the X.500 efficiently, for the cost of a single hash table or btree
directory. Naturally, it assumes the same information lookup, depending on the choice of underlying
model and namespace as X.500. LDAP is lightweight technology. If other index files are corrupted or
for three main reasons. First, the functional model is destroyed, they can be regenerated from this one file.
less complicated. The read and list operations are left
out; they are emulated using search. Some of the more id2entry attribute index
esoteric and less-often-used features of other operations
are also not included. id1entry1 value1id1,...,idn
Second, LDAP runs directly over TCP or other ... ...
reliable transport. It avoids the overhead of the OSI idnentryN vlaueNid1,...,idn
session and presentation layers, making connection
setup and packet handling faster and simpler. Figure 2: Xldbm index structure. The
Third, LDAP uses simple string representations for id2entry index stores entries in text form.
most syntaxes. While X.500 encodes data elements as The attribute indexes map values to lists of
highly structured ASN.1 elements, the LDAP approach entry IDs.
encodes them as simple strings. This is a big Second, for each indexed attribute, we generate
performance win in encoding/decoding speed and another file containing a list of IDs of entries
complexity. containing each value of the attribute. (See the detailed
We have made minor extensions to LDAP so that discussion on each query type below for an explanation
it can be used as a stand-alone directory service, not of what each value actually is.) This index is keyed by
just a frontend to X.500. The modifications involve value, making the retrieval of a list of IDs of entries
the addition of referrals to the protocol, and the containing a given value efficient. These indexes are
development of a backend database supporting the not text-based, nor are they meant to be read or
LDAP information model and query semantics. The manipulated by users (except via a low-level database
similar X.500 and LDAP information and query administration program we provide, or indirectly
models make database development for both a similar through an X.500 or LDAP query). Vaues are
task. The same database design can be used to backend normalized before they are added to an index (e.g.,
both protocols, with only minor modifications (e.g., caseIgnoreString values are case-normalized,
2
telephoneNumberSyntax values have spaces and dashes straightforward: the commonName index is consulted
removed). The original values are retained in the for the list of entries corresponding to the value "Babs
id2entry index and will be returned in a search. Jensen." Next, this list of entries is read from the
Additional indexes of this type are constructed for id2entry index. The entries returned are guaranteed to
Distinguished Names, aliases and knowledge match the filter.
references. These specialized indexes are discussed in
detail later. 4 . 2 Approximate Matching
Table 1: Text entry representation in the There are several approaches to approximate
id2entry index. The first line contains the matching. Phonetic algorithms such as soundex and
entry ID. The second line contains the metaphone [] are popular, and there is ongoing research
Distinguished Name. Subsequent lines into spelling or other error-correcting algorithms such
contain attributes. as the one used by glimpse []. Currently, we support
12345 both soundex and metaphone, and have plans to
dn: cn=Babs Jensen, o=Babsco, c=US support glimpse. For both phonetic algorithms we
cn: Babs Jensen chose an approach that makes few assumptions about
cn: Barbara J Jensen the structure of the data (e.g., it does not assume name
sn: Jensen data and attempt to extract a surname for matching
... purposes). This makes our algorithm appropriate for a
wide variety of data, and reduces the risk and
complexity involved in assuming a semantic structure.
Given this index structure, answering a simple
The disadvantage is that we are unable to take
query is straightforward: look up the requested value in
advantage of any knowledge about the type of data to
the appropriate attribute index, returning a list of entry
improve performance.
IDs; look up those entry IDs in the id2entry index,
returning the resulting entries to the user. For reasons We treat the value being matched as a sequence of
discussed below, the entries read from the index files words. When building the index, a phonetic code is
may be candidate entries, that is, there is no guarantee generated for each word in the value. The code is then
that they actualy match the query. Therefore each entry stored in the index, mapping to the ID of the entry
has the filter applied to it directly before it is returned containing the original value. The value given in an
as a match. This also provides an opportunity to apply approximate matching query is similarly broken into
access control, size and time limits, etc. words and then codes, each of which is looked up in
the index.
Using this simple index structure, we are able to
answer virtually any X.500/LDAP query efficiently. An entry is considered to match the query if it has a
The requirements on the underlying database are value containing words corresponding to all the given
minimal: index entries are read and written; entry IDs codes in the proper order. If the query contains multiple
are inserted in and deleted from index entries. Only words, the lists of associated IDs are intersected to
during range queries and, optionally, some forms of produce the final list. The words must appear in the
approximate matching is some ordering on the indexes same order in the value. Since ordering information is
required, implying the btree, rather than hash file, lost in the index, the filter must be applied directly to
backend. each candidate entry to determine if it really matches
the query. Table 2 shows some example approximate
matching queries and values to match against
4 Specific Query Types (including the corresponding phonetic codes generated
This section discusses xldbm's index use in detail by the metaphone algorithm), and a brief explanation
for various types of queries. For all queries, the general of why the value does or does not match the query.
approach is to consult one or more index files to
generate a list of entries. Depending on the query type,
these entries may have the filter applied to them
directly to ensure a match. Individual search primitives
are described first, followed by boolean combinations
of queries.
4 . 1 Simple Equality
An equality search tests for entries that have a
given value for a certain attribute. For example, a
commonName of "Babs Jensen." Satisfying such a
query using the xldbm index structure is
3
Table 2: Example approximate matches interesting and challenging posed by X.500/LDAP.
Both models support arbitrary substring matching on
Query (codes) Value (codes) Match? text attributes. A query may specify a leading
Babs Jensen Babs Johnson Yes - match substring, trailing substring, arbitrary internal
(BBS JNSN) (BBS JNSN) substring, or any combination of these to be matched.
We set out to design a scheme that was fast, efficient
Babs Jensen Jensen Babs No - match, but and flexible. A glimpselike approach, though efficient
(BBS JNSN) (JNSN BBS) wrong order in terms of index space used, was not fast enough and
Jensen Smith No - codes do difficult to update incrementally (e.g., in response to
(JNSN) (SMO) not match modifications). Other approaches involving fast pattern
matching on values suffer from order N performance
Bob Smith Bob A Smith Yes - match where N is the size of the data being searched.
(BB SMO) (BB A SMO)
Our solution is to generate all substring
Bob A Smith Bob Smith No - code for A components of a fixed length for each value and index
(BB A SMO) (BB SMO) is missing those. Additional anchors are added to each value,
There are several options for code generation and marking the beginning and end of the string. When a
matching in the index. The simplest is to generate query is presented, similar substring components are
fixed-length codes of some maximum length. This generated corresponding to it. These components are
makes generation and lookup simple. If the code length looked up in the index and the resulting lists of entry
is too short, it can lead to unexpected matches because IDs are intersected to form the list of candidate entries.
of code truncation (e.g., "Babs" matching These candidates then have the query applied to them
"Babsikowjskvik" - both produce a code with BBS as directly, to ensure they match the query. This last step
the first three characters). If the code length is too is necessary since, as for approximate matching, the
long, it can lead to missed matches that should be ordering of substrings is not retained in the index.
returned (e.g., "Howe" not matching "Howes"). These Figure 3 illustrates this process for the value "Babs."
two problems are in conflict.
Babs BABS ^BABS$
The solution is to adopt a configurable "prefix"
matching scheme in which a code is considered a match ^BA
if it contains the key code as a prefix. This solves the Substring
missed match probolem, but can still lead to BAB
unexpected matches as described above. To combat this Attribute
problem, we add the constraint that the two codes must
ABS
Index
differ in length by at most N characters, where N is an BS$
administrator-defined constant. Setting N to zero
results in strict code matching. Setting N to a large Figure 3: Substring index generation for
number results in strict prefix matching (e.g., "Babs" components of length three. The value is
(BBS) will match "Babsik" (BBSK) and normalized and leading and trailing anchors
"Babsikowjskvik" (BBSKJSKFK)). Setting N are added. Then all possible substrings of
somewhere in between results in more reasonable length three are generated and stored in the
behavior (e.g., "Babs" matches "Babsik", but not corresponding attribute index.
"Babsikowjskvik"). We have found two to be a pretty
good number for N. Our experience shows that a component length of
three is optimal for databases of around 100,000
To support this variable prefix matching requires entries. The optimal length depends on the size of the
the underlying database to support prefix retrieval of data being indexed, the type of data, and a time-space
codes. With a btree or other ordered method, this is trade-off in query performance versus index size.
straightforward. With a hash-based scheme, it is more Longer components result in each substring mapping
difficult. For small values of N and a restricted to fewer entries, but the number of distinct
phonetic code alphabet (e.g., in the soundex scheme), components increases. Shorter components reduce this
it is feasible to generate all possible codes of greater number, but increases the number of entries to which
length (up to N) and look them up. This method each component maps. The two extreme cases provide
clearly does not scale well, and our implementation some insight: A component length of one means that
only implements variable prefix matching with an each letter is a component. For many types of data,
ordered underlying database. this means that close to every entry will be listed for
each component, making the list of candidate entries
4 . 3 Substring Matching long. A very long component length degenerates into
the equality index case, and no advantage is gained.
The substring matching problem is one of the most
4
Note also that the component length sets a lower which this approach works are limited.
bound on the length of substring queries that can be
supported (e.g., with a component length of three, a 4 . 5 Boolean Combinations
query for *A* cannot be answered).
As with any database, one of the most challenging
In practice, this scheme works surprisingly well. problems is the support of arbitrary queries. If the
For most data sets and query types, we have found they query set is restricted and can be predicted ahead of
tend to contain some "power components" that help to time, design is simplified. In the case of X.500/LDAP
reduce the list of candidates quickly. Pathological cases (as with the relational model), there is no limit to the
exist in which most entries are listed for each complexity of queries. The method by which these
component. In such cases the "power" aspect may be queries are built is straightforward, though, making the
contained in the component ordering, rather than the task easier. Boolean combinations of queries include
components themselves, in which case the algorithm conjunction (AND), disjunction (OR), and negation
reduces to a substring search of the entire space, as (NOT). If X and Y are queries, so are "X AND Y," "X
candidates are eliminated. We have found such cases to OR Y," and "NOT X."
be rare in practice.
Conjunctive queries are easy to handle. A list of
4 . 4 Ranges candidates is produced for each conjunction and then
intersected. Note that it is not necessary to evaluate
For attributes supporting some kind of ordering, candidates before the intersection.
the X.500/LDAP models support inequality queries for
Disjunctive queries are similarly straightforward. A
entries containing values greater-than-or-equal-to or
list of candidates is produced for each disjunction and
less-than-or-equal-to a given value. Although often the
then unioned. Again, candidates need not be evaluated
case in practice, there is no requirement that these two
before the union takes place.
operators be used together to form a bounded range
query. With an underlying database supporting ordered Negation queries are more difficult. A simple
retrieval, responding to such queries is easy. With a approach is to produce a list of candidates matching the
hash-based scheme, it presents a problem we have not query and then subtract it from the list of all
yet solved, except in some specific cases. candidates. Unfortunately, since the original list may
be only a list of candidates not guaranteed to match the
With ordered retrieval, a greater-than-or-equal-to
query, blindly applying this approach can lead to lost
query is answered by retrieving the given value (or
matches. The solution is to apply the query (before
"smallest" value greater than it), and then stepping
negation) to the list of candidates before performing the
through subsequent values in the ordering. The
subtraction. This produces correct results, but can be
resulting lists of entry IDs are unioned together to
expensive.
form a single list of candidates. These candidates are
guaranteed to match the filter, so there is no need to For example if the query is for entries not
apply the filter directly to them. A less-than-or-equal-to containing an objectClass of person and the database
query is handled similarly. The first item in the contains a million entries, only one of which is not a
ordering is retrieved, followed by subsequent items person, the method degenerates into a linear search. In
until an item greater than or equal to the given key is this case, it would be more efficient to step through
reached. The resulting entry IDs are unioned to form the values of the objectClass attribute, building
the result. If the query involves range of values (i.e., candidates from the ones matching the query. By
greater than one vaue and less than another), obvious building more knowledge into the database (e.g., how
optimizations can be made. The efficiency of this many distinct values are in an index), NOT
method is proportional to the number of keys in the performance can be improved.
range.
With a hash-based scheme, ordered retrieval is not 5 Aliases and Knowledge References
possible. In some cases, where the ordering can be
approximated by a substring search, a hash-based Aliases and knowledge references provide similar
approach can still provide results. For example, an challenges to database design. Both features create
attribute containing UTC time values has this situations where a search must be continued "outside"
property. A query requesting entries with a time greater of the original search scope, perhaps even outside the
than or equal to 1994 and less than or equal to 1995 original server handling the query. Of the two, aliases
produces the same results as a substring query for a are more problematic because they can point anywhere,
time with a leading substring of "1994" (plus the there is no consistency requirement, and they are user-
simple case of a time equal to 1994). This works creatable.
because UTC time has a concrete string representation A search that does not have the searchAliases flag
that is lexicographically increasing. The situations in set in X.500 or the alias flag set to derefAlways or
5
derefSearching in LDAP is not affected by aliases. If to the client. In LDAP, knowledge references are
one of these flags is set, indicating that aliases returned as referrals (more on knowledge references and
themselves should not be searched, but rather what their relationship to centroids in Section 6). Figure 5
they point to, a new phase is added to the search depicts the structure of the knowledge reference index.
procedure.
The key to handling aliases is to identify those 1
aliases that point outside the scope of the search. If an
alias does not "escape" the scope of the search, the 2
entry it points to will be searched automatically
(because it is contained within the scope, not because
an alias points to it - why it gets searched is
immaterial, as long as it does). Once such aliases are
identified, the search is continued with the entries to
which they point (either the entry itself for a one-level One-level KR index Subtree KR index
search, or the entry and all its descendents for a subtree
search). Base object searches are easy to handle by 12 12
examining the entry directly, and do not require any
Figure 5: Knowledge reference scope index.
special indexing.
A search beginning at entry 1 is continued
To efficiently identify aliases that need searching, at the server identified by the knowledge
two new indexes are maintained, one for one-level reference contained in entry 2.
scopes, one for subtree scopes. For each non-leaf entry,
Maintenance of the alias and knowledge reference
the one-level index contains an entry containing the
indexes is non-trivial but straightforward. When a new
entry IDs of alias children of the entry that do not
alias entry is added to the tree, the one-level alias entry
point to other children (i.e., aliases that escape the one-
for its parent may need updating (if the alias does not
level scope). Similarly, the subtree index contains
point to a sibling entry). The subtree alias entry for
entry IDs of alias descendents of the entry that do not
each ancestor may also need updating. For knowledge
point to other descendents. During a search, the list of
references, the same is true.
candidate entries is generated as before, and then the
appropriate alias-scope index is consulted to determine This approach allows the efficient identification of
if there are entries outside the scope that should be alias and knowledge references at which a search must
searched. Figure 4 illustrates this process for a sampe be continued. Aliases can be particularly troublesome
tree. from a performance standpoint. If many aliases escape
the scope of a search, each one must be searched
individually, causing a significant performance penalty.
It's hard to see a general solution to this problem that
1 2 guarantees good performance, and we feel this is a
design deficiency in the X.500 and LDAP models.
4 6 Centroids
3 5
A weakness of the X.500/LDAP model is its lack
One-level alias index Subtree alias index of support for wide-area searches. The hierarchical
scheme works well for searches whose scope can be
13 13 restricted using the namespace. For searches that do not
have this property, the model degenerates to a search of
Figure 4: Alias scope index. A subtree or the entire tree, contacting every server. Clearly, this
one-level search starting at entry 1 approach does not scale well.
consultss the appropriate index and
determines it needs to continue the search Several solutions have been proposed, including
with the entry pointed to by entry 3. alternate hierarchies; special "yellow pages" portions of
the tree where attributes are organized to facilitate
Knowledge references are handled via a similar alternate searching; "alias" trees that collect pointers to
approach. Indexes are constructed for one-level and information in other servers; and out-of-band
subtree knowledge references. Given a search scope and distributed indexing to help prune the search space. All
the entry ID of the base object, the list of knowledge these schemes have their advantages, but we chose the
references within that scope can be quickly retrieved. In latter approach for our system. It averts many of the
X.500, these knowledge references are either used to maintenance and consistency probems of using aliases,
chain the search or returned as continuation references and does not require the global cooperation necessary to
6
implement an alternate namespace. It also has the smaller, but value-based centroids allow
advantage of presenting a consistent model to clients; more accurate and flexible query resolution.
they see the same tree as always, searches just happen In this example, the word centroid is
more efficiently. smaller by one "Babs" and one "Jensen," but
gives a false positive match for a query for
We chose to use centroids as our distributed "Bjorn Johnson."
indexing framework. Adapted from work by Salten [],
and originally proposed for use on the Internet in the Values Word centroid Value centroid
WHOIS++ system, centroids have the potential to
provide efficient wide-area searching in the Babs Jensen Babs Babs Jensen
X.500/LDAP model. We have adapted centroids to this Babs Johnson Jensen Babs Johnson
model, and included a few extensions that allow us to Babs Jensen Johnson Bjorn Jensen
support the more flexible query language defined by Bjorn Jensen Bjorn
X.500 and LDAP. Incorporating centroids into the X.500/LDAP
The basic centroid model involves generating the model is not difficult conceptually. A search is
list of distinct words in a database. This list is called a initiated at some point in the hierarchy. Normally, a
centroid of the database. If centroids are generated for server would search its own data and chain or refer the
many such databases and given to another server, the search to all servers holding data below it in the tree. If
server can consult the list of centroids to determine a server holds centroids for these servers, it can consult
which low-level databases might hold the answer to a them and only chain or refer the search to those servers
query. An trivial example is shown in Figure 6. possibly able to satisfy the query. Figure 7 illustrates
this process.
Babs A,B fix
Jensen A,B
Figure 7: Search-space pruning using
Johnson A,B centroids. A server holding centroid data for
Bjorn A servers below it in the tree only chains or
Lars B refers the search to those servers possibly
able to answer the query.
server C
In our database design, we introduce the concept of
a centroid entry. Conceptually, the entry holds an
Babs Jensen Babs Jensen entire centroid, along with access information for the
Babs Johnson Lars Johnson server that generated the centroid. (In our
Bjorn Jensen implementation, this information is contained inthe
name of the centroid entry.) The values in the entry are
server A server B added to the indexes just like normal values. During a
search, the indexes are consulted as usual, returning the
Figure 6: Centroid example. A centroid is a ID of the centroid entry. Here the handling of centroid
list of distinct words in a database. A server entries differs from regular entries. Instead of returning
that collects centroids from other servers is the entry to the client, the search is continued using
able to determine which servers are likely to the access information in the centroid entry, or the
be able to answer a given query. A search access information is returned to the client so it may
involving the word "Bjorn" can be directed continue the search.
only to server A. This implementation of centroids has several
We modified the centroid model to include whole advantages. First, it was very easy to implement. Once
values, rather than words. Simiar modifications were we developed the index structure described in the
made by the desighers of SOLO [], who use centroids previous sections, it took literally less than a dozen
for navigation and searching. The use of values in the lines of code to support centroids. Second, the pruning
centroid enables a broader range of searches to be happens through the normal indexing consultation
supported (e.g., substrings), and fewer "false positives" process. Third, the centroid generation and addition
to be returned. There is a trade-off, of course. The process is done using normal DAP or LDAP
centroids produced are larger, not having as attractive operations. Centroids are added using the add operation
collapsing properties as their word-oriented and deleted using the delete operation. Existing
counterparts. Table 3 provides a comparison between centroids are modified using the modify operation (e.g.,
the word and value approaches to centroid generation. in response to incremental changes to the centroid). A
separate process is responsible for using these
Tablle 3: Word versus value centroid operations to generate and apply centroid changes to
comparison. Word-based centroids tend to be and from the database. This makes it easy to add new
7
schemes later, or change the existing scheme. Second, our original design called for index entries
as single blocks of entry IDs. In practice, these blocks
become quite large. For example, in our database there
7 Modifications are close to 100,000 people entries, causing the single
Both X.500 and LDAP support adding, deleting and objectClass index entry for the value person to be
modifying entries in the directory. They assume that about 400,000 bytes (100,000 entries * 4 bytes per
read requests of the directory are far more frequent than entry ID). If an entry is added, this index entry
writes, but modify performance is still an issue. In increases by one entry ID, causing the entire 400K
both models, every set of modifications must either block to be rewritten. This caused a severe performance
succeed or fail as a group. Although we do not provide penalty. To combat the problem, we introduced a
a transaction system with roll-back capability, we do scheme in which index entries are broken up into file
minimize the time during which a serious failure can system-sized block fragments. Small blocks are stored
occur. as before. Large blocks are accessed with a level of
indirection through a "header" block which points to
A modification is implemented as a three-step the fragments. A slight penalty is paid when the block
process. First, an in-memory copy of the entry is is read (several small reads take longer than than one
changed. If this fails, or the entry fails to satisfy the large one), but the performance increase on a modify is
directory schema requirements after the modifications substantial. Generally, only one fragment needs
are applied, the operation is aborted. Second, the updating.
affected indexes are updated. This involves inserting
and deleting entry IDs from index entries. If anything Third, we noticed that some index entries,
goes wrong here, the operation is aborted. Third, the particularly in the substring index, were so large that
id2entry index containing the entry itself is updated. In they contributed little to reducing the number of
case of a catastrophic failure, this is the only file that candidates. Worse, such blocks contribute to poor
is crucial. The other indexes can be reconstructed from performance, both on reads and writes. To reduce the
the id2entry index. So, the only critical section of the effects of such large blocks, we introduced the concept
modify is the update of the id2entry index. Since each of an allIDs block. The allIDs block is a simpe stand-
entry is represented in the index by a contiguous in for a block containing all entry IDs in the database.
sequence of characters, the update of this index can be It contains a single flag-valued entry ID, making it
done with a single write operation, further reducing the very small. The allIDs block acts like the universal set
risk. in index operations. The block size above which a
block is replaced by an allIDs block is configurable.
In practice, we find this level of reliability adequate, The optimum size depends on the database size and
especially when combined with the ability to do content. In practice, we have found a size of between
replication, which we have implemented. A full 5,000 and 10,000 entry IDs to work well for our
discussion of our replication strategy is beyond the 100K-sized environment. Pathological cases exist in
scope of this paper, but changes are logged to a file which this approach leads to a linear search of the
which is then read by a separate replication process, database, but we have found these cases to be rare.
responsible for distributed changes to any number of
replicas. Finally, our biggest performance enhancement
comes from extensive use of caching. The most
expensive part of the search operation is the reading of
8 Performance Enhancements candidates from the id2entry index, and the subsequent
application of the filter. We maintain a configurable
As described, the system works surprisingly well, in-memory cache of entries, keyed by DN and by entry
especially considering its simplicity. We have made ID. Table 4 shows a comparison of some typical
several enhancements that have significantly increased searches, with "cold" and "hot" entry caches. We also
performance, especially on modifies. First, our original maintain a cache of open index files, and the
design called for a separate index file for each attribute underlying hash or btree database keeps a cache of its
and index type (i.e., for each attribute files for the own.
equality index, substring index and approximate index
are created). To reduce the number of indexes open at Table 4: Effect of caching on performance
one time, we have combined these three files into one. for various queries.
Doing so required the addition of a prefix scheme for
Query Cold cache Hot cache
storing the value keys in order to avoid conflicts.
Equality keys are prefixed by "=", substring keys are cn=Babs - -
prefixed by "*", and approximate keys are prefixed by
"~". This reduces by a factor of three the number of cn=*Babs* - -
open indexes necessary, with little impact on cn~=Babs - -
performance.
8
Directory Access Protocol. Request For
Comments (RFC) 1777, March, 1995.
9 Limitations [3] T. Howes, S. Kille, W. Yeong, C. Robbins, The
String Representation of Standard Attribute
Our scheme performs well, but it does have Syntaxes. RFC 1778, March 1995.
limitations. First, the scheme will not scale to more
than a few hundred thousand to a million entries. The
algorithms are sound, but the use of simple file Author Information
system-based hash and btree packages incurs a
Tim Howes is a Senior Systems Research
performance penalty. Replacing these underlying
Programmer for the University of Michigan's
systems with a DBMS system would undoubtedly
Information Technology Division. He received a
improve performance and increase the number of
B.S.E. in Aerospace Engineering, a M.S.E. in
entries the scheme can handle.
Computer Engineering from U-M, and is completing a
Second, we feel modify performance is still Ph.D. in Computer Science. He is currently project
unacceptably poor. The limiting factor here again is director and principal investigator for the NSF-
the underlying database package. A file system-based sponsored WINX project, and in charge of directory
approach, not optimized for modification, nor for our service development and deployment at U-M. He is the
application in particular, simply cannot compete with primary architect and implementor of the U-M LDAP
a commercial DBMS. directory package, the DIXIE system, the GDA X.500
DSA, and a major developer of the QUIPU X.500
Finally, the user-friendy text representation used to implementation. He is author or co-author of several
store entries in the id2entry index slows down the papers and RFCs, including RFC 1777 and RFC 1778
reading of entries. Especially in the X.500 case, there defining the LDAP protocol. He is chair of the IETF
is significant time spent converting from the text Access, Searching, and Indexing of Directories working
format to an internal representation. Storing entries in group, and an active member of the ACM and IEEE.
BER-encoded ASN.1 (the representation used on the
wire) would greatly improve this performance. It would
require developing new BER-aware versions of many
utility routines, including things like strcmp and
friends.
1 0 Summary
This paper describes the design and implementation
of xldbm, a backend database for X.500 and stand-alone
LDAP. The system uses simple freely-available
database technology and provides good performance,
reliability, and recovery. Xldbm handles virtually all
types of X.500/LDAP queries efficiently, including
full substring and approximate matches. Aliases and
knowledge references are handled through clever index
construction. Centroids have been adapted for use as a
search space-pruning device. Several performance
enhancements have been implemented, making the
system perform surprisingly well.
1 1 Acknowledgements
This material is based upon work supported by the
National Science Foundation under Grant No. NCR-
9416667.
References
[1] The Directory: Overview of Concepts, Models and
Service. CCITT Recommendation X.500, 1988.
[2] W. Yeong, T. Howes, S. Kille, Lightweight
9