Information about http://www.jxos.org/publications/jx-sec.pdf

Tags: arbitrary commands, buffer overflow, computer sciences, e mail, engineering sciences, erlangen germany, faculty of engineering, friedrich alexander, germany phone, integer overflow, java operating system, kernels, lems, meik, network operating system, root privilege, secure network, technical report tr, uni erlangen, university of erlangen,
Pages: 16
Language: english
Created: Tue Sep 3 15:00:31 2002
Display cached document
Page 1
image
Page 2
image
Page 3
image
Page 4
image
Page 5
image
Page 6
image
Page 7
image
Page 8
image
Page 9
image
Page 10
image
Page 11
image
Page 12
image
Page 13
image
Page 14
image
Page 15
image
Page 16
image
                                                                   Michael Golm, Meik Felser,
                                                        Christian Wawersich, Jürgen Kleinöder

                A Java Operating System
                   as the Foundation of a
        Secure Network Operating System



                                                              Technical Report TR-I4-02-05                 x   August 2002




              Department of Computer Sciences 4
       Distributed Systems and Operating Systems




                                                                                      Friedrich-Alexander-Universität
                                                                                      Erlangen-Nürnberg
Univ. Erlangen-Nürnberg · Informatik 4 · Martensstr. 1 · 91058 Erlangen · Germany
Phone: +49.9131.85.27277 · Fax: +49.9131.85.28732                                     TECHNISCHE FAKULTÄT
E-Mail: i4@informatik.uni-erlangen.de · URL: http://www4.informatik.uni-erlangen.de   (Faculty of Engineering Sciences)
                            A Java Operating System as the Foundation of a
                                  Secure Network Operating System

                     Michael Golm, Meik Felser, Christian Wawersich, Jürgen Kleinöder
                                     University of Erlangen-Nuremberg
                  Dept. of Computer Science 4 (Distributed Systems and Operating Systems)
                                  Martensstr. 1, 91058 Erlangen, Germany
                     {golm, felser, wawersich, kleinoeder}@informatik.uni-erlangen.de


                         Abstract                                      user to run arbitrary commands with root privilege [10],
   Errors in the design and implementation of operating                [11], one executes commands in emails [12], and one is an
system kernels and system programs lead to security prob-              integer overflow [13]. The six buffer overflow vulnerabili-
lems that very often cause a complete breakdown of all                 ties could have been avoided by using techniques described
security mechanisms of the system.                                     by Cowan et al. [15]. However, not all overflow attacks can
We present the architecture of the JX operating system,                be detected and the authors recommend the use of a type-
which avoids two categories of these errors. First, there are          safe language.
implementation errors, such as buffer overflows, dangling                  An argument that is often raised against type-safe sys-
pointers, and memory leaks, caused by the use of unsafe                tems and software protection is that the compiler must be
languages. We eliminate these errors by using Java--a type-            trusted. We think that this is not a very strong argument for
safe language with automatic memory management--for                    the following three reasons. (i) Traditional systems, such as
the implementation of the complete operating system. Sec-              Unix, also use compilers to compile trusted components,
ond, there are architectural errors caused by complex sys-             like the kernel and system programs. Security in such a sys-
tem architectures, poorly understood interdependencies                 tem relies on the assumption that the C compiler contains no
between system components, and minimal modularization.                 bugs or trojan horses [61]. (ii) Only the compiler backend
JX addresses these errors by following well-known princi-              that translates the type-safe instruction set into the instruc-
ples, such as least-privilege and separation-of-privilege,             tion set of the processor and the verifier that guarantees
and by using a minimal security kernel, which, for example,            type-safety must be trusted. (iii) The additional effort that
excludes the filesystem.                                               must be put into the verification of two components--the
Java security problems, such as the huge trusted class                 compiler backend and the verifier--pays off with reduced
library and reliance on stack inspection are avoided. Code             verification effort for many trusted system programs. Most
of different trustworthiness or code that belongs to different         vulnerabilities in current systems are caused not by bugs in
principals is separated into isolated domains. These                   the kernel but by bugs in system programs.
domains represent independent virtual machines. Sharing                    The second category of errors--the architectural
of information or resources between domains can be com-                errors-- is more difficult to tackle. The three CERT notes
pletely controlled by the security kernel.                             related to the execution of commands in strings and emails
                                                                       are critical because the vulnerable systems violate the prin-
1 Introduction                                                         ciple of least-privilege [52]. Thus, in current mainstream
                                                                       systems it is not the question whether the proper security
    There are two categories of errors that cause the easy             policy is used, but whether security can be enforced at all
vulnerability of current systems. The first are implementa-            [39]. Violations of the principle of least-privilege, an uncon-
tion errors, such as buffer overflows, dangling pointers, and          trolled cumulation of functionality, many implementation
memory leaks, which are caused by the prevalent use of                 errors, complex system architectures, and poorly under-
unsafe languages in current systems. This becomes danger-              stood interrelations between system components make cur-
ous when an OS relies on a large number of trusted pro-                rent systems very vulnerable. This is a problem that affects
grams. From the top ten CERT notes (as of January 2002)                all applications, because applications are built on top of an
with highest vulnerability potential six are buffer overflows          operating system and can be only as secure as its trusted
[4], [5], [6], [7], [8], [9], two relate to errors checking user       programs and the underlying kernel.
supplied strings that contain commands thus allowing the




                                                                   1
     As it will never be possible to develop software of mod-         nism that completely isolates the servers with respect to
erate complexity that is free of errors one must assume that          data access and resource usage.
every complex application contains security critical errors.              The paper is structured as follows. Section 2 gives an
The realization that these errors can not be avoided in cur-          overview about Java security and analyzes some weak-
rent systems led to the proliferation of firewalls that are           nesses of the Java security mechanism. Section 3 describes
responsible to shield potentially vulnerable systems from             the architecture of JX with the focus on the security archi-
potentially dangerous traffic. Application developers and             tecture. Section 4 describes the performance of the system
deployers react to the restriction of a firewall by tunneling         as a web server. Section 4 discusses how the system meets
traffic over open ports, for example the http port 80. The            the requirements of a security architecture. Section 5
security community reacts by building traffic analyzers that          describes related work and Section 6 concludes the paper.
analyze the TCP stream and the protocols above TCP and
http. As it becomes more and more expensive to cure the               2 Java Security
symptoms it becomes more attractive to fix the deeper
underlying causes of the security problems.                                Java security is based on the concept of a sandbox,
     It is well understood that the unsafe nature of the lan-         which relies on the type-safety of the executed code.
guages C and C++ is the reason for many of today's security           Untrusted but verified code can run in the sandbox and can
problems. There are several projects that try to develop a            not leave the sandbox to do any harm. Every sandbox must
safe dialect of C. One of these projects created a safe dialect       have a kind of exit or hole, otherwise the code running in the
of C, called Cyclone-C [34]. Although Cyclone-C looks                 sandbox can not communicate results or interact with the
similar to C it is not possible to recompile an existing non-         environment in a suitable way. These holes must be clearly
trivial C program, such as an OS kernel, without changes.             defined and thoroughly controlled. The holes of the Java
Using Java instead of a Cyclone-C means that it is more dif-          sandbox are the native methods. To control these holes, the
ficult to port C programs, but allows to run the large number         Java runtime first controls which classes are allowed to load
of existing Java programs without modifications. Further-             a library that contains native code. These classes must be
more, Cyclone-C programs have a similar performance                   trusted to guard access to their native methods. The native
overhead as Java programs.                                            methods of these classes should be non-public and the pub-
     There is still the problem that basing the protection on         lic non-native methods are expected to invoke the Security-
type-safety ties the system to a certain language and type            Manager before invoking a native method. The Security-
system. But this seems to be no problem at all. Although the          Manager inspects the runtime call stack and checks whether
Java bytecode was not designed as the target instruction set          the caller of the trusted method is trusted.
for languages other than Java, there is a large number of lan-             Java version 1 distinguishes between trusted system
guages that can be compiled to Java bytecode. Examples are            classes, which were loaded using the JVM internal class
Python [50], Eiffel [48], Tcl [35], Scheme [42], Prolog [20],         loading mechanism, and untrusted classes, which were
Smalltalk [56], ADA95 [26], and Cobol [47].                           loaded using a class loader external to the JVM. Implemen-
     Java allows developing applications using a modern               tations of the SecurityManager can check whether the
object-oriented style, emphasizing abstraction and reusabil-          classes on the call stack--the callers of the method--are
ity. On the other hand many security problems have been               trusted or untrusted classes. When the caller was a system
detected in Java systems in the past [18]. The main contri-           class the operation usually is allowed otherwise the Securi-
bution of this paper is an architecture for a secure Java oper-       tyManager decides, depending on the kind of operation and
ating system that avoids these problems and a discussion of           its parameters, whether the untrusted class is allowed to
its implementation and performance.                                   invoke the operation1.
     We follow Rushby [51] in his reasoning that a secure                  Java version 2 also relies on stack inspection but can
system should be structured as if it were a distributed sys-          define more flexible security policies by describing the per-
tem. With such an architecture a security problem in one              missions of classes of a certain origin in external files.
part of the system does not automatically lead to a collapse               To sum up, Java security relies on the following require-
of the whole system's security. Microkernels are well suited          ments:
as the foundation of such a system. Especially systems that           0
                                                                       (1) Code is kept in a sandbox by using an intermediate
adhere to the multi-server approach, such as SawMill [28],               instruction set. Programs are verified to be type-safe.
and mediate communication between the servers [33] are
able to limit the effect of security violations.
     The JX system combines the advantages of a multi-                    1. The real implementation uses the abstraction of classloader-depth,
server structure with the advantages of type-safety. It uses                 which is the number of stack frames between the current stack frame
                                                                             and the first stack frame connected to a class that was loaded using
type-safety to provide an efficient communication mecha-                     a classloader.




                                                                  2
 (2) The package-specific and/or class-specific access                A small microkernel contains low-level hardware initializa-
   modifiers must be used to restrict access to the holes of          tion code and a minimal Java Virtual Machine (JVM).
   this sandbox: the native methods of trusted classes. As                 The JX system is structured into domains (see Figure 1).
   long as the demarcation line between Java code and native          Each domain represents the illusion of an independent
   code is not crossed, the Java code can do no harm.                 JVM. A domain has a unique ID, its own heap including its
 (3) The publicly accessible methods of the trusted classes           own garbage collector, and its own threads. Thus domains
   must invoke the SecurityManager to check whether an                are isolated with respect to CPU and memory consumption.
   operation that would leave the sandbox is allowed.                 They can be terminated independently from each other and
     The SecurityManager is similar to a reference monitor,           the memory that is reserved for the heap, the stack and
but has a severe shortcoming: it is not automatically                 domain control structures can be released immediately
invoked. A trusted class must explicitly invoke the Security-         when the domain is terminated.
Manager to protect itself. The mere number of native meth-                 All domains execute 100% Java code. The microkernel
ods makes it difficult to assure this. We counted 1312 native         represents itself also as a domain. Because this domain has
methods in Sun's JRE 1.3.1_02 for Linux, which are 2.9                the ID 0 it is called DomainZero. DomainZero contains all
percent of all methods. From these native methods 34 per-             C and assembler code that is used in the system.
cent are public and even as much as 16 percent are public                  JX does not support native methods and there is no
static methods in a public class. This means that the method          trusted Java code that must be loaded into a domain. There
can be invoked directly from everywhere without the Secu-             is no trust boundary within a domain which eases adminis-
rityManager having a chance to intercept the call. Two of             tration and allows a domain complete freedom in what code
these methods are java.lang.System.currentTimeMillis() and            it runs. Because the domain contains no trusted code it is a
java.lang.Thread.sleep() which provides an interesting opportu-       sandbox that is completely closed. We create a new hole by
nity to create a covert timing channel. The fact that covert          introducing capabilities, called portals.
channels are not exploited can be attributed to the existence              Portals are proxies [55] for a service that runs in another
of many overt channels. Public, non-final, static variables in        domain. Portals look like ordinary objects and are located
public system classes are only one example (we counted 31             on a domains heap, but the invocation of a method synchro-
of these fields in Sun's JRE).                                        nously transfers control to the service that runs in another
     A further problem is that the stack inspection mecha-            domain. Parameters are copied from the client to the server
nism only is concerned with access control. It completely             domain.
ignores the availability aspect of security. This lack was                 Portals and services can not be created explicitly by the
addressed in JRes [17]. By rewriting bytecodes, JRes cre-             programmer. They "magically" appear during portal com-
ates a layer on top of the JVM. In our opinion, this is the           munication. When a domain wants to provide a service it
wrong layer for resource control, because resources that are          can define a portal interface, which must be a subinterface
only visible inside the JVM can only be accounted inside              of jx.zero.Portal, and a class that implements this interface.
the JVM. Examples are CPU time and memory used for the                When an instance of such a class is passed to another
garbage collector (GC) or just-in-time compiler or memory             domain the portal invocation mechanism creates a service in
used for stack frames. Furthermore, rewriting bytecodes is            the source domain and a portal in the destination domain.
a performance overhead in itself and it creates slower pro-           This architecture has a bootstrap problem: A domain can
grams. Often, Java is perceived as inherently insecure due to
the complexity of its class libraries and runtime system [22].            Components         Heap
As will be described in Section 3, JX avoids this problem by                 Classes                            Portals
not trusting the JDK class library.
                                                                                                Objects
3 JX Security Architecture                                                             Threads
                                                                                        Java-Stacks
    This section describes the aspects of the JX architecture                           Thread Control Blocks
                                                                         Domain A                                         Domain B
that are relevant to security.

                                                                                    C Code           Threads
3.1   JX architecture                                                               Assembler         Stacks
                                                                                                      Thread Control Blocks
   JX is a single address space system. All code runs in one
physical address space; an MMU is not used. Protection is                           DomainZero (Microkernel)
based on the type-safety of the Java bytecode instruction set.
                                                                               Figure 1: Structure of the JX system




                                                                  3
obtain new portals solely by using existing portals. There-            local service table, a pointer to the Domain Control Block
fore each domain possesses an initial portal: a portal to a            (DCB) and a domain ID. DCBs are one of the few global
naming service. Using this portal the domain can obtain                data structures of JX. Because the DCB of a domain is
other portals to access more services. When a domain is cre-           reused when a domain terminates and portals can outlive the
ated, the creating domain can pass the naming portal as a              domain in which the service is located, the DCB pointer
parameter of the domain creation call. When no naming                  could point to a DCB that contains not information about
portal is specified in the createDomain2 call, the default             the terminated service domain but a newly created domain.
Naming portal of the creating domain is passed to the cre-             Therefore the portal contains also a unique domain ID,
ated domain. The naming service of the microkernel is used             which is checked against the ID in the DCB before the DCB
only by the initial domain (DomainInit) which implements               is used.
a naming service in Java and passes this naming service to                 Although the portal is located on the heap of the client
all domains it creates. Because DomainInit looks up all por-           domain the Java code has no way to access its contents. The
tals from the microkernel on startup no interaction with the           type of the portal reference is the jx.zero.Portal interface,
microkernel naming service by any domain is needed after               which, as an interface, has no fields. Thus it is not possible
DomainInit has completed its initialization.                           to forge a portal to access an arbitrary service.
     The implementation of the portal mechanism had to ful-                Services are removed automatically when no portal to
fil the following requirements:                                        the service exists. To detect this condition the SCB contains
 · It must not be possible to explicitly create a portal object.       a reference counter that counts the number of portals to the
 · It must be possible to terminate a domain and release all           service. When a portal is passed to another domain a portal
   its resources independent of its current communication              to the same service is created in the other domain and the
   relationships.                                                      reference counter is incremented. When a portal is garbage
 · As services are created by the microkernel they must also           collected the finalization cycle decrements the reference
   be automatically removed when they are no longer                    counter of the service. When a domain terminates all portals
   needed. The data structures necessary to control a service          can be considered garbage and a finalization cycle is per-
   must be placed on the domains heap and a garbage collec-            formed before the heap memory is released.
   tor must be able to move them.
     With the following implementation all these require-              3.2   JX as a capability system
ments are met. A service is represented by a service control
block (SCB) that is stored on the server domain's heap. The                Portals are capabilities [19]. A domain can only access
SCB has a reference to the object that contains the imple-             other domains when it possesses a portal to a service of the
mentation of the portal methods, a thread that is used to exe-         other domain. The operations that can be performed with
cute the methods, and a queue of waiting senders (Figure 2).           the portal are listed in the portal interface.
                                                                           Although the capability concept is very flexible and
                                                 Sender Queue          solves many security problems, such as the confused deputy
       DomainID                           Service                      [30], in a very natural way, it has well known limitations.
       ServiceID                          Control                      The major concern is that a capability can be used to obtain
                                           Block
         Portal                                                        other capabilities, which makes it difficult, if not impossi-
                                                                       ble, to enforce confinement [62]. JX as described up to now
                            Service    Thread           Object         can not enforce confinement. Thus an additional mecha-
                             Table     Control
                                        Block                          nism is needed: a reference monitor that is able to check all
                                                                       portal invocations and the transfer of portals between
   Client Domain              Server Domain
                                                                       domains.
            Figure 2: Portal data structures
    A portal contains no direct pointer to the Service Control         3.3   The reference monitor
Block (SCB) because the SCB is stored on the heap and can
                                                                           A reference monitor must be tamper-proof, mediate all
be moved by the garbage collector. Using direct pointers
                                                                       accesses, and be small enough to be verified.
would require updating all portals to a service during a GC
                                                                           A reference monitor for JX must at least control incom-
cycle of the service domain. This would require a scan of
                                                                       ing and outgoing portal calls. There are two alternatives for
the heaps of all domains which does not scale well. There-
                                                                       the implementation of such a reference monitor:
fore a portal contains the index of the service in a domain-
                                                                       Proxy. Initially a domain has access only to the naming por-
                                                                       tal that is passed during domain creation. To obtain other
 2. createDomain is a method of the DomainManager service which
    runs in DomainZero.                                                portals the name service is used. The parent domain can




                                                                   4
implement this name service to not return the registered por-          to an object of another domain. The reference monitor fur-
tal but a proxy portal which implements the same interface.            thermore gets the Domain portal of the caller domain and
This proxy can then invoke a central reference monitor                 the callee domain. To accelerate the operation of the refer-
before invoking the original portal.                                   ence monitor, the Domain portal is a portal which can be
                                                                       inlined by the translator. On an x86 it takes only two
Microkernel. The portal invocation mechanism inside the
                                                                       machine instructions to get the domain ID given the Domain
microkernel invokes a reference monitor on each portal call
                                                                       portal.
and passes sender principal, receiver principal, and call
                                                                           The main problem is to obtain a consistent view of the
parameters to the reference monitor.
                                                                       system during the check. One way is to freeze the whole
                                                                       system by disabling interrupts during the check. This would
    These two implementation alternatives have the follow-
                                                                       work only on a uniprocessor, would interfere with schedul-
ing advantages and drawbacks. The proxy solution needs no
                                                                       ing, and allow a denial-of-service attack. Therefore, our cur-
modification of the microkernel and thus avoids the danger
                                                                       rent implementation copies all parameters from the client
of introducing new bugs. As long as no reference monitor-
                                                                       domain to the server domain up to a certain per-call quota.
ing is needed, the proxy solution does not cause any addi-
                                                                       These objects are not immediately available to the server
tional cost. The microkernel solution must check in every
                                                                       domain, but are first checked by the security manager. When
portal invocation sequence whether a reference monitor is
attached to the domain. Because the domain control block,              the security manager approves the call the normal portal
which contains this information, is already in the cache dur-          invocation sequence proceeds.
ing the portal invocation, this check is nearly for free. On the
other hand, the proxy solution requires the name service to            3.4   Making an access decision
create a proxy for each registered portal. During a method
                                                                           Spencer et al. [58] argue that basing an access decision
invocation at such a portal the whole parameter graph must
                                                                       only on the intercepted IPC between servers forces the secu-
be traversed and when a portal is found it must be replaced
                                                                       rity server to duplicate part of the object server's state or
by a proxy portal.
                                                                       functionality. We found two examples of this problem. In
    We rejected the proxy approach, because it requires a
                                                                       UNIX-like systems access to files in a file system is checked
rather complex implementation and it is difficult to assure
                                                                       when the file is opened. The security manager must analyze
that each portal is "encapsulated" in a proxy portal.
                                                                       the file name to make the access decision, which is difficult
    We modified the microkernel to invoke the reference
                                                                       without knowing details of the file system implementation
monitor when a portal call invokes a service of the moni-
                                                                       and without information that is only accessible to the file
tored domain (inbound) and when a service of another
                                                                       system implementation. The problem is even more obvious
domain is invoked via a portal (outbound). The internal
                                                                       in a database system that is accessed using SQL statements.
activity of a domain is not controlled. The same reference
                                                                       To make an access decision the reference monitor must
monitor must control inbound and outbound calls of a
                                                                       parse the SQL statement. This is inefficient and duplicates
domain, but different domains can use different monitors. A
                                                                       functionality of the database server.
monitor is attached to a domain when the domain is created.
                                                                           There are three solutions for these problems:
When a domain creates a new domain, the reference moni-                0
                                                                        (1) The reference monitor lets the server proceed and only
tor of the creating domain is asked to attach a reference
                                                                          checks the returned portal (the file portal).
monitor to the created domain. Usually, it will attach itself
to the new domain but it can - depending on the security pol-           (2) The server explicitly communicates with the security
icy - attach another reference monitor or no reference mon-               manager when an access decision is needed.
itor at all.                                                            (3) Design a different interface that simplifies the access
    It must be guaranteed, that while the access check is per-            decision.
formed, the state to be checked can only be modified by the                Approach (1) may be too late, especially in cases where
reference monitor. When this state only includes the param-            the call modified the state of the server.
eters of the call, these parameters could be copied to a loca-             Approach (2) is the most flexible solution. It is used in
tion that is only accessible by the reference monitor. When            Flask with the intention of separating security policy and
the state includes other properties of the involved domains,           enforcement mechanism [58]. The main problem of this
the activity of these domains must be suspended. For these             solution is, that it pollutes the server implementation with
reasons the access check is performed in a separate domain,            calls to the security manager. The Flask security architec-
not in the caller or callee domain.                                    ture was implemented in SELinux [40]. In SELinux, the list
    The list of parameters is accessed using an array of               of permissions for file and directory objects have a nearly
VMObject portals. VMObject is a portal which allows access             one-to-one correspondence to an interface one would use




                                                                   5
for these objects. This makes approach (3) the most promis-           control of portal communication and (ii) the control of por-
ing approach. Our two example problems would be solved                tal propagation.
by parsing the path in the client domain. In an analogous
manner the SQL parser is located in the client domain and a               Figure 3 shows the complete reference monitor inter-
parsed representation is passed to the server domain and              face. Figure 4 shows the information that is available to the
intercepted by the security manager. This has the additional          reference monitor.
advantage of moving code to an untrusted client, eliminat-
                                                                        public interface DomainBorder {
ing the need to verify this code. Section 3.11 gives further               boolean outBound(InterceptInfo info);
details about the design of the file server interface.                     boolean inBound(InterceptInfo info);
                                                                           boolean createPortal(PortalInfo info);
3.5   Controlling portal propagation                                       void destroyPortal(PortalInfo info);
                                                                        }
     In [36] Lampson envisioned a system in which the client
can determine all communication channels that are avail-                        Figure 3: Reference monitor interface
able to the server before talking to the server. We can do this
by enumerating all portals that are owned by a domain. As              public interface InterceptInfo extends Portal {
we can not enforce a domain to be memoryless [36], we                    Domain getSourceDomain();
must also control the future communication behavior of a                 Domain getTargetDomain();
domain to guarantee the confinement of information passed                VMMethod getMethod();
to the domain.                                                           VMObject getServiceObject();
     Several alternative implementations can be used to enu-             VMObject[] getParameters();
merate the portals of a domain:                                        }
0
 (1) A simple approach is to scan the complete heap of the
                                                                       public interface PortalInfo extends Portal {
   domain for portal objects. Besides the expensive scanning
                                                                         Domain getTargetDomain();
   operation, the security manager can not be sure, that the             int getServiceID();
   domain will not obtain portals in the future.                       }
 (2) An outbound intercepter can be installed to observe all
                                                                                Figure 4: Information interfaces
   outgoing communication of the domain. Thus a domain is
   allowed to posses a critical portal but the reference moni-
   tor can rejects it's use. The performance disadvantage is          3.6   Principals
   that the complete communication must be checked, even
   if the security policy allows unrestricted communication               A security policy uses the concept of a principal [19] to
   with a subset of all domains.                                      name the subject that is responsible for an operation. The
 (3) The security manager checks all portals transferred to           principal concept is not known to the JX microkernel. It is
   a domain. This can be achieved by installing an inbound            an abstraction that is implemented by the security system
   interceptor which inspects all data given to a domain and          outside the microkernel, while the microkernel only oper-
   traverses the parameter object graph to find portals. This         ates with domains. Mapping a domain ID to a principal is
   could be an expensive operation if a parameter object is           the responsibility of the security manager. We implemented
   the root of a large object graph. During copying of the            a security manager which uses a hash table to map the
   parameters to the destination domain, the microkernel              domain ID to the principal object. We first considered an
   already traverses the whole object graph. Therefore it is          implementation where the microkernel supports the attach-
   easy to find portals during this copying operation. The            ment of a principal object to a domain. The biggest problem
   kernel can then inform the security manager, that there is         of such a support would be the placement of the principal
   a portal passed to the domain (method createPortal()).             object. Should the object live in the domain it is attached to
   The return value of createPortal() decides whether the             or in the security manager domain? Both approaches have
   portal can be created or not. The security manager must            severe problems. As the security manager must access the
   also be informed if the garbage collector destroys a portal        object it should be placed in the security manager's heap.
   (destroyPortal()). This way reference monitor can keep             But this creates domain interdependencies and the indepen-
   track of what portals a domain actually possesses.                 dence of heap management and garbage collection, which is
                                                                      an important property of the JX architecture, would be lost.
                                                                      Thus, a numerical principal ID seemed to be the only solu-
   Confinement can now be guaranteed with two mecha-
                                                                      tion. But having a principal ID has no advantages over hav-
nisms that can be used separately or in combination: (i) the




                                                                  6
ing a domain ID, so finally we concluded that the microker-             outside the runtime system, the runtime system must know
nel should not care about principals at all.                            about their existence or even know part of their internal
    The security manager maps the unique domain ID to a                 structure (fields and methods). These structural require-
principal object. Once the principal is known, the security             ments are checked by the verifier.
manager can use several policies for the access decision, for               The class Object is the base class of all classes and inter-
example based on a simple identity or based on roles [24].              faces. It contains methods to use the object as a condition
    To service a portal call the server thread may itself               variable, etc. In JX Object is implemented by the runtime
invoke portals into other domains. To avoid several prob-               system. The class String is used for strings. Because String
lems (trojan horse, confused deputy [30]) the server may                is used inside the runtime system, it is required that the
want to downgrade the rights of these invocations to the                String class does exist in a domain and that the first field is
rights of the original client. The most elegant solution of             a character array. The runtime system needs to throw several
these problems is a pure capability architecture. In the JX             exceptions, such as ArrayIndexOutOfBoundsException,
architecture this would mean that the server uses only por-             NullpointerException, OutOfMemoryError, StackOverflow-
tals that were obtained from that particular client. This               Error. It is required that these classes and their superclasses
requirement is difficult to assure in a multi-threaded server           RuntimeException, Exception and Throwable exist in a
domain that processes requests from different clients at the            domain. There are no structural requirements for these
same time. Because the server threads use the same heap, a              classes. Arrays are type compatible to the interfaces Clone-
portal may leak from one server thread to another. A better             able and Serializable. These interfaces also must exist in a
solution is to allow the reference monitor to downgrade the             domain.
rights of a call. To allow the reference monitor to enforce                 Classes are represented by the portal jx.zero.VMClass.
downgrading rights to the rights of the invoker, each service           But because Object contains a method getClass(), it is
thread (a thread that processes a portal call) has the domain           required that java.lang.Class exists and contains a construc-
ID of the original client attached to it. This information is           tor which has one parameter of type VMClass.
passed during each portal invocation. The reference monitor
has access to this information and can base the access deci-            3.9   Structure of the Trusted Computing Base
sion on the principal of the original domain, instead of the
principal of the immediate client.                                           Figure 5 shows the structure of the trusted computing
                                                                        base (TCB). In the TCB we include all system components
                                                                        that the user trusts to perform a certain operation correctly.
3.7    Revocation of memory objects
                                                                        The central part of the system is the integrity kernel. Com-
    There is a special kind of portals, called fast portals. Fast       promising the integrity kernel allows an intruder to crash the
portals can only be created by DomainZero. They are exe-                whole system. Built on the integrity kernel is the security
cuted in the context of the caller. The semantics of a fast             kernel. The security kernel represents the minimal TCB. In
portal is known to the system and it's methods can be                   a typical system configuration the TCB will include the
inlined by the translator. An example for a fast portal is the          window manager and the file system. Users will trust the file
Memory portal. We solved the confinement problems of                    system to store their data reliably. Compromising the secu-
capabilities by introducing a reference monitor that is                 rity kernel or the rest of the TCB leads to security breaches,
invoked when a portal is used. This is not practical with               such as disclosure of confidential data or unauthorized mod-
memory portals for performance reasons, although it could               ification of data, but not to an immediate system crash. It
be done. Therefore memory portals support revocation.                   may lead to a system crash when a compromised security
When the reference monitor detects that a portal is passed              kernel allows access to the integrity kernel. This design is
between two domains (createPortal()) it could revoke the                reminiscent of the protection rings of Multics.
access right to the memory object for the source domain or                   JX is a component-based system. A component consists
reject passing of the memory portal.                                    of a number of classes and a file that describes the compo-
                                                                        nent. This file also contains the information on what other
3.8    Minimizing the JDK class library                                 components the component depends on. The modulariza-
                                                                        tion and explicit dependencies allows to remove unneces-
    The JVM and the class library of the Java Development               sary functionality with a few configuration changes. For
Kit (JDK) can not easily be separated from each other.                  example in a server system the window manager may not be
    In JX the JDK is not part of the trusted computing base             part of the TCB, while in a thin client system the file system
(TCB). However, there are some classes, whose definition is             may not be needed. A user may even decide not to trust the
very tightly integrated with the JVM specification [38][29].            file system and store the data in an own data base.
Although these classes (except Object) are implemented




                                                                    7
                                                                                                                                         It is important that there are no dependencies between
                                                                                                                                     the inner kernels and the outer ones. The security manager,
                          User Application                                                User Application
                                                                                                                                     for example, must not store its configuration in the file sys-
                       open                                                                                                          tem but use its own simple file system.
                    window
                                                                                       read file                           TCB
                                                                                                                                     Tamper-resistant auditing. The system must assure that
                                                                                                                                     all security relevant events are persistently stored on disk
 Window Manager                                                                                                                      and cannot be modified. To be certain that the audit trail is
                                                                                                                                     tamper-proof we use a separate disk and write this disk in an
                                                                                       File Server                                   append only mode. We do not use a file system at all but
                                                                                                                                     write the messages unbuffered to consecutive disk sectors.
Keyboard and                                                                                                                         We do not use any buffering and the audit call only returns
Mouse Driver
                                                                                                                                     when the block was written to disk. Writing at consecutive
                                     get permissions




                                                                                                                                     disk sectors avoids long distance head movements and gives
                         ask user
     trusted path




                                                                                                                                     a rate of 630 audit messages per second3. Writing one audit
                                                                                          Security Kernel
                                                                                                                                     message needs 1582 µseconds. Given that a file access
                                                                        start domain

                                                                                                                                     which can be satisfied from the buffer cache is in the tenth
                                                                                                                                     of µseconds auditing each file access adds considerable
  Authentication                                                                                                                     overhead. The size of a typical audit message is between 35
     System                                                                                                                          and 40 bytes. The disk is used as a ring buffer: when the last
                                                                                                                                     sector is reached we wrap to the first one and overwrite old
                                                                                                                                     logs. This avoids a problem often encountered when log-
                                                                                                       read/write sector



             Access/Execute
                Decision                                                                                                             ging to a file system: when the file system is full logs get
                                                                                                                                     lost. Usually, the most recent logs are the most valuable.
                                                                                                                                     With the above mentioned message rate of 630 messages/
   Principal                                                                                                                         second and a message size of 40 bytes we have a time win-
  Management
                                                                                                                                     dow of 110 hours using a 10 GBytes disk. Under normal
Central                                 Audit                                                                                        operation the time window is much larger, because the mes-
Security                                                                                                                             sage rate is well below its maximum.
Manager
                                                                                                                                     Trusted path. According to the Orange Book [21] a trusted
  Domain
                                                                                                                                     path is the path from the user to the TCB. Depending on the
  Starter                Program                                                                                                     user interface the TCB must include the window manager or
                          Loader                                                                                                     the console driver.
                                                                                                                                         Recent literature generalizes the notion of a trusted path
                                                                                                                                     to any communication mechanism within the system. To
                      Component
                      Repository                                                                                                     trust a communication path it is essential to identify the
                                                                                                                                     communication partner and provide a communication chan-
              Verifier &                                                                     BlockIO
                                                                                                                                     nel that can not be overheard or modified. Portal communi-
              Translator                                                                  Disk Driver                                cation is such a mechanism.
                                                                                                                                         Usually, the reference monitor limits communication
                                                                                                                                     according to a certain security policy. This mechanism
                                                                                                                                     works automatically and is transparent to domains. But it is
                                                       JX Microkernel                                                                even possible for a domain to explicitly consider portal
                                                                                                                                     communication as being performed on a trusted path,
                                                                                                                                     because the target domain of a portal can be obtained and
                                                         Hardware                                                                    this identity can not be spoofed.

                                                                                         Integrity Kernel

                                    interception
                                                                                                       domain
                               portal call                                                                                            3. The following hardware was used for all measurements in this pa-
                                                                                                                                         per: Intel PIII 500 MHz, 512 KB cache, 640 MB RAM, 440BX
                     Figure 5: Typical TCB structure                                                                                     Chipset, 82371AB PIIX4 IDE, Maxtor 91303D6 disk.




                                                                                                                                 8
3.10 Maintaining security in a dynamic system                         classes in terms of our capability-based filesystem interface
                                                                      (Figure 6).
    An operating system is a highly dynamic system. New
                                                                           Client Domain
users log in, services are started and terminate, rights of
users are changing, etc. To maintain security in such a sys-
                                                                                                  Client
tem, the initial system state must be secure and each state
transition must transfer the system into a secure state.                                  java.io.RandomAccessFile
                                                                                                  jdk_fs
    There are two issues to be considered here: the system
                                                                                               jx.fs.File
issue and the security policy issue.
                                                                                    fs_user
    It must be guaranteed that trusted software is not tam-
pered and untrusted software runs in a restricted environ-
ment. The system starts with a secure boot process. Pro-                                                         Reference Monitor
                                                                                                                 and Security Policy
vided that no attacker has physical access to the hardware
                                                                                                                          Security Domain
booting from a tamper-proof device, such as a CD-ROM, is
sufficient and we do not need a secure boot process as in                                       jx.fs.File
AEGIS [2] that checks for hardware modifications. We trust                          fs_user
the initial domain to correctly start the security services and
                                                                                             fs_user_impl
to attach them to the created domains. Each domain is
                                                                                              jx.fs.FileInode
started with a strictly defined set of rights (portals) and no                                                                         Legend:
                                                                                    fs
trusted code. The initial portals always include a naming                                                                                Implementation
                                                                                               fs_javafs                                   Component
portal with which other portals can be obtained. To avoid the
expensive nameserver lookup it is possible to pass a set of                                    jx.bio.BlockIO
                                                                                                                                             Interface
additional portals to a newly created domain. The created                           bio
                                                                                                                                       Interface Component
domain is automatically associated with a principal. When                   Fileserver Domain
a domain obtains new portals or communicates using exist-
ing portals the security system is involved.                                             Figure 6: Filesystem layers
    The policy issue is concerned with secure changes of the              The implementation component jdk_fs contains imple-
access rights, additions of principals, etc. How this is done         mentations for the java.io.* classes and uses portal inter-
depends on the used security policy and is outside the scope          faces from the fs_user interface component to access the file
of this paper.                                                        system. These portals access service objects that are imple-
                                                                      mented in the fs_user_impl component.
3.11 Securing servers                                                     Code that uses the java.io classes can run unmodified on
                                                                      top of our implementation of java.io. But the advantages of
     We use the file system server to illustrate how our secu-
                                                                      a capability-based system are lost: files must be referenced
rity architecture works in a real system. As we discussed in
                                                                      by name and problems similar to the Confused Deputy [30]
Section 3.4 we use the server interface to make access deci-
                                                                      are possible. An application can avoid the problems by
sions. For this to work servers must export securable inter-
                                                                      using the (not JDK-compatible) capability-based file sys-
faces. A securable interface must use simple parameters and
                                                                      tem interface.
provide fine-grained simple operations.
                                                                          In an multi-level security (MLS) system in which the file
     Many servers have a built-in notion of permissions, for
                                                                      system is part of the TCB, the file system must be verified
example the user/group/other permissions in a UNIX file
                                                                      to work correctly - which may be a difficult task as file sys-
system. We call them native permissions. These permis-
                                                                      tems employ non-trivial algorithms. We used a configura-
sions can be supplemented or replaced by a set of foreign
                                                                      tion which eliminates the need for file system verification.
permissions. These permissions could, for example, be
                                                                      Our system creates different instances of the file system for
access control lists. Because foreign permissions are not
                                                                      the different security levels, each file server being able to
supported by the server, there must be a way to store them.
                                                                      use a disjunct range of sectors of the disk. Assuring correct
The SELinux system [40] uses a file hierarchy in the normal
                                                                      MLS operation can now be reduced to the problem of veri-
file system to store foreign permissions.
                                                                      fying that the disk driver works correctly; that is, it really
                                                                      writes the block to the correct position on the disk. The file
    There is some scepticism whether a capability-based
                                                                      system may run outside the TCB with a security label that
system can be compatible to the JDK (see the discussion of
                                                                      is equivalent to the data it stores.
capabilities in [63]). We proved that this is possib