Information about http://www.auerbach-publications.com/dynamic_data/1951_874_46-40-24.pdf

46-40-24 …

Tags: available tools, backup systems, busi ness, consistency checks, contingency planning, database consistency, disaster recovery planning, enterprise operations, indus, input server, mark b desman, network input, operations management, operations manager, operations managers, plurality, removable media, resumption, server backups, word locations,
Pages: 10
Language: english
Created: Wed Feb 16 21:04:57 2000
Display cached document
Page 1
image
Page 2
image
Page 3
image
Page 4
image
Page 5
image
Page 6
image
Page 7
image
Page 8
image
Page 9
image
Page 10
image
                                                    46-40-24


                                 ENTERPRISE OPERATIONS MANAGEMENT


              MAINTAINING BACKUP
             SYSTEMS AND DATABASE
              CONSISTENCY CHECKS
                                            Mark B. Desman


                                                     INSIDE

                   Network Input; Server Backups; Database Backup; Available Tools and Solutions



        INTRODUCTION
        For years, contingency planning specialists have preached the gospel of
        backing up information resources. All data files are backed up to remov-
        able media (first tapes, then diskettes, now back to tape) and are moved
        to a remote location. All program files are likewise copied to removable
        media and dutifully carried miles from the site of probable failure. Gener-
        ations are kept close track of and several are kept in the off-site repository.
           Has this now become passé? Hardly. One still needs to back up every-
        thing that may be needed later, but the nature of that media has changed
        dramatically, as have the locations where it is now gathered. Pay partic-
        ular attention to that word "locations," as the reasons for its plurality will
        soon be revealed.
           Enterprise operations managers have moved merrily along in the busi-
        ness resumption planning mode, albeit changing the title of this chosen
        profession a number of times -- disaster recovery planning, contingency
        planning, etc. Certainly, they have recognized the evolution of the indus-
        try and its mechanized components, but have fallen short in the recogni-
        tion and addressing of specific needs
        that have grown out of these chang-
        es. In short, contingency planning                      PAYOFF IDEA

        specialists are still applying process-    The enterprise operations manager must consid-
                                                   er the nature of the new technology used to create
        es that have become sufficiently out-
                                                   backups. High-speed, high-volume tools can be
        dated so as to place the organization      applied to automated high-speed backup and
        at risk if changes are not made. One       storage. This article provides solutions to situa-
        can no longer simply back up files to      tions where something could go wrong in the
        tape and assume that the several           backup process -- where one believes there is a
                                                                  valid backup, when in truth, there is not.

04/00                                          Auerbach Publications
                                                 © 2000 CRC Press LLC
                   ENTERPRISE OPERATIONS MANAGEMENT



generations one has, along with journal or work files, can bring all up to
date when called upon to do so. Now, one must alter present methods
of backing up information as well, putting in place measures to ensure
that much more complex backups are valid, and can be used to recon-
struct lost production data.
    First, mainframe or centralized computing, while not now (or proba-
bly, ever) outdated, has become less and less a player in our connected
world. Today, servers host a major portion of production information.
Critical data may be housed in any workstation anywhere on the net-
work. One is no longer certain of getting all pertinent files backed up
simply by copying mainframe data files to tape for off-site storage. In ad-
dition, the nature of that production information has changed dramatical-
ly, so much so as to create the need for entirely new approaches to
backing up production.
    Most important is the evolution of data storage technology and the
current dependency thereon. No longer is one concerned with backing
up hundred-megabyte files, but now twenty-gigabyte databases are con-
sidered run-of-the-mill. As all the futurists have warned, information is
being created at a record pace and the means to store it are growing at
the same rate. Not only are these databases exponentially larger than the
flat files they replace, but the data structure within them is so complex as
to be easily damaged (and thus rendered unusable or usable only after
hours of painful reconstruction by a database administrator) should
something go wrong with a back-up process. As the sizes of the backed
up components have grown, so have the opportunities for disruptions to
the backup process.
    Databases are, in today's business environment, online tools. They are
addressed not by a central job stream, but by individual users and pro-
cesses that may number in the thousands for any particular database in
any particular location. As the sources of input or alteration varies, so can
the manner in which access is made or the time of day or date that such
accesses can be made. One thing that has not changed with this evolu-
tion is the potential for dire effect on any backup when the information
being backed up is exposed to any outside activity during backup. Mul-
tiply that potential by the now thousands of potential sources for that ac-
cess and place an exponent based entirely on the now huge size of the
component, and a recipe for disaster looms. Consider these new chal-
lenges on an individual basis.

NETWORK INPUT
Input or changes to stored data no longer take the few forms that were
known in mainframe processing days. Transactions that address the data
in many fashions from reading and copying through actual manipulation,
may be initiated from terminals anywhere on the network and may arrive

                              Auerbach Publications
                                © 2000 CRC Press LLC
     MAINTAINING BACKUP SYSTEMS AND DATABASE CONSISTENCY CHECKS



at the server housing the information from potentially any point on it or
any point from which contact might be made (dial-up, Internet, etc.). Ob-
viously, a number of different communications protocols, transmission
speeds, and input types may be encountered here. Each variation has the
capability for throwing a technological monkey wrench into the system.
If a translation or conversion is off by a byte or two, the entire database
may be corrupted.
    The overworked or dedicated employee now has a different role in
this issue as well. Now, input might come via telephone line and modem
hookup at any time of the day or night. With the real-time character of
database technology, there is no overnight shutdown for batch updating
(within certain limitations) or, at least, the window is much smaller.
Again, one is relying on voice-grade phone lines as a transfer medium for
sensitive data. Although in his other role as an information security ad-
ministrator, this author might find ample space for comment here, suffice
it to say that certain potentials for compromise are now on the table that
were lacking in the more controlled environment of the past. In addition,
the quality of the line can create additional areas for damage to the input
data and the potential consequential threat to the database in question.
Such are the complications one deals with today.
    Although this article deals with data center recovery, one must under-
stand that server housing installations are a logical extension of the data
centers that once housed only mainframes. For these reasons, one must
deal with the impact of the network here. As will be demonstrated, the
impact of the network is felt in the server or data center. Therefore, one
must view it as a component of this section.

SERVER BACKUP
Thus far, the server(s) has been identified as repositories for databases and
production information. What has not been addressed is the sensitive infor-
mation that might be created, housed, or input into peripheral workstations
or personal computers attached to the network. The loss of this sensitive in-
formation has the potential to be as damaging to the company as would be
any lost from the servers. How then, does one address that data?
    First, one should reflect on the absolute necessity for solid policies
and procedures in any data processing environment. Without them, dis-
tributed processing and networking are the playground of anarchy. With
no means to oversee the activities of users of these remote (yes, even in
the same building) sites, a code of conduct must be in place. None of the
rest of this article would have any impact without such documentation
on expected behavior and standards for usage.
    Assuming the existence of this documentation, it is important to re-
view the significance of protecting and backing up workstation-based
information.

                              Auerbach Publications
                                © 2000 CRC Press LLC
                    ENTERPRISE OPERATIONS MANAGEMENT



    One of the positive aspects of network attachment to a server or serv-
ers is the ability to use the server as an extension to the resources of the
attached PC itself. A virtual disk drive can be created on the server that
may be addressed by the PC as its own storage medium. What is needed
is to have the use of that virtual drive for storage of sensitive information
clearly delineated in the policies and procedures document note above.
That being the case, one has now gathered the sensitive information
from as many as thousands of workstations onto one or more servers,
storage devices not only housed in a secure location (that is a safe as-
sumption, is it not?), but small enough in numbers to represent a far bet-
ter option for mass backups. In addition, far more sophisticated backup
technology can be financially justified and installed where the need ex-
ists and valid security can be implemented.
    As can be seen, one has begun creating a backup and recovery envi-
ronment that must enlist the continuing involvement of database admin-
istrators (DBAs), network management, and applications support teams.
It will become apparent that new responsibilities will be identified that
must be handled in a manner heretofore unneeded.
    What, then, does one do to backup the virtual drives? Actually, this is
as simple as backing up server-based applications in the recent past.
What differs here is the need to partition the backups to the extent that
specific user names -- associated with the users' individual virtual drives
-- can be isolated for recovery of information kept on that drive. This is
a process best left to the server or network administrators.
    A means to implement such recoveries must also be procedurally in
place for this to function. It appears that often the best way to address
this is through the Help Desk or Support Center (or whatever name an
organization uses for the function) and having a predetermined process
for requesting reloading of backed-up information to the server.
    This process, assuming the carefully delineated responsibility not to
keep sensitive information at the workstation, is the central focus for hav-
ing backups for all critical data. Without it, there is certain to come a time
when something for which no backup exists mysteriously disappears
from a workstation or the usual mortality rate of such equipment claims
that same data.

DATABASE BACKUP
In the contemporary data processing world, database applications have
grown to be one of the central repositories for production information.
To be sure, no one has forgotten the IMS and DB2 of the mainframe
world, but the huge update windows that were used during off-hours for
batch processing have either been diminished to minutes or done away
with altogether. Today's database products live in the server world, and
may be products of entirely different vendors. Oracle, Sybase, and Red

                               Auerbach Publications
                                 © 2000 CRC Press LLC
     MAINTAINING BACKUP SYSTEMS AND DATABASE CONSISTENCY CHECKS



Brick, as examples, owe no allegiance to any particular hardware vendor.
Versions of each exist that run under the aegis of all major hardware
manufacturers. In short, an installation and potential mix of vendor prod-
ucts may not be paralleled in any nearby information center, therefore
making it doubly important that backups be accurate and available.
    One of the niceties about server-based processing is the relatively low
expense of new equipment and its availability. Now, where an organiza-
tion once contracted to get the "next-off-the-line" mainframe equipment,
it can replace most related equipment locally and in a very short time.
The large distributors such as Inacom and CompUSA can often meet rea-
sonable equipment requirements from inventory. Should that not be pos-
sible, networked warehouses and other outlets throughout the region
can be leveraged. As a consequence of newer design, a data center now
can be far less air-conditioned than was its mainframe housing predeces-
sor. Sufficient air conditioning is not difficult to arrange for most server-
sized units and their tolerance ranges are far broader than those of their
larger brethren. Often, by breaking a data center into smaller groupings,
and using the communications connections in place, relatively effective
redeployment can take place. To be sure, special effort will be required
to keep these temporary "satellites" operational, but it can be done while
awaiting a new data center or refurbishment of the old.
    As noted earlier, the flat files of yesterday are small potatoes compared
to today's server-based database applications. The files in these cases are
not compartmentalized units of a few hundred megabytes, but range into
the gigabyte volume. In fact, sizes of 50 or more gigabytes are not at all
uncommon. In addition, applications have burgeoned to where larger or-
ganizations might own hundreds of them. Remember, too, that this num-
ber of different applications requires larger server pools and disk storage
to accommodate them. Backups now range over a number of servers and
applications and a means to manage them must be in place. For exam-
ple, Adstar Data Storage Management (ADSM) allows rapid backup and
moves backed-up data to tapes or cartridges according to a predeter-
mined plan. It is capable of huge backup volume (more often dependent
upon the communications protocol than the capability of the system) and
can be called upon to manage the distribution of backup files and to
quickly provide information for restoration.
    What it or any of the other products in the marketplace cannot do is to
guarantee that the "recovered" information is accurate and uncontaminat-
ed. Events that occur during the backup process or simply programming
the process with the incorrect targets can make what appear to be full and
accurate backups totally worthless. The system performs as advertised,
and no message indicating improperly defined data will be generated.
    In some instances, the system will be impacted by events that occur
within the database during backup. A sequencing error may be caused


                              Auerbach Publications
                                © 2000 CRC Press LLC
                   ENTERPRISE OPERATIONS MANAGEMENT



by any number of outside events, from electrical spikes to real-time up-
dates being made during backup. The slippage of only a byte or two in
the schema can put the entire database out of sequence, a condition that
must be fixed before any valid restoration can take place. Even when re-
stored, some information will be impacted and the database will not mir-
ror the original.
   A third potential issue is that of the definitions of information to be
backed up. As stated, ADSM and similar products perform their routines
in direct response to predefined processes. They cannot and will not dis-
cern between correct and incorrect instructions. In many cases, such er-
rors in initial setup go unnoticed until such time as the backed-up
information is needed. The result does not bode well for the company in
question and offers precious little to the recovery coordinator's career
goals. Remember, in these circumstances, the number of generations of
backups available makes not one whit of difference. Each ensuing back-
up is doggedly referential to the instruction set and is as wrong as is the
current generation.
   Now, one is in a position where the volume of information being pre-
served through backup processing is many times what it was a few years
ago. Indeed, it is so large so as to have been impossible to back up with
yesterday's tools. The evolution of equipment has kept pace with the
proliferation of data and of the software that creates and utilizes it, but
our procedures have languished in the past.
   How then does one propose to validate the mountains of information
now being backed up and shipped (albeit electronically) off site? Can
one somehow guarantee its validity? Can one foretell that it will be there
and correct when the hour of need appears? Consider now some of the
ways that this can be done.

TOOLS OF TODAY
Along with the growth of database software, tables, files, and metadata
have come the tools needed to verify their validity. Perhaps the most
common and valuable is the DataBase Consistency Check (DBCC), a
means by which the information in a database is validated against the
search arguments that would allow accessing in the production world.
The tool validates the information line by line, looking for inconsisten-
cies. Once fully verified in this manner, the chances of recovering an ap-
plication from this backup in enhanced manyfold.
    Why, then, is this not simply done with every backup? The answer is
as simple as the concept of validity checking. In many organizations to-
day, hundreds of databases exist, often having a total volume measured
in terabytes, rather than the much smaller gigabyte individual database
size. To do so on a daily basis would be virtually impossible. Often, the
first time a backup file is checked in this manner is when it is retrieved

                             Auerbach Publications
                               © 2000 CRC Press LLC
      MAINTAINING BACKUP SYSTEMS AND DATABASE CONSISTENCY CHECKS



for purposes of recovering a downed application. (Have you ever found
your spare tire when you needed it most?) This parallel becomes obvious
the first time contaminated or out-of-sequence information is discovered
at restore time. (You did not check the spare because you did not want
to invest the time against the remote possibility of a flat.) Analogously,
one did not DBCC the database because one had far too much to back
up. (You never planned on the construction truck dropping a box of
nails.) And, no one planned on the power surge that killed both the serv-
er and the application.

Common-Sense Solutions
As with every problem, a number of solutions can be brought to bear.
Usually, the controlling factors are time, resources -- both human and
equipment -- and, of course, cost. As with every probable resolution, all
of these factors must be taken into consideration when deciding upon a
course of action.
    First and foremost, one must look at the environment to which the ap-
plication, applications, or entire server is to be restored. Fortunately, the
server world has provided test environments or the opportunity to pur-
chase additional recovery equipment at a fraction of what mainframe
hardware would cost. There is little need for the heavily conditioned fa-
cility that was required by the mainframes, so an alternate site can often
be found on current premises. That not feasible, the availability of usable
space is far greater than was a similar facility to house mainframes.
    Assume for a moment that one has a test box available and that it is
housed near enough to be practical and far enough away to be safe from
a failed site. (Note: "site" here may well refer to the same room in today's
world, so as not to be affected by whatever took the original down [fire,
electrical outage, gremlins, etc.].) Assume also that practical planning has
made this server a node on the affected network or is interfaced in some
manner (gateway, etc.) to the required net. The problems are over. One
can take backups and load them down onto the new server and all will
be well in the kingdom.
    It is at this point that the validity of the backups comes into play. One
does not need to play out the earlier scenario about corrupt or out-of-se-
quence databases; everyone is well aware of the potential. What one
needs here, then, is a degree of confidence in the quality of the backup.
There are ways to raise that probability, mostly around verification of
backups offline, but a number of degrees to which that can be done.
Consider next a few of them.
    First, there is the idea of running DBCCs against every backed-up da-
tabase file. Time consuming? Yes, but is it cost-effective?
    Commonly, every company that has database processing has one or
more applications that are the lynchpins of daily business; and usually, a

                              Auerbach Publications
                                © 2000 CRC Press LLC
                   ENTERPRISE OPERATIONS MANAGEMENT



relatively perfunctory examination can define what is an acceptable loss
without jeopardizing the business. If the application is a robust one,
there are often log files that can be maintained for a number of days and
read back in as input. This may put the window of acceptable loss at,for
example, three days. Simply, one can recover a database from three-day-
old information, apply all of the transactions that have taken place since
that backup, and be back to current in a matter of hours (with luck, few;
in other cases, many). Often those same log files will give the time to
complete the reload without stopping business entirely. Posting of trans-
actions will be delayed, but not lost.
   If one knows that there is a three-day window on this extremely im-
portant database, then using DBCC on every third backup can ensure re-
covery at any time. Likewise with less important applications, the
window might be greater or the loss accrued to failure may well be less.
In these cases, the same vehicle -- DBCC -- can be used less frequently
with effective results.


RECOMMENDATION
This article has discussed maintaining a backup system and performing
DBCCs on every database backup created, at least on an occasional ba-
sis. What if one were to build a schema around the databases being
backed up as a function of their importance and sensitivity with regard
to conducting business? Is this starting to sound like the applications pri-
ority lists one used to wrench from senior management at the beginning
of erstwhile planning exercises?
   In this instance, however, rather than applying the prioritization to the
restoration of the application, one is applying it to the verification of the
validity of the backups that will be used to restore it. On a regular basis,
backups will be verified for accuracy.


SOLUTION
To properly verify the validity of a backed-up database, the backup must
first be downloaded onto the recovery (or test) machine and the DBCC
process run. Results should be recorded for use in reporting to manage-
ment. Obviously, the first backups to be DBCC'ed will be the aforemen-
tioned critical applications. In turn, most likely over several weeks,
sampling of other databases will be done to check for validity. Over the
period of a year, each database being backed up should have had at least
one generation loaded, DBCC'ed, and the results recorded. In the case of
critical applications, such tests may be performed multiple times
throughout the year.
    The application of this practice has demonstrated a number of things to
those organizations that have taken the steps to verify backups in this man-

                              Auerbach Publications
                                © 2000 CRC Press LLC
     MAINTAINING BACKUP SYSTEMS AND DATABASE CONSISTENCY CHECKS



ner. For one, it has been discovered that the right things were not being
backed up in the first place. Recovery would be impossible, as the most
important components were left out of the backup process. For another,
the tendency of backups in real time has led to sequencing problems that
render the backups nearly useless. This had not become apparent until a
regular routine of reloading and DBCC'ing was undertaken.
   Putting this type of process into action will normally call for the addi-
tion of at least a part-time analyst. Assuming that business resumption
planning is either a part-time or single-person responsibility at this time,
it might well make it either a full-time position or justify the addition of
another full-time employee (FTE). Possibly, the function could be moved
to the DBA area with a lower level analyst performing the weekly reload
and DBCC on a predetermined segment of the overall database backups.
Again, the target should be not less than to reload 100 percent of the ap-
plications backups at least annually, with more sensitive databases being
verified more often.
   The most important aspect of this function is to record what is found.
By the time one is a few months into the verification process, patterns
will be beginning to emerge as to areas where more problems are discov-
ered. The application of this data will allow the DBA group to address
the issues to prevent recurrence and to stabilize the backup process.
Again, the logging of errors is a tool to demonstrate to management the
care being taken in backup and recovery.

SUMMARY
Today's server-hosted database applications are very different from their
predecessor mainframe-based flat-file systems. The historically effective
means for creating backups, then, are not as likely to be effective. This is
for a number of reasons, not the least of which is the relatively large file
size of databases, often in the many gigabyte range. In addition to the
sheer magnitude of databases is the dependency on absolute sequenc-
ing. In addition to the size of the problem is the nature of the new tech-
nology used to create backups. High-speed, high-volume tools such as
ADSM can be applied to automated and high-speed backup and storage.
   With the above known, a number of places have been created where
things could go wrong without leaving a clear trail -- one could believe
a backup valid when in truth, it is not.
   A number of things could cause these problems: an error in transmis-
sion during backup, dropped bits of information, an update function dur-
ing backup, or even the backing up of improper files.
   The answer to these issues is to methodically spot-check what is being
backed up. A target of going through all systems backed up at least annu-
ally is the minimum acceptable. Some percentage of backups, starting with
the most critical, should be reloaded and a DBCC run against them. Errors

                              Auerbach Publications
                                © 2000 CRC Press LLC
                             ENTERPRISE OPERATIONS MANAGEMENT



discovered must be logged, communicated to the DBA, and remedial ac-
tion taken -- whether it is a new backup or a fix to the backup routine.
   Companies today must recognize their dependence upon these mega-
backups and authorize the addition of staff and resources to perform the
above functions if they are to stay competitive in today's business envi-
ronment.



Mark B. Desman has been a practitioner in information security and contingency planning for the past 20 years.
His background includes being one of the first information security managers for American Savings of California,
as well as CalFed Bank (now NationsBank) and Gibraltar Savings in southern California. Most recently, he was
manager of information security, contingency planning, and the technical Help Desk for a multi-state bank-holding
company in New England. Currently, Mr. Desman is manager of information security at Micron Technology, Inc.




                                             Auerbach Publications
                                                © 2000 CRC Press LLC