DEIMOS Information Management

DEIMOS Information Systems Component

Hardware Resources Required

$Id: hardware.html,v 1.5 1996/03/12 19:34:37 de Exp de $

In order to manage, and offer open access to, the volume and variety of data described in these specifications, something more powerful than ASCII files and shell scripts (or C programs) is indicated. The most stable and well-understood currently available technology is a relational database management system (RDBMS) using some superset of ANSI SQL for data retrieval. See the recommended software specification for a discussion of various RDBMS and their application to the information management problem. To support a reliable relational database server, dedicated or near-dedicated hardware is recommended.

Consult the Glossary for explication of any specialized terms and abbreviations used in this document.

Type of Host

Assuming for the moment that we decide to use a Sybase RDBMS (since there is some precedent for this choice and we know enough about Sybase to make some useful predictions about performance and platform requirements), we need to make some estimate of the hardware resources needed to support the engine. Given that the RDBMS is an essential logging and archival tool, it wants to run on a host which is secure and relatively immune to radical load average shifts, reboots, crashes, etc. A non-login machine not directly connected to any experimental or custom hardware would be a good choice.

The host which supports the database server should be located in the Keck-II computer room with the rest of the machines which directly support the observing process. The database server will be used to store (log) operational data during the night, as well as provide information for the observer or for the rest of the observing software. It should be considered an integral part of the observing software suite, and co-located with its peer machines which perform instrument, telescope, and dome control.

We should obviously choose a "standard" platform, either a Sparc running Solaris or a DEC Alpha running OSF.

Configuration

64MB of main memory is a good median configuration for a Sybase server. Data space is reconfigurable after server installation and startup, but we would want to start with a reasonable disk configuration (enough space for at least a year's worth of operation). If we wish to do volume mirroring, then we need to duplicate the partitions we choose to mirror, on another spindle. For example, one of my Sybase servers has a 500MB data partition on a 2GB drive; this partition is mirrored to a 500MB drive on the same machine. The rest of the data on the 2GB drive are either non-Sybase or non-mirrored. If we choose not to mirror any partitions, then only one spindle is really required (though performance improvements can be realized by using multiple smaller spindles).

One Server or Two?

We should decide whether there is just one Sybase server which handles logging, interactions with the observing software, and also random queries to the public portion of the archive, or whether there is a "production" server handling public queries, using a downloaded copy of data from the "critical" server at the telescope which is protected from outside access altogether. I strongly recommed a 2-server model, mostly because it is easier to ensure security, good performance, and uptime if no random outside connections are permitted to the critical machine.

A 2-server model also ensures a working backup of the data, and the 2nd server should preferably not be at the same site. For example, a Sybase server at Lick might offer the public portion of the data archive via WWW pages, getting fresh data daily from the private server on Mauna Kea.

Backups

Local backups of the archived data also must be made, in a format which permits fast recovery of the critical server should it suffer catastrophic media failure. Tools exist which make complete ASCII, human-readable backups of Sybase databases from which recovery of entirely destroyed servers can be done in a matter of a few hours. The backup files go to normal filesystem space, whence they can be backed up again to tape media. Both the critical server and the production server should be backed up at their respective sites.

Goals and Requirements

We should consider our hardware requirements in the light of our three major goals: to preserve

slitmask data,
operational/calibration/parametric data, and
acquired data.

The slitmask library and operational (logged) data represent only a modest problem of volume and accumulation. The image header data likewise do not represent a real challenge in terms of storage space. It is the images themselves (as discussed in the image archive document, not stored in the database, which pose the real problem of storage space and access time. The data actually stored in the RDBMS represent only a few hundred megabytes per year. Some maintenance and re-indexing may be needed to ensure rapid access to the data as the accumulated record grows, but these tasks can be at least partially automated.

Given that these tables are unlikely to exceed a mega-record in a couple of years, I don't see a call for sophisticated multi- processor architecture or other expensive high-performance CPU power. SCSI-II disk speed would help to improve response time, but otherwise no state-of-the-art or specialized hardware is required for this fairly basic application.

In summary, no particularly "heavy" hardware must be acquired to support the RDBMS portion of the software specification. An older sparc2 with 64MB memory and 2GB dedicated data disk would would probably meet our needs for at least the first 2 years of operation. This machine could also serve a second purpose, if that second purpose did not compromise security and/or uptime. We would be wise, of course, to overspecify slightly (4GB of disk and a sparc5, for example).

Location

The database server hardware should be situated in the computer room with the rest of the telescope control computers, not at the end of a long and vulnerable link to some other site. Failure of network connectivity to the database engine could have a perceptible impact on the observer, forcing manual processes which could slow down observing. Failure of network connectivity could result in lost log information as well, damaging the historical record which we are trying to preserve. The database server wants to be on the local network with the rest of the instrument and telescope control equipment.

Very Approximate Costs

Supposing that we purchased hardware and a Sybase license specifically for this project, the total acquisition cost for the database engine should be on the order of $15K. This would cover one license, one low-end sparc station, and enough disk to get started. Note that no storage of image data is to take place in the database; other disk space and storage mechanisms must be developed for the temporary and permanent storage of images. This is why a modest amount of disk space will be adequate for the database server.

If we were to "make do and mend" by using the existing Sybase license on Mauna Kea, building a server out of miscellaneous used parts, etc., we could probably reduce this cost to no more than the price of a disk drive; however, we'd have to examine what other functions were required of the existing Sybase server and whether those requirements conflicted with the restrictions recommended above.

An economical suggestion:
It is possible that functions could be combined so that the database server host was also the designated host for some other low-level integral function. This function would have to be non-login, and involve no unpredictable and/or sudden load changes or interruptions of uptime. It should also not consume so much memory as to compete heavily with the database engine and drive the host into swapping. If such functions can be identified, sharing the hardware would be practical and even desirable.

A more luxurious suggestion:
It's been suggested (D. Koo) that two hosts should be constructed, one which is primarily a database server platform and can in case of emergency serve some other basic machine control function; the other is primarily a machine control or other low-level service host, but can function as a database server. This would provide a rapid recovery path should either host suffer hardware failure; however, it involves doubled hardware costs and some maintenance overhead.

Acquisition Schedule

Since Lick already owns two Sybase engines, DEIMOS applications can be designed and tested without immediate purchase of any additional software or hardware. The acquisition of Sybase license(s) and workstation(s) can be deferred until fairly close to the time of DEIMOS commissioning.

de@ucolick.org
webmaster@ucolick.org
De Clarke
UCO/Lick Observatory
University of California