DEIMOS Information Systems Component
Hardware Resources Required
$Id: hardware.html,v 1.5 1996/03/12 19:34:37 de Exp de $
In order to manage, and offer open access to, the volume and
variety of data described in these specifications, something
more powerful than ASCII files and shell scripts (or C programs)
is indicated. The most stable and well-understood currently
available technology is a relational database management
system (RDBMS) using some superset of ANSI SQL for data retrieval.
See the recommended software specification
for a discussion of various RDBMS and their application to the
information management problem. To support a reliable relational
database server, dedicated or near-dedicated hardware is recommended.
Consult the Glossary for
explication of any specialized
terms and abbreviations used in this document.
Type of Host
Assuming for the moment that we decide to use a Sybase RDBMS (since
there is some precedent for this choice and we know enough about
Sybase to make some useful predictions about performance and platform
requirements), we need to make some estimate of the hardware resources
needed to support the engine. Given that the RDBMS is an essential
logging and archival tool, it wants to run on a host which is secure
and relatively immune to radical load average shifts, reboots, crashes,
etc. A non-login machine not directly connected to any experimental
or custom hardware would be a good choice.
The host which supports the database server should be located in the
Keck-II computer room with the rest of the machines which directly support
the observing process. The database server will be used to store
(log) operational data during the night, as well as provide information
for the observer or for the rest of the observing software. It
should be considered an integral part of the observing software suite,
and co-located with its peer machines which perform instrument,
telescope, and dome control.
We should obviously choose a "standard" platform, either a Sparc running
Solaris or a DEC Alpha running OSF.
Configuration
64MB of main memory is a good median configuration for a Sybase server.
Data space is reconfigurable after server installation and startup,
but we would want to start with a reasonable disk configuration (enough
space for at least a year's worth of operation). If we wish to do
volume mirroring, then we need to duplicate the partitions we choose
to mirror, on another spindle. For example, one of my Sybase servers
has a 500MB data partition on a 2GB drive; this partition is mirrored
to a 500MB drive on the same machine. The rest of the data on the 2GB
drive are either non-Sybase or non-mirrored. If we choose not to mirror
any partitions, then only one spindle is really required (though performance
improvements can be realized by using multiple smaller spindles).
One Server or Two?
We should decide whether there is just one Sybase server which handles
logging, interactions with the observing software, and also random
queries to the public portion of the archive, or whether there is
a "production" server handling public queries, using a downloaded copy
of data from the "critical" server at the telescope which is protected
from outside access altogether. I strongly recommed a 2-server model,
mostly because it is easier to ensure security, good performance, and
uptime if no random outside connections are permitted to the critical machine.
A 2-server model also ensures a working backup of the data, and the
2nd server should preferably not be at the same site.
For example, a Sybase server at Lick might offer the public portion of the
data archive via WWW pages, getting fresh data daily from the private
server on Mauna Kea.
Backups
Local backups of the archived data also must be made, in a format which
permits fast recovery of the critical server should it suffer catastrophic
media failure. Tools exist which make complete ASCII, human-readable
backups of Sybase databases from which recovery of entirely destroyed
servers can be done in a matter of a few hours. The backup files go
to normal filesystem space, whence they can be backed up again to
tape media. Both the critical server and the production server should
be backed up at their respective sites.
Goals and Requirements
We should consider our hardware requirements in the light of our
three major goals: to preserve
- slitmask data,
- operational/calibration/parametric data, and
- acquired data.
The slitmask library and operational (logged) data represent
only a modest problem of volume and accumulation. The image
header data likewise do not represent a real challenge in terms
of storage space. It is the images themselves (as discussed in
the image archive document,
not stored in the database, which pose the real problem
of storage space and access time. The data actually stored in
the RDBMS represent only a few hundred megabytes per year.
Some maintenance and re-indexing may be needed to ensure rapid
access to the data as the accumulated record grows, but these
tasks can be at least partially automated.
Given that these tables are unlikely to exceed a mega-record in
a couple of years, I don't see a call for sophisticated multi-
processor architecture or other expensive high-performance CPU
power. SCSI-II disk speed would help to improve response time,
but otherwise no state-of-the-art or specialized hardware is
required for this fairly basic application.
In summary, no particularly "heavy" hardware must be acquired
to support the RDBMS portion of the software specification. An older
sparc2 with 64MB memory and 2GB dedicated data disk would would probably
meet our needs for at least the first 2 years of operation. This
machine could also serve a second purpose, if that second purpose did
not compromise security and/or uptime. We would be wise, of course,
to overspecify slightly (4GB of disk and a sparc5, for example).
Location
The database server hardware should be situated in the computer room with the
rest of the telescope control computers, not at the end of a long and
vulnerable link to some other site. Failure of network connectivity
to the database engine could have a perceptible impact on the
observer, forcing manual processes which could slow down observing.
Failure of network connectivity could result in lost log information
as well, damaging the historical record which we are trying to preserve.
The database server wants to be on the local network with the rest of
the instrument and telescope control equipment.
Very Approximate Costs
Supposing that we purchased hardware and a Sybase license specifically
for this project, the total acquisition cost for the database engine
should be on the order of $15K. This would cover one license, one
low-end sparc station, and enough disk to get started. Note that
no storage of image data is to take place in the database; other disk
space and storage mechanisms must be developed for the temporary and
permanent storage of images. This is why a modest amount of disk
space will be adequate for the database server.
If we were to "make do and mend" by
using the existing Sybase license on Mauna Kea, building a server
out of miscellaneous used parts, etc., we could probably reduce this
cost to no more than the price of a disk drive; however, we'd have
to examine what other functions were required of the existing Sybase
server and whether those requirements conflicted with the restrictions
recommended above.
An economical suggestion:
It is possible that functions could be combined so that the database
server host was also the designated host for some other low-level
integral function. This function would have to be non-login,
and involve no unpredictable and/or sudden load changes or interruptions
of uptime. It should also not consume so much memory as to compete
heavily with the database engine and drive the host into swapping.
If such functions can be identified, sharing the hardware would be
practical and even desirable.
A more luxurious suggestion:
It's been suggested (D. Koo) that two hosts should be constructed,
one which is primarily a database server platform and can in case
of emergency serve some other basic machine control function; the
other is primarily a machine control or other low-level service
host, but can function as a database server. This would provide
a rapid recovery path should either host suffer hardware failure;
however, it involves doubled hardware costs and some maintenance
overhead.
Acquisition Schedule
Since Lick already owns two Sybase engines, DEIMOS applications
can be designed and tested without immediate purchase of any
additional software or hardware. The acquisition of Sybase
license(s) and workstation(s) can be deferred until fairly close
to the time of DEIMOS commissioning.
de@ucolick.org
webmaster@ucolick.org
De Clarke
UCO/Lick Observatory
University of California