[Top] [Prev] [Next] [Index]


Building Databases


Introduction

The catDB utility program allows you to build small databases (containing fewer than ten thousand compounds) for the use of individual chemists and their project teams, and large databases (containing hundreds of thousands of compounds) suitable for use by an entire corporation.

Select a topic:

Background Information

The background information in the following sections introduces some key concepts that you need to understand before proceeding with database building. Most of these concepts are not unique to Catalyst, but are true for all database systems. However, even if you are familiar with other database systems, it is important to understand how Catalyst implements the following concepts:

How Do Databases Differ from Spreadsheets?

Databases contain collections of compounds and data such as IC50s and molecular weights for those compounds. Spreadsheets act as windows to the contents of databases. Spreadsheets are the products of browse and search commands on a database. A browse will return to the spreadsheet all of the database's contents, while a search query will typically return only of subset of the database's contents.

The data in a spreadsheet is a local copy of the database's data. if you make changes in the spreadsheet data, these changes are only made to the local spreadsheet copy. In order to permanently change data in the database itself, you must apply the Commit 1D Changes To Database command; this will update your database by copying in the altered local spreadsheet data.

Spreadsheets are designed for use by one person only. You can export your spreadsheet so your neighbor can import it and use it, but once your neighbor modifies a field, you must re-import that version to see the changes. Databases are shared so that you see any changes your neighbor makes to the data in the database. Thus, when you synthesize a new compound, enter it into a database, and then submit it to biological testing, the biologists in another building can enter their data into the database using Catalyst, and you will see their data immediately updated in the database.

Client-Server Operation

Catalyst is designed as a Client-Server application which permits distribution of computation for database searches over multiple computers. A typical search is a multi-step process consisting of the following steps:

  1. 2D indexing
  2. 3D indexing
  3. Relational database searching
  4. Final verification of the compound against the query (isomorphism)

Steps 1, 2, and 4 are performed on the catDisk server. Step 3 is performed in the relational database engine (Oracle). Step 4 can be executed in parallel. To control the execution of Step 4, set the Search.server.hosts variable in your ~/.Catalyst file.

Example 1:

Search.server.hosts = "DefaultServer"
Placing the above line in your ~/.Catalyst file sets the search to use the default server on the computer that holds the .0bdb file:

Example 2:

Search.server.hosts = "InClient"

Specifying "InClient" means that Step 4 (isomorphism) is performed on the client on which the Catalyst or catSearch application is running. This configuration emulates the behavior of Catalyst version 2.3.

Example 3:

Search.server.hosts = "HostA:HostA:HostB"

The variable can be set to a string of host names separated by colons. The above line sets up a three process parallel search on hosts named "HostA" and "HostB". Here, HostA must be a multi-processor machine and both hosts have to run the catDisk server. For help on how to configure your client-server most effectively, consult MSI Scientific Support.

Indexing and Its Effect on Performance

Indexes are another way of dealing with large volumes of data. Functionally, Catalyst indexes are like the indentations for each letter of the alphabet cut into the front edge of a dictionary. The thumb index comprised of these indentations allows you to jump quickly to the ds when looking up the word data. Catalyst builds indexes automatically, and they are exploited automatically during database searching. Nevertheless, if you are building a large database, you must be aware of them and make certain specifications related to indexing.

Catalyst databases support integrated 1D-2D-3D searching. That is, you can search for compounds that match a hypothesis composed of 1D components (constraints on data fields), 2D components (constraints on atom or bond types), and 3D components (constraints on geometric relationships). For example, your hypothesis can contain a 1D component such as MolecularWeight < 800, a 2D component that specifies a nitrogen that is not in an amide bond, and a 3D component that restricts the nitrogen to 5 to 7 angstroms from the center of a phenyl ring.

For efficient searching of large databases in Catalyst, indexes should be constructed for the 1D, 2D, and 3D data components for all compounds. When building a database you can, however, specify indexes for selected 1D data fields. Fields without indexes save disk space, but searching speed is slower. At the time you build a database, you can specify which 1D fields have indexes. You cannot easily change those indexing specifications after the database is built or add indexes for any new data fields appended to an existing database; call MSI Scientific Support for assistance.

Data Model, Property Dictionary, and crefs

The data model defines which data fields are associated with each molecule in a database. In a Catalyst database each compound is uniquely identified by its name and by an internal compound reference number called a cref (pronounced see-ref). Each entry in the database can be retrieved by its molecular structure, and this structure can exist more than once in the database. Thus, you can build a Catalyst database with one entry named Captopril and another named (S)-1-(3-mercapto-2-methyl-1-oxopropyl)-L-proline. As a user, you have the responsibility for ensuring the uniqueness of chemical structures in the database, if that is what you want.

For those familiar with relational database terminology, while Catalyst's database engine supports full relational database functionality, only a single-table data model is accessible to the user. Molecular topology (for which duplicates can exist) acts as a secondary key, molecule name (for which duplicates are not allowed) acts as a secondary key, and the single primary key is the cref. The major consequence of the single-table data model is that you cannot store multiple values for a property in a Catalyst database. Thus, you cannot use a Catalyst database to store an indeterminate and unlimited number of IC50s for a compound. With Catalyst, you can store only a single value (for example, usually the average or median).

You can specify the molecular topology and the molecule name for each database entry. The system provides the crefs, which are invisible to the Catalyst user. The construction of very large databases requires that the person building the database be aware of how the crefs are assigned.

Each database can optionally have an associated property dictionary (.bpd) file, which the database builder provides. For each property, the following information must be specified:

Catalyst Database Files

A Catalyst database named Fred, has the following files associated with it:

Furthermore, tables are stored for Fred in the 1D-data server's database. Each table contains the database ID encoded in the table's name (6458 in this example). These database tables are observable only to your system administrator.

Database Files and Their Disk Space Requirements

Database operations in Catalyst can require significant disk space, depending upon the size of the databases being manipulated. Guidelines for developing a rough estimate of the disk space needed for your database follow:

  1. The Conformational Models (.0bdb) file, contains all structural information for compounds and requires

    For each compound: 1 byte per character in a compound name + approximately 4 bytes per heavy (nonhydrogen) atom + 20 bytes overhead.

    For each conformer: Approximately 8 bytes per heavy (nonhydrogen) atom.

    Note: If compounds are quite large (more than 200 heavy atoms), double the estimates listed above. The guidelines for the .0bdb-file space requirements are approximate. Disk usage depends on many factors including the size of the compound, connectivity, and the number and positions of functional points (hydrophobes, hydrogen bond acceptors and donors, etc.) on each conformer.

  2. The amount of disk space required for the 1D database depends primarily on the amount of property data you have.

  3. The 2D Index (.2bdb) file contains structural indexing information to speed up database searches and requires about 40 bytes per compound. This file is optional; the catDB program will construct it only if you answer y or yes to the Database has 2D index? prompt when you are executing the catDB CONFIG command to specify your database configuration file.

  4. The 3D Index (.3bdb) file contains geometric indexing information to speed up database searches and requires about 1 kilobyte (Kb) per compound. This file is optional; the catDB program will construct it only if you answer y or yes to the Database has 3D index? prompt when you are executing the catDB CONFIG command to specify your database configuration file.

  5. The Feature Dictionary (.chm) file is used for building 3D indices on the database; it contains functions and features that are typically used for searching databases. The default dictionary file is $CATALYST_CONF/DBDictionary.chm. Although this file is optional, it is required if you do build 3D indices for the database; the catDB program will use it only if you answer y or yes to the Use the default database feature dictionary? prompt when you are executing the catDB CONFIG command to specify your database configuration file.

    If you want to use a .chm file other than the default, it must be consistent with the 3D indices template. For more information about using non-default .chm files, call MSI Scientific Support.

Note: If you want to install your database as a corporate database for others to use, see "Moving Databases" in the MSI document, Installing and Maintaining Catalyst.

To familiarize yourself with some UNIX and catDB program commands that are often useful when constructing databases, you can work through the following exercise.

  1. From your system administrator, find out where databases are stored on your system so that you can substitute the names of the appropriate directories for path in the steps that follow. The generic structure for path is
    .../database/platform/name

    in which the database's name appears in place of name and the platform name is substituted for platform.

  2. Open a UNIX shell window.

  3. Use the UNIX more command to display the contents of the configuration file for the sample database installed in your Stockroom by typing
    more /path/database/platform/Sample/Sample.bdb
    and pressing Enter. Remember to substitute the names of the directories you obtained in step 1 for name and platform. For example,
    more /biocad/r3.0/database/irix5r3/Sample/Sample.bdb
    is the appropriate entry for a system on which databases are stored in /biocad/r3.0 and the type of platform is irix5r3.

  4. To use the UNIX ls -l command sequence to determine the size of each of the files pointed to by the configuration file, type
    ls -l /path/platform/Sample
    and press Enter. The size in bytes of each file is listed in the column to the left of the date each was last saved.

Before You Start Building Databases: Determining if the 2D/3D Server Program Is Running

In order to construct a database the Catalyst 2D/3D server, catDis, must be running on the host on which you are building the database. To determine if the server program is running:

  1. Type
    ps -ef | grep catDisk | grep -v grep
    and press the Enter key. If the command results in output such as

       biocad   623   511  0 12:31:36 ?    0:00 /biocad/r3.0/software/qa/iris/bin/catDisk ca3.0

    catDisk is running and you can go to step 3. If you get no output, catDisk is not running, and you should go to the next step.

  2. Ask your system administrator where catDisk is running and continue this exercise on that machine. If you do not have a system administrator, refer to "Checking and Restarting the Catalyst Database Server" in Installing and Maintaining Catalyst.

    Note: To start catDisk you need one catDisk token and this is sufficient to run single process searches. Parallel searches require additional catDisk tokens, one for each additional process.

There are several ways to construct Catalyst databases depending on what you want to do and how much control you want over the operation. Select from the following topics for stepwise procedures:

Building a Default Database from Your StockroomDB

When you create a database using the Catalyst Create Database... command, the files of which it is composed are stored according to the specifications in the default configuration file set up by the catDB program. Also by default, the database is constructed with conformational analysis for each compound and with indexes to make searching for 2D and 3D structures more efficiently. If you want more control over catDB commands and options, for example, to suppress conformational analysis or to defer index construction, see "Building a Custom Database from Your StockroomDB with catDB". Follow these steps to convert your private StockroomDB into a Catalyst database you can share with others:

  1. In the View Database workbench, browse your StockroomDB by dragging and dropping the StockroomDB icon on the report area. The names of all the compounds in your Stockroom appear in the report.

  2. To exclude compounds from the spreadsheet from which you will construct your database, extend select the compounds' row numbers and then select Clear Selected Report Rows from the Edit menu.

  3. Select the local spreadsheet resulting from browsing your StockroomDB and then select Save Report To Lab As Spreadsheet... from the Data menu.

  4. When Catalyst displays the Save Report To Lab As Spreadsheet dialog box, type a name in the Name field and select the Save button.

  5. Select the saved spreadsheet on the shelf and then select Create Database... from the Databases menu.

  6. Catalyst displays the Create Database dialog box in which you should make your specifications and selections from the following options:

    Input Spreadsheet. The name of the currently selected spreadsheet appears in this text box. To specify another spreadsheet, click in the text box, use the Backspace or Delete key to remove unwanted characters, and type in the spreadsheet's name.

    Output Database. Click in this text box and type in a name for your database. If you fail to specify a name for your database, Catalyst displays an alert message reminding you to do so when you try to create the database.

    Conformational Model Generation

    Generation Type. Click on either Fast or Best to select the type of conformational models you want. Fast is the default. See "BEST/FAST" for detailed information on these options.

    Existing Conformers. Select Discard to generate a new conformational model for each compound or, if you are using Fast generation, you can select Use to retain conformers for the compounds in your spreadsheet. Discard is the default. See "ExistingConfs=" for detailed information on these options.

    Maximum Number of Conformers. Click in the text box and type a number for the upper limit on the number of conformers to be generated during conformational analysis. The default value is 100. See "MaxConfs=" for detailed information on this option.

    Job Options... Click on this button to display the Job Options dialog box. Specifying values for Start Time, Queue After, and Process Name is the same as for setting up a hypothesis generation job as described in "To Set Up and Generate a Hypothesis Automatically".

    Where To Execute. Select Locally to build the database on your machine if you have catDisk running on it. Otherwise, select Remotely On to construct the database on a machine other than your own.

    Remote Host. Click on a machine name from the scroll list to specify a remote computer on which to construct the database, and Catalyst enters the machine name in this text box. Alternatively, you can click in this text box, use the Backspace or Delete key to remove unwanted characters, and then type in the name of a remote machine. Note that to construct a database, catDisk (the 2D/3D server program) must be running on the remote host. See "Before You Start Building Databases" for information on how to determine if catDisk is running.

    Local Directory. Catalyst provides a unique default directory name of the form processnameDir. If the database is constructed locally, this is where it will be placed. Should you want to specify different directory name, you may edit this field.

    Remote Directory. Catalyst provides a default directory name of the form
    /usr/tmp/processnameDir in this text box. You should specify a directory for your database that is on a partition that is local (not NFS mounted) to the machine on which catDisk is running. To find out if the partition is local, in a UNIX shell window on the host in the directory from which you intend to create a database, type

    df .

    (make certain there is a space between df and the .) and press Enter. If efs (extended file system) appears under Type, the partition is local. If nfs (network file system) appears under Type, the partition is NFS mounted.

    Cancel. Select Cancel to close the Job Options dialog box without changing any settings.

    Help. Select Help to display Catalyst's Help window.

    OK. Select OK to make your specifications effective, close the Job Options dialog box, and return to the Create Database dialog box.

    Create. Starts the database construction process according to your specifications in the Create Database and Job Options dialog boxes.

    Cancel. Closes the Create Database dialog box without changing any settings.

    Help. Displays Catalyst's Help window.

  7. When you have finished making your selections and specifications, select Create. Catalyst displays a message advising you that the setup for batch generation has been completed and is running or scheduled depending on your specifications. Select the Acknowledged button to remove this message. (See "When a Background Process Starts" under "What Happens if There Are Not Enough Tokens?" for additional information.)

  8. You can check on the progress of database construction by selecting Process Information... from the Data menu in the Stockroom. See "Process Information..." for details on the Process Information dialog box and handling batch process data.

Building a Custom Database from a StockroomDB with catDB

Follow these steps to use the catDB program to create a database from your private StockroomDB into a Catalyst database you can share with others:

  1. In the View Database workbench, browse your StockroomDB by dragging and dropping the StockroomDB icon on the report area. The names of all the compounds in your Stockroom appear in the report.

  2. To exclude compounds from the spreadsheet from which you will construct your database, extend select the compounds' row numbers and then select Clear Selected Report Rows from the Edit menu.

  3. Select the local spreadsheet resulting from browsing your StockroomDB and then select Save Report To Lab As Spreadsheet... from the Data menu.

  4. When Catalyst displays the Save Report To Lab As Spreadsheet dialog box, type a name in the Name field and select the Save button. Note that the name of your database will be the same as the name of this spreadsheet.

  5. With your saved spreadsheet selected, select Export... from the Data menu to display the Export Data dialog box.

  6. Under Export Type select Database Spreadsheet File (ESP), enter your other specifications, and then select the Export button to write your spreadsheet file to disk as an extended spreadsheet (.esp) file. You should specify a target directory that is local (not NFS mounted) to a machine on which catDisk, the 2D/3D database server program, is running. To find out if the partition is local, in a UNIX shell window on the host in the directory in which you intend to create the database from the .esp file, type
    df .

    (make certain there is a space between df and the .) and press Enter. If efs (extended file system) appears under Type, the partition is local. If nfs (network file system) appears under Type, the partition is NFS mounted.

  7. After you have exported the extended spreadsheet file, if necessary remotely log in to the machine to which you exported it.

  8. Open a UNIX shell window and change to the directory containing your extended spreadsheet file with the UNIX Change Directory command. For example, if you exported the extended spreadsheet file to the directory
    /home/arlene/Training, you would type
    cd /home/arlene/Training

    and press the Enter key.

  9. Type
    catDB CREATE extendedspreadsheetname.esp

    substituting the name of your extended spreadsheet file for extendedspreadsheetname and then press the Enter key.

    The catDB program first displays the contents of the default configuration (.bdb) file for the database you are creating as shown below. The output in italics will be different for you because it depends on your specific hardware installation.

    catDB version 2.2
    Default configuration:
    ! Copyright © 1991-1999
    ! All Rights Reserved
    ! Biocad Database Configuration file
    Database Name = extendedspreadsheetname
    catDB version = 2.2
    Database ID = 29878 Unique ID chosen by catDB. Yours will be different.
    Conformational Models: Specifications for the .0bdb file.
    host = ravel Host computer for .0bdb file.
    path = /home/arlene/Training/ Directory for .0bdb file.
    !
    1D Data: Specifications for the 1D property data file.
    host = blackhole Host computer for 1D property data file.
    server = cat1-2.2 Name for server program for 1D property data file.
    !
    2D Index Specifications for the .2bdb file (2D searching indexes).
    host = ravel
    path = /home/arlene/Training/
    !
    3D Index: Specifications for the .3bdb file (3D searching indexes).
    host = ravel
    path = /home/arlene/Training/
    !
    Feature Dictionary Specifications for the .chm file.
    host = ravel
    path = /home/arlene/Training

  10. Respond to the prompts the catDB program displays. The default specification for each prompt is shown in [ ] (square brackets). You can just press the Enter key if you want to accept the default. For detailed information on all catDB CREATE prompts and options, see "CREATE".

  11. After the catDB program displays messages informing you that database construction is complete, you should perform the following check to verify you have done things properly. In the UNIX window, type
    catDB INFO databasename.bdb

    (substituting the name of the database for databasename) and press the Enter key. Executing the catDB INFO command prints out the number of compounds in the database, and a description of its property dictionary.

Other users can now install this database in their Catalyst sessions and browse it in the View Database workbench. To verify that the 1D property data is present in your database:

  1. Return to your Catalyst session and select Install Database from the Databases menu in the Stockroom.

  2. In the Install Database dialog box, click on the name of the configuration (.bdb) file corresponding to the name of the database you want to install.

  3. Click on the Install button, and then on the Cancel button to dismiss the dialog box. Catalyst places a database icon on the shelf of the Stockroom.

  4. Drag and drop the database icon on the View Database button in the tool bar or double-click on the icon to open a new View Database workbench.

  5. Browse the database and verify that the 1D property data is present by changing the column headings in variable fields to display values for properties other than the defaults. See "Customizing a Report of Spreadsheet Data" for details on changing properties in variable fields.

  6. After you have created and verified the database, you may want to make it easily accessible to other Catalyst users. To do this, copy the database configuration (.bdb) file into the $CATALYST_WCONF/directory. Now your database will be visible in the primary (default) directory list box when a user chooses the Install Database command.

Building a Database from a Subset of Another Database

Another way of building a Catalyst database is to construct one from an existing database. If someone in your organization has already built a Catalyst version of your corporate database and you want to build a database containing a subset of that (for example, all molecules synthesized by project 452 or all the D2 antagonists), you can follow a sequence of steps similar to the ones for creating a database from compounds in your Stockroom.

  1. In the View Database workbench search the database containing the subset of compounds from which you want to construct a new database. See "To Search a Database" for more information on searching.

  2. Save the local spreadsheet resulting from your search with the Save Report To Lab As Spreadsheet... command from the Data menu.

  3. Depending upon your needs:

  4. If you created a custom database with the catDB command, install the database in Catalyst with the Install Database... command in the Databases menu in the Stockroom.

Building a Single-conformer Database

The following exercise is provided to illustrate how to construct a single-conformer database and how to overcome certain common problems including

For this exercise /installdir/cattrain/ex9.sd is the input file. For installdir substitute the name of your Catalyst directory in which your training materials were installed. The input file contains a small number of errors of the types listed above. The purpose of the exercise is to introduce you to the techniques for resolving typical problems that might be encountered when constructing a large database. Dealing with such problems generally requires a degree of UNIX and Catalyst sophistication. The exercise assumes that you have some UNIX expertise and a familiarity with the UNIX nawk (new awk) utility. The non-Catalyst input formats that catDB can read are SMILES, MOL, and SD files. All specify molecular topology; SD files specify 1D data as well.

Step 1: What Is in the Input File?

The first questions to ask about an SD file are

Finding the Number of Compounds in an SD File

The following one-line UNIX statement entered on the command line counts the number of termination records (occurrences of $$$$) in the file, thus determining the number of compounds by tabulating the number of delimiters that separate one compound's definition from another:

grep '$$$$' filename | wc -l

Substitute the path and file name of your SD file for filename.

Note: Using ex9.sd, the result of executing the statement is 200.

Determining 1D Properties in an SD File

The UNIX statement

grep "> <" filename | sort -u
locates all occurrences of the > < (greater than, space, space, less than) character sequence delimiter preceding a property-data specification in the SD file, and pipes them to the sort utility with the unique option to display a list of the file's property names.

Note: You must include two spaces between the greater than and less than characters. If you provide only one space, your command statement will fail to locate the delimiter for 1D data specifications in SD files. Using ex9.sd, the result of executing the statement is

> <CAS_number> > <IC50>

Computing the Average Molecular Weight in an SD File

To find an approximate average size of the molecules in an SD file, use the next awk command to print out the average molecular weight of a molecule:

cat filename| nawk '/$$/ { for (i=1;i<=4;i++) getline;atomsum+=substr($0,1,4);ct++;};
END {print (14.7*atomsum)/ct;}'
Enter this command statement as one long line, pressing the Enter key only after you have finished typing in both of the lines shown above.

Note: Using ex9.sd, the result of executing the statement is 230.937.

Finding the Total Number of Explicit Unknown Chiral Centers

To calculate the total number of chiral centers explicitly marked as unknown, execute

cat filename | nawk 'BEGIN{ct=0;}; NF==10 {if ($7==3) ct++;};
END {print ct;}'
Again, enter this command statement as one long line before pressing the Enter key.

Note: Using ex9.sd, the result of executing the statement is 0.

If the average molecular weight is more than 400 or if the number of unknown stereocenters is large (more than 0.5 per molecule on average), you can anticipate that the conformer generation step in database building will be exceptionally time-consuming. (The catDB program automatically populates both chiralities if the center is listed as unknown.)

  1. For the ex9.sd file, verify that it contains 200 compounds by typing
    grep '$$$$' /installdir/cattrain/ex9.sd | wc -l
    (substituting your Catalyst directory name for installdir) and pressing the Enter key.

  2. Confirm that the 1D property fields in ex9.sd are CAS_number and IC50 by typing
    grep "> <" /installdir/cattrain/ex9.sd | sort -u
    and pressing the Enter key. The system displays the following result:

    > <CAS_number> > <IC50>
  3. Determine that the average molecular weight of the compounds in ex9.sd is about 231 by typing
    cat /installdir/cattrain/ex9.sd | nawk '/$$/ { for (i=1;i<=4;i++) getline; atomsum+=substr($0,1,4);ct++;}; END {print (14.7*atomsum)/ct;}'
    as one long line and then pressing the Enter key. The system displays 230.937 as the result.

  4. Verify that the number of explicitly unknown chiral centers in ex9.sd is zero by typing
    cat /installdir/cattrain/ex9.sd | nawk 'BEGIN{ct=0;};
    NF==10 {if ($7==3) ct++;};
    END {print ct;}'
    as one long line and then pressing the Enter key.

Step 2: Building the Configuration (.bdb) File

Before building the configuration (.bdb) file, you must make some decisions on where the database files will reside, which in turn is determined by the computers that will function as data servers. Available disk space and server computers are usually the critical factors in determining where to put database files. A full multiconformer 3D database requires roughly 1 megabyte per 1,000 compounds. To determine which computers are servers, consult your system administrator. Use the UNIX df command to report the free disk space on each server. The disks that can be "served" by the server must be local to that machine, and not NFS mounted.

You will usually use the 1D server common to all users. The catDB program looks up the default 1D server for your installation and displays that information as part of its listing of specifications for a default configuration file. To practice building a .bdb file, follow these steps:

  1. Remotely log on to the machine that will function as your 2D/3D server by typing
    rlogin machinename
    (substituting the name of the computer for machinename) and pressing the Enter key.

  2. Change to the directory in which you want the files to reside with the UNIX cd command.

  3. We will call our database exercise9. Type
    catDB CONFIG exercise9
    and press the Enter key. The catDB CONFIG command returns a listing for a default configuration file in the same manner that the catDB CREATE command does.

  4. Notice that the program displays the name of the 1D server for you and selects a unique database ID. It also chooses your current directory as the location for storing database files and your current machine as the owner of these files. Similarly, the catDB CONFIG command also issues the prompt,
    Do you want to use the default configuration shown above? [y] :
  5. Type y (for yes) and press the Enter key or just press the Enter key to accept the default settings. (See "CONFIG" for detailed information on each of the prompts you can elicit by typing n for no and all of the settings you can specify with the catDB CONFIG command.) By default, the program builds the new files, exercise9.bdb and exercise9.nnnnn.chm (where nnnnn is the database ID). If you specified a feature dictionary file, the exercise9.nnnnn.chm file is a copy of it. Otherwise, exercise9.nnnnn.chm is a copy of the default feature dictionary, $CATALYST_CONF/DBDictionary.chm.

  6. Verify the configuration file's contents with the UNIX more command. To change the contents of the configuration file, execute
    catDB RECONFIG exercise9.bdb

Step 3: Defining the Property Dictionary (.bpd) File

There are two methods for building a property dictionary file. The first, is to create it directly in an editor, using $CATALYST_CONF/Corporate.bpd as a template. That is, you can copy this file into an editor, edit it to specify the properties you want, and save it with your database name and a .bpd extension. Remember that the Special field should always be NULL for your definitions. (Non-NULL values are reserved for internal Catalyst use only.) Properties that are nearly always present for each molecule should be given a UNIVERSAL Schema specification; properties that occur for less than fifty per cent of the compounds should be given a SPECIFIC specification. For properties you expect to search on with any constraint except approximately equals substring search, e.g. RGDIC_50 < 10.0), specify QUICK_REF for Reference. Otherwise you can save some disk space on the 1D server by choosing SLOW_REF for reference.

The other method for building a property dictionary file is to define all the properties for your new database interactively in Catalyst by editing the property dictionary for your StockroomDB using the Edit Property Dictionary command in the Stockroom Databases menu. (See "Edit Property Dictionary..." for details on using the Edit Property Dictionary... command.) After editing your StockroomDB properties, perform a Save StockroomDB command. You can then use the UNIX cp command to make a copy of your Stockroom property dictionary for your exercise9 database as follows:

cp catdata/StockroomDB.bpd exercise9.bpd
Use a text editor to remove any unwanted property definitions. For this exercise, use a text editor to append the following definitions for IC50 and CAS_number to the property dictionary file:
IC50 FLOAT UNIVERSAL QUICK_REF NULL IC50 activity value
CAS_number STRING UNIVERSAL QUICK_REF NULL CAS registry number

Step 4: Building the Database from the Input Files

When you have the necessary input files (.sd, .bdb, and .bpd), you're ready to build the database.

Note: If, in the .sd file, there is a property whose value represents the name of the compounds (e.g. RegCmpdName), you should specify that property name in your ~/.Catalyst file before carrying out the catdb sd command. For example, adding the line:

importMOL.realCompoundNameProperty=RegCmpdName
to your ~/.Catalyst file specifies that the property called RegCmpdName will be used (if encountered) instead of the normal compound name field in each MOL header in building the database. The reason for using an alternate property to hold the compound name is that the .sd file format limits the number of characters in a compound name in the MOL header to 80, and many compounds have names longer than 80 characters.

The catDB SD command for building a database has many options, but for the purposes of this exercise, type the following as one long line and then press Enter:

nohup catDB SD /installdir/cattrain/ex9.sd exercise9 PropDict=exercise9
MaxConfs=1 errData=exercise9.err.sd >& exercise9.log &
The MaxConfs=1 option specifies building only one conformer for each molecule in the input ex9.sd file. The file you name with the PropDict= option specifies the property dictionary the catDB program uses when constructing the database; catDB automatically appends the .bpd extension to the file name you type. The errData= option puts all unprocessed data in the file you specify to the right of the = (equals) sign. The >& directs standard output and standard error to a file named exercise9.log. Following the command statement with the & operator runs the process in the background. The nohup command ensures that the catDB process will not be killed if the current window is killed.

Step 5: Evaluating Results and Isolating Problems

Building the exercise9 database should take about three minutes on an R3000 Indigo. After processing is complete, verify the number of compounds which were successfully converted into a Catalyst database using the command

catDB INFO exercise9.bdb
Since the INFO command reports 197 compounds in the database, 3 compounds did not convert. In the exercise9.log file, for each molecule successfully processed, you'll see lines like the following:

56145029 : Processing...
56145029 : CC(=O)N[C@?H]1N[C@?H](Cl)[C@?H](N)[C@?H](Cl)N1
70924833 : Processing...
70924833 : CC(=O)NC[C@?H]1CC[C@?H](CC1)CNC(=O)C
The log file also records which molecules had problems. These three compounds have been written into exercise9.err.sd because the errData= option was specified. In a real-life situation you would need to examine the data for the three molecules, and using knowledge of the MOL format, determine how they must be fixed. For example, a tetravalent, neutral nitrogen is a common problem; the charge must be specified in either the atom line or at the end in a CHG field. You may need to go back to your original documentation to retrieve an original drawing of the structure, which you can then draw using Catalyst. Exporting this new MOL file, followed by replacement of the original MOL data should correct the problem.

Verify that you can install your newly built single-conformer database in Catalyst, and that you can see the 1D data for CAS_number and IC50. Also verify that you can perform a 3D search. Change one of the IC50 values and commit that change to the database, dispose of the workbench, and do the search again to verify that your updated value has been recorded.

Building a Multiconformer Database

The strategy for building a large, multiconformer database is to 1) build the conformational models in parallel, 2) merge the database segments into a single database, 3) add the 1D data, and 4) build the 2D and 3D searching indexes. Any problems with nonimported structures are best resolved by iteratively appending the structures to the database after it is built.

The first task is the generation of conformational models for each molecule in the database. We recommend that you specify up to 100 FAST conformers per molecule; the catDB program will build a conformational model composed of fewer if the conformational space can be adequately covered by less than 100. At this level of conformational analysis, building a multiconformer database should require roughly one R4400 CPU-day per 8,000 molecules, although the actual time is dependent on the size and the flexibility of the compounds.

Note: The default conformational model energy range is 20 kcal/mol. For FAST conformational analysis you can specify a different energy range by adding the user parameter

confAnalysis.catDB.maxEnergySpread = n

to your .Catalyst file. Substitute a value in Joules for n.

The procedure outlined below assumes constructing a 250,000-compound database with input data provided in three SD files (file1.sd, file2.sd, and file3.sd) containing 100,000, 100,000, and 50,000 compounds, respectively. Each of the SD files contains a collection of property data for the compounds in them. Approximately one gigabyte of disk space will be required to store such a 1D/2D/3D database.

  1. Identify the machines you plan to use, and on multiprocessor machines, the number of processors that will be used. You also need to consider the number of Catalyst/Info tokens you have, since each catDB process consumes one token. For the purpose of this sample procedure, we assume that the conformational models will be built on four processors of a multiprocessor workstation/server named hana.

  2. Identify where sufficient disk space is available. Ideally the disk space should reside on the machine on which you will run the catDB processes so that a very large load will not be placed on your network. This sample procedure assumes that disk space that is mounted locally on hana with the path name /home4/DB.

  3. As a rule of thumb, it is best to partition the building of a database into 10,000- compound pieces. To construct the 25 database configuration files that are needed for this example, first log in to hana and change your current directory to /home4/DB. Next execute the following commands:
    csh
    set count = 1
    while ($count < 26)
    ? catDB CONFIG part${count}.bdb
    ? @ count++
    ? end
    The question marks are prompts and are not something you are supposed to type. It is also important to type a space before and after the < and the @ characters.

  4. The procedure above causes catDB to construct the 25 database configuration files (each with a name of the form part## where # is an integer). In each case, do not accept the defaults provided, but instead accept the conformer database host (hana) and the conformer database path (/home4/DB); answer no to the questions concerning 1D data, 2D indexes, and 3D indexes; and finally accept this configuration by answering yes.

  5. Construct three script files (run1, run2, run3) from which the parallel building processes will be controlled. The contents of these script files are provided below, and you can obtain these files from MSI Scientific Support.

    File run1:

    nohup catDB SD file1.sd part1.bdb maxconfs=100 startafter='$0' \
      stopafter='$10000' >& part1.out &
    nohup catDB SD file1.sd part2.bdb maxconfs=100 startafter='$10001' \
      stopafter='$20000' >& part2.out &
    nohup catDB SD file1.sd part3.bdb maxconfs=100 startafter='$20001' \
      stopafter='$30000' >& part3.out &
    nohup catDB SD file1.sd part4.bdb maxconfs=100 startafter='$30001' \
      stopafter='$40000' >& part4.out &
    #nohup catDB SD file1.sd part5.bdb maxconfs=100 startafter='$40001' \
    #  stopafter='$50000' >& part5.out &
    #nohup catDB SD file1.sd part6.bdb maxconfs=100 startafter='$50001' \
    #  stopafter='$60000' >& part6.out &
    #nohup catDB SD file1.sd part7.bdb maxconfs=100 startafter='$60001' \
    #  stopafter='$70000' >& part7.out &
    #nohup catDB SD file1.sd part8.bdb maxconfs=100 startafter='$70001' \
    #  stopafter='$80000' >& part8.out &
    #nohup catDB SD file1.sd part9.bdb maxconfs=100 startafter='$80001' \
    #  stopafter='$90000' >& part9.out &
    #nohup catDB SD file1.sd part10.bdb maxconfs=100 startafter='$90001' \
    #  stopafter='$100000' >& part10.out &
    
    Note: The StartAfter= and StopAfter= options control the portions of the .sd file that is processed. Only the first four commands will be executed in this script file's current form as only four processors are being used in the parallel build. If more commands are uncommented (by removing the leading # character), a larger number of simultaneous processes will be started.

    File run2:

    #nohup catDB SD file2.sd part11.bdb maxconfs=100 startafter='$0' \
    #  stopafter='$10000' startcref=100001 >& part11.out &
    #nohup catDB SD file2.sd part12.bdb maxconfs=100 startafter='$10001' \
    #  stopafter='$20000' startcref=100001 >& part12.out &
    #nohup catDB SD file2.sd part13.bdb maxconfs=100 startafter='$20001' \
    #  stopafter='$30000' startcref=100001 >& part13.out &
    #nohup catDB SD file2.sd part14.bdb maxconfs=100 startafter='$30001' \
    #  stopafter='$40000' startcref=100001 >& part14.out &
    #nohup catDB SD file2.sd part15.bdb maxconfs=100 startafter='$40001' \
    #  stopafter='$50000' startcref=100001 >& part15.out &
    #nohup catDB SD file2.sd part16.bdb maxconfs=100 startafter='$50001' \
    #  stopafter='$60000' startcref=100001 >& part16.out &
    #nohup catDB SD file2.sd part17.bdb maxconfs=100 startafter='$60001' \
    #  stopafter='$70000' startcref=100001 >& part17.out &
    #nohup catDB SD file2.sd part18.bdb maxconfs=100 startafter='$70001' \
    #  stopafter='$80000' startcref=100001 >& part18.out &
    #nohup catDB SD file2.sd part19.bdb maxconfs=100 startafter='$80001' \
    #  stopafter='$90000' startcref=100001 >& part19.out &
    #nohup catDB SD file2.sd part20.bdb maxconfs=100 startafter='$90001' \
    #  stopafter='$100000' startcref=100001 >& part20.out &
    

    Note: The StartCref= option ensures that this portion of the database does not conflict with the pieces built by the run1 script. The number used (100001) is one greater than the total number of compounds in file1.sd. For a given .sd file referenced in a script, the value of the StartCref= option should always be the same, e.g., 100001 in the example above.

    File run3:

    #nohup catDB SD file3.sd part21.bdb maxconfs=100 startafter='$0' \
    #  stopafter='$10000' startcref=200001 >& part21.out &
    #nohup catDB SD file3.sd part22.bdb maxconfs=100 startafter='$10001' \
    #  stopafter='$20000' startcref=200001 >& part22.out &
    #nohup catDB SD file3.sd part23.bdb maxconfs=100 startafter='$20001' \
    #  stopafter='$30000' startcref=200001 >& part23.out &
    #nohup catDB SD file3.sd part24.bdb maxconfs=100 startafter='$30001' \
    #  stopafter='$40000' startcref=200001 >& part24.out &
    #nohup catDB SD file3.sd part25.bdb maxconfs=100 startafter='$40001' \
    #  stopafter='$50000' startcref=200001 >& part25.out &
    

    Note: The StartCref= option ensures that this portion of our database does not conflict with the pieces built by scripts run1 and run2. The number used (200001) is one greater that the total number of compounds in file1.sd and file2.sd.

  6. Start building the database by invoking the first script by typing csh run1 at the UNIX prompt. The first four jobs will begin. As each job completes, you should "comment out" its command line in the script (insert a # character in the leftmost space of the command line) and remove the # character from the next command line to be executed. For example, when part1 completes, comment out the first two lines and remove the leading # from lines nine and ten for part5. Next execute the newly uncommented command at the UNIX prompt (see below) and repeat this procedure until all pieces have been constructed.
    nohup catDB SD file1.sd part5.bdb maxconfs=100 startafter='$40001' \
     stopafter='$50000' >& part5.out &
    Note: Invariably one or more of your database construction jobs will be interrupted by normal system maintenance or an unexpected system failure. This is covered in "Stopping and Restarting Database Construction".

  7. When all of the database pieces have been built, it is time to merge them into three larger pieces that correspond directly the original SD files. This is also a convenient time to extract all of the compounds that were not correctly imported and/or built. To prepare for the merging process, make three new database configuration files named db1.bdb, db2.bdb, and db3.bdb using the catDB CONFIG command. In each case, do not accept the default configuration; accept the defaults for the conformer database server, the conformer database path, and the 1D (biological) database components. Answer no to the questions about 2D and 3D indexes. Use these commands to merge the parts:
    catDB MERGE db1 \
      dblist=part1,part2,part3,part4,part5,part6,part7,part8,part9,part10  no1D
    

    catDB MERGE db2 \ dblist=part11,part12,part13,part14,part15,part16,part17,part18,part19,part20 no1D

    catDB MERGE db3 dblist=part21,part22,part23,part24,part25 no1D

    The No1D option suppresses the generation of 1D property data for the resulting database because they will be created in the next step.

    Note: To conserve disk space, the conformational model binary data files named part#.#.0bdb should be backed up onto tape after each merge has been completed. Then use the following command to purge the database files from disk.

    catDB DELETE_DB part#

  8. At this point, it is time to create the 1D component of the three merged databases. The first step is to identify the properties that are represented in the input .sd files and to construct a property dictionary file required in the succeeding steps. This is most easily accomplished with the UNIX command
    grep "> <" file1.sd file2.sd file3.sd | sort -u
    where the string being searched for is > < (the greater than, space, space, and less than characters). See "Defining the Property Dictionary (.bpd) File" for a discussion of the construction of a property dictionary file. It is best to strictly limit the number of UNIVERSAL properties (preferably fewer than ten) that are indexed (QUICK_REF) as the indexes can require significant disk space, and they also tend to slow the 1D data creation process significantly. Once the property dictionary file has been constructed with the name DB.bpd, create the default 1D data with the following commands:
    catDB CREATE_1D db1 propdict=DB.bpd

    catDB CREATE_1D db2 propdict=DB.bpd

    catDB CREATE_1D db3 propdict=DB.bpd

  9. The final step in constructing the 1D component of the three merged databases is to load the property data from the original .sd files. This is also a convenient point to collect the corrected compounds that were not built originally (for example, because of problems with importing or with constructing conformational models) for subsequent processing. The commands for loading property data are
    catDB SD_UPDATE file1.sd db1 errData=file1.errors.sd

    catDB SD_UPDATE file2.sd db2 errData=file2.errors.sd

    catDB SD_UPDATE file3.sd db3 errData=file3.errors.sd

    Note: To conserve disk space, the three .sd files can be archived and deleted after the SD_UPDATE procedures have completed.

  10. To begin the final phase of the procedure, construct a database configuration file (called CorpDB, for the sake of this example) with the command
    catDB CONFIG CorpDB
    Use the default configuration that is provided. Merge the three pieces of the database using
    catDB MERGE CorpDB DBlist=db1,db2,db3 No2Dindex No3Dindex

    Note: To conserve disk space, back up the conformational model binary data files named db#.#.0bdb onto tape after merge the component databases. Then use the following commands to purge the database files from disk:

    catDB DELETE_DB db1

    catDB DELETE_DB db2

    catDB DELETE_DB db3

  11. The very last step is to calculate the 2D and 3D indexing information for speeding up searches of the database. Execute
    catDB RECALC CorpDB

    This procedure requires approximately one-half hour on an R4400 for every 10,000 compounds in the database, although this is highly dependent on the size and functionality of the compounds in the database.

At this point the database is ready for general use by the scientists within your organization.

Stopping and Restarting Database Construction

It is reasonable to expect interruptions at some point during the construction of a large database. These interruptions could be orderly (for example, routine preventive maintenance), or they could be unexpected in nature (for example, a power outage or system failure).

Stopping Database Construction

If you need to halt catDB processes for any reason, you can do so on a per-process basis. Each running database construction process is associated with a database configuration file that has a .bdb extension. To stop a database building process, change your current working directory to the directory that contains the database configuration file and create a new file whose name is the database configuration file name with .stop as an additional extension. For example, if the database configuration file is named part1.bdb and this file resides in the directory /home4/DB, the following commands stop the ongoing construction process:

cd /home4/DB
touch part1.bdb.stop

Database building stops once the next compound is processed. The only outward sign that a stop instruction has been received is the removal of the newly created file with the .stop extension. Once a stop instruction has been received it cannot be reversed.

Important Note: Do not employ the procedure described above if your catDB process is using the AllowNFS option, because you will not be able to restart the process as described below. For additional information, see "AllowNFS".

Restarting Database Construction

Restarting database construction is independent of the way in which it was stopped or interrupted. The first task is to determine the last compound that was written to the database. This cannot reliably be obtained from any of the output and/or error files produced by the catDB program. You must use the command procedure that follows. For this example, we will restart the construction of part1.bdb.

catDB INFO part1.bdb Detail No1D |& grep Last

This command reports the name of the last compound that is saved in the database, for example:

Last compound in the database is 'Methylacetate'.

The next step is to modify the original command that started the process to indicate where database construction should resume. The modifications to the original command are in boldface italics in the example below, and should be made in the run1 script file being used to track the parallel building processes.

nohup catDB SD file1.sd part1.bdb maxconfs=100 startafter="Methylacetate"\
stopafter='$10000' APPEND >& part1.out &

The argument change for the StartAfter= option provides the name of the last compound in double quotes, in contrast to the single quotes that where used to indicate a specific starting record number. The APPEND option is required to inform the process that you are intentionally adding to an existing database file.

Important Note: If your catDB process is using the AllowNFS option, you will not be able to restart it as described above. For more information, see "AllowNFS".

Common Database Construction Problems

While the previous sections outline reliable ways to construct databases, there are a variety of problems that might be encountered. Select from the following list:

Overlapping crefs

Databases to be merged must not have overlapping crefs (internal compound reference numbers). Conflicts among crefs most commonly occur when multiple input files are used in the database construction process, and the StartCref= option is misused or not used at all. A simple but wasteful resolution of cref conflicts is to rebuild the affected portions of your database. A better solution is provided by the REPAIR_DB command, which has the following syntax:

catDB REPAIR_DB DBname.bdb StartCref=n
An appropriate starting cref number can be obtained with the catDB INFO command and its Detail option. See "INFO" for a detailed description of the command and its usage.

Note that the REPAIR_DB option has many restrictions. A database with crefs to be shifted must not have 1D property data, 2D indexes, or 3D indexes; each of these components will need to be remade after the cref shift has been completed. The REPAIR_DB command also works entirely in memory. Thus, its use might be limited by your computer system's resources. See "REPAIR_DB" for details.

NFS-mounted Disk Partitions

A less serious problem that is encountered when trying to construct a large database using CPU resources that are distributed across a network involves the access of disk partitions that are mounted using NFS. While in general it is better not to utilize NFS partitions because of problems with file locking, and more broadly with the network traffic that results from this activity, an override has been provided for the careful user. To permit the catDB program to use a NFS disk partition, use the AllowNFS option on the catDB command line. See "AllowNFS" for details.

Errors in 1D Property Definitions

If you made a mistake in a 1D property definition (such as specifying a DATE as a STRING), the most robust solution is to remove the 1D portion of the database, correct the property definition in the .bpd file, and reload the data using one of the UPDATE commands supported by the catDB program. For more information, select from the following:

Database maintenance falls into the following main categories:

Read and write access to any database is controlled by UNIX file permissions. If you have write permission for a database, you can update it.

Updating 1D Data in Individual Property Data Fields

Updating 1D data is easy in Catalyst, and is a primary reason for building a database with 1D data. To change or remove values in a database's 1D component that you created:

  1. Install the database in Catalyst, and browse or search it to bring the compounds of interest into a local spreadsheet. For specifics on these operations, select from the following:

  2. Edit the local spreadsheet and save it. See "Changing Values in a Spreadsheet" for details.

  3. Select Commit 1D Changes to Database from the Databases menu in the View Database workbench to replace the values in the database with the new ones in your spreadsheet. All other users of this database can now see those changes. For additional information on this step, see "To Update 1D Values in a Database".

You can also perform a mass update by making your changes in a spreadsheet and exporting it as a spreadsheet (.spst) file. Then use the catDB SPST_UPDATE command with the database's configuration file to update the values in the database with the ones in the .spst file. The syntax of the command is

catDB SPST_UPDATE spreadsheetname.spst outputDBname.bdb
Substitute the name of the database's configuration file for outputDBname.bdb and the name of the spreadsheet file for spreadsheetname.spst.

For detailed information on catDB commands for mass updates, select from the following:

Adding or Deleting Properties from the Property Dictionary

You can add or delete only
specific properties; universal properties can only be changed by rebuilding the database. To add or delete a set of specific properties, use the catDB BPD command to make a property dictionary (.bpd) file defining those properties, and then use the catDB ADD_PROPERTY command with the database's configuration and property dictionary files. Similarly, to delete a set of specific properties, use the catDB BPD command to make a property dictionary file defining them, and then use the catDB DELETE_PROPERTY command. The syntax for these commands follows:
catDB ADD_PROPERTY outputDBname.bdb propertydictionaryname.bpd

catDB DELETE_PROPERTY outputDBname.bdb propertydictionaryname.bpd

For specifics on individual catDB commands for these operations, select from the following:

You can also add and delete properties while you're working in Catalyst. For more information, see "Edit Property Dictionary...".

Deleting Compounds from a Database

To delete a set of compounds, first install the database in Catalyst and build a spreadsheet containing the names of all of the molecules to be removed. Export the spreadsheet as an .spst file. The easiest way to do this is to perform a 1D search if a whole class of molecules is to be removed. Or you can use the Find command to locate molecules individually and save them in the Stockroom. In the latter case, you should browse the Stockroom, clear all rows except those with the names of compounds to be deleted, save the resulting spreadsheet, and export it as an .spst file. You can then use the spreadsheet file to remove the compounds in it from your database by issuing the command

catDB SPST_DELETE spreadsheetname.spst outputDBname.bdb
in which you substitute the name of your spreadsheet file for spreadsheetname.spst and the name of your database's configuration file for outputDBname.bdb.

For details on individual operations in the procedure, select from the following:

The catDB program also lets you remove compounds from database with extended spreadsheet files. See "DELETE" for details.

Adding Compounds to a Database

Adding new compounds to an existing database will be necessary to maintain an up-to-date archive of information for the scientists in your organization. This is most easily accomplished with the sequence of commands below. This example details an addition to a database named CorpDB that is described by a database configuration file called CorpDB.bdb.

  1. Determine the range of internal compound reference numbers in the database to which compounds are to be added with
    catDB INFO CorpDB Detail No1D |& grep Cref
    The command returns a single line of data such as

    Cref low, high = 13002, 13145

    You will need to add 1 to the higher number reported and use that value to specify the StartCref= option in a subsequent database building step. For this example, the correct specification is StartCref=13146 for the database building step described below.

  2. Build a database with the compounds to be added to CorpDB. This is most easily accomplished by creating in the View Database or Generate Hypothesis workbench a spreadsheet containing the compounds and saving this spreadsheet to the Stockroom with a name like UpdateCorpDB.

  3. Export this spreadsheet as an extended spreadsheet (.esp) file. You will need to add 1 to the higher cref number reported in step 1 and use it when creating a database in a UNIX shell window as follows:
    catDB CREATE UpdateCorpDB.esp StartCref=13146
  4. The final step is to update the original database using the MERGE command. Prior to merging, it is important to ensure that no one is accessing the database from Catalyst or the catSearch program. Then combine the databases with
    catDB MERGE CorpDB DBlist=UpdateCorpDB,CorpDB

    The compounds in UpdateCorpDB will replace those in CorpDB if duplicate names are present. This procedure can be used to update existing compounds in a database. Note, however, that if compounds are replaced using this technique rather than using the Append/Replace Database Compounds... command in Catalyst, all existing 1D data for the replaced compounds will be lost. See "Append/Replace Database Compounds" for information on replacing 2D and 3D topologies while preserving 1D data.

For details on individual operations in the procedure, select from the following:

Moving the Location of Database Files

As your system expands over time, it might be necessary to change the physical location of one or more of the database servers and the database files that store the various data components in the database. Such database configuration problems are handled with the catDB RECONFIG command. See "RECONFIG" for details on altering the locations of conformational models, the 2D indexes, the 3D indexes, and the feature dictionary used to construct the 3D indexes. Consult MSI Scientific Support for instructions on how to alter the location of the 1D data server for a database.



[Top] [Prev] [Next] [Index]



Last updated April 25, 1996 at 09:42am PDT.

Copyright © 1999, Molecular Simulations Inc. All rights reserved.