[Top] [Index]

catDB Database Management


Introduction

The catDB program supports commands that build and maintain databases outside of Catalyst. With a couple of exceptions, you must use catDB commands from a UNIX command line (see "How to Use a UNIX Command Line"). The exceptions are the Create Database and Append/Replace Database Compounds commands, which are available within the View Database workbench.

A Catalyst database is organized so that 1D scalar data (values associated with properties such as name, molecular weight, formula, activities, and so on) are stored in a 1D database file; 2D data (topological) are stored in a 2D database file; and 3D data (geometric) reside in a 3D database file. The locations of the 1D, 2D, and 3D components of a Catalyst database are controlled with a configuration (.bdb) file that you create and specify with the catDB CONFIG or CREATE commands.

With the catDB program you can create databases from a variety of files: extended spreadsheet (.esp) or spreadsheet (.spst) files you build and save in Catalyst; Catalyst compound (.cpd); MDL MOL (.mol); SMILES (.smi); Catalyst topology (.tpl); or MDL SD (.sd) files.Before you use catDB commands, see "Building Databases" for background information on database structure, performance, data model, and step-by-step procedures for constructing databases.

Once you have constructed a Catalyst database, you can use the Catalyst Install Database command from the Stockroom's Databases menu to gain access to that database from within Catalyst. Then you can search it, change its 1D values, and so on. You can also maintain your database with catDB commands for removing compounds, adding or removing properties, deleting databases, and so on. For detailed information, select a topic:

How to Use catDB Commands

At the UNIX command line, type a catDB command statement, which has the form

catDB COMMAND arguments [options] substituting the name of a catDB command for COMMAND, the values required by the command for arguments, and optional parameters for options. The complete syntax of each catDB command is shown before its description on the following pages. Italic type indicates that you should substitute the name of a value, and [ ] (square brackets) indicate optional parameters (separated by a | character) that you can include or omit. Do not type square brackets in your command statement; separate catDB, the command name, arguments, and options with blank spaces. (To display a list of all catDB commands, type catDB HELP and press the Enter key in a UNIX shell window.)

Note: Operations that modify a database can be performed only by the owner of the database, i.e. the UNIX user who created it.

Suggestions for Building Catalyst Databases

Some source databases allow chemically (topologically) or geometrically invalid structures with valence violations or coincident atoms. The catDB program identifies such invalid structures during the database building process and displays the names of the invalid compounds on the screen; the program does not put the invalid structures in the database. If you use the errFile= option, the program also sends the names of the incorrect compounds to the file you specify.

If the database you want to build will contain more than 5,000 compounds, it is probably best to set up a script that constructs small databases and then combines them two at a time with the catDB MERGE command. In this way, if catDB runs into a problem with a particular molecule while building a small database, it can be more easily detected and corrected. In addition, large molecules (larger than pentapeptides) use a lot of memory and can cause a crash because of insufficient swap space. It is recommended that these types of molecules be built separately and then merged with the rest of the database. See"Building Databases" for background information on database structure, performance, data model, and step-by-step procedures for constructing databases.

HELP

HELP displays usage information about the catDB program.

Usage: catDB HELP

An ASCII text screen of catDB commands and options is displayed when you execute catDB HELP. This screen is also displayed when you give catDB an insufficient or inappropriate argument.

For additional information, select one of the following topics:

CONFIG

Commands for Building Databases

CONFIG creates a database configuration (.bdb) file for use with the ADD_PROPERTY, BPD, CPD, CREATE, CREATE_1D, DELETE_DB, DELETE_PROPERTY, INFO, MERGE, MOL, RECALC, RECONFIG, SD, SD_UPDATE, SMILES, SPST_CREATE, SPST_DELETE, SPST_UPDATE, TPL, and VALIDATE commands.

Usage: catDB CONFIG [outputDBname.bdb]

Although the CREATE command lets you specify a configuration file interactively, if you want a database configuration file to have a name that is different from your source extended spreadsheet (.esp) file, you must create the configuration file with the CONFIG command and then use the OutputDB= option with the CREATE command.

[outputDBname.bdb] For outputDBname.bdb specify the name for a database configuration file. If you fail to provide a file name, the program prompts you to specify one.

Before executing the CONFIG command: 1) remotely log on to the machine on which catDisk is running at your installation and which will function as your 2D/3D server; 2) use the UNIX cd command to change to the directory in which you want the files to reside; 3) from the UNIX command line type catDB CONFIG outputDBname.bdb, substituting the name for a database configuration file for outputDBname.bdb and press Enter.

The CONFIG command is interactive: it prompts you with default values that you can accept, or you can specify your own values for the configuration file for the database you are about to create. In the prompts, defaults are the values displayed in [ ] (square brackets). To accept default values you can type the value shown in square brackets and press the Enter or Return key (depending on how it is labeled on your keyboard), or you can simply press the Enter key. The program displays the next prompt and waits for your specification. Each prompt and its description follows. For a more detailed discussion on the information specified in the configuration file, see "Relationships between Database Components" in the MSI document, Installing and Maintaining Catalyst.

Do you want to use the default configuration shown above? [y] The catDB CONFIG command returns a listing for a default configuration file in the same manner that the catDB CREATE command does. Notice that the program displays the name of the 1D server for you and selects a unique database ID. It also chooses your current directory as the location for storing database files and your current machine as the owner of these files. Type n (for no) and press the Enter key to start displaying all of the prompts that permit you to alter the specifications of the default configuration file. Type y (for yes) to accept the default configuration or just press Enter. The program creates the configuration file for you, and you can use it as an argument for the CREATE command.

Conformational models host [DefaultMachineName] Specify the name of the host computer for the .0bdb file containing conformational models, or press Enter to accept the default machine name.

Conformational models path [DefaultPath] Specify the path for the directory to contain the .0bdb file for conformational models, or press Enter to accept the default path listed in square brackets.

Database has 1D data ? [y] Answer y to display prompts for making specifications for a 1D property data file. Answer n to skip specifying and building a 1D data file.

1D data server host [DefaultMachineName] Specify the name of the host computer for the 1D property data file, or press Enter to accept the name of the default machine. (If there is more that one host at your site, ask your system administrator for the name of your 1D data server host.)

1D data server name [cat1-3.0] Specify the name of the 1D property data server (Oracle), or press Enter to accept the default.

Database has 2D index ? [y] Answer y to display prompts for making specifications for a 2D index (.2bdb) file. Answer n to skip specifying and building a 2D index (for example, to save space at the expense of speed).

2D Index host [DefaultMachineName] Specify the name of the host computer for the .2bdb file or accept the default machine name in square brackets.

2D Index path [DefaultPath] Specify the path for the directory to contain the .2bdb file or accept the default.

Database has 3D index ? [y] Answer y to display prompts for making specifications for a 3D index (.3bdb) file. Answer n to skip specifying and building a 3D index (for example, to save space at the expense of speed).

3D Index host [DefaultMachineName] Specify the host computer for .3bdb file or accept the default.

3D Index path [DefaultPath] Specify the path for the directory to contain the .3bdb file or accept the default.

Use the default database feature dictionary? [y] Answer y to use the default feature dictionary, $CATALYST_CONF/DBDictionary.chm, when building a 3D index. Answer n to display prompts for making specifications for a different feature dictionary (.chm) file.

Feature dictionary host [DefaultMachineName] Specify the host computer for the .chm file or accept the name of the default machine.

Where is the feature dictionary file currently located? [$CATALYST_CONF/DBDictionary.chm] Specify a source .chm file or accept the default file. This prompt is displayed only when you answer no to the Use the default database feature dictionary? prompt.

Specify path for feature dictionary for this database: [DefaultPath] Specify the path for the directory to contain the copied .chm file or accept the default.

Is this correct ? [y/n] After you have finished responding to the prompts, the program displays a list of your specifications and prompts for confirmation. If you answer y to accept the database configuration, the program builds a new configuration file with the name you specified and a feature dictionary file with an .nnnnn.chm (where nnnnn is the database ID) extension. If you specified that 3D indexes are to be built, a copy of the feature dictionary file is made by the CONFIG command. If you type n, the program prompts you to execute the catDB CONFIG command again to make your specifications.

RECONFIG

RECONFIG changes a database configuration (.bdb) file.

Usage: catDB RECONFIG DBname.bdb

DBname.bdb For DBname.bdb substitute the name of the configuration file you want to alter.

The RECONFIG command is interactive. Once you invoke it, the catDB program makes a copy of your configuration (.bdb) file and appends an .old extension to its name so that you have a backup. Then the program displays the current settings in your configuration file and prompts you to specify changes. Default values for responses are displayed in [ ] (square brackets). After you have made your specifications, the catDB program displays the information in your altered configuration file and, if possible, moves files according to your specifications.

CREATE

CREATE constructs a database (or appends data to an existing database) from copies of compounds referenced in a Catalyst extended spreadsheet (.esp) file that can contain edited 1D data.

Usage: catDB CREATE extendedspreadsheetname.esp [OutputDB=outputDBname.bdb] [DEFAULT] [errFile=errorfilename] [PropDict=propertydictionaryname] [ExistingConfs=Use|Discard] [BEST|FAST] [MaxConfs=n] [ConfBuffer=n] [No2DIndex] [No3DIndex] [APPEND] [StartCref=n] [AllowNFS]

Both the database that is created or modified and the databases referenced by the source .esp file must have a 1D component. The catDB program first copies 1D data values from the referenced databases and then updates this 1D data to reflect any edited data in the extended spreadsheet file. If any compound in the source extended spreadsheet file comes from a StockroomDB, the topology and conformational model for the compound must be present as a CPD file with a file name of the form compoundname.cpd in the same directory in which you run catDB.

Before executing the CREATE command: 1) remotely log on to the machine on which catDisk is running at your installation and which will function as your 2D/3D server; 2) use the UNIX cd command to change to the directory in which you want files to reside. Descriptions of the extended spreadsheet source file argument, the OutputDB= and DEFAULT options, and using the CREATE command interactively follow.

extendedspreadsheetname.esp For extendedspreadsheetname.esp substitute the name of your Catalyst extended spreadsheet (.esp) file. Note that you must supply the .esp part of your extended spreadsheet's file name when you use it with the CREATE command. The name for your database files is derived from the name of your extended spreadsheet file. Use the OutputDB= option if you want to specify a different name.

[OutputDB=outputDBname.bdb] Use this option if you want the name of your database to be different from the name of the .esp file from which you build your database. First, name and build the configuration (.bdb) file for your database with the CONFIG command. Then execute the CREATE command substituting the name of the configuration file for outputDBname.bdb. If you want to append new structures and data from your extended spreadsheet (.esp) file to a database you have already created, substitute the name of the target database's configuration file for outputDBname.bdb.

[DEFAULT] Use this option if you do not want to be prompted for specifications. The catDB CREATE command uses the default values for all applicable options. You can also include the DEFAULT option along with some specified options in your command statement, and the program uses your specifications, suppresses prompts, and assigns default values to every option not explicitly specified.

Select "Using the CREATE Command Interactively" for descriptions of prompts or select from the following list for detailed information on individual options:

Using the CREATE Command Interactively

If you do not use the DEFAULT option, which suppresses prompts, the catDB CREATE command is interactive; that is, once you enter a command statement, the program prompts you for specifications for creating a database from an .esp file or for appending to a target database data derived from an .esp file. The values displayed in [ ] (square brackets) are the defaults. To accept default values you can type the value shown in square brackets and press the Enter or Return key (depending on how it is labeled on your keyboard), or you can simply press the Enter key. The program displays the next prompt and waits for your specification. Each prompt and its description follows. Do you want to use the default config file for your database? [y]: Unless you have specified a configuration file with the OutputDB= option, the program displays a listing for a default database configuration file. Answer y to accept its specifications, or type n and press Enter to start building a configuration file interactively. See CONFIG for details on specifying a database configuration file.

Would you like an error file? [n] Specify y if you want the one-line error messages issued by the catDB program when it encounters a problem sent to a file in the directory in which you are using the program. Press Enter to accept the default condition in which no error file is generated.

Error file name = [OutputDBname.err] If you answer y to the preceding prompt, the program proposes the default error file name OutputDBname.err in which the name of your database configuration (.bdb) file name is substituted for OutputDBname. Press Enter to accept this name. To specify a different name for the error file, type it in and press the Enter key. If during database creation the program encounters a compound that is invalid, presents problems with conformational model generation, or has a duplicate name, it issues a one-line error message and records it in the error file you specify. If no problems are encountered, the error file is empty.

Do you want to specify a property dictionary (.bpd) file? [n] If you answer n to this prompt, the program creates a default 1D property database component containing only the default properties, compound names and molecular weights. Answering y evokes the next prompt.

Please input .bpd file name: Use this option if you want your new database or the target database to which you are appending data to have the 1D property definitions contained in the property dictionary file you specify. The resulting database will contain the union of the default properties (name and molecular weight), the properties defined in the specified property dictionary file, and the properties in the databases referred to in your source extended spreadsheet (.esp) file.

Do you want to embellish the existing conformational models? [n] To retain the conformational models in the source extended spreadsheet file and suppress conformational model generation, accept the default no. If you answer y to specify conformational analysis, the catDB program discards conformational models in the .esp file, all unregistered conformers in CPD files, and attempts to generate the maximum number of conformers that you specify in response to the prompt below.

Maximum number of conformers per compound. [100] To accept the default maximum of 50 conformers per compound, press Enter. Otherwise, type a value and press the Enter key. Note that if the conformational space can be adequately covered by fewer than the number of conformers specified for the MaxConfs= option, the program displays a message with the compound's name and the number of conformers actually generated for its model.

Skip generation of property data? [n] To generate 1D property data for all compounds accept the default by answering no. To suppress the generation of 1D property data, answer yes.

Skip generation of 2D indexes? [n] : To generate 2D index information to expedite searching for 2D data, accept the default by answering no. To suppress the generation of 2D indexes, answer yes. (This prompt is displayed only if the configuration file specifies 2D indexes.)

Skip generation of 3D indexes? [n] : To generate 3D index information to expedite searching for 3D data, accept the default by answering no. To suppress the generation of 3D indexes, answer yes. (This prompt is displayed only if the configuration file specifies 3D indexes.)

Are you appending data to an existing database? [n] : If you are using the CREATE command to build a new database from an .esp file, accept the default n. If you are using the CREATE command to add data to a database that you own, type y and press the Enter key. Your specification influences how the catDB program interprets your command statement: specifying n for no causes the program to build a new database from the data in your extended spreadsheet (.esp) source file and give the database the extended spreadsheet source file's name unless you specify a different name with the OutputDB= option. Specifying y causes the program to add data from your source .esp file to the end of the target database you specify with the OutputDB= option. By default, the program assigns crefs (unique internal identification numbers) to each compound added from the source .esp file, starting with a cref value that is greater than the largest cref value in the existing database.

Setting : OptionsSettings

OK? [y] This prompt contains the names of options and the settings you have specified for them. To accept your settings and continue with building a new database or appending data to a target database, answer yes. To reject these settings and stop catDB, answer no. The program halts and displays the Build aborted message before it returns control to the UNIX command line.

MERGE

MERGE combines two or more Catalyst databases.

Usage: catDB MERGE outputDBname.bdb DBlist=inputDB1name.bdb, inputDB2name.bdb... [errFile=errorfilename] [PropDict=propertydictionaryname.bpd] [APPEND] [ExistingConfs=Use|Discard] [BEST|FAST] [MaxConfs=n] [ConfBuffer=n] [No1D] [No2DIndex] [No3DIndex] [AllowNFS]

The MERGE command is intended for combining two or more related databases created with the catDB program to construct a single database containing the information from all of them.

outputDBname.bdb For outputDBname.bdb substitute the name of the configuration (.bdb) file for the database resulting from merging inputDB1name.bdb, inputDB2name.bdb, inputDB2name.bdb, and so on. When you are combining two databases, you can specify either inputDB1name.bdb or inputDB2name.bdb for outputDBname.bdb, and the contents of the resulting database overwrites the one with the same name.

inputDB1name.bdb For inputDB1name.bdb substitute the name of the configuration (.bdb) file for the database you want to combine with the other input databases you specify. When combining two or more separate databases, the input database files must have disjoint crefs; that is, no cref in one file can also be in another. (A cref is the unique internal number for each compound in a database.)

inputDB2name.bdb For inputDB2name.bdb substitute the name of the configuration file for a database you want to combine with the others you specify.

Select from the following list for detailed information on individual options:

You can merge databases that contain 1D, 2D, and 3D data, 1D data only, 2D data only, 3D data only, or some other combination of the three. If the output database has 1D data, so must both of the input databases. The same applies for 2D and 3D data. The 1D properties in the two databases being combined must have the same data types. For example, if property A has a string data type in one database but an integer data type in the other database, catDB issues an error message and halts execution of the MERGE command. The database that results from the MERGE command contains the union of the properties in the two input databases.

Each compound in a database to be merged must have a unique name. If the same compound name exists in both databases being merged, the catDB program puts the first one encountered in outputDBname.bdb and displays a message on the screen for each duplicate thereafter. If you used the [errFile=errorfilename] option, the message is also sent to the error file you specified.

The function dictionaries (.chm files) belonging to each input database must be identical. If they are not, the catDB program issues an error message and stops combining the two databases.

CPD

CPD creates a database from Catalyst .cpd compound files.

Usage: catDB CPD CPDfilenames outputDBname.bdb [PropDict=propertydictionaryname.bpd] [APPEND] [ExistingConfs=Use|Discard] [BEST|FAST] [MaxConfs=n] [ConfBuffer=n] [No1D] [No2DIndex] [No3DIndex] [StartCref=n] [StartAfter=cmpd] [StopAfter=cmpd] [errData=errordatafilename] [errFile=errorfilename] [AllowNFS]

CPDfilenames For CPDfilenames substitute the name of a file you have created that contains a list of the names of the individual .cpd files from which to generate your database. Each .cpd file name should appear on a separate line and precisely match the name of the file it represents. File names should be relative to the current directory and should include the .cpd suffix. Each .cpd file in the list of files should describe one compound; that is, the file should consist of a single topology and zero or more conformers.

outputDBname.bdb For outputDBname.bdb substitute the name of a configuration (.bdb) file you have created with the CONFIG command. If you responded y to the CONFIG command's prompt, Database has 1D data?, and you do not include the PropDict= option in your command statement, the catDB program builds a default 1D database that includes only the default 1D properties, name and molecular weight, and their associated values. The name assigned to each compound is the name of the .cpd file, without a path name or extension. For example, a compound created from .cpd file /home/joe/jklmn.cpd is named jklmn. Note that although the name of the compound is also stored in the file, it is ignored. This convention makes it easier to rename a compound by just renaming the file, with no need to edit it.

Select from the following list for detailed information on individual options:

MOL

MOL builds a database from standard MDL .mol compound files.

Usage: catDB MOL MOLfilenames outputDBname

MOLfilenames For MOLfilenames substitute the name of a file you have created that contains a list of the names of the individual .mol files from which to generate your database. Each file name should appear on a separate line and precisely match the name of the file it represents. File names should be relative to the current directory and should include the .mol suffix. Each .mol file in the list of files should describe one compound; that is, the file should consist of a single topology and a 2D or 3D conformer.

outputDBname.bdb For outputDBname.bdb substitute the name of a configuration (.bdb) file you created with the CONFIG command. If you responded y to the CONFIG command's prompt, Database has 1D data?, and you do not include the PropDict= option in your command statement, the catDB program builds a default 1D database that includes only the default 1D properties, name and molecular weight, and their associated values. The name assigned to each compound is the name of the .mol file, without a path name or extension. For example, a compound created from .mol file /home/joe/jklmn.mol is named jklmn. Note that although the name of the compound is also stored in the file, it is ignored. This convention makes it easier to rename a compound by just renaming the file, with no need to edit it.

Select from the following list for detailed information on individual options:

SD

SD builds a database from an MDL .sd compound file.

Usage: catDB SD SDfilename.sd outputDBname.bdb [PropDict=propertydictionaryname.bpd] [APPEND] [ExistingConfs=Use|Discard] [BEST|FAST] [MaxConfs=n] [ConfBuffer=n] [No1D] [No2DIndex] [No3DIndex] [StartCref=n] [StartAfter=cmpd] [StopAfter=cmpd] [errData=errordatafilename] [errFile=errorfilename] [AllowNFS]

SDfilename.sd For SDfilename.sd substitute the name of a standard MDL .sd file consisting of a set of compound descriptions in MOL file format with each compound description delimited by $$$$. The .sd file can contain 1D property data.

outputDBname.bdb For outputDBname.bdb substitute the name of a configuration (.bdb) file you have created with the CONFIG command. If you responded y to the CONFIG command's prompt, Database has 1D data?, and you do not include the PropDict= option in your command statement, the catDB program builds a default 1D database that includes the default 1D properties, name and molecular weight, and their associated values. The default 1D database also includes all properties in the .sd file that are defined in $CATALYST_CONF/Biocad.bpd or $CATALYST_CONF/Corporate.bpd plus the values for these properties. If a 1D property from the source .sd file is not defined in these property dictionary files or one you specify, the program issues a message and skips over it to continue processing with the next property. When the .sd file contains no values for a particular property, the catDB program assigns it a NULL value; that is, space is reserved for these values and you can enter them by editing a spreadsheet in Catalyst. For additional information on editing 1D data, see "Changing Values in a Spreadsheet."

Select from the following list for detailed information on individual options:

SMILES

SMILES creates a database from SMILES (Simplified Molecular Input Line Entry System) files.

Usage: catDB SMILES SMILESfilename.smi outputDBname.bdb [PropDict=propertydictionaryname.bpd] [APPEND] [ExistingConfs=Use|Discard] [BEST|FAST] [MaxConfs=n] [ConfBuffer=n] [No1D] [No2DIndex] [No3DIndex] [StartCref=n] [StartAfter=cmpd] [StopAfter=cmpd] [errData=errordatafilename] [errFile=errorfilename] [AllowNFS]

SMILESfilename.smi For SMILESfilename.smi substitute the name of a SMILES file consisting of one or more lines of standard SMILES or Catalyst stereochemical SMILES input, each corresponding to a compound's topology. The name of each compound should follow its SMILES specification, separated by white space (a blank or tab character) on the same line. (See "Writing Catalyst's Stereochemically Extended SMILES" for details on the file format.)

outputDBname.bdb Substitute the name of a configuration file you have created with the CONFIG command for outputDBname.bdb. If you responded y to the CONFIG command's prompt, Database has 1D data?, and you do not include the PropDict= option in your command statement, the catDB program builds a default 1D database that includes only the default 1D properties, name and molecular weight, and their associated values.

Select from the following list for detailed information on individual options:

SPST_CREATE

SPST_CREATE constructs a database from a Catalyst spreadsheet (.spst) file.

Usage: catDB SPST_CREATE spreadsheetname.spst outputDBname.bdb directory fileformat [PropDict=propertydictionaryname.bpd] [APPEND] [ExistingConfs=Use|Discard] [BEST|FAST] [MaxConfs=n] [ConfBuffer=n] [No1D] [No2DIndex] [No3DIndex] [StartCref=n] [errFile=errorfilename] [AllowNFS]

spreadsheetname.spst For spreadsheetname.spst substitute the name of your spreadsheet (.spst) file exported by Catalyst via its Export... command.

outputDBname.bdb Substitute the name of a configuration (.bdb) file you have created with the CONFIG command for outputDBname.bdb. If you responded y to the CONFIG command's prompt, Database has 1D data?, and you do not include the PropDict= option in your SPST_CREATE command statement, the catDB program builds a 1D database that includes any properties in the spreadsheet that are also in the files $CATALYST_CONF/Biocad.bpd or $CATALYST_CONF/Corporate.bpd.

directory For directory substitute the UNIX directory in which the files containing the 2D/3D data for each compound in the spreadsheet reside. To specify the current directory type a . (period).

fileformat For fileformat substitute one of the permissible file formats: smi (SMILES), cpd (Catalyst compound), tpl (Catalyst topological), mol (MDL MOL). For each compound in the spreadsheet, there must be a corresponding file in the directory you specify. The name of the compound must correspond to the prefix of the file name, and the suffix of the file name must consist of the file format; for example, for compounds called CMPD1 and COMPD2 in the source spreadsheet, the SMILES files CMPD1.smi and CMPD2.smi should reside in the specified directory if you specified smi as the file format.

[PropDict=propertydictionaryname.bpd] For propertydictionaryname.bpd substitute the name of a property dictionary (.bpd) file with property definitions that you want in your output database. This option places the union of the default properties (name and molecular weight) along with their respective values and the properties defined in the property dictionary file in your output database. All properties referenced in the .spst file must be defined in $CATALYST_CONF/ Biocad.bpd, $CATALYST_CONF/Corporate.bpd, or propertydictionaryname.bpd.

Select from the following list for detailed information on individual options:

TPL

TPL creates a database from Catalyst .tpl compound files.

Usage: catDB TPL TPLfilenames outputDBname.bdb [PropDict=propertydictionaryname.bpd] [APPEND] [ExistingConfs=Use|Discard] [BEST|FAST] [MaxConfs=n] [ConfBuffer=n] [No1D] [No2DIndex] [No3DIndex] [StartCref=n] [StartAfter=cmpd] [StopAfter=cmpd] [errData=errordatafilename] [errFile=errorfilename] [AllowNFS]

TPLfilenames For TPLfilenames substitute the name of a file you have created that contains a list of the names of the individual .tpl files from which to generate your database. Each .tpl file name should appear on a separate line and precisely match the name of the file it represents. File names should be relative to the current directory and should include the .tpl suffix. Each .tpl file in the list of files should describe one compound; that is, the file should consist of a single topology and zero or more conformers (see "TPL File Format Description").

outputDBname.bdb For outputDBname.bdb substitute the name of a configuration (.bdb) file you have created with the CONFIG command. If you responded y to the CONFIG command's prompt, Database has 1D data?, and you do not include the PropDict= option in your command statement, the catDB program builds a default 1D database that includes only the default 1D properties, name and molecular weight, and their associated values. The name assigned to each compound is the name of the .tpl file, without a path name or extension. For example, a compound created from .tpl file /home/joe/jklmn.tpl is named jklmn.

Note that although the name of the compound is also stored in the file, it is ignored. This convention makes it easier to rename a compound by just renaming the file, with no need to edit it.

Select from the following list for detailed information on individual options:


BPD

Commands for Databases

BPD displays a database's property definitions as text. You can use this text to build and edit a template property dictionary (.bpd) file for use with the catDB program when creating a database, adding properties, and deleting properties.

Usage: catDB BPD DBname.bdb

DBname.bdb For DBname.bdb substitute the name of the configuration (.bdb) file of the database for which you want to display property definitions.

A sample display resulting from executing the BPD command appears below.

You can copy the displayed text from your screen into a text editor or you can redirect the output of the command into a file by executing

catDB BPD DBname.bdb > propertydictionaryname.bpd

from the UNIX command line. Then you can customize the property definitions to suit your needs, save the customized text as a file ending with a .bpd (for BioCAD property dictionary) file extension, and use this file to define the properties of a new database when you create it with one of the catDB commands for building databases.

Each line in a .bpd file must contain the following attribute fields separated by white space: property name, property type, schema, reference, special type, and description. Only the description field is permitted to have embedded white space. If the first character on a line is ! (exclamation point), the catDB program treats the line as a comment. Specifications for each property attribute field follow: Property Name. Use the alphanumeric characters A through Z, a through z, 0 through 9, and the _ (underscore) character to specify a property name. A property name must start with an alphabetic character. A property name cannot begin with the underscore character nor can it contain embedded spaces. Many names are reserved, and attempts to use them will elicit error messages. You cannot use any of these words (in upper- or lowercase characters) as a property name. A list of reserved words follows:

abs, acos, add, after, all, alter, and, any, arg, array, array1, array2, array3, as, asin, assertion, atan, authorization, automatic, avg, before, begin, between, buffersize, by, bytestring, cascade, case, cast, catalog, ceiling, char, character, check, cluster, coalesce, column, comment, commit, consistency, constraint, constraints, contain, contains, continue, corresponding, cos, cosh, count, create, current, csstring, cvb, cvbo, cvbs, cvco, cvcs, cvd, cvda, cvdp, cvdu, cve, cvi, cvl, cvli, cvlist, cvr, cvre, cvs, cvset, cvsm, cvti, cvts, cvu, database, date, datetime, day, dbfile, dec, decimal, default, delete, depth, desc, describe, displacement, distinct, div, do, does, domain, double, drop, duration, each, else, end, enumerated, escape, exact, except, exists, exp, expand, extend, false, fetch, file, fix, fixed, first, float, floor, for, forein, fraction, from, function, geq, grant, group, having, help, identified, immediate, in, index, inf, inherit, inner, insert, int, integer, intersect, interval, into, is, key, language, last, lb1, lb2, lb3, length, leq, level, like, list, ln, local, lock, log, log2, long, match, max, maxof, median, min, minof, minute, mod, modify, module, month, move, national, natural, neq, new, next, none, normalize, not, ntst, ntsta, null, nullif, numeric, object, of, off, on, only, option, or, order, outer, partial, precision, prefix, prepare, preserve, primary, privileges, procedure, program, public, random, read, real, reference, references, referencing, remove, rename, replace, revoke, rollback, row, rows, schema, second, select, sequence, session, set, similar, sin, sinh, size, smallint, some, sql, sqrt, statistics, string, substring, sum, synonym, system, table, tan, tanh, temporary, then, time, timestamp, to, transaction, trigger, true, truncate, tst, tsta, tuple, ub1, ub2, ub3, union, unique, uniquefileid, units, unlock, unset, update, user, using, value, values, varchar, varying, view, when, where, with, work, write, year, zero

Property Data Type. Specify one of STRING for character data, FLOAT for floating-point number data, INTEGER for whole number data, or DATE for month- day-year data. An example of each property data type follows:

STRING R.B.Woodward
FLOAT 7.2
INTEGER 1043
DATE Jul-3-1993

Schema. The schema determines the general data layout of your database and should be specified according to the following guidelines: 1) If you think you will need to add or delete a particular property once your database is built, specify SPECIFIC, because you cannot add or delete a universal property after you have created a database. 2) If all or most of the compounds in the database are likely to have data for the property, specify UNIVERSAL. If less than half of the compounds are likely to have data for this property, use SPECIFIC. The universal schema uses much more storage space than specific. If more than half of the compounds are likely to have data for this property, the specific schema uses more space. 3) If much less than 50 percent of the compounds will have data for the property and disk space is a concern, specify SPECIFIC. 4) When performance is an issue, in general, searching and retrieving universal property data is faster only if more than 50 percent of the compounds have values.

NOT_SAVED is a schema specification that is allowed only in the property dictionary file Biocad.bpd in $CATALYST_CONF. NOT_SAVED indicates a reserved property name that cannot be saved in a StockroomDB or any other database. Such reserved names include: Error, Est, 2D, 3D, Color. Do not use this schema specification.

Reference. Specify QUICK_REF to build an index to expedite data retrieval when the property is a searching criterion specified with = (equals function), < (less than function), £ (less than or equal function, > (greater than function), and (greater than or equal function) in the Edit 1D Hypothesis dialog box in Catalyst. Note that in the current implementation, for the string properties, QUICK_REF is permitted for only the compound name property. Note, also, that QUICK_REF requires a great deal of disk space. SLOW_REF suppresses building an index and provides slower response time when the property is part of a query, but it also requires less disk space because of the lack of an index. SLOW_REF is appropriate for string fields that will usually be searched with the (approximately equals substring search) operator. The same property name can be defined with QUICK_REF for one database and SLOW_REF for another.

Special Type. Certain properties have a special defined meaning for Catalyst; for example they are computed automatically (molecular weight), or their values are used as feedback for the system. Legal special types are in the file Biocad.bpd in $CATALYST_CONF as follows:


The only other allowable specification for the Special Type field is NULL; use it when you define a property that does not have a special meaning for Catalyst. At most, one property name can refer to a given special type.

Description. For this attribute, specify a short textual definition of the property. Only the description field can contain white space (space characters and tab characters).

ADD_PROPERTY

ADD_PROPERTY appends 1D properties to a database.

Usage: catDB ADD_PROPERTY outputDBname.bdb propertydictionaryname.bpd

outputDBname.bdb For outputDBname.bdb specify the name of the configuration (.bdb) file for the database to which you want to add properties.

propertydictionaryname.bpd For propertydictionaryname.bpd specify the name of the property dictionary file containing the definitions of the properties you want to add to your database. Note that you can add only those properties with a SPECIFIC schema type; you cannot add to an existing database properties whose schema is UNIVERSAL. For additional information on schema type, see the description of the BPD command.

You can use the catDB BPD command to redirect another database's property definitions to a file with the .bpd (BioCAD property dictionary) extension, edit that file with a text editor so that only the definitions of the properties you want to add remain, and then supply that file as the property dictionary argument to the catDB ADD_PROPERTY command.

After you have successfully executed the ADD_PROPERTY command, the catDB program displays a message listing the names of the properties that were appended to your database. You can also add properties to a database (that you own) in Catalyst by selecting the Edit Property Dictionary... command from the Databases menu in the Stockroom. See "Edit Property Dictionary" for details.

CREATE_1D

CREATE_1D constructs a 1D database for a corresponding 2D/3D database.

Usage: catDB CREATE_1D outputDBname.bdb [PropDict=propertydictionaryname.bpd] [errFile=errorfilename]

outputDBname.bdb For outputDBname.bdb specify the name of the configuration (.bdb) file for the database for which you want to create a 1D database.

Select from the following list for detailed information on individual options:

REPAIR_1D

REPAIR_DB reassigns crefs, the unique internal identification numbers for compounds in a database. Use this command to shift crefs to make them disjoint so that you can combine two or more databases with the MERGE command.

Usage: catDB REPAIR_DB DBname.bdb [StartCref=n] [No1D] [No2DIndex] [No3DIndex]

DBname.bdb For DBname.bdb substitute the name of the configuration file for the database whose crefs you want to shift.

Select from the following list for detailed information on individual options:

The REPAIR_DB command is useful for reassigning crefs when you have databases that you want to combine but cannot because their crefs are not disjoint (and thus would not be unique after you merged the databases). A typical set of steps for "repairing" such databases follows:

  1. Remove any 1D data with the DELETE_DB_1D command.
  2. Execute the REPAIR_DB command using the No1D, No2DIndex (if the database has a 2D index), and No3DIndex (if the database has a 3D index) options. If you need to start cref numbering with a value other than 1, use the StartCref= option.
  3. Use the CREATE_1D, SD_UPDATE, SPST_UPDATE, or UPDATE commands to restore 1D data so that it is synchronized with the new cref numbers.
  4. Execute the RECALC command to recompute 2D and 3D indexes so that they are synchronized with the new cref numbers.
  5. Use the MERGE command to combine the databases.


RECALC

Commands for Updating Databases

Use the RECALC command to improve performance of the database after alterations. For example, use RECALC to recompute the 2D and 3D indexing structures for a Catalyst database after you've made a lot of additions or deletions. Or, use the RECALC command to recalculate 1D indices after changing the property dictionary.

Usage: catDB RECALC DBname.bdb [AllowNFS]

DBname.bdb For DBname.bdb substitute the name of the configuration file of the database for which you want to recompute indexing structures.

When you invoke this command statement, the program displays a message informing you that 2D and 3D index files (with names of the form DBname.IDnumber.2bdb and DBname.IDnumber.3bdb) exist and prompts you to confirm their deletion or cancel the recalculation. Type y for yes or n for no and press the Enter key after each prompt. The catDB program must delete the 2D and 3D index files before recomputing the indexing structures. This process is time-consuming, so the prompts give you the opportunity to cancel it, if it is currently inconvenient.

SD_UPDATE

SD_UPDATE replaces 1D data in a database with new values from an MDL .sd compound file.

Usage: catDB SD_UPDATE SDfilename.sd outputDBname.bdb [PropDict=propertydictionaryname.bpd] [errData=errordatafilename] [errFile=errorfilename]

SDfilename.sd For SDfilename.sd substitute the name of a standard MDL .sd file containing 1D property data and a set of compound descriptions in MOL file format with each compound description delimited by $$$$.

outputDBname.bdb For outputDBname.bdb specify the name of the configuration file for the database containing the data to be replaced by the 1D data in the .sd file.

Select from the following list for detailed information on individual options:

The SD_UPDATE command adds only the 1D property data in the .sd file to the target database. The command skips over all compound data that does not exist in the 2D/3D portion of the database. (That is, if for some reason a compound was not built due to an error in structure generation, the compound is not in the database and the SD_UPDATE command skips it.) If you use the errData= option, all records that are skipped are written to the error data file you specify. Definitions for all 1D properties must exist in the 1D database component or in a property dictionary (.bpd) file specified with the PropDict= option.

SPST_UPDATE

SPST_UPDATE replaces 1D data in a database with new values from a spreadsheet (.spst) file.

Usage: catDB SPST_UPDATE spreadsheetname.spst outputDBname.bdb [PropDict=propertydictionaryname.bpd]

spreadsheetname.spst For spreadsheetname.spst substitute the name of the file containing the names of compounds and associated "dirty" 1D values (values in cells that Catalyst turns gray once you edit them) with which you want to update a database.

outputDBname.bdb For outputDBname.bdb specify the name of the configuration file for the database containing the data to be replaced by the data in your spreadsheet file.

The SPST_UPDATE command is similar to the Commit Changes to Database command in Catalyst, but changes 1D data in only one database at a time. NULL values (those for which an empty cell exists) in the spreadsheet file do not overwrite values in the database; that is, the values in the database are unchanged by NULL values in the spreadsheet file. To use the SPST_UPDATE command you must build a spreadsheet in Catalyst and export it as a spreadsheet (.spst) file. For additional information on editing 1D data, see "Changing Values in a Spreadsheet."

For additional information, see "PropDict=".

UPDATE

UPDATE replaces 1D data in a database with new values from an extended spreadsheet (.esp) file derived from the database.

Usage: catDB UPDATE extendedspreadsheetname.esp

extendedspreadsheetname.esp For extendedspreadsheetname.esp substitute the name of the file containing the "dirty" 1D values (values in cells that Catalyst turns gray once you edit them) with which you want to update a database.

The UPDATE command is similar to the Commit Changes to Database command in Catalyst, but it changes 1D data in only one database at a time. NULL values (those for which an empty cell exists) in the extended spreadsheet file do not overwrite values in the database; that is, the values in the database are unchanged by NULL values in the extended spreadsheet file. To use the UPDATE command you must build a spreadsheet in Catalyst and export it as an extended spreadsheet (.esp) file. For additional information on editing 1D data, see "Changing Values in a Spreadsheet."

DELETE

Commands for Deleting Compounds, Properties, and Databases

DELETE removes the compounds listed in an extended spreadsheet (.esp) file from a database.

Usage: catDB DELETE extendedspreadsheetname.esp

extendedspreadsheetname.esp For extendedspreadsheetname.esp substitute the name of an extended spreadsheet (.esp) file containing the compounds that you want to remove from a database. Note that you must supply the .esp extension portion of your extended spreadsheet's file name when you use the DELETE command.

Follow this procedure to delete compounds and all their associated data from a database using an extended spreadsheet (.esp) file:

  1. In Catalyst prepare a spreadsheet that contains the compounds that you want to remove from a database and save it with the Save Report To Lab As Spreadsheet... command from the Data menu.
  2. With that spreadsheet selected on the shelf, select the Export... command from the Data menu to display the Export Data dialog box.
  3. To save your spreadsheet as an extended spreadsheet (.esp) file in your Catalyst directory, select Database spreadsheet file (ESP) under Export Type and then select Export.
  4. If the database from which you want to remove compounds is installed in an active session of Catalyst, you must either exit Catalyst or use the Deinstall Database... command from the Stockroom's Databases menu before you use the catDB DELETE command. Otherwise, the catDB program issues the messages, Cannot lock file for update, try again later and Cannot update conformer database to inform you that the database is in use and cannot be revised.
  5. In your Catalyst directory, from the UNIX command line, type catDB DELETE followed by the name of your .esp file containing the compounds to be deleted, and press the Enter key.

When you execute the catDB DELETE command, for each compound in the .esp file, it removes the corresponding compound and all of its associated data from the database. When you next open a spreadsheet that was derived from the database before you deleted compounds from it (that is, the spreadsheet data is older than the changed data in the database), Catalyst displays an Alert message informing you that changes have been made to the source data and that the spreadsheet is no longer usable.

SPST_DELETE

SPST_DELETE removes the compounds listed in a spreadsheet (.spst) file from a target database that includes 1D data containing corresponding compound names.

Usage: catDB SPST_DELETE spreadsheetname.spst outputDBname.bdb

spreadsheetname.spst For spreadsheetname.spst substitute the name of a spreadsheet (.spst) file containing the names of the compounds that you want to remove from a database.

outputDBname.bdb For outputDBname.bdb substitute the name of the configuration (.bdb) file for the database from which you want to delete compounds. This database must contain 1D data that includes the names of the compounds to be deleted.

Follow this procedure to delete compounds and all their associated data from a database using a spreadsheet (.spst) file:

  1. In Catalyst prepare a spreadsheet that contains the compounds that you want to remove from a database and save it with the Save Report To Lab As Spreadsheet... command from the Data menu.
  2. With that spreadsheet selected on the shelf, select the Export... command from the Data menu to display the Export Data dialog box.
  3. To save your spreadsheet as a spreadsheet (.spst) file in your Catalyst directory, make certain that Catalyst spreadsheet file (SPST) under Export Type is selected and then select Export.
  4. If the database from which you want to remove compounds is installed in an active session of Catalyst, you must either exit Catalyst or use the Deinstall Database... command from the Stockroom's Databases menu before you use the catDB SPST_DELETE command. Otherwise, the program issues the messages, Cannot lock file for update, try again later and Cannot update conformer database, to inform you that the database is in use and cannot be revised.
  5. In your Catalyst directory, from the UNIX command line, type catDB SPST_DELETE followed by the name of your .spst file and the name of the .bdb file for the database containing the compounds to be deleted. Then press the Enter key.

When you execute the catDB SPST_DELETE command, for each compound in the .spst file, the catDB program removes the corresponding compound and all of its associated data from the target database. When you next open a spreadsheet derived from the database before you deleted compounds from it, Catalyst displays an Alert message informing you that changes have been made to the source data and that the spreadsheet is no longer usable.

DELETE_PROPERTY

DELETE_PROPERTY removes specified 1D properties and their associated data from a database.

Usage: catDB DELETE_PROPERTY outputDBname.bdb propertydictionaryname.bpd

outputDBname.bdb For outputDBname.bdb specify the name of the configuration (.bdb) file for the database from which you want to remove properties.

propertydictionaryname.bpd For propertydictionaryname.bpd specify the name of the property dictionary file containing the definitions of the properties you want to remove from your database. Note that you can delete only those properties with a SPECIFIC schema type; you cannot delete properties whose schema is UNIVERSAL. For more information on schema types, see the description of the BPD command.

Use the catDB BPD command to display a list of properties in the database and redirect the output of the command into a file by executing catDB BPD DBname.bdb > propertydictionaryname.bpd

from the UNIX command line. Load the resulting text file into a text editor and edit the text so that only the definitions of the properties you want to delete remain, save the resulting text as a file with the .bpd (BioCAD property dictionary) extension, and supply this file as the property dictionary argument to the catDB DELETE_PROPERTY command.

After you have successfully executed the DELETE_PROPERTY command, the catDB program displays a message listing the names of the properties that were deleted from your database. You can also remove properties from a database (that you own) in Catalyst by selecting the Edit Property Dictionary... command from the Databases menu in the Stockroom. See "Edit Property Dictionary" for details.

DELETE_DB

DELETE_DB removes from disk all of the data files for the specified database.

Usage: catDB DELETE_DB DBname.bdb [No1D]

DBname.bdb For DBname.bdb substitute the name of the configuration (.bdb) file of the database for which you want to delete all data files.

Select No1D for detailed information on this option.

When you execute the DELETE_DB command, the catDB program displays a message to remind you that all the data in your database will be expunged and prompts you to confirm the deletion. (The only way you can recover a database that you have deleted is from a backup tape that was made sometime before you executed the DELETE_DB command. Thus, it is a good idea to check with your system administrator about backup schedules before you delete a database containing data that might be difficult to restore if you needed to.) If you type n for no and press the Enter key, the program halts and displays a message indicating that your database was not deleted. If you type y for yes and press the Enter key, the program displays messages advising you when each data file has been removed and when the deletion is complete. The DELETE_DB command does not delete the database's configuration (.bdb) file or the feature dictionary (.chm file). Use the UNIX rm (Remove) command to delete these files.

Note: You should always use the DELETE_DB command when you want to remove a database's data files; use the UNIX rm command to delete only the configuration (.bdb) and the feature dictionary (.chm) files. Attempting to delete data files manually with the UNIX rm command could result in unpredictable results and is not recommended. Only the DELETE_DB command and the DELETE_DB_1D command can remove a database's 1D data.

If you receive an error message that the program cannot lock the database for deletion, it means that it is currently in use in an active Catalyst session. Exit the Catalyst session or deactivate the database with the Deinstall command from the Databases menu in the Stockroom to enable the DELETE_DB command.

DELETE_DB_1D

DELETE_DB_1D removes the 1D property data from the specified database.

Usage: catDB DELETE_DB_1D hostmachinename servername ID

hostmachinename For hostmachinename substitute the name the computer on which the 1D data resides. Use the INFO command to display the name of the host computer for the database's 1D data. See INFO for a description of this command and its output.

servername For servername substitute the name of the 1D data server. If you don't know the name of the 1D data server, use the INFO command to display it.

ID For ID substitute the database's identification number. To find out the identification number, use the INFO command.

When you execute this command, the program displays a message advising you that the 1D database is being deleted and another when the operation is complete.

INFO

INFO compiles and displays on the screen the following information about a Catalyst database: 1D database identification number and location; total number of compounds, 1D properties and their definitions, and conformer database location.

Usage: catDB INFO DBname.bdb [Detail] [No1D]

DBname.bdb For DBname.bdb substitute the name of the configuration file of the database about which you want to display facts regarding its contents.

Detail displays the following additional information about a database: lowest and highest cref value, total number of fragments, total number of conformers, average molecular weight, average number of conformers per fragment, and the name of the last compound in the database (which can be useful in restarting a database construction process that has terminated abnormally before completion).

If you want a paper copy of information about a database you can redirect the output to a file by typing catDB INFO DBname.bdb [Detail] >& filename making appropriate substitutions for DBname.bdb and filename, and press the Enter key. Print this file by typing

enscript filename

and pressing the Enter key.

Executing the INFO command with the Detail option builds data files (info.unsat, info.tot, info.stereo, info.rot, info.mw, info.hetero, info.heavy, info.endo, info.confs, info.carbon) with which you can display graphs. To display the graph for an individual file, on the UNIX command line type xg filename

substituting the name of the file for filename. Then press Enter. To display all graphs, on the UNIX command line type

foreach i (info.*)

and press Enter. Then at the ? prompt, type

xg $i &

and press Enter. At the next ? prompt, type

end

and press Enter. Note that your UNIX DISPLAY environment variable must be set to your machine if you are running the catDB program on a remote machine.

VALIDATE

VALIDATE verifies the 2D/3D structural integrity of all compounds in a database.

Usage: catDB VALIDATE DBname.bdb

DBname.bdb For DBname.bdb substitute the name of the configuration file of the database containing the 2D/3D structures you want to verify.

The name of any chemically invalid compound is displayed on the screen, and the corresponding compound is deleted from the database.

PropDict=

PropDict=propertydictionaryname.bpd specifies a property dictionary for a new database you are creating or a database to which you are appending data.

propertydictionaryname.bpd For propertydictionaryname.bpd substitute the name of the property dictionary file containing the 1D data property definitions you want to apply to a database. The PropDict= option applies to the following catDB commands: CPD, CREATE, CREATE_1D, MERGE, MOL, SD, SD_UPDATE, SPST_UPDATE, SMILES, SPST_CREATE, TPL.

Use this option if you want your new database or the target database to which you are appending data to have the default 1D properties (name and molecular weight) and those contained in the property dictionary (.bpd) file you specify. The resulting database will contain the union of the default properties, the 1D properties defined in the specified property dictionary file, and the properties in the databases referred to in your source file. For non-SD and non-SPST files, all properties except the default properties are given NULL values; that is, space is reserved for these values and you can enter them by editing a spreadsheet in Catalyst. For additional information on editing 1D data, see "Changing Values in a Spreadsheet." The default value for PropDict= is NONE.

ExistingConfs=

ExistingConfs=Use|Discard controls the treatment of conformers from input files. Either Use or Discard must be specified; the default value is Use. The ExistingConfs= option applies to the following catDB commands: CPD, CREATE, MERGE, MOL, SMILES, SD, SPST_CREATE, and TPL.

Use enters conformers from input source files into databases being merged, appended to, or created. If there are fewer than the number of conformers specified for the MaxConfs= option, conformational analysis generates additional ones. Conformational analysis is not performed if the number of conformers in the input file equals or exceeds the number specified for the MaxConfs= option. Use is in effect unless you explicitly specify otherwise. For more information on conformational analysis, see

Discard causes catDB to generate a new conformational model for each compound in the database being created, appended to, or merged, and to ignore all existing conformers for each input compound. That is, only a compound's connectivity information is retained and used for conformational analysis. Discard prevents incorrect conformers from a source database from getting into a new database. By default, Discard is not in effect. To activate this value for the ExistingConfs= option, you must explicitly include it in your catDB command statement.

Note that if no conformational model exists for a compound, or if you specify Discard for the ExistingConfs= option and elect not to embellish conformational models when constructing a database, the catDB program issues a warning message and generates one conformer.

MaxConfs=

MaxConfs=n specifies the upper limit for the number of conformers generated during conformational analysis. This option is interpreted in conjunction with the value of the ExistingConfs= option. Original 3D representations (so-called edit conformers) in CPD files are not included in generated conformational models, nor are other conformers in input files when the ExistingConfs= option is set to Discard. Otherwise, existing conformers are considered registered and are included in the conformational model even if they are duplicates of generated conformers. (For additional information about the effects of registering conformers, see "Register Confs".) The MaxConfs= option applies to the following catDB commands: CPD, CREATE, MERGE, MOL, SMILES, SD, SPST_CREATE, and TPL.

n sets the maximum number of conformers generated. The default value is 100.

ConfBuffer=

ConfBuffer=n specifies the number of conformers that are buffered in memory before being written out to disk. The ConfBuffer= option applies to the following catDB commands: CPD, CREATE, MERGE, MOL, SMILES, SD, SPST_CREATE, and TPL.

n sets the number of conformers held in the buffer. The default value of n is 5,000. Specifying a number smaller than 5,000 increases the number of times that conformers are written out to disk and lengthens execution time, but it also reduces the risk of data loss in the event of a computer crash. Similarly, specifying a number larger than 5,000 reduces execution time, but increases the chance of data loss in the event of a failure.

BEST
FAST

The BEST and FAST options control the type of conformational analysis for generating conformational models. By default, the FAST option is in effect unless you explicitly specify otherwise. To specify best-quality conformational analysis, you must include BEST in your catDB command statement. BEST and FAST apply to the following catDB commands: CPD, CREATE (when not used interactively), MERGE, MOL, SMILES, SD, SPST_CREATE, and TPL.

Note: The default conformational model energy range is 20 kcal/mol. For FAST conformational analysis you can specify a different energy range by adding the user parameter

confAnalysis.catDB.maxEnergySpread = n to your .Catalyst file. Substitute a value in joules for n.

BEST specifies best-quality conformational analysis. Best-quality conformational analysis requires five to ten times more processing time than FAST. For detailed information on fast and best conformational analysis, see "Which Type of Generation to Use."

FAST specifies fast-quality conformational analysis. Fast-quality conformational analysis requires five to ten times less processing time than BEST. For detailed information on fast and best conformational analysis, see "Which Type of Generation to Use."

No2DIndex
No3DIndex

No2DIndex suppresses the generation of 2D indexing information that makes database searching more efficient. By default this option is not in effect. You must explicitly specify it by including it in your catDB command statement or by answering yes to the CREATE command's Skip generation of 2D indexes? prompt. Use this option to defer index 2D index construction to a more convenient time when building large databases. The No2DIndex option applies to the following catDB commands: CPD, CREATE, MERGE, MOL, REPAIR_DB, SMILES, SD, SPST_CREATE, and TPL.

No3DIndex suppresses the generation of 3D indexing information that makes database searching more efficient. By default this option is not in effect. You must explicitly specify it by including it in your catDB command statement or by answering yes to the CREATE command's Skip generation of 3D indexes? prompt. Since a feature dictionary (.chm file) is needed only for 3D index generation, a feature dictionary is not required when you specify this option. Use this option to defer index 3D index construction to a more convenient time when building large databases. The No2DIndex option applies to the following catDB commands: CPD, CREATE, MERGE, MOL, REPAIR_DB, SMILES, SD, SPST_CREATE, and TPL.

APPEND

APPEND adds data from source files to a target database.

By default, the APPEND option is not in effect. That is, to add data when building or merging a database, you must explicitly include the APPEND option in your catDB command statement or answer yes to the Are you appending data to an existing database? prompt from the CREATE command.

If the StartCref= option has been specified, the program uses its value unless there is a potential conflict with crefs (unique compound identifiers) already in the database, in which case catDB issues an error message to that effect and halts. If you do not specify the StartCref= option, catDB sets it to a cref that is larger than all crefs in the target database.

The APPEND option applies to the following catDB commands: CPD, CREATE (when not used interactively), MERGE, MOL, SMILES, SD, SPST_CREATE, TPL.

StartCref=

StartCref=n specifies the integer with which the catDB program begins assigning unique internal identification numbers (crefs, pronounced see´ refs) to each compound in a database. By default, the catDB program starts cref numbering with 1. The StartCref= option applies to the following catDB commands: CPD, CREATE (when not used interactively), MOL, REPAIR_DB, SMILES, SD, SPST_CREATE, TPL.

n specifies the integer with which cref numbering begins. That is, you must explicitly include the StartCref= option in your catDB command statement if you want numbering to start with a value other than 1.

Note: If you anticipate that a database will be merged with another database, you must specify a starting cref number that avoids duplicating crefs in the other database. See "Building a Multiconformer Database" for examples of using the StartCref= option in scripts that control building conformational models in parallel.

StartAfter
StopAfter

StartAfter=cmpd specifies the input file compound with which the CPD, MOL, SD, SMILES, or TPL command should begin processing. The default is to begin with the first compound in the source file. See "Building a Multiconformer Database" for examples of using the StartAfter= and StopAfter= options in scripts that control building conformational models in parallel.

cmpd can be the compound's name enclosed in double quotation marks; for example, StartAfter="XYZ52137" instructs the program to begin with the molecule called XYZ52137. You can also specify the position in at input file at which to begin by substituting a single quotation mark, a dollar sign, an integer, and another single quotation mark for cmpd. For example, StartAfter=´$1000´ begins processing with the one-thousandth compound in the source file. StopAfter=cmpd specifies the input file compound with which the CPD, MOL, SD, SMILES, or TPL command should quit processing. The default is to halt after the last compound in the source file.

cmpd can be the compound's name or position in the input file as with the StartAfter= option. For example, StopAfter="XYZ52137" halts processing after the compound named XYZ52137; StopAfter=´$1000´ halts processing after the one-thousandth molecule in a source file.

errfile=
errData=

errFile=errorfilename creates a file to hold error messages issued during execution of catDB commands. Unless you specify otherwise, the file name is the name of the database configuration file with a .err extension. The program issues an error message for any compound that is invalid, has problems with conformational model generation, or has a duplicate name. The default value for errFile= is NONE.

errorfilename For errorfilename substitute the name for a file to hold any diagnostic error messages issued by the catDB program while executing the following commands: CPD, CREATE, MERGE, MOL, SMILES, SD, SPST_CREATE, TPL, CREATE_1D. errData=errordatafilename creates a file to collect all unprocessed data in their original formats. This option permits you to capture erroneous data so that you can correct the problems that prevented their incorporation in a database.

errordatafilename For errordatafilename substitute the name for a file to hold any data that could not by processed by the CPD, MOL, SMILES, SD, or TPL commands.

NoID

No1D prevents the generation of all 1D property data for all compounds in a database. You can also use this option to preserve the 1D component of a database when you use the DELETE_DB command to remove other database components (conformational models, 2D indexes, 3D indexes, and the feature dictionary). The No1D option applies only to the following commands: CPD, DELETE_DB, INFO, MERGE, MOL, REPAIR_DB, SD, SMILES, TPL. By default, No1D is not in effect.

AllowNFS

AllowNFS enables writing to database files across NFS-mounted disk partitions even though such files cannot be locked for writing. Data corruption can result from this unsafe practice because files can be accessed by other processes. Consult with MSI Scientific Support before using this option. The AllowNFS option's default condition is off. Although not recommended, AllowNFS can be used with the following commands: CPD, CREATE, MERGE, MOL, RECALC, SD, SMILES, SPST_CREATE, TPL.

Important Note: If you use the AllowNFS option, you will not be able to restart building a database as described in "Stopping" and Restarting Database Construction".

xdbConfigList File

Adding or Changing Databases Automatically Loaded into Catalyst

The system file xdbConfigList determines which databases are automatically installed when you start your Catalyst session. Catalyst searches the directory from which you invoked it for an xdbConfigList file and installs any databases specified in it if they are not already installed. Then Catalyst installs any databases specified in the xdbConfigList file stored in the $CATALYST_CONF directory if they are not already installed. The entry for the StockroomDB must be present in every xdbConfigList file, whether local or in $CATALYST_CONF.

The xdbConfigList file format is

DBname:DBtype:DBpath:DBAccessList where a : (colon) delimits each field in the specification. Only one database can be specified per line. You must use the catEdit program in $BIN to alter the xdbConfigList file. Field descriptions follow:

DBname For DBname substitute the name of the database.

DBtype DBtype is always Biocad. The catEdit program automatically supplies this specification.

DBpath For DBpath substitute the database's full UNIX path. For the xdbConfigList file in $CATALYST_CONF, the paths to a corporate database must be identical as seen on all hosts.

DBAccessList DBAccessList is always * (asterisk).

An example of a $CATALYST_CONF/xdbConfigList file your system administrator might create to make a corporate database available company-wide follows:

StockroomDB:Biocad:catdata/StockroomDB.bdb:*
DBname1:Biocad:
DBpath:*
DBname2:Biocad:DBpath:*



[Top] [Index]

Last updated April 18, 1996 at 11:18am PDT.
Copyright © 1999, Molecular Simulations Inc. All rights reserved.