catDB Database ManagementThe catDB program supports commands that build and maintain databases outside of Catalyst. With a couple of exceptions, you must use catDB commands from a UNIX command line (see "How to Use a UNIX Command Line"). The exceptions are the Create Database and Append/Replace Database Compounds commands, which are available within the View Database workbench.
A Catalyst database is organized so that 1D scalar data (values associated with properties such as name, molecular weight, formula, activities, and so on) are stored in a 1D database file; 2D data (topological) are stored in a 2D database file; and 3D data (geometric) reside in a 3D database file. The locations of the 1D, 2D, and 3D components of a Catalyst database are controlled with a configuration (.bdb) file that you create and specify with the catDB CONFIG or CREATE commands.
With the catDB program you can create databases from a variety of files: extended spreadsheet (.esp) or spreadsheet (.spst) files you build and save in Catalyst; Catalyst compound (.cpd); MDL MOL (.mol); SMILES (.smi); Catalyst topology (.tpl); or MDL SD (.sd) files.Before you use catDB commands, see "Building Databases" for background information on database structure, performance, data model, and step-by-step procedures for constructing databases.
Once you have constructed a Catalyst database, you can use the Catalyst Install Database command from the Stockroom's Databases menu to gain access to that database from within Catalyst. Then you can search it, change its 1D values, and so on. You can also maintain your database with catDB commands for removing compounds, adding or removing properties, deleting databases, and so on. For detailed information, select a topic:
At the UNIX command line, type a catDB command statement, which has the form
catDB COMMAND arguments [options]
substituting the name of a catDB command for COMMAND, the values required by the command for arguments, and optional parameters for options. The complete syntax of each catDB command is shown before its description on the following pages. Italic type indicates that you should substitute the name of a value, and [ ] (square brackets) indicate optional parameters (separated by a | character) that you can include or omit. Do not type square brackets in your command statement; separate catDB, the command name, arguments, and options with blank spaces. (To display a list of all catDB commands, type catDB HELP and press the Enter key in a UNIX shell window.)Note: Operations that modify a database can be performed only by the owner of the database, i.e. the UNIX user who created it.
Some source databases allow chemically (topologically) or geometrically invalid structures with valence violations or coincident atoms. The catDB program identifies such invalid structures during the database building process and displays the names of the invalid compounds on the screen; the program does not put the invalid structures in the database. If you use the errFile= option, the program also sends the names of the incorrect compounds to the file you specify.
If the database you want to build will contain more than 5,000 compounds, it is probably best to set up a script that constructs small databases and then combines them two at a time with the catDB MERGE command. In this way, if catDB runs into a problem with a particular molecule while building a small database, it can be more easily detected and corrected. In addition, large molecules (larger than pentapeptides) use a lot of memory and can cause a crash because of insufficient swap space. It is recommended that these types of molecules be built separately and then merged with the rest of the database. See"Building Databases" for background information on database structure, performance, data model, and step-by-step procedures for constructing databases.
Usage: catDB HELP
An ASCII text screen of catDB commands and options is displayed when you execute catDB HELP. This screen is also displayed when you give catDB an insufficient or inappropriate argument.
For additional information, select one of the following topics:
Usage: catDB CONFIG [outputDBname.bdb]
Although the CREATE command lets you specify a configuration file interactively, if you want a database configuration file to have a name that is different from your source extended spreadsheet (.esp) file, you must create the configuration file with the CONFIG command and then use the OutputDB= option with the CREATE command.
[outputDBname.bdb] For outputDBname.bdb specify the name for a database configuration file. If you fail to provide a file name, the program prompts you to specify one.
Before executing the CONFIG command: 1) remotely log on to the machine on which catDisk is running at your installation and which will function as your 2D/3D server; 2) use the UNIX cd command to change to the directory in which you want the files to reside; 3) from the UNIX command line type
catDB CONFIG outputDBname.bdb, substituting the name for a database configuration file for outputDBname.bdb and press Enter.
The CONFIG command is interactive: it prompts you with default values that you can accept, or you can specify your own values for the configuration file for the database you are about to create. In the prompts, defaults are the values displayed in [ ] (square brackets). To accept default values you can type the value shown in square brackets and press the Enter or Return key (depending on how it is labeled on your keyboard), or you can simply press the Enter key. The program displays the next prompt and waits for your specification. Each prompt and its description follows. For a more detailed discussion on the information specified in the configuration file, see "Relationships between Database Components" in the MSI document, Installing and Maintaining Catalyst.
Do you want to use the default configuration shown above? [y] The catDB CONFIG command returns a listing for a default configuration file in the same manner that the catDB CREATE command does. Notice that the program displays the name of the 1D server for you and selects a unique database ID. It also chooses your current directory as the location for storing database files and your current machine as the owner of these files. Type n (for no) and press the Enter key to start displaying all of the prompts that permit you to alter the specifications of the default configuration file. Type y (for yes) to accept the default configuration or just press Enter. The program creates the configuration file for you, and you can use it as an argument for the CREATE command.
The RECONFIG command is interactive. Once you invoke it, the catDB program makes a copy of your configuration (.bdb) file and appends an .old extension to its name so that you have a backup. Then the program displays the current settings in your configuration file and prompts you to specify changes. Default values for responses are displayed in [ ] (square brackets). After you have made your specifications, the catDB program displays the information in your altered configuration file and, if possible, moves files according to your specifications.
RECONFIG
RECONFIG changes a database configuration (.bdb) file.
CREATE
CREATE constructs a database (or appends data to an existing database) from copies of compounds referenced in a Catalyst extended spreadsheet (.esp) file that can contain edited 1D data.
Both the database that is created or modified and the databases referenced by the source .esp file must have a 1D component. The catDB program first copies 1D data values from the referenced databases and then updates this 1D data to reflect any edited data in the extended spreadsheet file. If any compound in the source extended spreadsheet file comes from a StockroomDB, the topology and conformational model for the compound must be present as a CPD file with a file name of the form compoundname.cpd in the same directory in which you run catDB.
Before executing the CREATE command: 1) remotely log on to the machine on which catDisk is running at your installation and which will function as your 2D/3D server; 2) use the UNIX cd command to change to the directory in which you want files to reside. Descriptions of the extended spreadsheet source file argument, the OutputDB= and DEFAULT options, and using the CREATE command interactively follow.
[OutputDB=outputDBname.bdb] Use this option if you want the name of your database to be different from the name of the .esp file from which you build your database. First, name and build the configuration (.bdb) file for your database with the CONFIG command. Then execute the CREATE command substituting the name of the configuration file for outputDBname.bdb. If you want to append new structures and data from your extended spreadsheet (.esp) file to a database you have already created, substitute the name of the target database's configuration file for outputDBname.bdb.
Select "Using the CREATE Command Interactively" for descriptions of prompts or select from the following list for detailed information on individual options:
If you do not use the DEFAULT option, which suppresses prompts, the catDB CREATE command is interactive; that is, once you enter a command statement, the program prompts you for specifications for creating a database from an .esp file or for appending to a target database data derived from an .esp file. The values displayed in [ ] (square brackets) are the defaults. To accept default values you can type the value shown in square brackets and press the Enter or Return key (depending on how it is labeled on your keyboard), or you can simply press the Enter key. The program displays the next prompt and waits for your specification. Each prompt and its description follows.
Do you want to use the default config file for your database? [y]: Unless you have specified a configuration file with the OutputDB= option, the program displays a listing for a default database configuration file. Answer y to accept its specifications, or type n and press Enter to start building a configuration file interactively. See CONFIG for details on specifying a database configuration file.
Would you like an error file? [n] Specify y if you want the one-line error messages issued by the catDB program when it encounters a problem sent to a file in the directory in which you are using the program. Press Enter to accept the default condition in which no error file is generated.
Setting : OptionsSettings
MERGE
MERGE combines two or more Catalyst databases.
Select from the following list for detailed information on individual options:
You can merge databases that contain 1D, 2D, and 3D data, 1D data only, 2D data only, 3D data only, or some other combination of the three. If the output database has 1D data, so must both of the input databases. The same applies for 2D and 3D data. The 1D properties in the two databases being combined must have the same data types. For example, if property A has a string data type in one database but an integer data type in the other database, catDB issues an error message and halts execution of the MERGE command. The database that results from the MERGE command contains the union of the properties in the two input databases.
Each compound in a database to be merged must have a unique name. If the same compound name exists in both databases being merged, the catDB program puts the first one encountered in outputDBname.bdb and displays a message on the screen for each duplicate thereafter. If you used the [errFile=errorfilename] option, the message is also sent to the error file you specified.
The function dictionaries (.chm files) belonging to each input database must be identical. If they are not, the catDB program issues an error message and stops combining the two databases.
Usage: catDB CPD CPDfilenames outputDBname.bdb [PropDict=propertydictionaryname.bpd] [APPEND] [ExistingConfs=Use|Discard] [BEST|FAST] [MaxConfs=n] [ConfBuffer=n] [No1D] [No2DIndex] [No3DIndex] [StartCref=n] [StartAfter=cmpd] [StopAfter=cmpd] [errData=errordatafilename] [errFile=errorfilename] [AllowNFS]
outputDBname.bdb For outputDBname.bdb substitute the name of a
configuration (.bdb) file you have created with the CONFIG command. If you
responded y to the CONFIG command's prompt, Database has 1D data?, and
you do not include the PropDict= option in your command statement, the catDB
program builds a default 1D database that includes only the default 1D properties,
name and molecular weight, and their associated values. The name assigned to
each compound is the name of the .cpd file, without a path name or extension.
For example, a compound created from .cpd file /home/joe/jklmn.cpd is named
jklmn. Note that although the name of the compound is also stored in the file, it is
ignored. This convention makes it easier to rename a compound by just renaming
the file, with no need to edit it.
Select from the following list for detailed information on individual options:
Usage: catDB MOL MOLfilenames outputDBname
outputDBname.bdb For outputDBname.bdb substitute the name of a
configuration (.bdb) file you created with the CONFIG command. If you responded
y to the CONFIG command's prompt, Database has 1D data?, and you do not
include the PropDict= option in your command statement, the catDB program
builds a default 1D database that includes only the default 1D properties, name
and molecular weight, and their associated values. The name assigned to each
compound is the name of the .mol file, without a path name or extension. For
example, a compound created from .mol file /home/joe/jklmn.mol is named
jklmn. Note that although the name of the compound is also stored in the file, it is
ignored. This convention makes it easier to rename a compound by just renaming
the file, with no need to edit it.
Select from the following list for detailed information on individual options:
Usage: catDB SD SDfilename.sd outputDBname.bdb [PropDict=propertydictionaryname.bpd] [APPEND] [ExistingConfs=Use|Discard] [BEST|FAST] [MaxConfs=n] [ConfBuffer=n] [No1D] [No2DIndex] [No3DIndex] [StartCref=n] [StartAfter=cmpd] [StopAfter=cmpd] [errData=errordatafilename] [errFile=errorfilename] [AllowNFS]
outputDBname.bdb For outputDBname.bdb substitute the name of a configuration (.bdb) file you have created with the CONFIG command. If you responded y to the CONFIG command's prompt, Database has 1D data?, and you do not include the PropDict= option in your command statement, the catDB program builds a default 1D database that includes the default 1D properties, name and molecular weight, and their associated values. The default 1D database also includes all properties in the .sd file that are defined in $CATALYST_CONF/Biocad.bpd or $CATALYST_CONF/Corporate.bpd plus the values for these properties. If a 1D property from the source .sd file is not defined in these property dictionary files or one you specify, the program issues a message and skips over it to continue processing with the next property. When the .sd file contains no values for a particular property, the catDB program assigns it a NULL value; that is, space is reserved for these values and you can enter them by editing a spreadsheet in Catalyst. For additional information on editing 1D data, see "Changing Values in a Spreadsheet."
Select from the following list for detailed information on individual options:
Usage: catDB SMILES SMILESfilename.smi outputDBname.bdb [PropDict=propertydictionaryname.bpd] [APPEND] [ExistingConfs=Use|Discard] [BEST|FAST] [MaxConfs=n] [ConfBuffer=n] [No1D] [No2DIndex] [No3DIndex] [StartCref=n] [StartAfter=cmpd] [StopAfter=cmpd] [errData=errordatafilename] [errFile=errorfilename] [AllowNFS]
SMILESfilename.smi For SMILESfilename.smi substitute the name of a SMILES file consisting of one or more lines of standard SMILES or Catalyst stereochemical SMILES input, each corresponding to a compound's topology. The name of each compound should follow its SMILES specification, separated by white space (a blank or tab character) on the same line. (See "Writing Catalyst's Stereochemically Extended SMILES" for details on the file format.)
outputDBname.bdb Substitute the name of a configuration file you have created
with the CONFIG command for outputDBname.bdb. If you responded y to the
CONFIG command's prompt, Database has 1D data?, and you do not include
the PropDict= option in your command statement, the catDB program builds a
default 1D database that includes only the default 1D properties, name and
molecular weight, and their associated values.
Select from the following list for detailed information on individual options:
Usage: catDB SPST_CREATE spreadsheetname.spst outputDBname.bdb directory fileformat [PropDict=propertydictionaryname.bpd] [APPEND] [ExistingConfs=Use|Discard] [BEST|FAST] [MaxConfs=n] [ConfBuffer=n] [No1D] [No2DIndex] [No3DIndex] [StartCref=n] [errFile=errorfilename] [AllowNFS]
Select from the following list for detailed information on individual options:
Usage: catDB TPL TPLfilenames outputDBname.bdb [PropDict=propertydictionaryname.bpd] [APPEND] [ExistingConfs=Use|Discard] [BEST|FAST] [MaxConfs=n] [ConfBuffer=n] [No1D] [No2DIndex] [No3DIndex] [StartCref=n] [StartAfter=cmpd] [StopAfter=cmpd] [errData=errordatafilename] [errFile=errorfilename] [AllowNFS]
TPLfilenames For TPLfilenames substitute the name of a file you have created that contains a list of the names of the individual .tpl files from which to generate your database. Each .tpl file name should appear on a separate line and precisely match the name of the file it represents. File names should be relative to the current directory and should include the .tpl suffix. Each .tpl file in the list of files should describe one compound; that is, the file should consist of a single topology and zero or more conformers (see "TPL File Format Description").
Select from the following list for detailed information on individual options:
Usage: catDB BPD DBname.bdb
A sample display resulting from executing the BPD command appears below.
You can copy the displayed text from your screen into a text editor or you can redirect the output of the command into a file by executing
from the UNIX command line. Then you can customize the property definitions to suit your needs, save the customized text as a file ending with a .bpd (for BioCAD property dictionary) file extension, and use this file to define the properties of a new database when you create it with one of the catDB commands for building databases.
Each line in a .bpd file must contain the following attribute fields separated by white space: property name, property type, schema, reference, special type, and description. Only the description field is permitted to have embedded white space. If the first character on a line is ! (exclamation point), the catDB program treats the line as a comment. Specifications for each property attribute field follow:
abs, acos, add, after, all, alter, and, any, arg, array, array1, array2, array3, as, asin, assertion, atan, authorization, automatic, avg, before, begin, between, buffersize, by, bytestring, cascade, case, cast, catalog, ceiling, char, character, check, cluster, coalesce, column, comment, commit, consistency, constraint, constraints, contain, contains, continue, corresponding, cos, cosh, count, create, current, csstring, cvb, cvbo, cvbs, cvco, cvcs, cvd, cvda, cvdp, cvdu, cve, cvi, cvl, cvli, cvlist, cvr, cvre, cvs, cvset, cvsm, cvti, cvts, cvu, database, date, datetime, day, dbfile, dec, decimal, default, delete, depth, desc, describe, displacement, distinct, div, do, does, domain, double, drop, duration, each, else, end, enumerated, escape, exact, except, exists, exp, expand, extend, false, fetch, file, fix, fixed, first, float, floor, for, forein, fraction, from, function, geq, grant, group, having, help, identified, immediate, in, index, inf, inherit, inner, insert, int, integer, intersect, interval, into, is, key, language, last, lb1, lb2, lb3, length, leq, level, like, list, ln, local, lock, log, log2, long, match, max, maxof, median, min, minof, minute, mod, modify, module, month, move, national, natural, neq, new, next, none, normalize, not, ntst, ntsta, null, nullif, numeric, object, of, off, on, only, option, or, order, outer, partial, precision, prefix, prepare, preserve, primary, privileges, procedure, program, public, random, read, real, reference, references, referencing, remove, rename, replace, revoke, rollback, row, rows, schema, second, select, sequence, session, set, similar, sin, sinh, size, smallint, some, sql, sqrt, statistics, string, substring, sum, synonym, system, table, tan, tanh, temporary, then, time, timestamp, to, transaction, trigger, true, truncate, tst, tsta, tuple, ub1, ub2, ub3, union, unique, uniquefileid, units, unlock, unset, update, user, using, value, values, varchar, varying, view, when, where, with, work, write, year, zero
STRING R.B.Woodward
Description. For this attribute, specify a short textual definition of the property.
Only the description field can contain white space (space characters and tab
characters).
propertydictionaryname.bpd For propertydictionaryname.bpd specify the
name of the property dictionary file containing the definitions of the properties you
want to add to your database. Note that you can add only those properties with a
SPECIFIC schema type; you cannot add to an existing database properties
whose schema is UNIVERSAL. For additional information on schema type, see
the description of the BPD command.

FLOAT 7.2
INTEGER 1043
DATE Jul-3-1993
ADD_PROPERTY
ADD_PROPERTY appends 1D properties to a database.
You can use the catDB BPD command to redirect another database's property definitions to a file with the .bpd (BioCAD property dictionary) extension, edit that file with a text editor so that only the definitions of the properties you want to add remain, and then supply that file as the property dictionary argument to the catDB ADD_PROPERTY command.
After you have successfully executed the ADD_PROPERTY command, the catDB program displays a message listing the names of the properties that were appended to your database. You can also add properties to a database (that you own) in Catalyst by selecting the Edit Property Dictionary... command from the Databases menu in the Stockroom. See "Edit Property Dictionary" for details.
outputDBname.bdb For outputDBname.bdb specify the name of the configuration (.bdb) file for the database for which you want to create a 1D database.
Select from the following list for detailed information on individual options:
CREATE_1D
CREATE_1D constructs a 1D database for a corresponding 2D/3D database.
Usage: catDB REPAIR_DB DBname.bdb [StartCref=n] [No1D] [No2DIndex] [No3DIndex]
DBname.bdb For DBname.bdb substitute the name of the configuration file for
the database whose crefs you want to shift.
Select from the following list for detailed information on individual options:
The REPAIR_DB command is useful for reassigning crefs when you have databases that you want to combine but cannot because their crefs are not disjoint (and thus would not be unique after you merged the databases). A typical set of steps for "repairing" such databases follows:
Usage: catDB RECALC DBname.bdb [AllowNFS]
DBname.bdb For DBname.bdb substitute the name of the configuration file of
the database for which you want to recompute indexing structures.
When you invoke this command statement, the program displays a message informing you that 2D and 3D index files (with names of the form DBname.IDnumber.2bdb and DBname.IDnumber.3bdb) exist and prompts you to confirm their deletion or cancel the recalculation. Type y for yes or n for no and press the Enter key after each prompt. The catDB program must delete the 2D and 3D index files before recomputing the indexing structures. This process is time-consuming, so the prompts give you the opportunity to cancel it, if it is currently inconvenient.
SD_UPDATE
SD_UPDATE replaces 1D data in a database with new values from an MDL .sd compound file.
Select from the following list for detailed information on individual options:
The SD_UPDATE command adds only the 1D property data in the .sd file to the target database. The command skips over all compound data that does not exist in the 2D/3D portion of the database. (That is, if for some reason a compound was not built due to an error in structure generation, the compound is not in the database and the SD_UPDATE command skips it.) If you use the errData= option, all records that are skipped are written to the error data file you specify. Definitions for all 1D properties must exist in the 1D database component or in a property dictionary (.bpd) file specified with the PropDict= option.
Usage: catDB SPST_UPDATE spreadsheetname.spst outputDBname.bdb [PropDict=propertydictionaryname.bpd]
outputDBname.bdb For outputDBname.bdb specify the name of the
configuration file for the database containing the data to be replaced by the data
in your spreadsheet file.
The SPST_UPDATE command is similar to the Commit Changes to Database command in Catalyst, but changes 1D data in only one database at a time. NULL values (those for which an empty cell exists) in the spreadsheet file do not overwrite values in the database; that is, the values in the database are unchanged by NULL values in the spreadsheet file. To use the SPST_UPDATE command you must build a spreadsheet in Catalyst and export it as a spreadsheet (.spst) file. For additional information on editing 1D data, see "Changing Values in a Spreadsheet."
For additional information, see "PropDict=".
UPDATE
UPDATE replaces 1D data in a database with new values from an extended spreadsheet (.esp) file derived from the database.
extendedspreadsheetname.esp For extendedspreadsheetname.esp
substitute the name of the file containing the "dirty" 1D values (values in cells that
Catalyst turns gray once you edit them) with which you want to update a
database.
The UPDATE command is similar to the Commit Changes to Database command in Catalyst, but it changes 1D data in only one database at a time. NULL values (those for which an empty cell exists) in the extended spreadsheet file do not overwrite values in the database; that is, the values in the database are unchanged by NULL values in the extended spreadsheet file. To use the UPDATE command you must build a spreadsheet in Catalyst and export it as an extended spreadsheet (.esp) file. For additional information on editing 1D data, see "Changing Values in a Spreadsheet."
DELETE
Commands for Deleting Compounds, Properties, and Databases
DELETE removes the compounds listed in an extended spreadsheet (.esp) file from a database.
extendedspreadsheetname.esp For extendedspreadsheetname.esp
substitute the name of an extended spreadsheet (.esp) file containing the
compounds that you want to remove from a database. Note that you must supply
the .esp extension portion of your extended spreadsheet's file name when you
use the DELETE command.
Follow this procedure to delete compounds and all their associated data from a database using an extended spreadsheet (.esp) file:
When you execute the catDB DELETE command, for each compound in the .esp file, it removes the corresponding compound and all of its associated data from the database. When you next open a spreadsheet that was derived from the database before you deleted compounds from it (that is, the spreadsheet data is older than the changed data in the database), Catalyst displays an Alert message informing you that changes have been made to the source data and that the spreadsheet is no longer usable.
SPST_DELETE
SPST_DELETE removes the compounds listed in a spreadsheet (.spst) file from a target database that includes 1D data containing corresponding compound names.
outputDBname.bdb For outputDBname.bdb substitute the name of the
configuration (.bdb) file for the database from which you want to delete
compounds. This database must contain 1D data that includes the names of the
compounds to be deleted.
Follow this procedure to delete compounds and all their associated data from a database using a spreadsheet (.spst) file:
When you execute the catDB SPST_DELETE command, for each compound in the .spst file, the catDB program removes the corresponding compound and all of its associated data from the target database. When you next open a spreadsheet derived from the database before you deleted compounds from it, Catalyst displays an Alert message informing you that changes have been made to the source data and that the spreadsheet is no longer usable.
Usage: catDB DELETE_PROPERTY outputDBname.bdb propertydictionaryname.bpd
propertydictionaryname.bpd For propertydictionaryname.bpd specify the
name of the property dictionary file containing the definitions of the properties you
want to remove from your database. Note that you can delete only those
properties with a SPECIFIC schema type; you cannot delete properties whose
schema is UNIVERSAL. For more information on schema types, see the
description of the BPD command.
Use the catDB BPD command to display a list of properties in the database and redirect the output of the command into a file by executing
from the UNIX command line. Load the resulting text file into a text editor and edit the text so that only the definitions of the properties you want to delete remain, save the resulting text as a file with the .bpd (BioCAD property dictionary) extension, and supply this file as the property dictionary argument to the catDB DELETE_PROPERTY command.
After you have successfully executed the DELETE_PROPERTY command, the catDB program displays a message listing the names of the properties that were deleted from your database. You can also remove properties from a database (that you own) in Catalyst by selecting the Edit Property Dictionary... command from the Databases menu in the Stockroom. See "Edit Property Dictionary" for details.
DELETE_DB
DELETE_DB removes from disk all of the data files for the specified database.
Select No1D for detailed information on this option.
When you execute the DELETE_DB command, the catDB program displays a message to remind you that all the data in your database will be expunged and prompts you to confirm the deletion. (The only way you can recover a database that you have deleted is from a backup tape that was made sometime before you executed the DELETE_DB command. Thus, it is a good idea to check with your system administrator about backup schedules before you delete a database containing data that might be difficult to restore if you needed to.) If you type n for no and press the Enter key, the program halts and displays a message indicating that your database was not deleted. If you type y for yes and press the Enter key, the program displays messages advising you when each data file has been removed and when the deletion is complete. The DELETE_DB command does not delete the database's configuration (.bdb) file or the feature dictionary (.chm file). Use the UNIX rm (Remove) command to delete these files.
Note: You should always use the DELETE_DB command when you want to remove a database's data files; use the UNIX rm command to delete only the configuration (.bdb) and the feature dictionary (.chm) files. Attempting to delete data files manually with the UNIX rm command could result in unpredictable results and is not recommended. Only the DELETE_DB command and the DELETE_DB_1D command can remove a database's 1D data.
If you receive an error message that the program cannot lock the database for deletion, it means that it is currently in use in an active Catalyst session. Exit the Catalyst session or deactivate the database with the Deinstall command from the Databases menu in the Stockroom to enable the DELETE_DB command.
DELETE_DB_1D
DELETE_DB_1D removes the 1D property data from the specified database.
hostmachinename For hostmachinename substitute the name the computer on which the 1D data resides. Use the INFO command to display the name of the host computer for the database's 1D data. See INFO for a description of this command and its output.
ID For ID substitute the database's identification number. To find out the
identification number, use the INFO command.
When you execute this command, the program displays a message advising you that the 1D database is being deleted and another when the operation is complete.
INFO
INFO compiles and displays on the screen the following information about a Catalyst database: 1D database identification number and location; total number of compounds, 1D properties and their definitions, and conformer database location.
Detail displays the following additional information about a database: lowest and
highest cref value, total number of fragments, total number of conformers,
average molecular weight, average number of conformers per fragment, and the
name of the last compound in the database (which can be useful in restarting a
database construction process that has terminated abnormally before
completion).
If you want a paper copy of information about a database you can redirect the output to a file by typing
and pressing the Enter key.
Executing the INFO command with the Detail option builds data files (info.unsat, info.tot, info.stereo, info.rot, info.mw, info.hetero, info.heavy, info.endo, info.confs, info.carbon) with which you can display graphs. To display the graph for an individual file, on the UNIX command line type
substituting the name of the file for filename. Then press Enter. To display all graphs, on the UNIX command line type
and press Enter. Then at the ? prompt, type
and press Enter. At the next ? prompt, type
and press Enter. Note that your UNIX DISPLAY environment variable must be set to your machine if you are running the catDB program on a remote machine.
VALIDATE
VALIDATE verifies the 2D/3D structural integrity of all compounds in a database.
DBname.bdb For DBname.bdb substitute the name of the configuration file of
the database containing the 2D/3D structures you want to verify.
The name of any chemically invalid compound is displayed on the screen, and the corresponding compound is deleted from the database.
PropDict=
PropDict=propertydictionaryname.bpd specifies a property dictionary for a new database you are creating or a database to which you are appending data.
Use this option if you want your new database or the target database to which you are appending data to have the default 1D properties (name and molecular weight) and those contained in the property dictionary (.bpd) file you specify. The resulting database will contain the union of the default properties, the 1D properties defined in the specified property dictionary file, and the properties in the databases referred to in your source file. For non-SD and non-SPST files, all properties except the default properties are given NULL values; that is, space is reserved for these values and you can enter them by editing a spreadsheet in Catalyst. For additional information on editing 1D data, see "Changing Values in a Spreadsheet." The default value for PropDict= is NONE.
ExistingConfs=
ExistingConfs=Use|Discard controls the treatment of conformers from input files. Either Use or Discard must be specified; the default value is Use. The ExistingConfs= option applies to the following catDB commands: CPD, CREATE, MERGE, MOL, SMILES, SD, SPST_CREATE, and TPL.
Discard causes catDB to generate a new conformational model for each
compound in the database being created, appended to, or merged, and to ignore
all existing conformers for each input compound. That is, only a compound's
connectivity information is retained and used for conformational analysis. Discard
prevents incorrect conformers from a source database from getting into a new
database. By default, Discard is not in effect. To activate this value for the
ExistingConfs= option, you must explicitly include it in your catDB command
statement.
Note that if no conformational model exists for a compound, or if you specify Discard for the ExistingConfs= option and elect not to embellish conformational models when constructing a database, the catDB program issues a warning message and generates one conformer.
MaxConfs=
n sets the maximum number of conformers generated. The default value is 100.
ConfBuffer=
ConfBuffer=n specifies the number of conformers that are buffered in memory before being written out to disk. The ConfBuffer= option applies to the following catDB commands: CPD, CREATE, MERGE, MOL, SMILES, SD, SPST_CREATE, and TPL. BEST
The BEST and FAST options control the type of conformational analysis for generating conformational models. By default, the FAST option is in effect unless you explicitly specify otherwise. To specify best-quality conformational analysis, you must include BEST in your catDB command statement. BEST and FAST apply to the following catDB commands: CPD, CREATE (when not used interactively), MERGE, MOL, SMILES, SD, SPST_CREATE, and TPL.
FAST
FAST specifies fast-quality conformational analysis. Fast-quality conformational analysis requires five to ten times less processing time than BEST. For detailed information on fast and best conformational analysis, see "Which Type of Generation to Use."
By default, the APPEND option is not in effect. That is, to add data when building or merging a database, you must explicitly include the APPEND option in your catDB command statement or answer yes to the Are you appending data to an existing database? prompt from the CREATE command.
If the StartCref= option has been specified, the program uses its value unless there is a potential conflict with crefs (unique compound identifiers) already in the database, in which case catDB issues an error message to that effect and halts. If you do not specify the StartCref= option, catDB sets it to a cref that is larger than all crefs in the target database.
The APPEND option applies to the following catDB commands: CPD, CREATE (when not used interactively), MERGE, MOL, SMILES, SD, SPST_CREATE, TPL.
No2DIndex
No2DIndex suppresses the generation of 2D indexing information that makes database searching more efficient. By default this option is not in effect. You must explicitly specify it by including it in your catDB command statement or by answering yes to the CREATE command's Skip generation of 2D indexes? prompt. Use this option to defer index 2D index construction to a more convenient time when building large databases. The No2DIndex option applies to the following catDB commands: CPD, CREATE, MERGE, MOL, REPAIR_DB, SMILES, SD, SPST_CREATE, and TPL.
No3DIndex APPEND
APPEND adds data from source files to a target database. StartCref=
StartCref=n specifies the integer with which the catDB program begins assigning unique internal identification numbers (crefs, pronounced see´ refs) to each compound in a database. By default, the catDB program starts cref numbering with 1. The StartCref= option applies to the following catDB commands: CPD, CREATE (when not used interactively), MOL, REPAIR_DB, SMILES, SD, SPST_CREATE, TPL.
Note: If you anticipate that a database will be merged with another database, you
must specify a starting cref number that avoids duplicating crefs in the other
database. See "Building a Multiconformer Database" for examples of using the
StartCref= option in scripts that control building conformational models in
parallel.
StartAfter
StartAfter=cmpd specifies the input file compound with which the CPD, MOL, SD, SMILES, or TPL command should begin processing. The default is to begin with the first compound in the source file. See "Building a Multiconformer Database" for examples of using the StartAfter= and StopAfter= options in scripts that control building conformational models in parallel.
StopAfter
cmpd can be the compound's name enclosed in double quotation marks; for
example, StartAfter="XYZ52137" instructs the program to begin with the
molecule called XYZ52137. You can also specify the position in at input file at
which to begin by substituting a single quotation mark, a dollar sign, an integer,
and another single quotation mark for cmpd. For example, StartAfter=´$1000´
begins processing with the one-thousandth compound in the source file.
StopAfter=cmpd specifies the input file compound with which the CPD, MOL, SD, SMILES, or TPL command should quit processing. The default is to halt after the last compound in the source file.
errfile=
errFile=errorfilename creates a file to hold error messages issued during execution of catDB commands. Unless you specify otherwise, the file name is the name of the database configuration file with a .err extension. The program issues an error message for any compound that is invalid, has problems with conformational model generation, or has a duplicate name. The default value for errFile= is NONE.
errData= NoID
No1D prevents the generation of all 1D property data for all compounds in a database. You can also use this option to preserve the 1D component of a database when you use the DELETE_DB command to remove other database components (conformational models, 2D indexes, 3D indexes, and the feature dictionary). The No1D option applies only to the following commands: CPD, DELETE_DB, INFO, MERGE, MOL, REPAIR_DB, SD, SMILES, TPL. By default, No1D is not in effect.AllowNFS
AllowNFS enables writing to database files across NFS-mounted disk partitions even though such files cannot be locked for writing. Data corruption can result from this unsafe practice because files can be accessed by other processes. Consult with MSI Scientific Support before using this option. The AllowNFS option's default condition is off. Although not recommended, AllowNFS can be used with the following commands: CPD, CREATE, MERGE, MOL, RECALC, SD, SMILES, SPST_CREATE, TPL.
The system file xdbConfigList determines which databases are automatically installed when you start your Catalyst session. Catalyst searches the directory from which you invoked it for an xdbConfigList file and installs any databases specified in it if they are not already installed. Then Catalyst installs any databases specified in the xdbConfigList file stored in the $CATALYST_CONF directory if they are not already installed. The entry for the StockroomDB must be present in every xdbConfigList file, whether local or in $CATALYST_CONF.
The xdbConfigList file format is
DBname:DBtype:DBpath:DBAccessList
DBname For DBname substitute the name of the database.
DBtype DBtype is always Biocad. The catEdit program automatically supplies
this specification.
xdbConfigList File
Adding or Changing Databases Automatically Loaded into Catalyst
DBAccessList DBAccessList is always * (asterisk).
An example of a $CATALYST_CONF/xdbConfigList file your system administrator might create to make a corporate database available company-wide follows:
StockroomDB:Biocad:catdata/StockroomDB.bdb:*
DBname1:Biocad:DBpath:*
DBname2:Biocad:DBpath:*