[Top] [Index]

View Database Workbench - How To


Select a Database Subject:

Functional areas of the workbench

Introduction to the View Database workbench

How to use the workbench

Menus and command descriptions:

[Workbench] [Data] [Edit] [ReportStyle] [Tools] [Databases] [Windows]

View Database Workbench Functional Areas

Introduction to Catalyst Databases

Categories of Data

In Catalyst you can save compounds, hypotheses, and spreadsheets. They can be exported to special ASCII files (see "To Export an Object"), or they can be stored in a Catalyst database. There are two types of databases that Catalyst uses to store information: source databases and the StockroomDB.

Source databases. Source databases contain three categories of data for a set of compounds:

With Catalyst you can search corporate and project databases for compounds that match the 1D, 2D, and 3D criteria of your hypotheses, create and update your

own source databases, and print reports of retrieved data in a variety of formats. You can also convert databases in standard formats such as SMILES, MDL MOL, and MDL SD to Catalyst format with the catDB utility program. See "catDB Database Management" for more information about converting and modifying source databases.

StockroomDB. The StockroomDB is your special database that contains all objects (compounds, hypotheses, spreadsheets, labs, and databases) that are on the shelf of your Catalyst Stockroom. You can search or browse the compounds in the StockroomDB as you would any source database.

Introduction to 1D Data

Property or 1D data refers to numerical or textual values associated with the defined attributes of compounds and hypotheses. Property data for compounds are stored in Catalyst databases and in saved spreadsheets. The 1D query component of a hypothesis is stored with the hypothesis.

Property Attributes. Each property has a set of attributes: its name, data type, schema (storage type), reference, special type, and description. The following table shows some typical property attributes.

Property Name


Data Type


Schema


Reference


Special Type


Description

Name

STRING

UNIVERSAL

QUICK_REF

Compound_Name

Internal name

MolWt

FLOAT

UNIVERSAL

QUICK__REF

MolWt

Molecular weight

Chemist

STRING

UNIVERSAL

QUICK_REF

NULL

Lead chemist

TestDate

DATE

SPECIFIC

SLOW_REF

NULL

12-13-88

Each database contains a property dictionary describing the properties for compounds in that database. In addition, there are global property dictionary files: $CATALYST_CONF/Biocad.bpd contains a noneditable list of compound properties specific to Catalyst; $CATALYST_CONF/Corporate.bpd contains a list of compound properties specific to your organization. For information on

To Specify Searching Criteria for 1D Property Data

You can use the Create/Edit Hypothesis 1D Properties... command to specify searching criteria for the values of 1D properties such as name, chemist, molecular weight, and so on. For example, you can use it to create a 1D hypothesis with which you can search a database or a spreadsheet for all compounds that fall between a particular range of molecular weights. Moreover, you can edit a 2D/3D hypothesis and define 1D searching criteria for it so that you can then use it to search a database or spreadsheet for compounds that match both its 1D and 2D/3D specifications. To create or edit 1D searching criteria for a hypothesis:

  1. In the View Database or View Hypothesis workbench ensure that the databases you want to search are on the shelf (so that Catalyst can generate a complete list of properties from which you can select). Also make certain that any 2D/3D hypotheses with which you want to search are also on the shelf.

  2. To create or edit 1D searching criteria for a hypothesis, select it on the shelf and then select Create/Edit Hypothesis 1D Properties... from the Edit menu. To create a new 1D-only hypothesis, make certain no hypotheses are selected, and then select the Create/Edit Hypothesis 1D Properties... command. Catalyst displays the Create/Edit Hypothesis 1D Properties dialog box shown in the illustration that follows.

  3. Click on the Property Dictionary... button to display the Edit Hypothesis 1D property dictionary dialog box.

  4. Click on the P next to the property whose values you want to restrict to see a description of the property. Drag and drop the property icon into one of the Property fields in the Create/Edit Hypothesis 1D Properties dialog box to add it to the hypothesis. Alternatively, you can type in the name of a property in a Property field.

  5. Make your specifications for the fields or buttons as described below:

Hypothesis. The default value for this field is the name of the previously selected hypothesis or the name of the hypothesis in the workspace of the View Hypothesis workbench if there are no icons selected and there is only one hypothesis in the workspace. If you have no hypotheses selected and are creating a 1D hypothesis, Catalyst supplies a default name of the form Local-n (where n is an integer) and displays a local hypothesis icon on the shelf after you select the Save button.

Min Fit for Search (Max n.n). When searching a database with a hypothesis with weighted features, the value in this field specifies the threshold at which a compound's fit value qualifies as a match. Catalyst considers only mappings involving all weighted features during a database search. The lowest possible (and default) Min Fit for Search value is 0.0. The maximum possible fit value is the sum of the weights of all features; Catalyst computes the maximum value for the hypothesis you selected and substitutes it for the n.n in (Max n.n) next to the Min Fit for Search box. Note that the value you set in this field is functional only during database or spreadsheet searches.

The value you specify for Min Fit for Search controls the number of hits returned from a database search. Higher values specify a more discriminating search (and fewer matches) while lower values return more hits. If the hypothesis with which you are searching can estimate activities, there is a relationship between Min Fit for Search and estimated activity: the maximum value displayed in the dialog box is

-log10 (highest possible estimated activity)

Higher fit values correlate with better estimated activity. Each unit decrease in the value specified for Min Fit for Search allows compounds with estimated activities that are one order of magnitude poorer to match the hypothesis. Specifying a Min Fit for Search of zero would return those compounds that map all functions in the hypothesis and cover the entire range of estimated activities; that is, the maximum number of matches. Note that the estimated activities that Catalyst computes during a database search are approximations and should be considered as relative or qualitative values.

Property Dictionary... button. Clicking on this button displays the Edit Hypothesis 1D property dictionary dialog box from which you can select properties to include as searching criteria for your hypothesis. The scrolling list of property names is derived from all databases on the current workbench shelf, all databases with at least one compound in the current report in the View Database workbench, and the global property dictionary files, Biocad.bpd and Corporate.bpd. When you click on a property icon a brief description of the property appears in the Description scrolling window, and the property dictionary from which the property is derived is displayed under Dictionary. The property's data type is displayed under Type.

Add Row button. Clicking on this button creates another row of boxes and buttons below the last property search criterion so that you can specify additional 1D requirements and restrictions.

( field. If you want to group a searching criterion so that it is evaluated before another, type a ( (left parenthesis) in this field to mark the beginning of the statement. To delimit the end of the statement, type a ) (right parenthesis) in the ) field. Catalyst evaluates the specification within a pair of parentheses first. For example,

specifies compounds that have both a molecular weight under 500 and a value under 100 for Activity_3, or that have a value under 5 for Activity_1. An example that means, "Find all compounds of low molecular weight that have shown binding at one of the beta receptors" would be:

Note that your specification must contain matched parentheses. If it does not, Catalyst displays an alert box prompting you to match them.

Min field. When you specify a range of values for a 1D data search, type the value for the lower boundary in this field. For example, the specification

locates compounds with pKas between 5 and 7. The 5 in the Min field sets the lower boundary. The Min field also applies to alphanumeric characters, which are evaluated according to the ASCII collating sequence 0-9, A-Z, a-z. The following entry would locate compounds synthesized by four different departments (A706 through A709):

= button. Click on this button to display a popup menu of arithmetic/lexical function symbols with which you can further restrict your 1D search criteria. These functions are defined as follows:

= function. The equals function locates values that are the same as the value you specify for a given property. The equals function is the default value for 1D property searching criteria. When searching for string type data (alphanumeric characters) this function finds only matching strings. For example, specifying Sam for a property called Chemist and searching with the equals function locates all instances of Sam, SAM, sam, and so on; the function is not case sensitive. This query will not find any matches when the value of the Chemist property is Samson; use the approximately equals function when you want to search for partial strings.

not equal to function. For the specified 1D property, the not equal function locates all values that are not the same as the one you set.

< function. The less than function finds all values that are smaller than the one you specified for a 1D property.

less than or equal function. The less than or equal function finds all values that are the same or smaller than the one you specify for a given property.

> function. The greater than function locates all values that are larger than the one you specify.

greater than or equal function. The greater than or equal function finds all values that are the same or larger than the one you specify.

the approximately equal function. For a given property with a string data type (not numeric types), the approximately equal function locates all strings in which the value you specify occurs. For example, specifying Sam for a property called Chemist and searching with the approximately equal function locates all instances of Samson, Samantha, Sam, SAM, sam, and so on; the function is not case sensitive.

wildcard characters. You can use the % (percent sign) as a wildcard character to represent unknown characters; for example, specifying a searching string of a%b will locate all character strings that begin with a and end with b. The % wildcard character can also locate substrings within strings: specifying %amino% will find all character strings containing an embedded amino substring. Use the _ (underscore) to represent any one character in a search; for example, %a_b% will locate xyzaNb, but not xyzaNNb. If you want to find a string that includes either the percent or underscore characters, type a \ (back slash) before them in your searching specification. For instance, %a\%b% will find all occurrences of the substring a%b.

Property field. This field is for the name of the 1D property that is part of the search criterion that you are defining. In addition to dragging a property icon from the Edit Hypothesis 1D property dictionary dialog box and dropping it in the Property field as described in steps 3 and 4 above, you can also specify a property by clicking in the field and typing a property name. Valid property names include those defined for all databases on the current workbench shelf, all databases with at least one compound in the current report, and the global property dictionary files, Biocad.bpd and Corporate.bpd.

Max field. When you specify a range of values for a 1D data search, type the value for the upper boundary in this field. For example, the specification

locates compounds with pKas between 5 and 7. The 7 in the Max field sets the upper boundary. The Max field also applies to alphanumeric characters, which are evaluated according to the ASCII collating sequence 0-9, A-Z, a-z. The following entry would locate compounds synthesized by four different departments (A706 through A709):

) field. When grouping searching criteria so that one restrictive statement is evaluated before another, type a ) (right parenthesis) in this field to mark the end of the statement. To delimit the beginning of the statement, type a ( (left parenthesis) in the ( field. See the description under "( field" above for examples of grouped statements.

Boolean button. Click on this button to display a menu of logic operators with which you can further restrict your 1D search criteria. The NOT operator takes precedence over the AND operator, which in turn takes precedence over the OR operator. These Boolean operators are defined as follows:

AND operator. For statements A, B, and C, the AND of A, B, and C is true if all statements are true, false if any statement is false.

OR operator. For statements A, B, and C, the OR of A, B, and C is true if at least one statement is true, false if all statements are false.

NOT operator. For a statement A, the NOT of A is true if A is false, false if A is true.

Save button. Click on this button when you have finished creating or editing 1D properties. Catalyst assigns a name to the hypothesis of the form hypothesisname-1D-n where hypothesisname is the name of the hypothesis you edited and n is an integer. If you created a 1D-only hypothesis, the name is of the form Local-n. Catalyst places a local hypothesis on the shelf of the currently open workbench. You can use the hypothesis with the Fast Flexible Search Databases/Spreadsheets or Best Flexible Search Databases/Spreadsheets command from the Tools menu to locate compounds that match your 1D criteria, or you can edit the 1D properties further. If you want to retain the hypothesis for future use, be certain to select the hypothesis and then the Save To Lab As... command from the Data menu before you dispose of the current workbench, and finally select the Save StockroomDB command from the Stockroom's Data menu before you exit Catalyst.

Cancel button. Click on this button to close the dialog box without making any changes.

Help button. Click on this button to display the Help window.

To Edit a Spreadsheet

Changing Values in a Spreadsheet

For a spreadsheet derived from databases you created, you can change values for editable properties such as Activ, Chemist, CAS_num, and so on. See "Creating Databases" for more information on creating your own databases. Follow these steps to change values for editable properties:

  1. With the spreadsheet in the report area, select the value you want to change by clicking on it. Catalyst displays a dashed yellow outline around the cell containing the value and also displays the value in the Edit box above the report area. If the cell contains a value with more characters than can be displayed in the report style or if you want to enter a multiline value, see "Editing Multiline Values in a Spreadsheet".

  2. To enter a new value into an empty cell, or to replace all of the characters in the selected cell, just start typing after selecting the cell. Alternatively, to edit some characters in a cell, click in the Edit box, press the Backspace or Delete key to remove unwanted characters, and then type in a new value. If you did not create the database from which the spreadsheet is derived, you do not have write access to the spreadsheet and cannot change values in it. Attempting to change such values causes Catalyst to beep. A property's data type determines which characters are valid entries for a property value:

    Integer data type. Numerals 0 through 9.

    Real data type. Numerals 0 through 9, e, - (minus), + (plus), and . (decimal point).

    String data type. All printable characters.

    Date data type. Alphanumeric characters A through Z; a through z: 0 through 9: - (hyphen); , (comma); and space. Catalyst supports the following date formats in which you should substitute the day, month, and year for d, m, and y:

    mm-dd-yyyy             1-23-1994
    mmm-dd-yyyy            Jan-23-1994
    mmmmmmmmm dd, yyyy     January 23, 1994
    dd-mm-yyyy             23-1-1994
    dd-mmm-yyyy            23-Jan-1994
    dd mmmmmmmmm yyyy      23 January 1994

    Only one date format is allowed per Catalyst session.

    Note: Default names for the months of the year are stored in $CATALYST_CONF/month.data so that the natural language in which they are expressed can be changed (for example, janvier for January, février for February, and so on). Since Catalyst first looks for the month.data file in the directory in which you invoked Catalyst, then in your home directory, and finally in $CATALYST_CONF, you can make your own customized file, store it in the directory from which you use Catalyst or in your home directory, and use it instead of the default version in $CATALYST_CONF.

    A quick way to find out a property's data type is to select Sort by Property from the Tools menu to display the Sort by Property dialog box and click on the P icon next to the name of the property. Catalyst displays the data type in the box under Type. Select Cancel to close the dialog box.

  3. After typing in a new value for a property, press the Enter or Return key to transfer the new value to the report area. Catalyst makes the cell containing the new value gray ("dirty") to indicate that it is different than the one in your source database. Catalyst also moves the dashed yellow outline to the next cell for the same property.

    Note on Navigation Keys: Once you have selected a value by clicking on it, you can use keystrokes to move the dashed yellow outline over other cells in the spreadsheet. Activate navigation keys by clicking in the Edit box. On Silicon Graphics machines, the arrow keys advance the dashed yellow outline in the report area to the left, right, up, and down respectively. In addition, pressing the Tab key moves the dashed yellow outline to the right and pressing the Shift and the Tab keys simultaneously moves it to the left.

  4. If you want to change values for a property not displayed by the currently active report style, double-click on the name of a property in a variable field (such as Activ in Compound-Table report) in the report area. Catalyst displays the Change Report Property dialog box containing a scrolling list of the property names for the spreadsheet's source databases as well as for all databases with icons on the current workbench shelf and the global property dictionary files $CATALYST_CONF/Biocad.bpd and $CATALYST_CONF/Corporate.bpd.

  5. Click on the P icon next to the name of the property whose values you want to change, and then select the Change button. Catalyst substitutes the property and its associated values for the property whose name you doubled-clicked on. (Data in any edited gray cells for the previously displayed property are not lost; they are just no longer displayed.)

  6. Select and edit the property's values as described in steps 1 through 3 above. You can double-click on this property's name after you have finished editing values and change to another editable property for further revisions.

  7. To remove a compound and its associated data from a spreadsheet report, select the compound you want to delete by clicking on its number in the Row column of a Compound-Table report or double-clicking on a 2D or 3D drawing of it in any of the other report styles that come with Catalyst.

  8. After selecting a row as described above, select Clear Selected Report Rows from the Edit menu. Catalyst removes the compound and all of its associated data from the copy of the spreadsheet in the report area.

  9. To save this edited copy of the spreadsheet without changing the values in the original spreadsheet, select Save Report To Lab As Spreadsheet... from the Edit menu, specify a unique name, and select the Save button. For detailed information on the Save Report To Lab As Spreadsheet... command, see "Saving Hit Compounds as a Spreadsheet." To make your changes apply to the original spreadsheet, select the Save Report To Lab As Spreadsheet... command, leave the spreadsheet's name unchanged and select the Save button. Catalyst displays an Alert box notifying you that a spreadsheet of the same name exists and requests confirmation to overwrite it.

  10. To keep the contents of your saved spreadsheets consistent with the values stored in your databases, you should always execute the Save StockroomDB command after you have used the Commit 1D Changes To Database command to update a database. In this way, your saved Stockroom files will reflect the changes you made to the 1D values in your databases.

Editing Multiline Values in a Spreadsheet

When you need to add or edit multiline values in a spreadsheet, you must use the Large Data Editor window as follows:

  1. Click in the cell containing the value you want to edit. Catalyst displays a dashed yellow rectangle to indicate that the cell is selected.

  2. Click on the Large Editor button to the right of the Edit box to display the Large Data Editor window. Catalyst shows the cell's data in this window.

  3. Click in the Large Data Editor window next to the characters you want to edit or where you want to insert new values. Use the Backspace or Delete key to remove characters and type in your new text. Press the Enter or Return key to start new lines of text. You can use the Tab and arrow keys on your keyboard to move the cursor among the characters in the Large Data Editor window. To navigate among cells in the spreadsheet, click on the buttons in the Large Data Editor window; Catalyst moves the dashed yellow selection rectangle to the appropriate cell and displays its contents in the window.

  4. When you have finished editing values in the cell, click on the Update button to put your edited changes in the spreadsheet. To remove the Large Data Editor window, click on the Dismiss button. To make your changes take effect in the spreadsheet's source database, see "To Update 1D Values in a Database".

To Update 1D Values in a Database

To update 1D values in a database that you created (i.e. to which you have write access), first make your changes in a spreadsheet derived from it (by searching/browsing in the View Database workbench or by sampling a compound from the StockroomDB to the Generate Hypothesis workspace), save the changed spreadsheet with the Save Report To Lab As Spreadsheet... command, and follow the procedure below to use the Commit 1D Changes To Database command. For additional information on changing values in a spreadsheet, see "To Edit a Spreadsheet".

  1. Make certain you have saved the StockroomDB by selecting the Save StockroomDB command from the Data menu in the Stockroom.

  2. Drag and drop the icon of the spreadsheet containing the values with which you want to update its source databases on the View Database tool.

  3. When the workbench opens, drag and drop the spreadsheet into the report area to see the current list of compounds and current gray cells containing changed data.

  4. From the Databases menu select Commit 1D Changes to Database. Catalyst matches the gray cells in your spreadsheet to the corresponding cells in the spreadsheet's source databases and replaces the database values with the ones in the gray cells. Catalyst also removes the gray from the "dirty" cells (the ones containing the values you changed, saved, and committed to the database) in this spreadsheet to show you that these values are now the ones in the database. Moreover, Catalyst updates the values in corresponding cells in other spreadsheets containing them unless those cells contain previously saved "dirty" data and are already gray.

    Note: Executing the Commit 1D Changes to Database command makes your changes take effect immediately; that is, you cannot "undo" this mass data update by exiting Catalyst without saving your work. It is also recommended that you dispose of open workbenches and save the StockroomDB after performing the Commit 1D Changes to Database command to ensure database/spreadsheet file synchronization.

To Change a Report's Layout

Customizing a Report of Spreadsheet Data

The report style you select determines the data that is displayed for compounds in the report area, but you can customize a report's layout by selecting different properties for variable fields. Most report styles contain both fixed and variable fields. Properties and their associated values that appear in fixed fields are always displayed. In the report styles that come with Catalyst, properties such as compound name and molecular weight are fixed for most styles. You can replace properties in variable fields with other properties selected from the Change Report Property dialog box. Variable fields are marked with square brackets [ ] in tabular style reports and with an icon (shown below) in other report styles. To specify different properties for display in a report:

  1. With your spreadsheet in the report area, select the style you want from the ReportStyles menu.

  2. Double-click on the name of a property in a variable field. Catalyst displays the Change Report Property dialog box containing a scrolling list of all the property names in the spreadsheet's source databases, all databases with icons on the current workbench shelf, and the global property dictionary files Biocad.bpd and Corporate.bpd.

  3. Click on the P icon next to the name of the property whose values you want to display in the report, and then select the Change button. Catalyst substitutes this property and its associated values for the one whose name you double-clicked on, and displays them in the report area. The values displayed for the previous property are not changed or lost, just not currently displayed.

Displaying Computed Property Data

In addition to the data associated with properties defined in the property dictionary file for a database, the $CATALYST_CONF/Biocad.bpd property dictionary file provides the following additional properties for which you can compute values: Fit (calculated fit), BestFit (best calculated fit), Est (estimated activity), BestEst (best estimated activity), NumFrags (number of fragments in compound), ConfModelSize (number of conformers in conformational model), NumAtoms (total number of atoms in compound), HeavyAtoms (number of nonhydrogen atoms in compound), CarbonAtoms (number of carbon atoms in compound), HeteroAtoms (number of heteroatoms in compound), RotBonds (number of rotatable bonds in compound), EndoCycBonds (number of endocyclic bonds in compound), UnsatBonds (number of unsaturated bonds in compound), KnownSC (number of known stereocenters in compound), Cref (compound's internal reference number in source database), DBName (name of source database containing compound).

To display values for computed properties:

  1. Change a property in a variable field to one of the computed properties listed above. See "Customizing a Report of Spreadsheet Data" for details on changing properties in reports. When a spreadsheet is in the report area, Catalyst displays the name of the computed property and its associated values for all computed properties except Fit, BestFit, Est, and BestEst. Otherwise, the values of the computed properties (except Fit, BestFit, Est, and BestEst) are displayed when you browse or search databases and spreadsheets.

  2. To compute values for Fit, BestFit, Est, or BestEst, you must have a spreadsheet with those properties in the report area and a hypothesis selected on the shelf before you select Compute Property... from the Tools menu. Catalyst displays the Compute Property dialog box.

  3. Select Fit, BestFit, Est, or BestEst. Catalyst highlights the property you select, displays a description of it, the name of the property dictionary in which the property is defined, and its data type. The name of the selected hypothesis appears in the Hypothesis text box.

  4. To change the hypothesis for which you want to compute Fit, BestFit, Est, or BestEst, click in the Hypothesis text box, use the Backspace key to remove unwanted characters, type in the name of another hypothesis, and press the Enter key.

  5. To compute values for the property, select the Compute button. While Catalyst is processing the computation, it displays a Busy box that indicates the number of compounds for which the calculation has been completed. The Busy box also contains a Stop button that halts what can be a time-consuming computation depending on the nature of the hypothesis and the compounds in the spreadsheet.

You can save spreadsheets containing data for computed properties, and you can also sort the compounds in them by their computed properties. For example, you can sort compounds in a spreadsheet in ascending or descending order by their estimated activities, fit values, molecular weights, and so on. See "Sort by Property..." for information on reordering compounds by their values for individual properties.

You cannot search for values of computed properties because they are not stored in the database, but rather are calculated when you search or browse it.

Introduction to the Workbench

You can use the View Database Workbench to browse and search for compounds in spreadsheets, databases, or in your StockroomDB. You can search through any database in your Stockroom at the start of your current session or that you have installed with the Install Database... command. For information on how to install databases, see "Install Database."

You can use the View Database workbench to browse through a database to see all the compounds, or you can search for compounds that meet specified criteria using a hypothesis. The database search process can use any hypothesis you built in the View Hypothesis workbench, QuickTool, or generated in the Generate Hypothesis workbench. In addition, you can use the View Database workbench to search for particular classes of compounds such as beta-lactams, compounds containing particular functional groups, or ones that have particular properties such as a positive charge or hydrogen bond donor properties, or that belong to a class of compounds with a particular type of activity.

To open a View Database workbench, double-click on the View Database tool in the tool bar or drop a database, spreadsheet, compound, or hypothesis icon on the View Database tool in the tool bar.

Tasks You Can Perform in the View Database Workbench

You can do the following tasks in this workbench:

To Browse a Database or Spreadsheet

The Browse Databases/Spreadsheets command lets you look at the contents of one or more databases (including the StockroomDB) and spreadsheets. To view the data in databases and spreadsheets:

  1. Ensure that the View Database workbench shelf contains the icons for the databases and spreadsheets through which you want to browse.

  2. From the ReportStyle menu, select the type of report you want. For more information on report styles, see "ReportStyle Menu."

  3. Drag and drop the icons for the databases and spreadsheets you want to view onto the report area of the workbench to begin browsing. Alternatively, select the database and spreadsheet icons on the shelf and then select the Browse Databases/Spreadsheets command from the Tools menu. Catalyst displays a Busy dialog box with a slider at the bottom indicating the progress of data retrieval.

    You can halt the retrieval process before it is complete by selecting the Stop button in the Busy dialog box, and Catalyst will display data for only those compounds it located before you selected Stop. Otherwise, when retrieval is complete, the data for the compounds appear in the report area and are displayed according to the style of report you selected.

    Note: The total number of compounds that the Browse Databases/Spreadsheets command retrieves is locally limited to the number specified in the Max Search Hits field of the View Database Options dialog box and globally limited to the number set in the Max Search Hits field in the View Database Preferences dialog box. The local limit takes precedence over the global. For information on

  4. Use the horizontal and vertical scroll bars to view any entries which are not visible. (Drag the left scroll bar down to move information in the report area up and display information at the bottom of the report page. Drag the horizontal scroll bar to the right to move information in the report area to the left and display material on the right side of the report page. Drag the right scroll bar down to display information that appears on succeeding report pages.)

  5. To display a compound from a Compound-Table or HypoGen report, double-click on its number in the Row column. From other report styles, double-click on a 2D or 3D drawing of the compound to display the dialog box. Catalyst brings up the Hit Mappings Row dialog box and displays the compound that you selected in it. For details on the Hit Mappings Row dialog box, see "Displaying a Hit Compound."

Note: Simultaneously browsing multiple databases and/or spreadsheets is a way to combine their contents into one spreadsheet. To find multiple occurrences of the same compound in the resulting spreadsheet, you can use the Sort by Property... command from the Tools menu to reorder the contents of the spreadsheet by name. Duplicate names will be next to each other in the Name field, and you can eliminate unwanted copies with the Clear Selected Report Rows command from the Edit menu.

Select one of the following for more information on

To Search a Database

Before using the View Database workbench to search a database or spreadsheet for compounds that fit a hypothesis, you must first create a hypothesis using the View Hypothesis QuickTool, the View Hypothesis workbench, or you can have Catalyst generate one using the Generate Hypothesis workbench. For details, see

For example, if you want to search a database for all compounds that contain a beta-lactam group, create a hypothesis consisting of a single beta-lactam functional group. If you want to search a database that contains the functional parts of an ACE antagonist, build a hypothesis that represents the distinguishing characteristics of ACE antagonists. If you want to search the database for other compounds that might exhibit the same type of activity as a series of compounds you have studied, use the Generate Hypothesis workbench to generate a hypothesis based on those compounds. Finally, if you want to search a database for values of "1D" properties such as name, molecular weight, and so on, use the Create/Edit Hypothesis 1D Properties... command from the Edit menu in the View Database workbench to construct a 1D hypothesis that you can use as a query for searching.

Differences between Fast and Best Flexible Searching

The concept of fast and best is a consistent theme in Catalyst algorithms. In general, fast algorithms are intended to give you an approximate solution to a problem quickly. If the solution is "interesting", you can apply the best algorithm to enhance your results. Putting it another way, fast and best algorithms offer trade-offs between speed and quality.

The Fast Flexible Search Databases/Spreadsheets command uses precomputed conformations to model the flexibility of a molecule during a search; the fast algorithm finds the best fit among existing conformers. Best Flexible Search Databases/Spreadsheets has the ability to modify the conformations of molecules during execution to provide a more precise database/spreadsheet search; the best algorithm finds the best fit among conformations, permitting no conformer's energy to rise by more than the default of 40,000 joules (about 9.5 kcal). You can change the default by specifying a value for the
flexfit.excessEnergyPerConf parameter in your .Catalyst file in your home directory with the following statement:

flexfit.excessEnergyPerConf = n

Catalyst will use the value you substitute for n (joules) when you use Best Flexible Search Databases/Spreadsheets command.

In general, you should use Fast Flexible Search Databases/Spreadsheets when speed is a primary consideration. Best Flexible Search Databases/Spreadsheets is appropriate for

What Determines if a Compound Matches a Hypothesis?

When you use a hypothesis to search a database or spreadsheet, Catalyst compares the hypothesis to each compound in them. The hypothesis matches a compound if

If the hypothesis contains a hydrogen bond donor function and a CCX fragment (where X is specified to be either oxygen or nitrogen), a matching compound must contain any set of atoms that can function as a hydrogen bond donor and also have either (or both) a CCN or CCO fragment. If any of the required atom specifications or functions in the hypothesis cannot be satisfied by the compound, the compound fails the topology test.

Searching for Substructures

The most efficient way to search a database or spreadsheet for compounds that contain a particular substructure or combination of substructures, features, and functions is to create a hypothesis that represents them as follows:

  1. Open the View Database workbench, select the View Hypothesis QuickTool, and create a hypothesis that defines the chemical substructures for which you want to search.

  2. Select Return from QuickTool from the QuickTool menu. Catalyst brings the hypothesis into the View Database workbench, places a local object representing the hypothesis on the shelf, and assigns it a unique name of the form Local-n where n is an integer. Use the Save To Lab As... command from the Data menu if you want to retain this hypothesis.

  3. Use the hypothesis to search a database or a spreadsheet as described in "To Search a Database or Spreadsheet for Compounds That Match a Hypothesis."

If a particular substructure that you need is not available in the Feature Dictionary, an alternate procedure for quickly producing a hypothesis includes these steps:

  1. Select the View Compound QuickTool to draw the substructure, and then select Return from QuickTool from the QuickTool menu.

  2. Drop the local compound on the View Hypothesis QuickTool, convert the compound to a hypothesis, and edit it.

  3. Select Return from QuickTool from the QuickTool menu and use the hypothesis for searching databases and spreadsheets.

    For detailed information on

To Search a Database or Spreadsheet for Compounds That Match a Hypothesis

The Fast Flexible Search Databases/Spreadsheets and Best Flexible Search Databases/Spreadsheets commands let you find and retrieve compounds that match specified criteria expressed in a hypothesis. (See "Differences between Fast and Best Flexible Searching" for advice on when to use each command.)

Database searching can be carried out in parallel processes. For more information about setting up parallel searches, see "Client-Server Operation".

To search one or more databases and/or spreadsheets:

  1. Ensure that the View Database shelf contains the databases and spreadsheets you want to search as well as the hypothesis with which to search them.

  2. From the ReportStyle menu, select the type of report you want. Compound-Table is the default report style. For detailed information on report styles, see "ReportStyle Menu."

  3. Select the databases, spreadsheets, and hypothesis on the shelf, and then select Fast Flexible Search Databases/Spreadsheets or Best Flexible Search Databases/Spreadsheets from the Tools menu. Catalyst displays a Busy dialog box with a slider at the bottom indicating the progress of the search. (You can halt the process before it is complete by selecting the Stop button in the Busy dialog box, and Catalyst displays data for only those compounds it has retrieved before you selected Stop.)

  4. When the search is complete, Catalyst displays the number of hits (compounds that matched your hypothesis) in the Status Area. The data for the matching compounds appear in the report area according to the style of report you selected. If you searched a single database or a single spreadsheet, Catalyst assigns a unique temporary name of the form

    Search-Database/SpreadsheetName-n

    (where Database/SpreadsheetName is the name of the source database or spreadsheet that you searched and n is an integer) to the results of your search and places a local (temporary) spreadsheet icon representing it on the shelf. Similarly, if you searched multiple objects, Catalyst creates a local icon and names it Local-n.

    Note: The number of compounds that Fast Flexible Search Databases/Spreadsheets or Best Flexible Search Databases/Spreadsheets retrieves is locally limited to the number specified in the Max Search Hits field of the View Database Options dialog box and globally limited to the number set in the Max Search Hits field in the View Database Preferences dialog box. The local setting takes precedence over the global. For information on specifying these values, select one of the following:

  5. Use the horizontal and vertical scroll bars to see any entries which are not visible. (Drag the left scroll bar down to move information in the report area up and display information at the bottom of the report page. Drag the horizontal scroll bar to the right to move information in the report area to the left and display material on the right side of the report page. Drag the right scroll bar down to display information that appears on succeeding report pages.)

  6. To display a compound from a Compound-Table or HypoGen report, double-click on its number in the Row column. Catalyst brings up the Hit Mappings Row dialog box and displays the compound that you selected in it. For information on viewing the compound in the Hit Mappings Row dialog box, see "Displaying Hit Compounds."

    Select one of the following for more information on

To Save Search/Browse Results

When database or spreadsheet searching or browsing is complete, Catalyst displays the number of hits (compounds that matched your hypothesis) in the Status Area. The data for the matching compounds appear in the report area according to the style of report selected. In addition, if you search or browse only one object, Catalyst creates a local spreadsheet on the shelf and assigns it a unique temporary name of the form

Browse-Database/SpreadsheetName-n or

Search-Database/SpreadsheetName-n

(where Database/SpreadsheetName is the name of the source database or spreadsheet, and n is an integer). If you search or browse multiple objects, Catalyst creates a local spreadsheet on the shelf and assigns it a unique temporary name of the form Local-n, where n is an integer.

Note: Simultaneously browsing multiple databases and/or spreadsheets is a way to combine their contents into one spreadsheet. To find multiple occurrences of the same compound in the resulting spreadsheet, you can use the Sort by Property... command from the Tools menu to reorder the contents of the spreadsheet by name. Duplicate names will be next to each other in the Name field, and you can eliminate unwanted copies with the Clear Selected Report Rows command from the Edit menu.

The spreadsheet resulting from searching or browsing is local; that is, it exists only so long as the View Database workbench in which it was created exists. Unless you explicitly use a command from the Data menu to save the local spreadsheet before you dispose of the View Database workbench, the local spreadsheet will be lost. That is, you will have to repeat the search or searches which generated this particular set of data if you need it again. For information on related topics, select one of the following:

Saving Hit Compounds as a Spreadsheet

To save the hits from a database/spreadsheet browse or search as a spreadsheet, follow these steps:

  1. With the data from searching or browsing in the report area, select Save Report To Lab As Spreadsheet... from the Data menu. Alternatively, you can select the icon on the shelf representing the local spreadsheet you want to save and then select the Save Report To Lab As Spreadsheet... command. Catalyst displays the Save To Shelf dialog box.

  2. If you want to save the spreadsheet in a lab you have already created (rather than saving it directly into the Stockroom), click on the Stockroom button in the Save To Shelf dialog box to display a list of your labs and then click on the name of one to select it.

  3. You can accept the default local name in the Name box, or you can click in it, press the Backspace or Delete key to delete unwanted characters, type a name for the spreadsheet, and then select the Save button. Catalyst saves the spreadsheet with the name you specified. If you selected a lab, the spreadsheet is saved in it, otherwise Catalyst saves it in the Stockroom. Catalyst also replaces the local icon with a spreadsheet icon that has the name you specified.

  4. If you want to retain this spreadsheet in your Stockroom or the lab you specified so that you can use it in future sessions, be certain to select the Save StockroomDB command from the Data menu in the Stockroom before you exit Catalyst.

    For information on

Saving a Hit Compound

To save an individual compound you located with the Browse Databases/Spreadsheets, Fast Flexible Search Databases/Spreadsheets, or Best Flexible Search Databases/Spreadsheets command:

  1. From a Compound-Table report (the default style), double-click on the compound's row number to display the Hit Mappings Row dialog box. From other report styles, double-click on a 2D or 3D drawing of the compound to display the dialog box.

  2. Click on the Save To Shelf button in the Hit Mappings Row dialog box to display the Save To Shelf dialog box showing the hit compound and its name.

  3. If you want to save the compound in a lab you have already created (rather than directly into the Stockroom), click on the Stockroom button to display a list of your labs and then click on the name of one to select it.

  4. If you want to save the hit compound with a different name, click in the Name box. Press the Backspace key to remove unwanted characters, type a new name, and then select the Save button. Catalyst saves the compound with the name you typed. If you selected a lab, the compound and its icon is saved in it, otherwise Catalyst saves the compound in the Stockroom placing an icon for it there. Catalyst also places an icon for the compound on the shelf of the currently open View Database workbench.

  5. If you want to retain this compound in your Stockroom or the lab you specified so that you can use it in future sessions, be certain to select the Save StockroomDB command from the Data menu in the Stockroom before you exit Catalyst.

Creating Databases

You can build your own Catalyst database from a spreadsheet with the Create Database... command in the Databases menu or with the catDB utility program. See "Building a Database from a Subset of Another Database" for procedures on creating new databases from spreadsheets. For background information on Catalyst databases and the various ways of creating and maintaining them, select from the following topics:

Export

Exporting the Results of a Database/Spreadsheet Browse or Search

You can save the results of searching or browsing a database or spreadsheet in disk files outside of Catalyst with the Export command in the Data menu. However, you must save the local files resulting from a database/spreadsheet browse or search to the shelf before you can execute the Export command. For information on

Displaying Hit Compounds

Displaying a Hit Compound

After browsing or searching a database or spreadsheet, Catalyst places the hit compounds, the ones that match your hypothesis or are included in the set returned by browsing, in the report area of the workbench. You can display a hit compound in a 3D workspace by

When Catalyst displays the Hit Mappings Row dialog box with one of the conformers of a hit compound in the dialog box's 3D workspace, if the query contained a 2D substructure, the mapping (the parts of the compound that match the hypothesis) appears in a dotted style, as shown in the example below.

If you select Save to Shelf in the Hit Mappings Row dialog box, Catalyst displays the Save to Shelf dialog box so that you can save the hit compound (along with all the conformers that were stored with it in the database) in a lab or in the Stockroom. If the compound was retrieved from the StockroomDB, an attempt to save the compound elicits an Alert message advising you that the compound already exists. You can rotate, move, and resize the molecule in the dialog box's workspace just as you can in any 3D workspace.

To view each mapping between the hypothesis and the compound:

  1. Make certain that Mappings is showing on the button below the Conf Energy box. If Confs is showing on the button, click on it and select Mappings.

  2. You can now click to the left or the right of the slider to go to the next or previous mapping, or on the Fast Forward or Fast Reverse button to step through each of a hypothesis's mappings for the compound. You can also drag the slider to move it to a particular mapping.

  3. Click on the Stop button if you want to examine one mapping in particular.

    To view each conformer that contains a particular mapping:

  1. Click on the Mappings button and select Confs.

  2. You can now click on the Fast Forward or Fast Reverse button to step through each of the conformers.

  3. When you are stepping through a set of conformers, click on the Stop button if you want to examine one conformer in particular.

The box above the horizontal slider provides a useful readout while you view the compound's mappings and conformers. For example, M:2/4 indicates that the second of a total of four mappings is currently displayed in the 3D workspace. Similarly, C12:67 indicates that the 3D workspace contains the twelfth of 67 conformers that fit this mapping.

Note that you can view individual mapping/conformer combinations by clicking on the slider and moving it horizontally. The number that appears above it indicates the current mapping number on display if Mappings is selected or the current conformer number if Confs is selected. Select the Cancel button to close the Hit Mappings Row dialog box.

For information on changing the style in which hit compounds are displayed in the Hit Mappings Row dialog box, see "3D Rendering".

Displaying 2D Structures for Hit Compounds

You can display 2D structures for hit compounds by selecting Structure-Activity-View from the ReportStyle menu. This report style is specifically intended for viewing 2D structures (not for printing them). You can also display 2D structures with any report style that includes 2D drawings. For example, if your current report style is Compound-Table, select 2D3D-Compound, Four-per-page, Six-per-page, Structure-Activity-9, Structure-Activity-24, or Structure-Activity-View from the ReportStyles menu. For detailed information on individual report styles, see "ReportStyle Menu."

If you have a large number of hit compounds and want to view the 2D drawings for a group with a common property, follow these steps:

  1. With a report style that displays 2D drawings, select Sort by Property... from the Tools menu.

  2. Click on the P next to the property of interest, for example, MW.

  3. To arrange the hit compounds in order of increasing values for MW, click on 1,2,3,4...; click on 9,8,7,6... to sort them by decreasing values.

  4. Click on OK to sort by the selected property and close the dialog box; click on Sort to reorder the compounds and leave the dialog box open.

When the compounds are sorted, you can quickly scroll to the part of the report in which you are interested.

To Print the Results of a Search

The Print command in the Data menu lets you send the results of searching or browsing directly to a black and white or a color printer for printing, or to a file for printing later on. The following instructions give details on how to print the current report. (See "Print" for a complete discussion of all the options for the Print command.)

  1. With your data in the View Database workbench report area, select the type of report you want from the ReportStyle menu. Compound-Table is the default report style. For detailed information on report styles, see "ReportStyle Menu."

  2. Select Print... from the Data menu to display the Print dialog box shown in the illustration that follows.

  3. Next to Print From, select Report.

  4. For Pages, select All if you want to print the entire report. Otherwise, select the button next to First, and specify the first and last pages you want to print by typing their respective values in the boxes next to First and Last. The number of compounds that are displayed on a page is specified in each report style's format; except for page boundaries, the "page" displayed on the screen is the same as the printed page.

  5. Specify the number of copies of the report by typing a value in the Copies box.

  6. For Destination, select Printer and type the name of a color or black and white printer to which your computer has access. If you do not know a valid printer name, see your system administrator.

    To save the data in the report as a PostScript file (for example, to print again or to print later on), select File as the Destination. Click in the File box to specify a file name. Type only a file name if the file is to be saved in the current directory, or enter a full UNIX path preceding the file name if the file is to be saved in a directory other than the current one.

    Note: Catalyst prints 3D objects by sending many 2D filled polygons, lines, and text to a PostScript printer. Thus, reports containing complex 3D objects may require considerably more time to print than reports lacking them.

    To save the data in a report to an encapsulated PostScript file that you can later import into a graphics or word-processing application, select both File and EPS. Specify a file name as described in the preceding paragraph.

    Note: Catalyst does not automatically supply an .eps file name suffix. Thus, if the application into which you plan to import the file requires a particular file name suffix, type the suffix when you specify the file name.

  7. After completing the selections in the dialog box, select the Print button. Catalyst grays out the dialog box while processing your printing specifications and removes the dialog box when processing is complete. If you are sending report data to a file, the data is copied and saved in the file immediately. If you are sending your data to a printer, your printing request enters the printer queue and will be printed when it reaches the top of the queue.

Maximum Number of Search Hits for the Current Workbench

Specifying the Maximum Number of Search Hits for the Current Workbench

To change the maximum number of compounds that the Browse Databases/Spreadsheets, Fast Flexible Search Databases/Spreadsheets, and Best Flexible Search Databases/Spreadsheets commands can locate in the currently open View Database workbench:

  1. From the Workbench menu select Workbench Options... to display the View Database Options dialog box.

  2. Click in the Max Search Hits box. Type an integer representing the upper limit for the number of compounds that the Browse Databases/Spreadsheets, Fast Flexible Search Databases/Spreadsheets, and Best Flexible Search Databases/Spreadsheets commands can locate for this workbench only.

  3. Select one of the following:

Apply. Makes the value you specified effective only for the current workbench and leaves the View Database Options dialog box open. The global default value set in the Global Preferences dialog box, available from the Preferences menu, remains the same, and the next View Database workbench you open will use that global value.

OK. Works in the same manner as Apply, but closes the dialog box.

Reset. Restores the settings in the dialog box to the values you set the last time you used Apply, or to their original values when you opened the dialog box.

Cancel. Closes the dialog box and makes no changes to any of its settings.

Help. Displays Catalyst Help window.

When you use the Browse Databases/Spreadsheets, Fast Flexible Search Databases/Spreadsheets and Best Flexible Search Databases/Spreadsheets commands, Catalyst limits the number of compounds it locates to the value specified in the Max Search Hits box in the View Database Options dialog box. When you initially open the workbench, that value is the same as the one specified in the View Database Preferences dialog box available from the Preferences menu. (For additional information on the global value of Max Search Hits, see "Workbench Preferences - View Database.")

Select one of the following for details on

Differences between Search and Fit Mappings

Comparing the Results of Database Searches and Fit Operations

Sometimes the way a molecule matches a hypothesis in the Hit Mappings Row dialog box from the View Database workbench is different from the way the molecule fits the hypothesis when used with the Compare/Fit and Estimate Activity commands. A database search returns only compounds (and their respective conformers) that fit all the constraints and specifications expressed in the hypothesis. The Compare/Fit and Estimate Activity commands find the best fit of the molecule to the hypothesis; the best fit does not necessarily involve all of the location constraints. A close fit to two location constraints may be better than a poor fit to three. Furthermore, the database search mechanism is concerned only with a mapping that finds a match above some specified minimum. It does not attempt to find the mapping giving the best fit.

For additional information on the Compare/Fit command, see "Comparing a Compound and a Hypothesis."

To Convert Standard DatabaseFormats to Catalyst Format

Catalyst can install only databases that are in Catalyst format. See "Database Utilities" for detailed information on converting and creating databases.



[Top] [Index]

Last updated April 17, 1996 at 12:16pm PDT.
Copyright © 1999, Molecular Simulations Inc. All rights reserved.