MSI Product Previous Next Contents Index Top
QSAR



6       Working with Molecules

The first step in developing a QSAR equation consists on identifying and building or loading the molecules that are going to define the training set into The Cerius2 environment and into the QSAR Study Table. This chapter describes the following major activities related to working with molecules in QSAR+.

This chapter describes

Loading molecules into Cerius2.

Adding molecules to the study table.

Setting molecule-processing preferences.

Loading molecules directly from SD files.

Exporting molecules to SD files.

Managing conformations.


Loading molecules into Cerius2

You can build molecular structures using the various Cerius2 building and sketching tools. Alternatively, you can load structural data from a variety of common formats generated by other molecular modeling and chemical database software. For each molecule with which you want to work, you can choose one of the following:

For detailed information about using the 3D Sketcher, using the Analog Builder, and importing files, refer to the appropriate discussions in the Cerius2 Builders and Modeling Environment books.


Adding molecules to the study table

QSAR+ uses a study table to maintain and display the data for a QSAR analysis. Much like a conventional spreadsheet, a study table is made up of cells that can contain both numeric and textual data. Additionally, these cells can contain and display molecular structures. Each row in a study table represents a single observation (or experiment) and includes a molecular structure, user-entered biological activity data, and the results of descriptor calculations. For detailed information about the study table, see Chapter 5, Working with the Study Table.

You can add molecules from the Cerius2 Models window to the study table by using the items in the Molecules menu in the study table:

QSAR+ simplifies the process of adding molecules to the study table by enabling you to take advantage of default settings and automated processing. The following occurs by default:

QSAR default processing


Setting molecule-processing preferences

You can override and change the processing that occurs automatically whenever a molecule is added to the study table. To do so, choose Preferences/Molecules... on the study table menubar (or in the QSAR card's Preferences menu). The QSAR Molecule Preferences control panel appears:

This can be a very time consuming.

Note

If a QSAR equation has been added to the study table, activating this option also causes immediate recalculation of the predicted activity for the molecule being edited. This provides a quick and convenient way of checking the effects of structural changes on the activity.  

Setting preferences for charges, minimization and conformations

Charge calculation, energy minimization, and conformation generation are performed according to default values or user-specified criteria. QSAR+ enables you to set the processing preferences for the charge calculation, energy minimization, and conformation generation that will be performed.

To set charge calculation preferences

1.   Click Charges... in the Molecule Preferences control panel.

The Charges control panel appears. The default charging scheme is Charge-Equilibration, though you can also select Gasteiger as the Charge Calculation Method.

2.   Use the action buttons in the Charges control panel or select the appropriate Preferences button to edit and calculate charges with the current charging method.

For detailed information about using the Charges, QEq Preferences, and Gasteiger Preferences control panels, refer to the Cerius2 Simulations Tools.

To set minimization preferences

1.   Select OFF (default) or MMFF in the popup next to the Minimization option on the Molecule Preferences control panel to select between the Open Forcefield or the Merck Molecular Mechanics Forcefield.

2.   Click Controls... next to the popup to open the OFF or the MMFF preferences control panel:

3.   Use the Energy Minimization control panels to set termination criteria and other options and to access control panels governing minimization variables and output.

For detailed information about using the Energy Minimization control panel and about the Cerius2·Minimizer module, refer to Cerius2 Simulations Tools.

To set conformation generation preferences

1.   Click Conformations... in the Molecule Preferences control panel.

The QSAR Conformation Generation control panel appears:

2.   Use the QSAR Conformation Generation control panel to select a default conformational search method, apply an energy cutoff, specify the number of conformations that will be generated, and so on. You can also use the Generate Conformers button at the top of the control panel to immediately generate conformations for all molecules, for only the selected molecules, or for only the current molecule.

For information about the Cerius2 Conformer Search and Conformer Analysis modules, refer to the Cerius2 Conformational Search and Analysis book. .


Loading molecules directly from SD files

This section shows how to import molecules and their associated data from an SD file to the QSAR+ study table. The SD file format provides a general and flexible way to store chemical structures and data (biological activities, molecule name, values of calculated molecular descriptors, etc.), and QSAR+ offers functionality to take full advantage of these capabilities.

To import molecules from an SD file select Molecules/From SD File... on the study table menubar (or from the Molecules menu on the QSAR card). The Add Molecules from SD File control panel appears:

You can use this panel to select the SD file you want to read, which molecules you want to import, and whether to also import data fields associated with molecules (if any).

Selecting the SD file

The IMPORT MOLECULES pushbutton at the top of the control panel is initially inactive (grayed-out). The first thing you need to do to import molecules from an SD file to the study table is to select the SD file you want to import. You do this by using the file browser to navigate to the directory where the SD file is and then to select the file by highlighting its name with the cursor and clicking SELECT or by double-clicking the file. As soon as an SD file is selected, QSAR+ opens the file, counts the number of molecules in it, and displays this information at the top of the Add Molecules from SD File control panel. If the Read Data Fields checkbox is checked and if there is data in the file, the names of the data fields are also displayed in the panel.

The IMPORT MOLECULES pushbutton is only active after you have selected a specific SD file.

Selecting the range of molecules to import

You can import all the molecules in the file (the default behavior), a range of molecules, or only a single molecule from the SD file. You specify which option you want using the controls provided at the bottom of the panel.

Reading data fields

If the Read Data Fields checkbox is checked when an SD file is selected, and if there is data in the file, the names of the data fields are displayed in the panel. You can then select any of the data fields to be added to the study table together with the corresponding molecules. These data fields appear as columns in the study table. To select data fields, you mark them with the cursor, using <Shift>- and <Control>-click to select groups of data fields. You can also use the Select All and Deselect All buttons to select and deselect all the available fields.

Another option that is provided is to assign one of the available data fields as the molecule name when reading the molecules from the SD file. To do this, make sure that only one data field is selected and click Set mol name from selected data field.

Loading the molecules

After the SD file has been selected and, optionally, data fields and a range of molecules have been specified, the molecules are loaded into the Cerius2 Models window and into the study table by clicking the IMPORT MOLECULES button.

All the molecule-processing options (hydrogen filling, energy minimization, charging, conformation generation) available for adding molecules from the Cerius2 Models window to the study table are also in effect when adding molecules from an SD file. You should check and activate only those options you want to use when importing the molecules from the SD file.

Special memory-saving options

Sometimes, specially when importing a large number of molecules from an SD file, it is convenient or even necessary not to keep all the molecules in memory. It may also be desirable not to keep in memory the study table row corresponding to each molecule. QSAR+ offers functionality to optionally delete each model from the Cerius2 Models window after it is added to the study table, and to optionally export the corresponding row to a file (after the desired molecular descriptors have been calculated) and delete the row from the study table.

You can access these options by clicking the Preferences... button at the top of the Add Molecules from SD File control panel. The SD File Preferences control panel appears:

This control panel allows you to activate the options to delete both the model and the study table row after reading each molecule from the SD file. It also provides an option to control whether or not information regarding the SD file (file name, file type, and file index) is added to the study table.

Recovering deleted molecules

When you want to delete the models after adding them to the study table (see Special memory-saving options, above), there is a way to easily recover the molecules from the original SD file and include them in the Cerius2 Models window and in the corresponding row in the study table.

Select Molecules/Recover Molecules... in the Study Table. The Recover Molecules control panel appears:

Select the rows for which you want to recover the corresponding molecules in the study table and click Reconstruct in the Recover Molecules control panel.

The information contained in the File name, File type, and File index columns in the study table is used to go back to the original SD file and extract the molecules, which are loaded into the Cerius2 Models window and into the corresponding study table cells.


Loading molecules directly from Daylight SMILES files

This section shows how to add a set of structures from a Daylight SMILES file to the QSAR+ study table.

This functionality is especially useful if the SMILES file contains a large number of molecules and loading them all into the model manager would require too much memory. Using a SMILES file and checking the Delete The Model After Adding checkbox (on the SMILES File Preferences control panel) can save memory that would otherwise be used to keep each molecule in memory.

To import a set of structures from a Daylight SMILES file, choose Molecules/From SMILES File... in the study table (or from the Molecules menu on the QSAR card). The Add Molecules from SMILES File control panel appears:

Selecting the SMILES file

The IMPORT MOLECULES pushbutton at the top of the control panel is initially inactive (grayed-out). The first thing you need to do to import molecules from an SMILES file to the study table is to select the SMILES file you want to import. You do this by using the file browser to navigate to the directory where the SMILES file is, and then select the file by highlighting its name with the cursor and clicking SELECT or by double-clicking the filename. As soon as a SMILES file is selected, QSAR+ opens the file, counts the number of molecules in it, and displays this information at the top of the Add Molecules from SMILES File control panel.

The IMPORT MOLECULES pushbutton is only active after you have selected a specific SMILES file.

Selecting the range of molecules to import (Read mode)

You have the option to import all the molecules in the file (the default behavior), a range of molecules, or only a single molecule from the SMILES file. You specify which option you want with the controls provided at the center of the panel.

Loading the molecules

After the SMILES file has been selected and, optionally, a read mode selected, the molecules are loaded into the Cerius2 Models window and into the study table by clicking the IMPORT MOLECULES button.

All the molecule-processing options (hydrogen filling, energy minimization, charging, conformation generation) available for adding molecules from the Cerius2 Models window to the study table are also in effect when adding molecules from a SMILES file. You should be careful to check and activate only those options you want to use when importing the molecules from the SMILES file.

Special memory-saving options

Sometimes, especially when importing a large number of molecules from an SMILES file, it is convenient or even necessary not to keep all the molecules in memory. It may also be desirable not to keep in memory the study table row corresponding to each molecule. QSAR+ offers functionality to optionally delete each model from the Cerius2 Models window after it is added to the Study Table, and to optionally export the corresponding row to a file (after the desired molecular descriptors have been calculated) and delete the row from the Study Table.

You can access these options by clicking the Preferences... button at the top of the Add Molecules from SMILES File panel. The following panel appears:

This control panel allows you to activate the options to delete both the model and the Study Table row after reading each molecule from the SMILES file. It also provides an option to control whether or not information regarding the SMILES file (file name, file type, and file index) is added to the Study Table.

Recovering deleted molecules

In cases where you choose to delete the models after adding them to the Study Table (see Special memory-saving options), there is a way to easily recover the molecules from the original SMILES file and include them in the Cerius2 Models window and in the corresponding row in the Study Table.

To do so, you select Molecules/Recover Molecules... on the Study Table pulldown. The following panel appears:

Select (highlight) the rows for which you want to recover the corresponding molecules in the Study Table and click Reconstruct in the Recover Molecules panel.

The information contained in the File name, File type, and File index columns in the Study Table is used to go back to the original SMILES file and extract the molecules, which are loaded into the Cerius2 Models window and into the corresponding Study Table cells.

SMARTS table derivations

The following derivations can be entered as column headers in the study table:


==daysss(Structure, <SMARTS string>) and 


==daysss_unique(Structure, <SMARTS string>) 

For example:


==daysss(Structure, "c1ccccc1") 


==daysss_unique(Structure, "[C;H1]NOH") 





Exporting molecules to SD files

This section shows how to export molecules, and selected columns, from the Study Table to an SD file. You can access this functionality by selecting Molecules/Export to SD File... on the Study Table menubar (or from the Molecules menu on the QSAR card). This brings up the following panel:

You can export the molecules from all the rows in the Study Table or only from the currently selected rows. A file browser allows you to select the name of the SD file in which you want to save the molecules (if you want to overwrite it) or to specify a new name. An option is provided to export data contained in columns in the Study Table together with the molecules as data fields. You can choose among several options:


Managing conformations

The Tools/Conformers... command on the Study Table menubar, brings up a panel with controls that you can use to display additional information that relates to conformations and contingent descriptors:

Conformation refers to the placement in 3D space of the atoms and bonds in a molecule. Many of the descriptors that QSAR+ calculates are conformationally dependent (that is, they depend on the 3D structure of a molecule). Among these conformationally dependent descriptors, for example, are Molecular Volume (Vm), Dipole Moment (Dipole), and Principal Moment of Inertia (PMI).

If you have generated conformations for the structures in a Study Table, QSAR+ stores information about the generated conformations both in the Study Table and in a separate Conformers table for each structure.

Each row in a Study Table can contain information about only one conformation (that is, the current conformation) and its 3D coordinates. You can display and work with information about the other generated conformations by accessing the appropriate Conformers table. You can also update the Study Table with information about one of these other conformations (that is, you can select another conformation as the current conformation).

Displaying conformation information

This section describes the following activities related to displaying conformation information:

Displaying information about the current conformation

The Conformer Summary check box in the Conformers panel allows you to toggle ON and OFF the display of four conformation-related columns in a Study Table. For each structure in a Study Table, these columns provide the following information:

Displaying the Conformers table

If you have generated conformations for the structures in a Study Table, QSAR+ can build a separate Conformers table for each structure. The Conformers table stores information about each of the conformations generated for a structure and contains one row for each conformation. Each table row contains the following information:

Note

The current row in the Conformers table identifies the conformation being used in the Study Table (that is, the current conformation).  

You can use the Conformers table to gather information about the highest and lowest energy conformations, to observe how a given 3D property is affected by various changes in conformation, to determine which conformations are within a specified kilocalorie range of the lowest energy conformation, and so on. You can also use the Conformers table to update the conformation that is currently being used in the Study Table (that is, to select another conformation as the current conformation).

You must have generated conformations for the structures in the Study Table to be able to display a Conformers table. Then you select the Study Table structure for which you want to display a Conformers table and do either of the following:

or

When you do so, QSAR+ generates a Conformers table for the selected structure by calculating all of the conformationally dependent properties used in the Study Table for each conformation of that structure. When it finishes its calculations, QSAR+ displays the Conformers table for the specified structure.

The following figure illustrates a portion of a sample Conformers table:

Working with a Conformers table

Just as with any Cerius2 table, you can use the icons on the table tool bar to work with a Conformers table. Additionally, you can click Update to rebuild the Conformers table (that is, to recalculate all of the conformationally dependent properties for each conformation of a structure). You might perform this activity, for example, if you previously interrupted QSAR+ processing before the Conformers table was completely built.

To update the Study Table

Recall that at any given time, the Study Table can contain only one conformation (that is, only one set of X, Y, and Z coordinates) for each structure. Conversely, a Conformers table contains information about all the conformations generated for a structure.

You can update the Study Table so that it contains information about another conformation by doing either of the following:

or

When you do so, QSAR+ updates the Study Table with the conformation that you selected and updates all of the conformationally dependent properties shown in the Study Table for that structure, as appropriate. Thus, the selected conformation becomes the new current conformation. Additionally, QSAR+ displays the selected conformation in the Cerius2 Models window and identifies it as the current structure.

Displaying contingent descriptors

Contingent descriptors are those needed to calculate other descriptors. For example, density is equal to molecular weight divided by volume. Therefore, in order to calculate a value for the Density descriptor, QSAR+ requires the calculations for both the Molecular Weight (MW) and Molecular Volume (Vm) descriptors. In this example, Molecular Weight and Molecular Volume are referred to as the contingent descriptors.

QSAR+ stores information about contingent descriptors in the Study Table. You can toggle the display of these descriptors on and off, as appropriate.

To display contingent descriptors

Toggle the Contingent Descriptors checkbox in the Conformers panel ON and OFF to display or hide contingent descriptors.



MSI Product Previous Next Contents Index Top

Last updated May 18, 2000 at 05:51PM Pacific Daylight Time.
Copyright © 2000, Molecular Simulations Inc. All rights reserved.