| QSAR |

You can use Cerius2·QSAR+ to help you make informed decisions about which candidate compounds should be considered (based on estimates of biological activity), as well as to help you gain insight into various underlying biological processes. You can also use QSAR+ to provide basic insight into structure-property relationships. This information can be gathered before modeling the atomic-level mechanisms behind these relationships (using other Cerius2 modules). Using the analysis capabilities of the QSAR+ module, you can then correlate the values calculated from the modeling programs with various properties. This correlation ability makes Cerius2·QSAR+ a useful complement to your molecular modeling programs.
This tutorial familiarizes you with Cerius2·QSAR+ by illustrating the step-by-step procedure for building a QSAR equation, including:

To complete this tutorial, you need a licensed copy of Cerius2 that includes these modules:

|
Starting with a new Cerius2 session, select the File/Load Model menu item in the main Cerius2 Visualizer panel and use the file browser to navigate to the directory Cerius2-Resources/EXAMPLES/DBH.
|
|
Select all the .msi files from dbh02 through dbh52 (<Shift>-cklick) and click LOAD to load them into Cerius2.
|
A total of 47 models should be loaded.
Entering molecular descriptors
A descriptor is a molecular property that QSAR+ can calculate and use in determining a new QSAR. QSAR+ calculates a wide variety of spatial, electronic, topological, and other descriptors. QSAR+ also enables you to modify existing descriptors and to create or import new descriptors from other Cerius2 modules, such as Molecular Field Analysis (MFA) and Receptor, to meet your specific requirements.
1. Create an empty study table.
|
Go to the QSAR deck of cards and select Show Study Table on the QSAR card. This opens a new, empty Study Table control panel.
|
|
Select the Descriptors/Add Default menu item in the study table to add a set of default descriptors to the study table.
|
You can add more descriptors to the study table:
|
Open the Descriptors control panel by selecting the Descriptors/Select menu item in the study table.
|
|
In the the Descriptors control panel, set the Descriptors in family popup to Topological.
|
|
Assure that the other popups are set to Display and All, and click the associated action button to display them in the table in the Descriptors control panel.
|
Descriptors 58 (Balaban) to 65 (Zagreb) are displayed.
|
Set the popup just to the right of the action button to Select and click the action button to Select All descriptors in the Topological family.
|
The descriptors in the family are highlighted.
|
Click ADD in the Descriptors control panel to add the selected descriptors to the study table.
|
An additional set of 29 descriptors from the topological family are added to the study table, giving a total of 49 descriptors.
Load the molecules into the QSAR study table
You can now add the molecules in the Cerius2 models window to the study table.
|
Select the Molecules/Add All menu item in the study table, to add all the molecules in the Models window to the study table.
|
As each molecule is added, QSAR+ automatically calculates charges, adds hydrogens, and performs an energy minimization. In addition, all the molecular descriptors are calculated.
Entering biological activity data
Next, you need to enter biological activity data for all the molecules into the study table, the same way as you enter data into any Cerius2 table. That is, you can type the activity data directly into study table cells or copy the activity data from another table. Here, you will enter data directly into the study table.
1. Add data to the study table.
| Molecule | Activity | Molecule | Activity | |
|---|---|---|---|---|
|
dbh02
|
3.00
|
|
dbh28
|
4.12
|
|
dbh04
|
3.15
|
|
dbh29
|
4.21
|
|
dbh06
|
3.30
|
|
dbh30
|
4.28
|
|
dbh07
|
3.45
|
|
dbh31
|
4.28
|
|
dbh08
|
3.47
|
|
dbh32
|
4.31
|
|
dbh09
|
3.47
|
|
dbh33
|
4.33
|
|
dbh10
|
3.70
|
|
dbh34
|
4.33
|
|
dbh11
|
3.76
|
|
dbh35
|
4.44
|
|
dbh12
|
3.81
|
|
dbh36
|
4.48
|
|
dbh13
|
3.83
|
|
dbh37
|
4.51
|
|
dbh14
|
3.94
|
|
dbh38
|
4.55
|
|
dbh15
|
4.08
|
|
dbh39
|
4.77
|
|
dbh16
|
4.13
|
|
dbh34
|
4.92
|
|
dbh17
|
4.13
|
|
dbh31
|
4.92
|
|
dbh18
|
4.16
|
|
dbh42
|
5.25
|
|
dbh19
|
3.24
|
|
dbh44
|
5.29
|
|
dbh20
|
3.45
|
|
dbh45
|
5.62
|
|
dbh21
|
3.69
|
|
dbh46
|
5.66
|
|
dbh22
|
3.80
|
|
dbh48
|
5.70
|
|
dbh23
|
3.83
|
|
dbh49
|
5.82
|
|
dbh24
|
3.92
|
|
dbh50
|
5.92
|
|
dbh25
|
3.99
|
|
dbh51
|
6.17
|
|
dbh26
|
4.01
|
|
dbh52
|
7.13
|
|
dbh27
|
4.02
|
|
|
|
Before you generate a QSAR equation you need to specify which columns in the study table should be used as dependent and independent variables.
|
Select the column named Activity in the study table by clicking the column heading. Mark this column as a dependent variable (Y) by selecting the Variables/Set Y menu item in the study table.
|
2. Set the independent variables
By default, the descriptors columns are automatically marked as independent variables (X) when they are added to the study table. If this didn't happen, select all the descriptors columns, from Charge to Zagreb, in the study table. Mark these columns as independent variables by selecting the Variables/Set X menu item in the study table menubar.
Exploring the data
You can now analyze the dependent and independent variables using the statistical and graphics tools available in QSAR.
|
Generate histograms of selected variables by selecting one or more of the columns and selecting the Tools/Graphics/Histogram Plots menu item in the study table menubar.
|
|
Calculate descriptive statistics for all dependent and independent variables by selecting the Tools/Statistical/Summary Statistics menu item in the study table.
|
The statistics are calculated before the Descriptive Statistics control panel appears.
Generate a QSAR equation
You are now ready to generate a QSAR equation. Several regression methods are available in QSAR, including multiple linear regression, partial least squares (pls), simple linear regression, stepwise multiple linear regression, principal components regression (PCR), genetic function approximation (GFA), and genetic partial least squares (G/PLS). In this session you will use the GFA method.
|
Select GFA in the Methods popup in the study table. Then click the RUN pushbutton to start a GFA calculation with the default parameters.
|
The GFA calculation takes a few minutes.
Analyzing the QSAR equation
The GFA calculation performed in the previous step results in a set of 99 QSAR equations. You can analyze each of these equations with the Equation Viewer control panel.
1. View the equation terms, coefficients, and statistics
|
Open the Equation Viewer control panel (if it does not appear automatically) by selecting the Tools/Equation Viewer... menu item in the study table.
|
|
Click an equation row number in the upper table in the Equation Viewer control panel to display the terms, coefficients, and statistics for that equation in the lower part of the control panel.
|
2. Connect the 2D plot to the equation viewer
You may want to move teh Graph window so that it doesn't overlap the Equation Viewer control panel.
You can also identify points in the 2D plot with molecules in the QSAR study table:
The rows corresponding to the selected points in the 2D plot are highlighted in the study table. In addition, the corresponding molecules appear in the models window, and information about the selected molecules is printed in the text window.
Saving the QSAR equations
QSAR+ allows you to save the QSAR equations generated in the current session for later retrieval and use.
|
Open the Save QSAR Equations control panel by clicking the Save Equations... button at the top of the Equation Viewer control panel.
|
The entire set of 99 GFA equations is saved in the file testset.qsar. You can read in QSAR equations saved in .qsar files into the equation viewer by using the Open Equations button.
Predicting the activity of new molecules
Once you have calculated a QSAR equation, it is easy to use it to predict the activity of a molecule outside the training set.
|
Select the File/Load Model menu item in the Cerius2 Visualizer and navigate to the same directory you used at the beginning of the session:
Cerius2-Resources/EXAMPLES/DBH |
|
Select the dbh02.msi file and click LOAD to load the molecule into Cerius2. The copy of dbh02 is named dbh02_1.
|
2. Add the new model to the study table
|
Make sure that dbh02_1 is current in the Models window and add it to the study table by selecting the Molecules/Add Current menu item in the study table.
|
The new molecule is added at the bottom of the study table. QSAR+ automatically calculates charges, adds hydrogens, and performs an energy minimization (as for the original molecules). All the descriptors are automatically calculated, including the QSAR equation column (GFA Predicted Activity), which should show the same value as for the original dbh02 molecule in row 1.
|
Open the Molecule Preferences control panel by selecting the Preferences/Molecules menu item in the study table.
|
|
Check the Recalculate Descriptors When Models are Edited checkbox in the Molecule Preferences control panel.
|
Immediately after picking the sulfur atom, QSAR+ checks and fixes the number of hydrogens, recalculates charges, minimizes the molecule, and recalculates the descriptors in the study table corresponding to model dbh02_1.
Saving the study
You can save your QSAR study, including molecules and the QSAR study table, using the Cerius2 Save Session function:
This tutorial familiarized you with QSAR+ by illustrating the steps you could perform to build a QSAR equation, including:
Summary