MSI Product Previous Next Contents Index Top
Combi-Chem


3f. Library comparison and augmentation

Back to index of Tutorials.

Comparing library diversity

1.   Start a new Cerius2 session

If you have just finished the previous section, start a new session by selecting File/New Session from the Visualizer menu bar. Click Confirm in the Re-initialize All message window. You can also start a brand new Cerius2 session.

2.   Open an empty study table

Go to the LIBRARY ANALYSIS card located in the COMBI-CHEM I deck and select the Show Study Table item.

3.   Recall the benzodiazepine library

Open the Open Study Table control panel by selecting the File/Open... item from the menu bar in the Study Table control panel. Select the table benzo.tbl and click the OPEN button. Alternatively, if you did not complete the earlier part of this tutorial, select the table ./Cerius2-Resources/COMBICHEM/demos/benzo_720.tbl and click the OPEN button.

4.   Set the molecule import defaults

Open the Molecule Preferences control panel by selecting the Preferences/Molecules... item from the menu bar in the Study Table control panel. Uncheck Add Hydrogens and Minimize Energy using.

5.   Import the WDI structures into the study table

Open the Add Molecules from SD File control panel selecting the Molecules/From SD File... item from the menu bar in the Study Table control panel. Select the file ./Cerius2-Resources/COMBICHEM/wdi/wdi-400.sd and click the SELECT pushbutton.

The SD file is read, and the number of structures in the file is reported to be 399.

Click the Preferences... pushbutton from the top right of the Add Molecules from SD File control panel to open the SD File Preferences control panel. Check the Delete Model After Adding check box and leave the other controls unchanged. Click the IMPORT MOLECULES pushbutton on the Add Molecules from SD File control panel.

You can follow the progress of the import process in the Cerius2 interrupt window.

Importing the structures and calculating the descriptors takes about 5 minutes on an Indigo2 R10000 work station. After the importation has finished, you can close all the control panels except the Study Table control panel.

6.   Perform a PCA analysis

Run a simple PCA analysis as described in the earlier example, Using principal component analysis. Remember to set the Label Row by popup to Star in the 3D Plot panel before starting the analysis.

If you forgot set up the plotting options, you can re-plot the compounds using the same 3D Plot Samples control panel.

7.   Define compound libraries

Go to the LIBRARY COMPARISON card located in the COMBI-CHEM I deck and select the Define Libraries item.

This opens the Define Libraries control panel.

Select rows 1-720 in the study table by selecting the first row, scrolling down to row 720 and <Shift>-clicking to select rows 1-720. Identify these rows as Library A by assuring that the Set Selected Rows as Library entry box is set to LibA and clicking the associated action button in the Define Libraries control panel. Select rows 721-1119 in the study table by selecting row 721, scrolling down, and <Shift>-clicking to select row 1119. Identify these rows as Library B by assuring that the Set Selected Rows as Library entry box is set to LibB and clicking the associated action button in the Define Libraries control panel.

You can now close the Define Libraries control panel.

8.   Visualizing compound libraries

From the tool bar in the Study Table control panel, click the 3D Plot tool to open the 3D Plot Samples control panel. Set the X, Y, and Z coordinates to PC1, PC2, and PC3, respectively. Set the Label Row by popup to Star. Select the column Library in the study table, set the Colors entry box to 2, and click the Color using Selected Column action button. Do not change the other defaults in the 3D Plot Samples control panel. Then click the 3D Plot pushbutton in the 3D Plot Samples control panel.

The 3D graph appears in the Cerius2 Models window.

The benzodiazepine library and WDI compounds appear in grey and green, respectively. You can see that the WDI compounds cover more of the PCA space than the benzodiazepines. We can now numerically assess that the WDI compounds offer more diversity than the benzodiazepine library.

9.   Comparing compound libraries

Go to the LIBRARY COMPARISON card located in the COMBI-CHEM I deck and select the Compare Libraries --> Diversity Integral item to open the Compare Libraries control Diversity Integral panel. Set Libraray 1 and Library 2 to LibA and LibB, respectively.

The settings on this panel define that 1000 points are to be used to sample PCA space. Also, 1007 samples (90% of the compounds) are to be taken from the study table to perform the library comparison.

Click the Compare Libraries pushbutton in the Compare Libraries Diversity Integral control panel.

For each of the 1000 sampling points, distance information to the closest compound in Library A and Library B is provided. The overall diversity assessment is based on statistical analysis of these distances and is provided in the text window. For detailed information on diversity measurements using this technique (diversity integral), please refer to the Combinatorial Chemistry Methodologies section of this documentation.

10.   Comparing diversity scores

As shown in the text window, the sample of WDI compounds is indeed more diverse than the benzodiazepine library.

You can now close the Compare Libraries Diversity Integral control panel.

Distance-based library augmentation

11.   Continuing from the previous example

12.   Library augmentation

Go to the LIBRARY COMPARISON card located in the COMBI-CHEM I deck and select the Complement Library item.

This opens the Complement Library control panel.

Set Diverse Molecules to 50 and set From Library and To Library to LibA and LibB, respectively.

These settings specify that 50 compounds should be selected from the benzodiazepine library to complement the set of WDI compounds.

Click the SELECT pushbutton in the Complement Library control panel.

Compounds are selected using a Monte Carlo optimization of the MaxMin diversity function, and the Sample Diversity graph is updated to reflect the selection of compounds.

13.   Interpretation of the results

The compounds highlighted in green are the 399 WDI compounds, which are taken as a fixed set in the optimization process. The compounds in red are those that have been selected from the benzodiazepine library to complement the set of WDI compounds. These compounds also appear as selected rows in the study table. The compounds in grey are the remaining compounds from the benzodiazepine library.

Close the Complement Library control panel.

Hole identification and hole filling

14.   Continuing from the previous example

15.   Hole identification

Go to the LIBRARY COMPARISON card located in the COMBI-CHEM I deck and select the Holes in Property Space/Find Holes item to open the Find Holes control panel. Uncheck All Molecules in Study Table and check Molecules in Library, set to LibB.

These settings specify that 10,000 points are to be used to sample PCA space, looking for holes in the set of WDI compounds using a random-point search technique. Click the FIND HOLES pushbutton in the Find Holes control panel. The 100 largest holes are recorded and output to the study table and to the text window.

16.   Hole filling

Go to the LIBRARY COMPARISON card located in the COMBI-CHEM I deck and select the Holes in Property Space/Fill Holes item to open the Fill Holes control panel. Uncheck All Molecules in Study Table and check Molecules in Library, set to LibA.

These settings specify that all the holes reported in the previous step should be considered. Also, only compounds from the benzodiazepine library should be used to fill the holes.

Click the FILL HOLES pushbutton in the Fill Holes control panel.

The compounds from the benzodiazepine library that successfully fill holes are reported in the text window. These compounds also appear as selected rows in the study table.

Distance histogram - library comparison

1.   Start a new Cerius2 session

If you have just finished the previous section, start a new session by selecting File/New Session from the Visualizer menu bar. Click Confirm in the Re-initialize All message window. Alternately, you can start a brand new Cerius2 session.

2.   Open an empty study table

Go to the LIBRARY ANALYSIS card located in the COMBI-CHEM I deck and select the Show Study Table item.

3.   Recall a file with library definitions

If you are continuing from the previous section of this tutorial: Open the Open Study Table control panel by selecting the File/Open... item from the menu bar in the Study Table control panel. Select the table benzo_dipep_iso.tbl and click the OPEN button. Alternately, if you did not complete the earlier part of this tutorial, select the table ./Cerius2-Resources/COMBICHEM/demos/benzo_dipep_iso.tbl and click the OPEN button.

4.   Library comparison

Go to the LIBRARY COMPARISON card located in the COMBI-CHEM I deck and select the Compare Libraries ---> Similarity item.

This opens the Compare Libraries Similarity control panel.

Set Candidate Library and Reference Library to dipep and benzo, respectively. Change the number at the bottom right next to Minimum Distance to 1.000. Now click Preferences... to bring up the Compare Libraries Preferences control panel and set Number of bins for histograms to 100.

These settings specify that, for every compound in the Candidate dipeptide library, the closest compound in the Reference library will be identified and the distance reported. Both similarity plot (minimum distance for each candidate compound) and similarity histogram (histogram distribution of nearest neighbor distances) will be reported. Since PC1, PC2 and PC3 are the only independent variables identified, the inter-compound distances will be based on these parameters.

Click the Compare Libraries pushbutton in the Compare Libraries Similarity control panel.

The Cerius2 Graphs panel displays both representations.

5.   Interpretation of the results

The graph on the left (similarity plot) represents, for every compound in the dipeptide library (X axis), the distance to the closest compound in the benzodiazepine library (Y axis). The graph on the left (similarity histogram) represents the distribution of those same distances where the number of compounds in each bin is plotted versus distance bins.

Distance histograms shifted to the left indicate redundancies between two libraries whereas histograms shifted to the right indicate complementarity. For this reason, we may be interested in isolating the compounds that fall towards the right of the histogram as a complementary set. This is the object of the following exercise, "Distance histogram - library augmentation".

Distance histogram - library augmentation

... Continuing from the previous example...

6.   Library augmentation

In the previous Compare Libraries Similarity panel, check the Select Molecules in Candidate Library box and set the Minimum Distance requirements to 1.0. In the Compare Libraries Preferences panel, set the Number of bins for histograms to 100.

This will select only those molecules in the Candidate library that meet those distance requirements.

Click the Compare Libraries pushbutton in the Compare Libraries Similarity control panel.

The Cerius2 Study Table now shows those selected molecules that meet the specified distance requirements.

Note

The determination of distance thresholds for selecting compounds may be determined by the number of samples desired or by more involved descriptor validation experiments. For example, when working with fingerprints and Tanimoto distances, values of 0.15 have been recommended as thresholds (Brown and Martin 1996; 1997).



MSI Product Previous Next Contents Index Top

Last updated May 19, 2000 at 01:50PM Pacific Daylight Time.
Copyright © 2000, Molecular Simulations Inc. All rights reserved.