MSI Product Previous Next Contents Index Top
Combi-Chem


3e. C2·LibProfile Penalty based Diversity and Similarity Selection

Diversity and Similarity selections can be influenced by the need to preferentially select molecules with desirable properties. In this tutorial you will see the use of one diversity selection method, however the procedure is applicable to all other selection methods. The example given in this tutorial uses the study table. This functionality is also available using BDF files.

The design of libraries can be constrained at three levels, penalties can be simultaneously imposed at any or all levels.

Product Diversity based on property ranges

In this section, a diverse subset of a dipeptide virtual library will be selected in which the desired ranges of molecular weight and hydrogen bond acceptors will be controlled.

15.   Open the study table

Go to the LIBRARY ANALYSIS card located on the COMBICHEM-I card deck and select Show Study Table. Open the file Cerius2-Resources/COMBICHEM/demos/dipep_400_pca.tbl.
and create a 3D plot

Create a 3D plot from the first three principal components. First, select the first three columns marked PC in the Study Table.

Click the 3D Plot button on the Study Table button bar (this is the sixth button from the right, marked with a picture of a cube).

This will bring up the 3D Plot Samples control panel.

On the 3D Plot Samples control panel, click Set XYZ and then 3D Plot.

16.   Examining Property Profiles for the Virtual Library

The library design in the next step will constrain the molecular weight and number of hydrogen bond acceptors. First the distributions of these properties in the virtual library will be examined.

In the study table, scroll right and select the MW and H bond acceptor columns (Control-Click to add columns to the selection). Then, push the Histogram button (the button with a tiny picture of a histogram) in the center right of the Study Table tool bar.

17.   Set up the penalty ranges

From the LIBRARY ANALYSIS CARD on the COMBICHEM-I deck select the Restraints Property Ranges. Push the Get Properties button.

This will show all properties for which restraints can be imposed (typically all those present in the study table or bdf file).

Scroll down and select MW from the list of properties. Set the Upper Bound to 225. Push the ADD button.

Now select Hbond acceptor and set the Upper Bound to 8. Push ADD, then LIST.

A list of the current penalty profile is listed in the text port. Note that this profile can be saved for future use using the Save Ranges... button.

From the LIBRARY ANALYSIS card on the COMBICHEM-I deck select Select Molecules Diverse Distance Based.

Change Optimize to Diversity-Penalty. Enter 25 Molecules and click the SELECT button.

Note that the selection has been constrained to the left hand side of the 3D chemical space plot. The first principal component is primarily size related and thus "favorable" molecules are located towards the left.

Back on the Penalty Ranges panel, reselect MW. Change the Weight to 100. Push ADD to modify the MW restraint and LIST to see the new list.

This will now weight this restraint 100 times more than the diversity or the Hbond acceptor penalty.

From the Select Diverse panel select 25 Molecules and press SELECT.

Note that the selection is even more concentrated towards the left hand end of the space.

To emphasize the diversity more than the properties, the weights on the restraints can be set to fractional values. Diversity always carries a weight of 1.0

18.   Examining Property Profiles for the Subset Library

In the Study Table, select the MW and Hbond acceptor columns without unselecting the rows that form the subset library (Control-click to add the column selections to the row selections).

Push the Histogram button (the button with a tiny picture of a histogram) in the center right of the Study Table tool bar.

Note that the distributions in the design library have been constrained to be within the desired ranges.

Diversity based on profiling product properties of an existing dataset

In this section a diverse subset of a dipeptide library will be selected that is designed to have similar property profiles to a set of benzodiazepines.

1.   Open the benzodiazepines study table

Select File/New Session from the Visualizer control panel and click Confirm.

This will reinitialize all parameters and settings for this tutorial.

Go to the LIBRARY ANALYSIS card located on the COMBICHEM-I card deck and select Show Study Table. Open the file Cerius2-Resources/COMBICHEM/demos/benzo720.tbl.

2.   Create the profile of a set of structures

In the study table, scroll right and select the H bond acceptor column. From the Study Table, select Preferences/Histograms....

This brings up the Histogram Preferences control panel.

Check the Create Penalty Profile File checkbox. Set the Number of Bins to 4 and Error for lower and upper bounds to 0.05. Under Penalty Profile File Name, enter benzo720.ppf. Then, push the Create Histogram button (you can select more than one property to profile simultaneously by selecting multiple columns).

Note that the range of values is from 1 to 4.

Check off the Create Penalty Profile File checkbox before closing the panel.

3.   Apply the profile to selection

Open dipep_400_pca.tbl and create a 3D plot.

Create a 3D plot from the first three principal components. First, select the first three columns marked PC in the Study Table.

Click the 3D Plot button on the Study Table button bar (this is the sixth button from the right, marked with a picture of a cube).

This will bring up the 3D Plot Samples control panel.

On the 3D Plot Samples control panel, click Set XYZ and then 3D Plot.

Select the Hbond acceptor column and push the histogram button.

Note that the distribution has a maximum frequency at value 7.

From the LIBRARY ANALYSIS card select the Restraints Clear All Penalties option.

From the LIBRARY ANALYSIS card select Restraints Property Profiles. Select the file benzo720.ppf created above. Select the Hbond acceptor column and push ADD. Push the LIST button to see that only 1 restraint is set.

From the LIBRARY ANALYSIS card select the Select Molecules Diverse Distance-based to bring up the Select Diverse control panel. Click the Preferences... button on the Select Diverse control panel to bring up the Analysis Preferences. Change Monte Carlo Steps to 100000 and the Terminate after Idle Steps to 10000.

Close the Preferences panel, and, in the Select Diverse panel, enter 25 Molecules, change Optimize to Diversity-Penalty and push the Select button.

Keeping the selected rows, select the Hbond acceptor column (control-click on the column header). Push the Histogram button (the button with a tiny picture of a histogram) in the center right of the Study Table tool bar.

Note that the selection is constrained so that the distribution is skewed towards the lower values typical in the benzodiazepines. The diversity plot shows that a diverse selection has still been made.

Note

To include the combinatorial constraint in the above selections, the diversity would be accessed from the R-Group Subsetting menu rather than the Select Molecules menu.     For a selection that is restrained by the properties but not by any diversity or similarity, select the Restrained item from Select Molecules or R-Group Subsetting.  

Diversity Constrained by Reagent Properties

In this section a diversity selection will be made from a dipeptide library while controlling the properties of the reagents which make up the design library.

1.   Setting up the Reagent Information

If it is not already open, reopen the dipep_400_pca.tbl and recreate the 3D plot.

From the LIBRARY ANALYSIS card select the Restraints Clear All Penalties option.

From the LIBRARY ANALYSIS card select Restraints Reagent Penalties. Push the Browse... button to select the reagents table:
Cerius2-Resources/COMBICHEM/demos/dipep_reagents.dat.

Push the Browse... button to select the suppliers table:
Cerius2-Resources/COMBICHEM/demos/dipep_suppliers.dat.

Check the Total Reagent Cost button

Reagent based constraints operate on two text files. The first contains information about each reagent in the virtual library (name, MW, supplier, cost per unit and a user definable penalty for using that reagent); this file should be prepared from the results of the database searches used to select reagents to build the virtual library. The second file contains information about each supplier (name and a user definable penalty to be attached to the use of that supplier).

2.   Selecting a diverse library and evaluating its cost to make

From the LIBRARY ANALYSIS card select the Rgroup Subsetting Diverse Library to bring up the Diverse Library control panel. Click the Preferences... button on the Diverse Library control panel to bring up the Analysis Preferences. Change Monte Carlo Steps to 100000 and the Terminate after Idle Steps to 10000.

Close the Preferences panel, and, in the Diverse Library panel, enter 5,5 in the Number of fragments to select in each Rgroup text box. Push the Estimate Optimum Number of Cells button. Make sure the Optimize menu is set to Only Diversity. Push the SELECT RGROUP FRAGMENTS button.

This will select a 5x5 library on the basis of diversity only.

From the LIBRARY ANALYSIS card select Restraints Calculate Penalties. Push the Calculate Penalty Function for Selected Rows button.

A report will be written to the text port, showing that the total cost of the library.

3.   Constraining the diverse library to reduce cost

Change the Optimize menu on the Rgroup Subsetting control panel from Only Diversity to Diversity-Penalty. Push the SELECT RGROUP FRAGMENTS button.

This will select a 5x5 library on the basis of diversity and reagent cost.

On the Calculate penalty control panel, push the Calculate Penalty Function for Selected Rows button.

The report in the text port shows that the cost of the library has been halved.

4.   Constraining the diverse library to reduce the total number of reagents

On the Reagent Penalties control panel, uncheck the Total Reagent cost checkbox and check the Number of Different Reagents box. Change the Weight of this penalty to 10.

On the Diverse Library control panel, push the SELECT RGROUP FRAGMENTS button.

This will select a 5x5 library on the basis of diversity and total number of reagents used (since R1 and R2 are selected from the same list it is possible to use the same reagents at both positions).

On the Calculate Penalty control panel, push the Calculate Penalty Function for Selected Rows button.

A total of only 5 reagents are now being used.

5.   Fixing some reagents in the design library.

On the Diverse Library control panel, push the Fix Fragments in Rgroups... button.

This brings up the Fix Fragments control panel, which allows some reagents to be fixed in the design library. The selection methods will augment this selection to best satisfy the diversity and penalty function.

On the Fix Fragments control panel, push the Get Rgroup Fragments button. In the R1 fix column toggle on ala and arg (i.e., click the cells in the R1 fix column corresponding to ala and arg, changing each value from No to Yes). Scroll right to the R2 fix column and toggle on asn and asp.

Now, on the Diverse Library control panel, push the SELECT RGROUP FRAGMENTS button.

This will select a 5x5 library on the basis of diversity and total number of reagents used but will only vary 3 of the 5 reagents at each position.

Finally, on the Calculate Penalty control panel, push the Calculate Penalty Function for Selected Rows button.

A total of 5 reagent are being used including the two required reagents at each position.

This tutorial has shown the use of reagent and product penalties separately with diversity. Both types of constraint can be applied simultaneously to any selection.

Advanced binning and factorial design

Additional binning options have been added as of Cerius2 version 4.5 to allow more user control over the binning of descriptors than had been possible in C2 Diversity. This tutorial shows the use of one of these binning options to do a factorial design on a small library. Such design methods might also be applied to individual reagent lists that will make up a combinatorial library.

1.   Open the study table

Goto the ADVANCED BINNING card on the COMBI-CHEM-I card deck and select Show Study Table. Select the Descriptors/Select menu item in the Study Table. In the Descriptors control panel, select (use <Ctrl>-click) Rotbonds, Hbond acceptor, and AlogP98 (items 45, 46, and 50 in the table). Click the ADD pushbutton to add the descriptors to the study table. Select the Molecules/From SD File menu item in the study table. In the Add Molecules from SD File control panel, select Cerius2-Resources/COMBICHEM/demos/benz_analogs.sd. Click the IMPORT MOLECULES button. Once the molecules have been loaded, save the study table.

2.   Setting up the bins

Goto the ADVANCED BINNING card the COMBI-CHEM-I card deck and select Define Binning. In the Define Binning control panel, click the Load properties button.

Clicking the properties in the listbox allows you to see the distributions of each property.

Select Population-weighed from the upper popup in the Define Binning control panel. Set the Number of bins to 3 and click the Bin all properties action button. Click the Display binning table button.

This opens the Binning Thresholds control panel, which is a table showing the break points for each bin of each property. Having examined the table, close this panel. The binning scheme can be saved and restored using the Load Binning and Save Binning buttons in the Define Binning control panel.

3.   Manual selection of compounds from bins

From the ADVANCED BINNING card on the COMBI-CHEM-I deck, select Analyze Binning. Check the Browse Models checkbox in the Bin Analysis control panel. Click the Display histogram action button.

This loads an interactive histogram into the model window showing the filled cells. The cell number, cell coordinates (i.e., bin numbers on each axis), and cell occupancies are shown in the listbox on the right side of the Cell Browser control panel.

In the histogram, click the blue dot for cell 1.

Molecules may be manually selected by toggling entries in the Selected column in the Cell Browser control panel from No to Yes. This process can be repeated for each occupied cell to select a subset of the library or reagent list. After each cell is processed, clicking Select Rows in Study Table updates the selected rows in the study table.

After processing all cells, the set of selected rows forms the total selection.

4.   Automatic selection of compounds from bins

Having created a binning scheme, cell-based diversity methods can be used to select compounds automatically.

From the ADVANCED BINNING card on the COMBI-CHEM-I deck, select Select Molecules. Set the Normalization popup to None. Select the Plot Cells in 3D Space box and click Select Molecules.

(The Optimum Binning option was the default binning scheme in previous releases of C2 Diversity.)

5.   Working with multiple libraries

Two libraries can be compared on the basis of the number of cells that are occupied by members either or both libraries.

Selectl the Molecules/From SD File menu item in the study table. In the Add Molecules from SD File control panel, SELECT the file Cerius2-Resources/COMBICHEM/demos/benzo720.sd. Choose the Range radio button and set the upper limit To 100. Click the IMPORT MOLECULES button to add the first 100 molecules to the study table. Select the first 144 rows of the study table (i.e., the benzamidine entries). From the LIBRARY COMPARISON card on the COMBI-CHEM I deck, select Define Libraries. In the Define Libraries control panel, click the Set Selected Rows as Library button to mark these structures as LibA. Repeat this process to mark the next 100 rows (the benzodiazipines) as LibB. Resave the study table. From the ADVANCED BINNING card on the COMBI-CHEM-I deck select Define Binning. In the Define Binning control panel, click the Load properties pushbutton to update the list and then the Bin all properties action button to set up a binning scheme combining the two libraries. Click the Analyze binning pushbutton. In the Bin Analysis control panel, select All from the cells popup, then library from the from popup. Select LibA from the pulldown that appears. Click the Display histogram action button.

This shows the cells occupied by the benzamidines.

Select LibB from the Add from library pulldown and click the Add from library action button. This adds the new library to the existing histogram in a contrasting color.

At this point selections can be made from cells containing members of both libraries (for focussing one library to another) or from cells containing only one library (to augment the other).



MSI Product Previous Next Contents Index Top

Last updated May 19, 2000 at 01:50PM Pacific Daylight Time.
Copyright © 2000, Molecular Simulations Inc. All rights reserved.