MSI Product Previous Next Contents Index Top
Combi-Chem


4c. Analog Builder

Back to the Combinatorial Chemistry Methodologies index.

The C2·Analog Builder module generates analog structures from a model by substituting user-specified groups for hydrogen atoms in a parent structure. Generated structures are placed in their own model spaces and can then be studied using any of the Cerius2 tools and modules.

This section contains information on:

How the analog builder works

Outline of general procedure

Specific methodology for building analogs

Table 2

For information about See
3D-Sketcher templates.   The "Building Models" chapter in the Cerius2 Modeling Environment book.

Manipulating the analog table.   The "Working with Tables" chapter in the Cerius2 Modeling Environment book.

Accessing the tools

The analog builder is (by default) available in two cards in Cerius2: the ANALOG BUILDER card in the BUILDERS 2 card deck and the LIBRARY CONSTRUCTION card in the COMBI-CHEM I card deck. The control panels available from both cards are identical.

Additional information

Please see the help for additional information on all controls in all control panels that are found in the analog builder. (To access help, click the right mouse button when the cursor is over an item about which you want information.)

How the analog builder works

Parent structure

The analog builder operates on the model in the current model space. This parent structure is typically an organic molecule with one or more hydrogen atoms that can readily be replaced by substituent groups to form analog structures.

Substitution points

Several hydrogen atoms can be selected and defined as R-group substitution points R1, R2, ... (Figure 1). The analog builder generates structures by replacing these selected atoms with substituent groups.

Figure 1

Substituent groups

Analog structures are built by replacing the hydrogen atoms at the substitution points with groups (for example methyl, phenyl, or hydroxy) chosen from the directory Cerius2-Models/templates/organics or from an sd file in which you have stored your own groups. You can specify as many substituent groups as you want for each substitution point. The analog builder creates structures for every possible combination of selected substituents.

You can also create analogs using double-linked R-groups, symmetrical R-groups, or multiply bonded R-groups.

Analog table

When you have specified all required substituent groups, a table is created showing all the analog structures that can be generated from the specified substituent groups. The total number of analog structures (and, therefore, entries in the table) is given by:

Number of substituent groups specified for R1 X

Number of substituent groups specified for R2 X

Number of substituent groups specified for R3 ...

Generate structures

Analog structures that correspond to each row of the analog table are built. Each structure generated is placed in the next empty model space and is given the name of the parent structure suffixed with the corresponding row number from the analog table (there are other naming options). For example, the structure generated from row four of the analog table for parent structure CAFFEINE would be named CAFFEINE04.

Structures may be output to an sd file.

Outline of general procedure

Required steps

1.   First load a model into Cerius2 or build the desired model. This model serves as the core structure to which various groups will be added to form analogs.

2.   To access tools for defining the R-group attachment sites in the core structure (where substitution will occur) and for specifying the substituent groups, click the Analog Builder card menu item to open the Analog Builder control panel.

3.   Define one or more of the core model's hydrogen atoms as R groups.

4.   Load the substituent groups from files and define which ones to add at which R groups.

5.   Finally, instruct Cerius2 to begin generating analogs.

Note

Note

The H in the core model that is defined as the R group and an atom in the substituent group whose element type is X are both removed when the core and substituent are bonded together during analog generation.

Once the analogs are generated, you can name them, save them in files, save the Cerius2 session, and perform other routine operations. Please see the Cerius2 Modeling Environment book for details on these and other basic procedures. Additional work with analogs can involve other modules, which are documented in the Combi-Chem documentation.

Optional procedures

Sometimes you may want to edit the list of analogs before actually generating them. You can add or eliminate substituents using the controls on the Analog Builder control panel.

You can store a set of fragments defined as R-X in a single sd file and then recall them as a set of fragments.

With MDL's Project Library program you can specify the core structure, its R groups, and the substituent groups. If you have done this, click the Import Library card menu item to open the Import Library control panel, which allows you to import these library definitions. Reading a definition file results in creation of a core model with its R groups already specified and in creation of a directory containing the fragments. Then you instruct Cerius2 to begin generating analogs.

Note

Note

Analog Builder can import and export MDL RG files consistent with the 1997 version of the MDL v2000 format. Please see Importing and exporting fragments for more detailed information.

Finally, you can choose to minimize your analog structures or you may choose to use the time-saving Clean Analogs option instead. Please see Setting preferences for more detailed information.

Specific methodology for building analogs

Following some introductory material (below), this section describes how to carry out these procedures within the Cerius2 interface:

Creating a core structure

Accessing the analog-building tools

Defining R group attachment sites

Specifying substituent groups

Reading all required structures from a library (alternative proce dure)

Generating analogs

You can manually create in the core structure (Creating a core structure) before accessing the analog builder tools (Accessing the analog-building tools) to manually define its R-group attachment sites (Defining R group attachment sites) and specify the substituent groups (Specifying substituent groups) that will be added to the core structure.

Alternatively, you can read the core structure and substituent groups in from a library generated by MDL's Project Library program (Reading all required structures from a library).

In either case, you might want to edit the list of analogs to be formed before actually generating them.

Finally, you generate the analogs ( Generating analogs).

Creating a core structure

Start Cerius2 and then create a core structure upon which to make analogs by reading in or constructing a model.

Please see Cerius2 Modeling Environment for details on constructing and reading in models and on how to specify a given model as current.

Accessing the analog-building tools

The analog builder contains the tools you need for specifying and generating analog structures and is (by default) available in two cards in Cerius2:

To access the C2·Analog Builder module through the BUILDERS 2 card deck, click the deck selector in the main control panel and choose BUILDERS 2 from the list that appears. The ANALOG BUILDER card should be the only one in the deck of cards menu.

Analog Builder control panel

To access the Analog Builder control panel, click the Analog Builder menu item on the ANALOG BUILDER or LIBRARY CONSTRUCTION card. Other control panels are accessed via menu items on the ANALOG BUILDER and LIBRARY CONSTRUCTION cards and/or through buttons on the Analog Builder or other control panels.

Defining R group attachment sites

To define the R-group attachment sites, click the R-group selection tool, in the upper-right corner of the Analog Builder control panel.

Selecting R group attachment sites

Then use the cursor to pick hydrogen atoms in the current model displayed in the model window. Each picked hydrogen atom is defined as an R-group attachment site.

Editing the attachment sites

You can decide to remove the specification of an R-group attachment site from your core model (at any time before you generate the analogs). To do so, use the left- or right-facing arrow below the R list box in the Analog Builder control panel to make the desired R group list current (i.e., it appears in that list box). Then click the Delete R Group push-button at the bottom of the control panel.

Additional information

Please see the on-screen help for information about all the controls in this control panel.

Specifying substituent groups

By default, the contents of the Cerius2-Models/templates/organic directory are listed in the Set 1list box (on the left side of the control panel. In addition, hydrogen is always listed as a substituent (the directory does not actually contain an XH group). If this list is adequate to your needs, you can proceed to Specifying substituents for R groups.

Nonstandard file of substituent groups

However, if you have made your own custom set (file) of substituent groups, you can define it as the default directory (if you want it to always appear) by creating a USER_FRAGMENTS environment variable before starting Cerius2.

Alternatively, you can access a different substituent-group file in addition to the default file by clicking the Add Fragments Set push-button to access the Select New Fragments Set control panel. This control panel contains standard file-browsing controls, whose operation is described in Cerius2 Modeling Environment. To load the entire directory, click the name of any file within the directory.

Note

sd files containing a set of fragments may be created from a selection of .msi files. Add them to a table and then export them as an sd file. Template files are .msi format files of models in which one of the atoms is defined as an X element. The X atom is removed during analog generation.

Multiple substituent- group directories

When you have more than one fragment list loaded, you can move from one fragment list to another by clicking the left- or right-facing arrow below the Set 1list box in the Analog Builder control panel. The file whose contents are displayed in this list box is considered current.

If you are not using any substituent groups from one or more of your open substituent-group files, you may make that file current and delete it from the Set 1 list box by clicking the Delete Fragments Set push-button

Specifying substituents for R groups

To add the same list of substituent groups to each R group, select the desired groups in the Set 1 list box. Do this by clicking the desired group names in the Set 1 list box. (Click a selected name again to unselect it.) The Select All and Deselect All push-button below the Set 1 list box can also be used. Next, assure that the To popup below the R1 list box is set to All R Groups and click the right-facing arrow that is located between the two list boxes. The group names appear in the R1 list box (this list can be edited, below).

To add different lists of groups to different R groups, follow the same procedure, except set the To popup below the R1 list box to Current R Group. To move from one R group list to another, click the left- or right-facing arrow below the R1 list box. (The group that appears in the R group list is considered current.)

Select R-group fragments based on selected study table rows

To quickly select R-groups that you used in a previous study, click the Select from Study Table button. This looks at the rows currently selected in the study table, determines what fragments are present in the models in the selected rows, and selects the corresponding fragments in the Analog Builder control panel.

Editing the lists of substituent groups

To edit the lists of substituent groups for each R group, make the desired R group list current by clicking the left- or right-facing arrow below the R1 list box. Then select the group(s) you want to keep in or remove from the list and click the Remove Selected or Remove Unselected push-button, as appropriate. The Select All and Deselect All push-button below the R1 list box can also be used. Clicking a selected R-group name deselects that name.

Substituents selected by diversity criteria

To use an automatic method based on a diversity criterion to select a certain number of substituent groups that will actually be used during analog generation, click the Select Diverse... push-button (in the Analog Builder control panel) to open the Select Diverse Fragments control panel.

Select the method to be used by choosing it from the Use Diversity Measure popup. If you want to specify additional preferences affecting the selection process, click the Preferences... push-button to access another control panel.

After setting all controls as desired, click the SELECT push-button in the Select Diverse Fragments control panel to start selecting the diverse substituent groups.

Additional information

Please see the on-screen help for information about all the controls in these control panels.

Double-linked and Symmetrical R-groups

You can use Analog Builder to create analogs using bridging R-groups, that is a set of fragments that have two attachments to the core model (for example, CH3-(R)-COOH).

Defining a new bridging R-group on the core model

You specify a doubly attached R-group in the same manner as a singly attached R-group, but instead of picking a hydrogen atom, you must pick a doubly attached atom. The atom you pick may be doubly bonded to other atoms but may only be attached to a total of two atoms. For example, the carbon of -C=O is a valid selection, while that of -CH2- is not (it is attached to 4 atoms).

Note

The atom you select for a singly-attached R-group is not required to be a hydrogen atom. It may be any singly-attached atom, for example. H-, Cl-, or O=

Defining a new bridging R- group fragment

A bridging R-group fragment must have two connection points. You specify these by attaching or replacing two atoms in the fragment with X atoms, or with an X atom and an XX atom. For example, X-CH2-X and X-CH2-O-XX are both valid bridging R-group fragments.

Note

You may use a bridging R-group fragment at a singularly attached R-group connection site. In this case only the primary X atom makes an attachment. For example, (R)-CH3 with R = X-COO-XX produces the analog CH3-COO-XX.

If you use a fragment containing only a single X atom for a bridging R-group, the fragment will only be attached to the primary attachment point of the R-group in the core model. This may be useful for ring-opening but will otherwise lead to bond cleavage. For example, Cl-(R)-CH3 with R = XOH produces Cl-OH and CH4 (assuming Cl is the primary attachment atom -- the extra hydrogen is added as a result of saturation).

Bridging R-group attachment ordering

When a bridging R-group fragment is asymmetrical, you may need to specify the order in which the fragment atoms are attached. For example, a core model CH3-(R)-NH2 where R contains the fragment X-CH2-O-X could produce either CH3-CH2-O-NH2 or CH3-O-CH2-NH2 as analogs. (In fact it would produce just one of these, depending on the order in which the bonds occur in both the core and fragment models.) To ensure a specific ordering is adopted, you must specify the order for both the core and fragment models.

The primary and secondary attachment atoms for the fragment model are simply the atoms labeled X and XX respectively. When you select a new bridging R-group on the core model, the bond to the secondary attachment atom is marked ("). You can move this between the two R-group bonds to re-specify the secondary attachment atom for a currently selected bridging R-group. To do this, click the button (marked with an R and a double-headed arrow) located in the upper right hand corner of the Analog Builder control panel.

Bridging R-group special case: the null-fragment:

It is possible to build fragment libraries using fragments such as X-X and X=XX. Since the X atoms are removed when added at an R-group site in the core model, these fragments effectively contain zero atoms, although they do contain bonding information. For a core model C-(R)-N with R = {X-X,X=X}, the resulting analogs built would be C-N and C=N respectively. (See the note below on exporting libraries containing null-fragments.)

Symmetrical R-groups

With Analog Builder, you can also define repeated R-groups. If an R-group is repeated at two or more sites, the same R-group fragment appears simultaneously at each site when the analogs are generated. For example, the core model CH2(R1)-CH(R2)-(R1) with R1 = {XCH3,XBr}, R2 = {XF} produces just two analogs, CH2(CH3)-CH(F)-CH3 and CH2(Br)-CH(F)-Br.

Defining a repeated R- group on the core model

The ANALOG BUILDER panel, shown above, contains a check button called New Rgroup, which is checked by default; this indicates that when you click on an atom to define an R-group, a new R-group is created and added to the list. After you have added at least one new R-group to your core model, you may create a repeated instance of a R-group using the following steps:

1.   Uncheck the New Rgroup option.

2.   Select an existing R-group by clicking on the R-group atom in the model while the pick function is in Add-Rgroup mode, or by using the Next and Previous buttons of the R-group table.

3.   Select an atom in the core model to become the site for the repeated R-group.

Where a bridging R-group is repeated, you may specify the attachment order for each R-group instance independently.

Note

You cannot remove or replace a particular repeated R-group. Clicking the Remove Selected button removes all instances of the currently selected R-group.

Multiply bonded R-groups

You can select core model atoms that have multiple bonds as R-group sites. The original bonding is retained when the analogs are produced. For example, (R1)=CH-(R2)=O with R1 = {XO,XS} and R2 = {XCHX,XNX} produces the four analogs, O=CH-CH=O, S=CH-CH=O, O=CH-N=O and S=CH-N=O.

Note

The bond order between the X atom(s) and the R-group fragment atom it is attached to is irrelevant, except in the special case of a null-fragment (as described above).

Fragment extraction from database searches

The Extract Fragments panel (shown below) is accessible from both the COMBI-CHEM1/LIBRARY CONSTRUCTION and DATABASES/CATALYST INTERFACE menu cards. This command allows you to extract fragments from models that match a model query, and save them in an MDL-format SD file. Typically the model query would be one used to previously search a database, while the file that the fragments are extracted from would be the result of that search. The main purpose of this panel is to automatically generate a set of substituent R-groups from a set of reagents.

[Image]

The Extract Fragments command requires that you also have a licensed version of Catalyst4.0, since the matching of the query model to models in the input SD file is performed using Catalyst search functionality. Hence, typically the input file is a hit list from a Catalyst search although fragments can be extracted from the results of an ISIS database search or a more general SD file.

Using Extract Fragments to extract molecules

The simplest use for the Extract Fragments command is to extract molecules from a list of molecules in an SD file. If the input file is a random list of molecules then this functionality is similar to performing a database query. If the query model was based on a query used to produce the input file as the results of a previous database search then this functionality is similar to performing a query on a subset of a database.

To extract a list of molecules, complete the following steps:

1.   Go to the COMBI-CHEM I card deck, and on the LIBRARY CONSTRUCTION card select Extract Fragments, with your query model as the current Cerius2 model. This opens the Extract Fragments control panel.

2.   If query model cleavage bond atoms are defined, click the CLEAR button to unspecify them.

3.   Select the filenames for the input (Hit List to Search) and output (Save Result In) or leave them as the default names (DBAccess.sd is the default name for the results of database searches).

4.   Click the EXTRACT button to perform the command and generate the output file.

Note

Only molecule structure is extracted from the input SD file, that is, any extra property data associated with the extracted molecules in the input file is not transferred to the resulting SD file.

Using Extract Fragments to extract fragments

The primary use of the Extract Fragments command is to actually extract fragments of molecules from an SD file, from those molecules which match the query. These fragments are generated by defining a cleavage bond within the query model. Molecules which match the query model are cleaved at this bond to produce two fragments, one of which, usually that containing the bulk of the query-matched atoms, is discarded and the other is written to the output file. An X atom is added to the output fragments where the fragment was cleaved from the rest of the molecule.

To produce extracted fragments, you follow the same steps as those in Using Extract Fragments to extract molecules, above, except for step (2). To extract fragments, you define the cleavage bond by setting the pick mode to cleavage bond mode by clicking on the icon next to the EXTRACT button (see the illustration above) then picking two bonded atoms in the query model. The selected atoms are listed in the panel. The order in which you pick the atoms is important, since this determines which half of the fragment is kept. Usually an outer-most atom is picked first, so that fragments attached to the query at this point are produced when the fragment extraction is performed. However, you may wish to retain the query part of the matched molecules so that further fragment extraction can be performed.

Note

You may load the resulting set of fragments, with X atoms to define the attachment points, directly into Analog Builder as a new fragment set. However, a general query will be attached to several potential fragments, so cleaving the query at one bond in multiple matching molecules will often produce duplicate fragments in the results. These duplicate fragments are not removed, so you must take care when using the results of a fragment extraction to build combinatorial libraries.

Reading all required structures from a library

Accessing the tools

If you want to read in libraries containing a core structure (with defined R groups) and analogs that were output by MDL's Project Library program, select the Import Library menu item on the LIBRARY CONSTRUCTION card to open the Import Library control panel.

Loading a library file

If you want to set preferences before loading your library file, click the Preferences... push-button to open the Import Library Preferences control panel.

Use the file browser controls in the Import Library control panel to find and load the desired library file. The operation of standard file browser controls is covered in Cerius2 Modeling Environment.

Generating analogs from a library

To generate the analogs, click the GENERATE ANALOGS push-button in the Analog Builder control panel (see Generating analogs).

Additional information

Please see the on-screen help for information on the controls in these control panels.

Importing and exporting fragments

RG file importer and exporter

Analog Builder can both import and export standard MDL RG files that are consistent with the 1997 version of the MDL V2000 format. (Previous versions of Cerius2 did not allow for the reading (or writing) of RG files containing bridging and/or repeated R groups and did not process R bond order information.) The following points should be noted:

1.   When you import a library into Cerius2 with Analog Builder, singularly attached R-group sites appear as hydrogen atoms, and bridging R-group sites appear as oxygen atoms, regardless of bond order and what these atoms appeared as when you exported the library.

2.   MDL RG file readers may have a problem reading libraries exported by Cerius2 which contain null fragments, as described above. (Most likely you will not wish to use such fragments.)

3.   Pre-1997 versions of the V2000 format did not allow for a specified atom attachment order ($AAL entry). Programs which only support older versions of V2000 should be able to import Cerius2-exported RG files if the default attachment order for bridging R-groups is used.

Export selected R-group fragments to SD files

To save the currently selected fragments in your set of R-groups, click the Export Selected Fragments button in the Analog Builder control panel. This saves the fragments in a new SD file.

Generating analogs

Click the GENERATE ANALOGS push-button in the Analog Builder control panel to generate analogs containing all possible combinations of R groups, as specified in the preceding steps. (Text immediately below the push-button tells you how many analogs will be generated.)

Tip

If you want to change how analog generation works, click the Preferences... push-button to open the Analog Builder Preferences control panel (Setting preferences).

The generated analogs automatically appear as models in the model window and as entries in the Study Table panel, which appears automatically.

Tip

If you close a study table, you can view it again by selecting the Show Study Table menu item in the LIBRARY ANALYSIS card (in the COMBI-CHEM I card deck).

Setting preferences

To change various aspects of how the analog builder works, click the Preferences... push-button in the Analog Builder control panel to access the Analog Builder Preferences control panel. The controls in this panel affect the analog-generation process throughout the current Cerius2 session.

To minimize the analogs as they are generated, check the Minimize Analog check box.

Clean Analogs option

In the Preferences panel of the Analog Builder panel is a Clean Analogs option, which is deselected by default. When this option is checked, subsequent analogs generated are "cleaned" using the Cerius2 Clean functionality. This option is an alternative to Minimize Analogs, that also produces reasonable analog structures, while requiring significantly less time. Using this in conjunction with the minimize option may improve the quality and/or performance of the analog generation process above that obtained if you use the minimize option on its own.

To append each newly constructed analog to an MDL SD file, check the Append Analogs to SD File check box. Enter the name of the MDL SD file in the SD File entry box.

To remove analogs from current Cerius2 memory after all actions are completed, check the Delete Analog When Actions Completed check box. By default, enough information about the analog is added to the study table so that the model can be recovered later.

To automatically add all generated analogs to the study table, be sure that the Add Analogs to Study Table option is checked.

Fast enumeration to an SD file

You have the option to enumerate the libraries directly to an SD file, without adding the analogs to the Cerius2 Models Manager or to the Study Table. Enumeration using this option is two to three orders of magnitude faster than the normal enumeration in Analog Builder.

To use the new option, define your library as you normally do, but check the option to Enumerate Directly to SD File, and specify a name for the SD file that will contain the enumerated library.

Note

When you use this option, all other options in the Analog Builder Preferences panel are ignored. The analogs are never added to the Study Table.

Note

Upon generation of analogs, the resulting sd file contains all the enumerated structures in addition to their synthetic derivation. The sd file contains data fields indicating the nature of each substituent R-group for each analog. This information is essential for operations performed in the C2·LibSelect module.

Additional information

Please see the on-screen help for information about all the controls in these control panels.



MSI Product Previous Next Contents Index Top

Last updated May 19, 2000 at 01:51PM Pacific Daylight Time.
Copyright © 2000, Molecular Simulations Inc. All rights reserved.