Screening Assistant 2


Last edited: 18/11/2011
Sourceforge: SA2 website
Help: SA2 forums
Documentation index

This page provides all the information needed to understand how properties are stored in the database, and how to store your own properties in the database.

Table of Contents

  1. Introduction
  2. Existing properties
  3. Existing fingerprints
  4. Adding new properties
  5. Adding new fingerprints

Introduction - top

One of the most important feature of SA2 is that it has been designed to associate to each molecule, any number of properties. Focusing on molecular descriptors, there is potentially thousands of values that can be associated with each molecule. Dealing with such a large dataset is far from being trivial, especially in a database supposed to store millions of molecules. Various solutions have been tested to deal with such amount of data, and we finally ended up with the most basic one, which actually provides the best compromise between performance, stability and flexibility.

In SA2, Properties are organized by Tables. A property can be anything: biological activity, molecular descriptors... They are stored in different Tables, where you will want to group related properties together. In other words, each Table can be seen as a simple spreadsheet table (e.g. in Excel), where rows correspond to molecules, and columns correspond to properties.

Fingerprints on the other hand, are stored in dedicated Tables. They are stored in binary format. This is probably not the best choice in term of storage, but hey... disk space is certainly not the most expensive thing that you will need in your drug discovery pipeline ;)

Additionally, each table is associated with a Category, and each category can contain multiple Tables and sub-categories. This way, you can bring a bit of organization in the plethora of data that can be associated with each molecule, although there is of course still many room for improvement...

Existing properties

In SA2, various property tables will be created when you will create a new SA2 database. Here is the complete list of tables available by default in the current version. Note that to populate these tables with actual values, you need to either enable / compute the corresponding worker (if available, e.g. CDK / JOELib workers), or retrieve the properties yourself and import them in SA2 using the supported input files (e.g. the MOE tables).

Existing fingerprints

Adding new Properties - top

You can add your own new properties, either on-the-flight when importing new molecules / importing new properties in the database, or by creating them explicitely by hand (Database->Properties, or using the List of properties window). Simple enough, not much to say :)

Just note that when imorting properties in the database using the import wizzard (either SDF or CSV), you will be able to create new tables on the flight, but you will not be able to create new property in existing tables!. To do so, use the List of properties.

Adding new fingerprints - top

Adding new fingerprints deserve a bit more explanations, although it is very similar to importing properties.

First of all, when you create yourself a new fingerprint, you actually create the storage capability for your fingerprint. There is by definition no way to compute this new fingerprint within SA2 (unless you create your own worker), As a consequence, the actual fingerprints values for each molecule must be computed using your own tool, and subsequently imported in SA2 (in a very similar way to the CSV import), as illustrated in the quickstart guide..

This has one important consequence: when you compute a similarity search on a new molecule (drawn in the sketcher), you will actually not be able to use your own fingerprints as the basis of your similarity search. For such molecule, you will only be able to use the fingerprints that can be computed within SA2, e.g. all fingerprints that can be calculated through the available workers. That makes sense, right? :)

Note also that you can't create new fingerprints when importing molecules, and you can't either import your fingerprints directly when you import a new sets of molecules. You will have to first, import you molecules, and then import fingerprints.

So... to create a new fingerprint:you will have to use the properties window. To create a new fingerprin: left-click on any the of categories available, and select Add new fingerprint.

  1. Open the Properties window.

  2. Left-click on any the of categories available. A good idea would be to either create a new category storing your own fingerprints, or to use the Fingerprint category. This is what we will do here.

    An image

  3. Click Next.

  4. Give a name, a size and an optional description to your new fingerprint.

    An image

  5. Click finish.

The new fingerprint table should now appear in the Properties window, in our case, in the Fingerprint category as shown in the following screenshot:

An image

Go To Table of Contents