Last edited: 06/10/2011
By: VLG
Sourceforge: SA2 website
Help: SA2 forums
Documentation index
This page will guide you through the basic utilisation of SA2. In this tutorial, you will learn how to perform simple and common tasks such as importing molecules, importing properties, viewing your molecules... A MySQL server must be installed on your machine or on a reachable network to go through this tutorial. See the installations and requirements section for all installation instructions. You may also want to read about the important terminologies used by SA2 before starting this tutorial. At the end of this tutorial, you will be pointed to other sections of the documentation dealing with more specific functionalities of SA2.
In this small tutorial, you will learn how to import molecules, visualise your molecules, and create new subsets of molecules (Libraries). We will use some SDF file that we have randomly extracted from a database of more than 6 millions molecules. Each molecule in these file has been standardized using a specific Pipeline Pilot protocol, and the 3D coordinates have been generated using Corina. We also provide you a set of MOE2D descriptors that we will use to illustrate the import of properties in the database. All the files used are located in the sample-data/ directory, which is included in the SA archive since the 1.0.2b version.
SADB ID is the field corresponding to the identifier of each molecule.
OK lets start doing some SA2 stuffs. Run the executable binary of SA2 (located in $SA/bin/), or use the shortcut installed if you used the automatic installer. Note that you cannot run two (or more) instances of SA2. After a few seconds, the following window should popup.
At this point you have to (1) connect to the MySQL server of your choice, and (2) create a new SA2 database. Let's detail a bit these two steps.
You can either connect to a local server (i.e. a server installed on the same computer as SA2) or to a server running on a different computer which can be reached by whatever network protocol. To connect to a server installed on your computer, enter "localhost" in the server text field. To connect to a distant server, enter a valid, reachable URL (e.g. an IP adress) in the server text field.
Once done, you will have to enter your user name and your password, that you had previously created just after the installation of your MySQL server in the appropriate fields. Click on the connect button to establish the connection.
Once connected, a list of databases compatible with the current version of SA2 should appear in the "Existing database" table. In our case, this list should be empty, as this is the first time you've run SA2. Lets populate it then. Click on the "New" button.
In this window, you simply have to enter the name of the database and an optional description. You also have to choose which handler to use for your database (note that this choice is definitive). The name of the database must fit a specific pattern. It should not contain any space or special character; in such case, a warning message will be displayed as shown in the screenshot, and you won't be able to create your database until your name fits the pattern.
When you are done, click in the Finish button. The database will be created. Many operations will be performed, such as creating tables and inserting various pre-computed data (e.g. PCA-based reference chemical spaces - DRCS). One or two seconds will be needed for the database to be created, depending on how fast your computer is.
Once the database has been created, it will appear in the table. Select it, and click on the "Open" button to open it. Rendez-vous to the next section then.
Once you open a database, a set of windows opens automatically. When running SA2 for the first time, a default window organization (layout) will be setup. Note that you can completely reoranize your windows. Try to play around with opened windows (drag them, undock them using left click...) so you can get used to it and see all the possibilities offered by the windowing system. Here is an example of layout we often use in our lab:
A more detailed overview of the Graphical User Interface (GUI) can be found in the dedicated section of this documentation.
Let's import new molecules! Note that the full workflow describing what SA2 will do when
importing a new molecule can be found in the
Import workflow section of this documentation.
Click on the second button on the toolbar, or use File->Import SDF in the menubar.
There are 4 configuration steps before starting the actual import process. Let's detail each
of them.
Important note: we will import a SDF file containing properties here. If you are not interested in importing existing / new properties, you can skip the 4th step, thereby making the import process a bit faster.
During this step, you must inform SA2 on the origin of our compounds. If you are building a database dedicated to store chemical vendors collections, you will want to assign each collection to its dedicated provider. If you are importing a library that corresponds to a medicinal chemistry project, just create a new provider for this project. In our case, we will create a new dummy provider.
Note that during this step, you can also associate your molecules with a new or an existing libraries. Libraries are slightly different from Providers: they represent subsets of molecules in the database, while providers represents the origine of molecules. Learn more about these simple (yet important) concepts in the dedicated section.
In this example, we will not create libraries.
Let's now import descriptors in the database. This step actually allows you to import properties available in your input file, in existing or new Property Tables. In our case, we will import the MOE descriptors available in the input file, in the MOE table that is already available in SA2.
The window is splitted in 3 main parts: the left part is a simple list of properties that
have been detected in the input file.
The right part is splitted in two more parts: on the top, you will find the properties
that have been assigned to new tables, and on the bottom, you will find the list of properties
that have been assigned to existing tables. In both parts, you can click on the arrow pointing
to the left (red one) to remove a property from the import process, and on the right arrow
(blue) to import a property in a new (top) or existing (bottom) table. The creation of new
tables will be described later.
Last but not least, the additional calculation. If you haven't done yet, learn about workers in the Terminologies section or in the specific section where a detailed description of each available worker is provided.
Here, just leave all workers unchecked. Note that each worker can be configured through the buttons located in the third column of this table. You will be able to select what the worker will do (which descriptors should be calculated...), and eventually set more specific parameters.
Click on Finish to start the import.
The import procedure should not be too long. An output window should open to inform you on the
various steps and eventual errors detected, as well as a progress status bar located on the
right bottom part of the main SA2 window.
Once the importation is finished, you may want to practice a bit, and repeat the process for the Provider1b.sdf input file. The only difference will be that you will assign to this input file the same provider as for the first imported file instead of creating a new one.
With a bit of practice, it takes me no more than 10 seconds to complete the 4 steps described previously.
We will now import another SDF file, but which does not contain any properties. Once done, we will import the corresponding MOE2D descriptors stored in a separate semi-colon separated text file.
We will now import the values of a fingerprint available in the database: the SSKey
fingerprint that was available in SA1 (and is still available in SA2).
As mentioned previously, this fingerprint can be directely calculated using the
JOELib worker.
As you will see, importing fingerprints is quite similar to importing properties. The main
difference is that you can't create new storage capability for a new fingerprint during
this process.
You fingerprints will be imported, and you will now be able to use them for e.g. similarity searching or diverse subset creation.
Lets now take a closer look at our compounds and properties.
We will now describe a very straightforward way of viewing our compounds. Before doing so, let's ensure that the appropriate windows are opened. Most of these windows should already be opened if you read the documentation about setting up a better default layout before running SA2 for the first time.
In simple table as well as in various plotting facilities, you have the possibility to select interactively one or several molecules. When doing so, the full list of selected molecules will appear in the Selection window, usually located in the right of the main SA2 window. You can subsequently perform various operations on these selected molecules using the vertical toolbar located on the right of the window.
As you will see, you can also synchronize the selection between the different views by checking the Synchronized checkbox. This way, when you select one or several molecules on either a table, or a plot, all opened views will be updated to select the same molecules (if available !).
Learn more about this view in the GUI section of this documentation.
We will now create new libraries. For the recall, Libraries in SA2 represents subsets of molecules. We will illustrate this point using three simple approaches: (1) create a library based on selected molecules, (2) create a library using simple filtering rules, and (3) create a library grouping molecules that have a common scaffold (or framework). Note that there are other ways of creating libraries, e.g. by merging existing libraries, by complementing existing library, by using diversity algorithm etc.
A simple way of creating a selection is to use the Selection window. Let's do this by creating a new selection containing all fragment (RO3 compliant) molecules.
Your library has been saved. It should now be visible in various windows, including the List of libraries window, and in all other views (Flags table...) that allow you to view only one particular library.
Let's now create a filtered library. We will create the exact same library as previously, but using a smarter way. Indeed, you probably noticed that the previous process is OK when you have only a few molecules, but becomes quite boring if you deal with a large database. Moreover, you don't want to use the sort capability of these table for large database. Let's makes the process a bit more automatic then.
Your new library has been created.
If you are not convinced, open the List of libraries window. You should see your two libraries. Left click on each of them, and select Properties. You should see that they both contain the same number of compounds. This properties windows is available for most database entities (Providers, Libraries, Properties / Tables...), and must be used if you want to change the name / description of one particular entity, or see some interesting properties (e.g. the % of explained variance of each component for a DRCS model).
Let's finally create a library containing all molecules that belong to a particular framework.
Go back in any of the table previously opened, and select the checkbox named "Lib". Select your newly created library and you should now see only the molecules contained by your library and having the same scaffold.
You've learned how to perform basic operations within SA2. There are plenty of other things that you can do with the software. The documentation is not completely exhaustive yet, but be patient, it will get improved with time. Here is a subset of interesting pages you may want to read: