Home   ●   About   ●   Help

Welcome to the SickleGen database.

Use menu on top to navigate.

Attention: This is a prototype database (GDDTEST). The real database requires pre-approval and IRB protocol certification and is located at http://lobstah.bu.edu/gdd/. This database has 500 re-deidentified subjects from CSSCD and MSH databases with limited questionnaire and SNP data and is only meant for demonstration purposes in conjunction with the SickleGen grant proposal.

SickleGen database is a collection of phenotype and genotype data from different databases brought together to study sickle cell anemia. Current list of databases include CSSCD, MSH, C-Data, BU-PH. Main features of the database are outlined below. For more complete help manual please follow the help link on top of this page.

Questionnaires tab contains data from various data sets produced from patient questionnaires.

Each data set can be sorted by any field by clicking its name. Each set can be filtered and saved as a group of ids (see filtering for more information).

Set name and database name are printed on top of the page, followed by filter box, followed by page information and navigation are finally by top 100 records returned by the database.

Documentation tab provides access to documents related to data sets as well as original data sets (before MySQL import).

All search documents of the following types are indexed and can be searched via the SEARCH tab (.doc, .txt, .csv, .wpd, .fmt).

FULL TEXT search with boolean operators is used in this application. Minimum indexed word length is 2. This means that words like "at" were not indexed. However, doing "at*" will find "attribute". Commonly used words (and, you, them...) were also excluded from the index.

The following examples demonstrate some search strings that use boolean full-text operators:

* For document search, all search documents of the following types are indexed and can be searched via the SEARCH tab (.doc, .txt, .csv, .wpd, .fmt).
  • 'apple banana' - Find rows that contain at least one of the two words.

  • '+apple +juice' - Find rows that contain both words.

  • '+apple -macintosh' - Find rows that contain the word "apple" but not "macintosh".

  • 'apple*' - Find rows that contain words such as "apple", "apples", "applesauce", or "applet".

  • '"some words"' - Find rows that contain the exact phrase "some words" (for example, rows that contain "some words of wisdom" but not "some noise words"). Note that the " characters that enclose the phrase are operator characters that delimit the phrase. They are not the quotes that enclose the search string itself.

SNP data is available for some sets. SNP data can be queried by RS ID or chromosome location.

Either a single SNP RS ID can be used or a list (pasted or uploaded).

Location can be either typed, or selected from the region drop down list (created from workspace menu).

The results can be given for the entire data set, a single subject or a group of subjects (created from workspace menu).

Phenotype Builder menu (in workspace tab) is similar to Questionnaires menu with two important differences.

First, only the ids are displayed (so they can be saved as a group later).

Second, filtering across data sets is allowed using INTERSECT or UNION joining of results. Of course, because of different database subjects and identifiers, it only makes sense to join tables that came from a single database.

Your workspace is saved as long as you access this database from the same machine (based on IP number).

Workspace includes questionnaire filters and custom settings, as well as created chromosome regions and upoaded groups of individuals.

Your workspace may be accessed from another machine if you save the session string provided and type it in at a later time. You may also simply bookmark the link provided on the workspace page.