Home   ●   About   ●   Help
Phenotype Builder
SNP Data

This is the help manual. Use index in the left menu bar to navigate this manual.

DATA - The following describes datasets present in the database:

  1. CSSCD - data sets as SAS database files. Exported to Access database and converted to MySql import files with exportSQL script. A few sets exceeded Access 255 fields limitations, so they were broken up into two.

  2. C-DATA - data sets as SAS database files. Exported to Access database and converted to MySql import files with exportSQL script.

  3. MSH - data sets as SAS database files. Exported as DBF files and imported to MySQL though Navicat. The 2 files that exceeded 255 filed limit were imported as CSV after table structure creation with Navicat from DBF. Replaced empty entries with NULL via the sed replacement program.
  4. BU-PH - data sets were given to me as Excel files. Converted to CSV and then to SQl import file via http://www.sqldbu.com/eng/sections/tips/mysqlimport.html website. Spaces were removed from some table field names.

Documentation - Documentation tab provides access to documents related to data sets as well as original data sets (before MySQL import). All search documents of the following types are indexed and can be searched via the SEARCH tab (.doc, .txt, .csv, .wpd, .fmt).

Download - Download button on the questionnair page allows downloading of data in text format. Please note that if the set is filtered down, filtered down version will be downloaded.

  1. Your systetm - select Windows or Linux/Mac for proper end of line conversion.
  2. Delimeter - choose field delimeter.
  3. Pretty Dates - unchecking this will return dates in numeric SAS format for some fields
  4. Field Names - unchecking this will remove the first line containing field names
  5. Field Labels - checking this will replace field names by labels when available.

Filtering - Filter allows you to filter out results you do not want. Select a condition from the FIELD textbox, a comparison operator from the OPERATOR textbox and type the desired value into the VALUE textbox. For example, selecting "anonid" + "equal to" + 20 and pressing APPLY will filter out the results that do not anonid equeal to 20. Once you have entered a filter, it will be displayed above the selection line. If you want to get rid of the filter, uncheck the checkbox in front of the filter you want to get rid off and press APPLY.

Pay attention to the number of repeats in the "original set" and always compare it to the number of repeats "after filter" as displayed below the filter. This indicates how many repeats are left after the filters you applied.

Phenotype Builder - Phenotype Builder menu (in workspace tab) is similar to Questionnaires menu with two important differences. First, only the ids are displayed (so they can be saved as a group later). Second, filtering across data sets is allowed using INTERSECT or UNION joining of results. Of course, because of different database subjects and identifiers, it only makes sense to join tables that came from a single database.

Questionnaires - Questionnaires tab contains data from various data sets produced from patient questionnaires. Each data set can be sorted by any field by clicking its name. Each set can be filtered and saved as a group of ids (see filtering for more information). Set name and database name are printed on top of the page, followed by filter box, followed by page information and navigation are finally by top 100 records returned by the database.

Search - FULL TEXT search with boolean operators is used in this application. Minimum indexed word length is 2. This means that words like "at" were not indexed. However, doing "at*" will find "attribute". Commonly used words (and, you, them...) were also excluded from the index.

The following examples demonstrate some search strings that use boolean full-text operators:

  • 'apple banana'

    Find rows that contain at least one of the two words.

  • '+apple +juice'

    Find rows that contain both words.

  • '+apple -macintosh'

    Find rows that contain the word "apple" but not "macintosh".

  • 'apple*'

    Find rows that contain words such as "apple", "apples", "applesauce", or "applet".

  • '"some words"'

    Find rows that contain the exact phrase "some words" (for example, rows that contain "some words of wisdom" but not "some noise words"). Note that the " characters that enclose the phrase are operator characters that delimit the phrase. They are not the quotes that enclose the search string itself.

* For document search, all search documents of the following types are indexed and can be searched via the SEARCH tab (.doc, .txt, .csv, .wpd, .fmt).

SNP Data - SNP data is available for some sets. SNP data can be queried by RS ID or chromosome location. Either a single SNP RS ID can be used or a list (pasted or uploaded). Location can be either typed, or selected from the region drop down list (created from workspace menu). The results can be given for the entire data set, a single subject or a group of subjects (created from workspace menu).

Workspace - Your workspace is saved as long as you access this database from the same machine (based on IP number). Workspace includes questionnaire filters and custom settings, as well as created chromosome regions and upoaded groups of individuals. Phenotype Builder is accessible from this menu.

Your workspace may be accessed from another machine if you save the session string provided and type it in at a later time. You may also simply bookmark the link provided on the workspace page.