How to access Shared Data sets on IDEA

Get Help

Posted on July 6, 2016

Updated on

September 21, 2022

Data Sets Index

ClinicalTrials.gov: Two versions of the database are currently available:
- September 2014 - Use schema clintrialsgov_201409
- March 2015 - Use schema clintrialsgov_201503
dbSNP: Human, Mouse, Fruit Fly.
- Name conventions:
  - dbsnp_(genome data set)_(build number)_(major genome version)_(minor genome version),
  - dbsnp_main_(build number)
- Current versions:
  - dbsnp_main_145
  - dbsnp_human_9606_144_38_2
  - dbsnp_fruitfly_7227_130_0_0
  - dbsnp_mouse_10090_142_38_3

Connecting to the Database

Investigators may request access to our public datasets hosted on the IDEA platform by completing the Public Data Service Request form. Once access is granted, the username will be your regular Partners ID and your password are used to gain access.

The data sources are hosted on HAWQ - a Postgres-compatible relational database. pgAdmin III or a similar Postgres-compatible tool may be used to connect to the database. HAWQ uses a forked version of PostgreSQL from 8.2.14, please use the pgAdminIII v.1.20.0 to avoid compatibility issues.

A screen shot showing the server connection setup for pgAdmin III is shown below.

Connect to HAWQ

Contact IDEA Support Team for questions on Public data Sets at: @email.

Querying the data in IDEA

Once connected to HAWQ you can query the data inside the database. The publicdatasets folder will show the different data sets available in the different schemas. The access to them will only be provided for the selected data set requested in the Public Data Service Request form.

HAWQ Interface

Use the SQL button to display the sql editor, all the commands are postgres like.

To extract selected data from IDEA when you need the data outside the platform (we highly recommend not to make duplicates of the datasets) please use one of the methods of querying the data:

From any other Postgres/SQL database: Use postgres_fwd.
From Python: Use PyGreSQL.
From R: Use RPostgreSQL.

Go to KB0028033 in the IS Service Desk

How to access Shared Data sets on IDEA

Get Help

Data Sets Index

Connecting to the Database

Querying the data in IDEA

Related articles