Enterprise-wide data services are available to help Mass General Brigham investigators and research groups expedite the process of obtaining data for analysis or to gauge study feasibility. Repositories of clinical samples and data, as well as public data sets, are accessible to researchers through the tools and processes outlined below.
A new initiative, the Mass General Brigham Big Data Commons, enables Big Data to be integrated with the RPDR and tighter integration of the RPDR with Epic. The specific areas of focus of the Big Data Commons are to create a Research Patient Portal for direct patient engagement in Epic, creating a distributed query system to allow more types of Partners Big Data to be integrated and become discoverable by researchers, and specific integration platforms such as the Biobank Portal which serve to researchers new forms of Big Data in easily consumable forms.
Collibra provides researchers a common language for RISC data assets across data repositories in a single location. Details of these data assets are structured within a series of dashboards containing relevant information, how to access, security and compliance, and data dictionary links.
Available RISC Data Dictionaries within Collibra are:
- COVID-19 Tools including:
- RPDR Detailed Data Files
These Data Dictionaries describe the structure of the data, arrangement, and their relationship within the data repository. Researchers can easily view the full data dictionary content or search for specific assets within the dictionary or across all the RISC Data Dictionaries.
How to Access
All MGB researchers have access to Collibra. Log in with your MGB credentials.
To get started, view the Navigation Video Tutorial that illustrates how to navigate and use the Collibra Tool.
For questions or suggestions regarding Collibra and Data Dictionaries, please contact MGBCOVIDResearchRequest@partners.org and address Peter Gray.
Overview of the RPDR
The Research Patient Data Registry (RPDR) is a centralized clinical data registry, or data warehouse, that gathers clinical information from various Mass General Brigham hospital systems. An online Query Tool allows researchers to explore clinical data through a self-service system in order to:
- Assess clinical study feasibility
- Identify patients for clinical trials
- Investigate hospital operations and patient care
- Provide identified patient data with approved IRB protocol
- Find control patients for previously defined populations
- Search clinical notes for specific text terms and phrases in order to identify patient cohorts who have notes/reports that contain the searched text.
- Query for patients with blood samples in the Mass General Brigham Biobank
- Supply a workbench that allows viewing and download access for MGH and BWH Radiology images
The RPDR ensures the security of patient information by controlling and auditing the distribution of patient data within the guidelines of the IRB and with the use of several built-in, automated security measures. The online search in real time results in a faster data turnaround with extensive specificity of patient criteria.
Functions of the RPDR
The RPDR has two related but separate functions:
- The online query tool provides users with aggregate numbers of patients that meet user-defined characteristics and criteria such as diagnoses, procedures, medications and/or laboratory results.
- The Data Request Wizards allow the user to ask for more detailed medical record information on the identified patient population. This process requires an approved IRB protocol.
Obtaining access to the RPDR
In order to use the RPDR, a person must first become registered in the RPDR system. Registration is handled differently for faculty vs. non-faculty members.
- A faculty member is defined as an attending physician or other research staff (such as a PhD or nurse researcher) who has the title of Instructor, Associate Professor, Assistant Professor, Professor, or Lecturer at the Harvard Medical School (HMS). Additionally, in order to become an RPDR user, the HMS user must be affiliated with a Mass General Brigham institution. Faculty members can self-register to become a Faculty Sponsor from the RPDR homepage at (http://RPDR).
- A non-faculty member is eligible to use the RPDR as a workgroup member. The Faculty Sponsor (Workgroup Leader) can add non-faculty members to their RPDR workgroup. Access will then be granted to the RPDR for the non-faculty member. Faculty Sponsors can add workgroup members from the RPDR homepage at (http://rpdr.partners.org).
*Note: http://RPDR (this link is internal only and requires Internet Explorer; you must be logged on to a Mass General Brigham workstation or connected to the Mass General Brigham network via VPN)*
Getting to the RPDR query tool
The RPDR query tool can be accessed through various entrance points once a user has been granted access.
- RPDR home page, click on the “Launch RPDR Web Query Tool” link
- Mass General Brigham workstation, click on Start menu, hover over Mass General Brigham Applications and select Research Patient Database Enhanced Query Tool
- Mass General Brigham Portal page via VPN, select ‘Partners Applications’ ->’MY CITRIX APPS’ and then ‘RPDR Web’. Citrix will need to be installed before launching the Portal Page.
Tools Available in the RPDR - locked page
*To query data and notes that are updated daily, visit the RPDR Daily Query tool. This can be found from the RPDR homepage (http://rpdr) or your Partners Application menu.*
This new functionality, allows users to query source systems that are updated in the RPDR on a daily basis. These include Encounter Detail, Demographic Detail, Laboratory Tests, Radiology Tests, Providers, Specimens, Transfusion Services, Biobank Patient Consents and perhaps the most powerful is the Notes Search (EPIC + LMR Ambulatory notes and Clinical Reports). Using the Notes Search in the Daily Query tool allows users to search for terms contained in notes without having to wait for delayed coded information.
In addition to having more patients available, the daily query tool is helpful for finding time-sensitive patients for enrollment in studies and clinical trials.
Identified detailed data (Demographic and Identifying Patient Information) can be requested through the Daily Query Tool.
*Note: http://rpdr.partners.org (this link is internal only and requires Internet Explorer; you must be logged on to a Mass General Brigham workstation or connected to the Mass General Brigham network via VPN)*
The Biobank Portal is a tool that links consented subjects from the Mass General Brigham Biobank with their healthcare data from the electronic medical record (EMR), as well as health information survey and genomic data. The Portal allows researchers to query this data for aggregate totals, and to request blood and plasma samples, genomic data, and EHR data for these subjects. Users must have a valid Mass General Brigham logon and be a registered Research Patient Data Registry (RPDR) user to use the Portal.
In addition to comprehensive electronic medical record data, the Biobank Portal includes:
- Biobank Health Information Survey, patient-reported lifestyle, environment, and family history information.
- Curated Disease Populations sets of subjects within the Biobank population who have been statistically determined to have a particular disease such as Type 2 diabetes, rheumatoid arthritis, congestive heart failure, and others. These cohorts are often called disease phenotypes.
- Healthy Populations, an index that statistically groups patients by co-morbidities (using the Charlson Index) in order to help select relatively healthy controls from the Biobank population.
- Biobank Sample Types, including DNA, plasma, and serum. Both de-identified and identified samples may be requested from the Portal. Requesting identified samples requires a valid Mass General Brigham IRB protocol.
- Biobank Genomic Data genotyped and imputed genomic data are available for a subset of the Biobank population and may be requested via the Portal. Both de-identified and identified samples may be requested from the Portal. Requesting identified samples requires a valid Mass General Brigham IRB protocol.
- Querying by Genomic Data, single nucleotide polymorphism (SNP), and insertion and deletion (indel) variants and their related annotations are available on a subset of genotyped subjects in the Biobank Portal and may be used to query for subjects within the Portal.
- Download of de-identified patient data enables the creation and download of data-obfuscated limited data sets (LDS) for further analysis. The downloaded file includes the Biobank Subject IDs, which may be used to request samples or genomic data.
Researchers can use the Biobank Portal to identify eligible case and control subjects, request samples, and perform analyses related to using consented samples for research.
Please contact the Biobankportalhelp@partners.org mailbox with any questions or support.
The ACT Network is a real-time platform allowing researchers to explore and validate feasibility for clinical studies across the NCATS Clinical and Translational Science Award (CTSA) consortium, from their desktops. ACT helps researchers design and complete clinical studies and is secure, HIPAA-compliant, and IRB-approved.
ACT was developed collaboratively by members of NCATS’ Clinical and Translational Science Award (CTSA) consortium, with funding from the NIH National Center for Advancing Translational Sciences.
Why Use ACT?
- Explore patient populations
- Learn about your patient population in-depth and in real-time, from your desktop.
- Check feasibility of clinical studies
- Iteratively test and refine inclusion and exclusion criteria to confirm the feasibility of your clinical study.
- Find partner sites
- Search for patient cohorts across the CTSA network to identify potential partners for multi-site studies.
- Demonstrate feasibility
- Access easy-to-download results for use in funding proposals and IRB submissions to demonstrate the feasibility of your clinical study.
Visit https://www.actnetwork.us/harvardmgb for additional details and to get started at Mass General Brigham (MGB).
mi2b2 - Medical Image Access Tool
The Medical Imaging Informatics Bench to Bedside (mi2b2) workbench serves as a secure bridge between a researcher and the Mass General Brigham PACS systems, which aims to:
- Facilitate searching for, reviewing, and accessing clinically acquired images that are stored in several PACS (Picture Archive and Communication System) systems that serve the Mass General Brigham institutions
- Enable researchers to extend the use of the Research Patient Data Registry (RPDR) to access clinical images on patients of interest for enhanced research studies, with proper IRB approval
- Enable efficient retrieval of medical images (DICOM format) for lists of patients generated from research, teaching, and clinical activities in keeping with all regulatory guidelines
- Provide access to only patient images authorized by approved IRBs and provide audit trails for HIPAA compliance.
If using the RPDR Query tool and the RPDR Data Acquisition Engine (Image Request Wizard), a user can obtain aggregate numbers of patients with user-defined characteristics based on a query or upload a pre-defined list of Medical Record Numbers and then receive more detailed medical information about the queried patient cohort. Either way, the user is provided with a personalized mi2b2 workbench, directly configured to include the queried cohort information. It is delivered in a folder along with the encrypted RPDR data results. For instructions on how to use the RPDR to request mi2b2 workbench, please visit: (http://rpdr.partners.org) ->Help -> Request Images Help.
Detailed tutorial support for new users of the mi2b2 software is found at http://mi2b2help.partners.org.
The IDEA Platform hosts public data sets that research teams at Mass General Brigham can access using standard data analysis tools. Many public data sets take researchers days to locate, download and convert into a format suitable for further analysis. Now researchers can access these data sets directly in the IDEA repository, without needing to perform any further data movement, loading, or transformation.
Public Data Sets Available on IDEA
ClinicalTrials.gov is a registry and results database of publicly and privately supported clinical studies of human participants conducted around the world.
dbSNP is a database of single nucleotide polymorphisms (SNPs) and multiple small-scale variations that include insertions/deletions, microsatellites, and non-polymorphic variants. The IDEA platform hosts the human, mouse, and fruit fly genomes.
Read-only access to these data sets is available to all Mass General Brigham researchers.
As part of the service, the ERIS team will update local copies of the data as new versions of datasets are published. Prior versions of data sets will be archived and kept available for a limited period.
How it Works
The data sets are stored on the IDEA platform and are available to researchers using either standard SQL tools or the MapReduce programming model.
To access a data set a user will need to request authorization to access the IDEA repository for that data set. The public data sets are freely available with no restrictions on the use of the data.
How to get Access
For access to a data set of questions on a data set, please complete the Shared Data Set Service Request Form.
The ERIS Collaboration Site provides additional information on hosted public data sets, including how to request access, data dictionaries, schemas, and links to the websites of the sources of the public data. You will be prompted to log in with your Mass General Brigham username and password to access this documentation.
Additional Data Sets
We are collecting information about other public data sets that might be of interest to researchers.
Public data sets currently under consideration for hosting on IDEA include PubMed, ExAC, ClinVar, GTEx, GDAC, and TCGA.
Requests for additional public data sets will be considered based on demand and availability of ERIS resources to support. Please send an email to firstname.lastname@example.org with your request.
Use of the Epic/Mass General Brigham eCare Reporting Workbench for research is medical records research.
REMINDER: The Epic/Mass General Brigham eCare Reporting Workbench tool is for use in clinical care.
For research data queries, Investigators should use the Research Patient Data Registry (RPDR); the Epic/Mass General Brigam eCare Reporting Workbench tool should not be used.
If RPDR is insufficient for your research, you may use the Epic/Mass General Brigham eCare Reporting Workbench if:
- you have IRB approval AND
- use is in accordance with specific restrictions as outlined in the FAQs.
Use of medical records and protected health information (PHI) for research must be compliant with the HIPAA Privacy Rule, the Common Rule, and other laws that regulate patient and study subject privacy, in addition to Mass General Brigham policies on encryption and Information Security.
Paul J. Anderson, MD, PhD, Chief Academic Officer, Brigham and Women’s Hospital
O’Neil Britton, MD, CHIO, Partners HealthCare
Shelly Greenfield, MD, MPH, Chief Academic Officer, McLean Hospital
Jigar Kadakia, Chief Information Security and Privacy Officer, Partners HealthCare
Anne Klibanski, MD, Chief Academic Officer, Partners HealthCare
Janet (Jodi) Larson, MD, Acting Associate Chief Medical Officer, Newton Wellesley Hospital
Gregg S. Meyer, MD, Chief Clinical Officer, Partners HealthCare
Harry W. Orf, PhD, Senior Vice President for Research, Massachusetts General Hospital
Kerry Ressler, MD, PhD, Chief Scientific Officer, McLean Hospital
Ross Zafonte, DO, Senior Vice President of Medical Affairs, Research and Education, Spaulding Rehabilitation Network