Partners Enterprise Research IS Webinar: Applications of Data Science to Life Science Research WebEx Info

Did you miss this event? All are welcome to view the recording of the webinar to learn how big data and open-source tools can be used to tackle complex data challenges in health care. 

Find additional materials in our Presentation archive.


Do you have colleagues who have been speaking about data science and machine learning tools on their genomic and patient data? Do you wonder how these tools and technologies may be used in your research and how they differ in the tools you are using today or help in how quickly you may setup your datasets for analysis? Have you heard of big data tools such as Spark, MongoDB and Greenplum (GPDB)? 

ERIS' Scientific Computing team invites you to a webinar to learn how a big data analytics platform can be used for complex analytics and data transformations across large and diverse datasets in the life and health science domains.

Led by Sarah Aerni, PhD Biomedical Informatics
Principal Data Scientist at Pivotal for the Life Science area.  Pivotal is an open source platform software and service provider.

Through an in-depth analysis of real-world use case examples, participants will leave with an understanding about big data analytics in life and health science, the tools available, and that this work can be done at Partners HealthCare using the Integrated Data Environment for Analytics (IDEA) platform. Sarah will cover a use case on building models using electronic medical records of patient data to predict outcomes in hospital settings. In addition, she will cover how data-driven approaches can be applied to drug discovery using genomics, image-based proteomics and structural data.

All are welcome to join the webinar to learn how big data and open-source tools can be used to tackle complex data challenges in health care. There will be time for questions at the end.


The IDEA platform provides research teams access to a range of open-source tools, including Hadoop, Spark, and MongoDB, hosted on a highly scalable infrastructure specifically designed for large analytics workloads. In addition to well-known Hadoop applications such as Pig, Hive, and HBase, specialized machine learning and natural language processing tools are also integrated into the solution. Access to HAWQ and Greenplum, two of the leading Big Data relational database solutions, is also incorporated into the IDEA platform. Complete a webform to request an IDEA account today, or contact @email with questions or concerns.

ERIS provides information services and technologies to enable and drive innovation in research across the academic medical centers of Partners HealthCare.