IDEA Analytics Platform Documentation

The IDEA Analytics platform is ideal for performing analytics across large and diverse datasets, using the Hadoop distributed computing system.  In addition to Hadoop and related open-source tools for machine learning and natural language processing, IDEA includes tools like Spark to perform high performance parallel operations.

TYPICAL USES

  • Association studies combining genomic information with medical records
  • Applying natural language processing to textual datasets
  • Predicting outcome using predictive modeling algorithms on large datasets

SUPPORTED METHODS OF CONNECTING TO THE CLUSTER 

  • SSH command line terminal for Hadoop to the workspace.
  • Web portals and applications

Getting an account?

DOCUMENTATION

All IDEA documentation is hosted on the IDEA Confluence space. You can find here the most popular links. 

 

Go to KB0034146 in the IS Service Desk

Related articles