Linux HPC Applications Specialist

Visit the Partners careers page to apply online. Job ID is 3116286

General Summary/Overview Statement

Partners Healthcare Systems’ (PHS) Enterprise Research IS (ERIS http://rc.partners.org) is immediately seeking a candidate for the position of High Performance Computing (HPC) Applications Specialist. Our Scientific Computing platform supports the computational and analytics needs of our research and innovation communities across the Partners Healthcare Systems member hospitals. This position is responsible for all aspects of the self-service applications and hosted data within the Linux HPC cluster and associated services. The role is challenging and varied, requiring technical, interpersonal and problem-solving abilities.

The successful candidate will be responsible for the applications, and the runtime environment of the computing cluster and assisting the user community in its use. Included in the cluster service are 6000 CPU cores with remote desktop accessibility, numerous scientific applications, large-memory systems, GP-GPU equipped systems for machine learning, and Intel Xeon Phi co-processors. Key technical aspects of the role are Linux system administration, scientific software installation and monitoring and troubleshooting performance issues relating to software and user workloads. Experience using comparable computing systems is essential. In the course of expanding and improving the service capability, this position will interface with commercial hardware and software vendors, select and deploy new technologies and create how-to guides and training for end users. 

The position requires an eagerness to learn and deploy new technologies, deploy/assess in test and production environments, versatility in technical skills, demonstrated ability to work independently, effective communication with team and management, outstanding customer service skills, and the ability to help teams and projects successfully accomplish their research objectives. Candidates are not expected to be deeply knowledgeable in all areas but must demonstrate the ability and desire to learn how to support a large and diverse research environment in its use of analytics and HPC. Ideal candidates thrive on variety and innovation in their daily work, on interaction with customers who are world-renowned leaders in their scientific field, and on working with a wide range of technologies in a decentralized non-standard environment (academic).

ERIS enables and supports the highly successful and innovative research programs of the largest teaching hospitals in the nation--Massachusetts General (MGH), Brigham and Women’s (BWH), McLean and Spaulding Rehabilitation Hospitals--with their more than 3200 grant-sponsored programs in the biomedical sciences, from basic to clinical and applied research.

Principal Duties and Responsibilities

  • Maintenance of software and runtime environments: effective installation and configuration of open-source and commercial scientific applications.
  • Maintenance of software and runtime environments: effective installation, configuration and troubleshooting of open-source and commercial scientific applications and toolsets.
  • Technical and Customer Support: Provide support to scientific researchers who use a broad spectrum of applications from diverse fields (genetics, epidemiology, drug development, natural language processing, and medical imaging). 
  • Provide applications and tools assistance to the scientific community. Applications include Matlab, R, Python, Tensorflow, Git, bioinformatics applications, etc.:
  • break/fix support, setup/installation support, escalation support, and solutions support
  • Develop and maintain system documentation as well as user-facing knowledge base articles and how-to guides
  • Offer small group training classes in Linux and cluster computing
  • Create and close tasks/tickets using established standards.
  • Analyze and resolve customer and technical problems: Tuning cluster scheduling parameters, memory/CPU contention, scientific application compilation and run-time issues. 
  • Troubleshoot scheduler submission problems and assist with user access and Linux/schedular command line help. 
  • Develop containerization solutions for the environment. Work with users to deploy singularity (Docker) applications. 
  • Analyzes result of server monitoring and implement changes to improve performance, processing and utilization. Proposes, maintains and enforces polices, practices and security procedures.
  • Leverage industry standard system monitoring and reporting tools to ensure the maintainability, scalability and availability of the infrastructure environment.
  • Perform other duties as required or assigned by the situation and circumstances.
  • Use the Partners HealthCare values to govern decisions, actions and behaviors. These values guide how we get our work done: Patients, Affordability, Accountability & Service Commitment, Decisiveness, Innovation & Thoughtful Risk; and how we treat each other: Diversity & Inclusion, Integrity & Respect, Learning, Continuous Improvement & Personal Growth, Teamwork & Collaboration.

Qualifications

  • BA/BS degree required or equivalent combination of skills/experience. Masters or higher degree preferred.
  • Minimum of 5 years’ experience working with HPC or similar analytics environments supporting and deploying scientific applications and tools.
  • Strong understanding of and managing different runtime environments in Linux.
  • Strong verbal and written communication and interpersonal skills
  • Research experience with scientific applications is highly desired

Skills/Abilities/Competencies Required

  • Must be capable of contributing within a team, exhibit a high level of initiative, and have an eagerness to learn and implement new technologies.
  • Experience with EasyBuild or similar HPC software build and installation frameworks highly desired
  • Demonstrated ability in providing HPC cluster and scheduler troubleshooting to a community with diverse computing needs. 
  • Candidate must possess advanced knowledge and understanding of Linux runtime environments. And high ability to debug and troubleshot complex installations.
  • Extensive experience using Stackoverflow or similar forums for debuging complex installations is highly desired.
  • Knowledge of Git and Jira tools.
  • Knowledge of a common scheduling system, e.g. Platform LSF, SLURM, etc.
  • Ability to multitask and prioritize work requirements, keeping team and management informed.
  • Excellent interpersonal skills to effectively communicate with cross functional teams including staff at all levels of the organization including both technical and non-technical, medical and non-medical personnel
  • Ability to successfully negotiate and collaborate with others of different skill sets, backgrounds and levels within and external to the organization.

 

Working Conditions

  • Standard office environment.
  • Travel to remote buildings required, consisting of onsite work around the Massachusetts General/Brigham and Women’s/McLean Hospitals campuses and the Partners Data Centers.
  • As projects and priorities dictate, may be required to work occasional non-standard hours to support major projects
     

EEO Statement Partners HealthCare is an Equal Opportunity Employer & by embracing diverse skills, perspectives and ideas, we choose to lead. All qualified applicants will receive consideration for employment without regard to race, color, religious creed, national origin, sex, age, gender identity, disability, sexual orientation, military service, genetic information, and/or other status protected under law. 

Primary Location: MA-Somerville-Assembly Row - PHS
Work Locations: Assembly Row - PHS  399 Revolution Drive   Somerville 02145
Job: Systems/Network Administration
Organization: Partners HealthCare(PHS)
Schedule: Full-time
Standard Hours: 40
Shift: Day Job
Employee Status: Regular 
Recruiting Department: PHS Information Systems
Job Posting:  September 17, 2019

Visit the Partners careers page to apply online. Job ID is 3106960