Loading...
 

HPC Systems Engineer

Location: 

Houston, TX, US

Company:  ExxonMobil

Job Role Summary

The HPC Systems Engineer role has the overall responsibility to work within a team to provide a performant, reliable, and secure high performance computing (HPC) environment.  The HPC Systems Engineer will be involved in various aspects of designing and engineering our HPC system as well as be responsible for managing day-to-day operations and maintenance activities including, but not limited to the following: general troubleshooting of any issues that may arise, monitoring overall system health, performing system maintenance tasks, and evaluating new hardware/system software.  

Primary Job Functions

  • Establish strategies for overall support of the system
  • Evaluate new hardware and software and understand potential benefits/impacts it can have in the environment
  • Perform hardware maintenance
  • Perform software installations and upgrades; inclusive of operating system
  • Monitor overall system performance and health
  • Be available periodically for on-call support and weekend maintenance activities
  • Provide support for the management of data in the environment
  • Work with users to resolve problems and ensure they are able to effectively utilize the system
  • Interact with both business customers and technical teams that are globally distributed and within varied time zones
  • Engaging with vendors for problem resolution of existing infrastructure and discussion of roadmaps and new technologies for evaluations 
  • Foster a supportive work environment and maintains open, productive interactions among team and across organizations
  • Build and maintain cross-organizational contacts to facilitate execution of work

Job Requirements

  • B. S. in Computer Science or related degree area (e.g. Computer Engineering, Information Systems) or equivalent skills work experience
  • Excellent technical, analytical, and communication skills
  • A minimum of 3 years of hands-on Linux experience (e.g. RHEL, CentOS) and production infrastructure support (e.g. networking, storage, monitoring, compute)
  • Experience in system administration and technical support (e.g. installation, configuration, maintenance, upgrade, retirement, problem resolution)
  • Experience in HPC technologies such as parallel/distributed files systems (e.g. Lustre, GPFS), high speed interconnect fabrics (e.g. Infiniband, Omni-Path), and HPC batch scheduling software suites (e.g. PBSPro, SLURM)
  • Proficiency in technical writing and documentation of solutions
  • Works well in a team environment
  • Self-motivated

Preferred Knowledge/Skills/Abilities

  • Strong IT skills in infrastructure and applications
  • Experience with supporting large scale production environments
  • Experience in implementing changes and security controls in a global framework
  • Understanding of data center operations fundamentals in networking, cooling, and power
  • Knowledge and experience with installing/compiling vendor and open source software
  • Knowledge and experience with application/infrastructure deployment and support in one or more of the major cloud environments


  

 

Alternate Location:  

ExxonMobil is an Equal Opportunity Employer.  All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, age, sexual orientation, gender identity, national origin, citizenship status, protected veteran status, genetic information, or physical or mental disability.


Nearest Major Market: Houston

Job Segment: Engineer, Systems Engineer, Information Systems, Computer Science, Cloud, Engineering, Technology