Advanced Research Computing Solutions Engineer

Rice University
Houston TX
30+ days ago
Rice University
Rice University
rice.edu

Job Description

Special Instructions to Applicants: All interested applicants should attach a cover letter and a resume in the Supporting Documents section of the application. We suggest the documents be in a PDF format to avoid formatting issues.

Rice University is growing! Under President DesRoches, our research footprint is expanding, and we are hiring 200 new faculty. The Office of Research has launched several new research institutes for materials science, biology, sustainability, digital health, bioengineering, and more. Our research computing & data services are growing rapidly to support the needs of our research community, and to provide leading technologies to accelerate and advance the world-class research and scholarship that is underway across the University.

The Center for Research Computing (CRC), within the Office of IT, enables faculty and researchers to effectively use on and off-premises resources and services, including (1) shared high-performance computing systems, (2) VM and cloud computing, (3) data storage infrastructure, (4) a wide range of scientific instruments, and (5) broader cyberinfrastructure and services. The CRC currently manages three HPC systems and three research data storage services, including a general-purpose HPC / HTC cluster and 2 small GPU clusters.

Position Summary

We are seeking an experienced HPC Systems Administrator to join our team. Reporting to the Director of the Center for Research Computing, the Advanced Research Computing Solutions Engineer works with the HPC team to perform specialized functions for systems installation, management, problem-solving, and solution design, and serves as primary back-up for the lead HPC systems engineer. Additional technical functions include the implementation and support of HPC research environments, including databases, containers, HPC & hybrid/cloud compute and storage services, and security and access controls. The incumbent will participate on the HPC Systems & User-facing team to proactively and reactively identify and solve operational and software problems running on our HPC systems; and collaborate with Rice Information Security to properly secure the environment and any related information services: cloud-based or on-premise.

Research using commercial and federally-funded cloud resources is increasingly important, and responsibilities for this role will include working with CRC teams and faculty to facilitate best practice for cloud-based research computing. Additionally, while this is primarily a systems-facing role, the incumbent will participate in the training of scholars and students on campus for the use of the HPC and research computing facilities to support research, education, and outreach to industrial and governmental partners.

The ideal candidate has broad experience with managing HPC systems in research environments and the ability to work with a wide range of scholars to support the selection and use of cost-effective environments in which to carry out their research. Supporting research on environments including but not limited to cloud computing, regional or national data repositories, and supercomputers, other federal and institutional research computing resources, etc.

Workplace Requirements

Working onsite is required for this job. After the 6-month probationary period, the incumbent may be allowed to work up to 2 days remotely, with supervisor approval, provided they remain in the local area . Per Rice policy 440 , work arrangements may be subject to change.

This is a full-time, benefits-eligible position, and the proposed salary range is $108,000 to $118,000 annually, depending on qualifications and experience . * Exempt (salaried) positions under FLSA are not eligible for overtime.

Minimum Requirements:

  • Bachelor's Degree
    • In lieu of the education requirement, additional related experience, above and beyond what is required, on an equivalent year-for-year basis may be substituted.
  • 3+ years of experience in HPC systems integration and management and supporting researchers with HPC and/or cloud computing solutions.
    • In lieu of the experience requirement, additional related education, above and beyond what is required, on an equivalent year-for-year basis may be substituted.
  • Skills:
    • Proven ability to develop appropriate plans to meet computing needs
    • Proven ability to work on large/complex system deployment projects in a team environment
    • Proficient level of understanding in the architecture, design, and development of High Performance Computing solutions
    • Advanced knowledge of security trends and best practices
    • Familiarity with generally accepted principles, patterns, and practices, of domain-driven design, test-driven design, and continuous integration
    • Able to use critical thinking to provide support and troubleshoot systems
    • Possess attention to detail, organizational skills, and excellent time management skills
    • Strong communication skills, both written and oral

Preferences

  • Master’s and/or Ph.D. in computer science or STEM discipline.
  • 5+ years of experience developing, installing, managing, and provisioning large-scale High Performance and High Throughput Computing environments.
  • 2+ years developing Cloud-based solutions for research projects, managing the migration of projects from local HPC environments to commercial or academic cloud platforms
  • Minimum of two years’ experience in Linux systems administration.
  • Experience with GPUs and GPU-based clusters
  • Ability to optimize workflows and job scripts for optimal use of HPC systems.
  • Experience in a university or similar research-oriented environment.
  • Familiarity with schedulers such as SLURM (Simple Linux Utility for Resource Management).
  • Familiarity with the design of HPC systems.
  • Experience with implementing and maintaining system security strategies, policies, and procedures
  • Advanced knowledge of parallel programming with OpenMP, MPI, and CUDA.
  • Familiarity with virtualization environments for running background research applications.
  • Proven experience working with Big Data applications
  • Experience providing user support and training for High-Performance Computing (HPC) environments

Essential Functions

  • Administer and program high performance and research computing environments that may include cloud-based systems, as well as local physical and virtual systems.
  • Provide system maintenance and troubleshooting, primarily for Linux operating systems, leveraging industry standards and best practices
  • Utilize monitoring and reporting tools on system health and status to inform CRC services
  • Installs, and maintains operating systems, utilities, and applications software on computing systems
  • Works collaboratively to resolve system complex issues that impact the integrity of user data and systems
  • Engages in long-term planning about systems development and integration
  • Performs capacity planning for system configuration, software services, network services, load distribution, and service interrelationships among computer systems
  • Acts as a technical expert or lead for local computer system administration
  • Manages vendor relationships and cost-effective hardware and software maintenance agreements with vendors
  • Actively foster a collaborative work environment, promoting teamwork and open communication across departments.
  • Performs all other duties as assigned

Additional Functions

  • May be required to work extended hours (evenings and weekends) in emergency situations or to restore systems.

Rice University HR | Benefits: https://knowledgecafe.rice.edu/benefits

Rice Mission and Values: Mission and Values | Rice University

Rice University is an Equal Opportunity Employer committed to diversity at all levels and considers for employment qualified applicants without regard to race, color, religion, age, sex, sexual orientation, gender identity, national or ethnic origin, genetic information, disability, or protected veteran status.


Boasting a 300-acre tree-lined campus in Houston, Rice University is ranked among the nation’s top 20 universities by U.S. News & World Report. Rice has a 6-to-1 undergraduate student-to-faculty ratio, and a residential college system, which supports students intellectually, emotionally and culturally through social events, intramural sports, student plays, lectures series, courses and student government. Developing close-knit, diverse college communities is a strong campus tradition, which is why Rice is highly ranked for best quality of life and best value among private universities.

Visit Original Source:

http://www.indeed.com/viewjob
why ?Jumpstart your career with our tech sales bootcamp!
Free Guides, Videos and Podcasts
  • The Biggest Red Flags in Sales Interviews: A Complete Guide
    The Biggest Red Flags in Sales Interviews: A Complete Guide
  • Career Change Guide: Breaking Into a Career in Tech Sales
    Career Change Guide: Breaking Into a Career in Tech Sales
  • How to Find a Second Career in Tech Sales
    How to Find a Second Career in Tech Sales
  • SDR Interviews | How to Land the Interview and Stand Out in the Process
    SDR Interviews | How to Land the Interview and Stand Out in the Process
  • See More…

Other Jobs

Trusscore

Shipping/Receiving

Trusscore

Who We Are Trusscore is a material science company focused on developing sustainable building materials. We're starting a journey to change the way people build buildings and the environmental

 
Palmerston ON
Benevity

MEET BENEVITY The world's coolest companies (and their employees) use Benevity's technology to take social action on the issues they care about. Throu

 
Calgary AB
FreshBooks

ABOUT FRESHBOOKS FreshBooks is a leading cloud-based SaaS accounting software platform built for small business owners and consistently ranks #1

 
Toronto ON / Remote