Systems Engineer

Employer

Job Description

Title: Systems Engineer III
Location: Redmond, WA
Duration: 12 months contract to possible hire or extension.

Job Description:
Reality Labs at Meta is seeking a contractor for the role of Systems Engineer in AR/VR audio to work in our research lab. The job entails supporting the audio research team in achieving their research goals by performing systems setup and job management for high-performance compute clusters (*HPC*) and the associated technology stacks being built for deployment on those solutions.

Responsibilities:
• Manage, monitor, and ensure proper functioning and interoperability of existing systems and infrastructure.
• Write, maintain, standardize, and test custom scripts to automate processes and reduce the need for human intervention.
• Explore the potential expansion of the existing infrastructure.
• Evaluate, identify, and propose automatic solutions to optimize workflows.
• Oversee and potentially contribute to the development and/or installation of new hardware and software.
• Write unambiguous, concise, and well-structured documentation and reports.
• Help ensure high level availability and high quality of technical solutions.
• Collaborate and interoperate with different teams (research, software, cluster management) to achieve optimal performance stability.

Top 3 must-have HARD skills:
• HPC cluster management (e.g. InfiniBand)
• Scripting in Python/Bash/PowerShell or light programming in C/C++
• Containers (Kubernetes, Docker)

Good to have skills:
• NVIDIA DGX Experience
• DevOps
• Networking

Minimum Qualifications Requirements:
• A BS degree in Computer Science, Engineering, or similar or have equivalent work experience.
• 2+ years’ experience in programming/scripting (e.g., Python, Matlab, C/C++)
• 2+ years’ experience in using command shells for both UNIX/Linux-based and Windows systems(e.g., bash, PowerShell)
• 2+ years’ experience with virtualization and software containerization (e.g., Kubernetes, VMWare)
• 1+ years’ experience in working with HPC clusters and automation software.

Preferred Qualifications:
• 2+ years’ experience with scientific computing and NVIDIA software/hardware stacks (e.g., DGX machines)
• 2+ years of professional experience in system engineering and/or dev ops engineering.
• Proven experience and/or certifications in working with IT tools/concepts such as networking, information security, storage solutions, distributed version-control systems, cloud services.
• Proven experience in configuring, deploying, and working with cloud computing (e.g., AWS)
• Proven experience in creating, analyzing, and repairing large-scale distributed systems.

Years of experience required: 3 years of experience desired

Degrees or certifications required:
• Prefer BS in CS/Engineering or equivalent.

Candidates should be able to commute to Redmond, WA once offices reopen.