Senior Software Engineer - Infrastructure Engineering

Posted 2 Days Ago
2 Locations
Remote
160K-200K Annually
Senior level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
We build infrastructure for machine learning
The Role
Design and implement systems for infrastructure management, automate tooling, and drive new infrastructure rollouts while collaborating with cross-functional teams.
Summary Generated by Built In

Voltage Park is seeking a Senior Software Engineer for our Infrastructure Engineering team. Our team is responsible for building automation, tooling, and API-driven systems to bridge the gap between our physical infrastructure and the systems that our customers depend on for AI/ML training, inference, and HPC workloads at scale.

In this role, you’ll design and implement systems that enable humans and software to interact programmatically with thousands of bare-metal servers, storage clusters, and high-performance networks. You will work closely with teams across Voltage Park to drive new infrastructure rollouts and improve the lifecycle management of existing resources.

This is a fully remote position, although candidates must be based in the continental United States. Unfortunately, we are unable to provide sponsorship for this role.

Responsibilities:
  • Design, build and maintain tools, APIs, and automation frameworks to manage physical infrastructure at scale.

  • Build and extend systems for server lifecycle management.

  • Implement observability, telemetry, and logging systems that enable visibility and insights into the health of our hardware.

  • Collaborate with our Network, Infrastructure Operations, Platform Engineering, and Customer Experience teams to define requirements for and build new tools.

  • Participate in architectural discussions to help define the direction of infrastructure engineering at Voltage Park.

  • Write clear design documents and technical documentation.

Qualifications:
  • 8+ years of professional experience in software engineering, infrastructure engineering, or related fields.

  • Strong experience with Linux in production environments.

  • Proficiency in Python or similar object-oriented programming languages.

  • Familiarity with containerization and orchestration concepts.

  • Understanding of HPC infrastructure fundamentals, bare-metal provisioning and out-of-band management.

  • Experience balancing pragmatic shipping with good long-term architecture.

  • Comfortable with navigating ambiguity.

  • Strong written and verbal communication skills.

Ideal Experiences
  • Experience with bare metal hardware troubleshooting and provisioning, extra points for working with Dell hardware.

  • Experience with GPU servers, both in bare metal form or under virtualization.

  • Deep experience with network switches, routers, and firewalls, particularly SONiC switches, Palo Alto firewalls and Juniper Networks as vendors.

  • Experience with VAST storage systems.

Culture:
  • Enjoy collaborating with a growing, motivated team focused on execution.

  • Comfortable operating with a high degree of autonomy and able to independently prioritize tasks aligning with company objectives.

  • Possess a breadth of knowledge in your domain while also embracing the opportunity to take on diverse responsibilities.

  • Value the importance of clear communication and documentation in driving success.

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. If you require an accommodation during the job application process, please notify your recruiter. 

Compensation Range: $160K - $200K


#BI-Remote

Top Skills

Containerization
Hpc Infrastructure
Linux
Orchestration
Python

What the Team is Saying

Melissa Du
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
51 Employees
Hybrid Workplace
Year Founded: 2023

What We Do

The market for cutting-edge ML compute is broken. Startups, researchers and even big AI labs are scrambling to buy or rent access to the latest chips for ML training. But demand far outstrips supply, and what’s available is only accessible to the well-resourced, placing an artificial damper on innovation.

To solve this challenge, we've launched Voltage Park, and we’re on a mission to make machine learning infrastructure accessible to all, from large enterprises and research universities, to seed-stage startups and nonprofits.

With around 24,000 NVIDIA H100 GPUs, the Voltage Park cloud is one of the most powerful collections of cutting-edge ML compute in the world. Our clusters consist of 80GB H100 SXM5 GPUs fully interconnected with 3.2T InfiniBand.

Why Work With Us

You’ll play a pivotal role as a member of the founding team that will change the face of machine learning infrastructure. As an early hire, you’ll have outsize influence in defining the company’s culture and ensuring mission success.

Gallery

Gallery
Gallery
Gallery
Gallery

Voltage Park Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Typical time on-site: Flexible
HQSan Francisco, CA

Similar Jobs

Voltage Park Logo Voltage Park

Infrastructure Operations Engineer

Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
Remote
USA
140K-200K Annually

Voltage Park Logo Voltage Park

Storage Engineer

Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
Remote
USA
150K-180K Annually

Voltage Park Logo Voltage Park

Solutions Engineer

Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
Remote
2 Locations
145K-185K Annually

Voltage Park Logo Voltage Park

Content Marketing Manager

Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
Remote
USA
130K-180K Annually

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account