Data Center Engineer
San Jose, CA, USA
About Etched
Etched is building hardware for frontier intelligence. We co-design chips, racks, software, and manufacturing to deliver best-in-class throughput and latency across both prefill and decode workloads. Our first products are heavily focused on inference. Backed by hundreds of millions from top-tier investors and staffed by leading engineers, Etched is redefining the infrastructure layer for the fastest growing industry in history.
Job Summary
Deploying next-generation inference hardware at scale requires more than great chips - it demands world-class physical infrastructure. As a Data Center Engineer at Etched, you will own the end-to-end lifecycle of our data center and hardware lab environments: from facility selection and rack design, power distribution, networking layout, cabling, hardware management, day-to-day operations, and long-term capacity planning. You'll work directly with the hardware, platform, and software teams to bring Sohu systems online faster, keep them running harder, and push the limits of what dense, high-power AI inference design and manufacturing infrastructure can do.
This is not a traditional data center operations role. We expect you to treat data center engineering with the same rigor and craftsmanship we apply to our chip design - thinking from first principles about power density, thermal constraints, network topology, and physical security. You'll be making real architectural decisions that directly shape how our products are engineered, manufactured and reach customers.
You will be on the ground in our co-location facilities and lab environments, working hands-on with custom server platforms and high-speed networking. You'll drive the processes, tooling, and vendor relationships that allow us to scale our infrastructure as fast as our product roadmap demands.
Key Responsibilities
Own rack layout, capacity planning, power distribution, network design, cabling, and physical deployment of Etched High performance computing platforms across data center and hardware lab environments.
Design and manage power distribution and redundancy architectures — from utility feeds and PDUs to per-rack power and cooling budgets - for high-density AI compute deployments pushing 90 kW per rack.
Collaborate with physical infrastructure and facilities teams, as well as external vendors to build and manage highly sophisticated hardware labs used for bring-up, EVT, and customer demos.
Partner with co-location vendors and internal teams to evaluate sites, negotiate contracts, and enforce SLAs around power, cooling, physical security, and network connectivity.
Architect and implement high-speed networking infrastructure (100G/200G/400G Ethernet) connecting compute nodes, storage, and upstream peering, in coordination with network and platform engineering teams.
Develop and maintain asset management systems, rack diagrams, and change control processes to ensure full visibility into physical infrastructure state at all times.
Build and operate monitoring and alerting for environmental health (temperature, humidity, power draw, UPS state) and drive rapid response to hardware and facility incidents.
Define and execute preventive maintenance schedules and hardware lifecycle processes, including RMA coordination with vendors and on-site repair.
Lead capacity planning cycles in lockstep with the hardware roadmap, forecasting power, space, and network needs 6–18 months out and translating those forecasts into facility agreements and procurement plans.
Establish and enforce physical security procedures, access control policies, and audit trails across all data center sites.
You may be a good fit if you
Have 5+ years of hands-on data center engineering or operations experience, with direct responsibility for physical hardware deployment, power architecture, and facility management.
Have designed and deployed high-density compute environments (20 kW/rack and above) and have first-hand experience managing the thermal and power challenges that come with them.
Are deeply comfortable with structured cabling, fiber and copper plant management, and high-speed networking hardware at scale.
Can read and interpret electrical one-line diagrams, raised-floor and hot-aisle/cold-aisle plans, and co-location facility documentation without assistance.
Have built or operated monitoring and DCIM tooling and treat infrastructure visibility as a non-negotiable property of any environment you own.
Are a strong vendor manager - you know how to write an RFP, run a competitive evaluation, and hold a co-lo or hardware vendor to their commitments.
Thrive in fast-moving environments where requirements shift quickly and you need to make confident decisions with incomplete information.
Are driven by ownership and take pride in environments that are clean, documented, and operationally excellent - not just ones that are "up."
Strong candidates may also have experience with
Physical deployment and bring-up of custom or semi-custom server platforms, including early-stage hardware that doesn't come with vendor support.
Liquid cooling systems (direct liquid cooling, rear-door heat exchangers, or immersion cooling) and the facility requirements they impose.
AI or HPC cluster environments - GPU or ASIC clusters, high-radix switch fabrics, RDMA networking.
Scripting and automation (Python, Bash, Ansible) for asset tracking, environmental monitoring integration, or operational workflows.
Working within a semiconductor or hardware startup, where roadmaps compress and infrastructure needs to keep pace with silicon.
Benefits
-
Medical, dental, and vision packages with generous premium coverage
$500 per month credit for waiving medical benefits
Housing subsidy of $2k per month for those living within walking distance of the office
Relocation support for those moving to San Jose (Santana Row)
Various wellness benefits covering fitness, mental health, and more
Daily lunch and dinner in our office
Unlimited compute budget subject to ROI justification
How We're Different
Etched believes in the Bitter Lesson. We are the first inference-focused frontier AI system, betting early on transformer and transformer-like architectures and on increasing model sizes. Our addressable market is the entirety of inference, unlike many of our competitors.
We are a fully in-person team in San Jose (Santana Row), and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both and work across disciplines as needed.