Skip to main content

Etched.ai

Performance Modeling Engineer

San Jose$175k–$275kfulltimemidAdded today

About this role

Etched seeks a Performance Modeling Engineer to develop analytical models and analyze deep learning workloads on custom inference hardware. You'll identify architectural bottlenecks, drive hardware-software co-optimization, and inform next-generation chip design decisions for an AI infrastructure startup backed by top-tier investors.

What you'll do

  • Build performance models and projections across varying workloads and system configurations
  • Profile deep learning workloads on hardware to identify micro-architectural bottlenecks
  • Drive hardware/software co-optimization by analyzing architectural features for performance gains
  • Validate performance models against real systems and silicon through regression testing
  • Pathfind architectural decisions during design and proof-of-concept phases
  • Analyze inference serving workloads and system-level performance implications

What they're looking for

  • Performance modeling and analysis (analytical or simulation-based)
  • Computer architecture and micro-architecture knowledge
  • Deep learning workload profiling on accelerators
  • Software engineering fundamentals
  • GPU architectures and CUDA programming
  • Transformer model inference optimization
  • Architecture simulators (gem5, trace-driven tools)
  • ASIC/FPGA/CGRA accelerator development

Benefits

  • Medical, dental, and vision coverage with $500/month credit option
  • $2,000/month housing subsidy for those within walking distance of office
  • Relocation support to San Jose
  • Wellness benefits including fitness and mental health
  • Daily lunch and dinner provided
  • Unlimited compute budget subject to ROI justification

Likely interview questions

  • Walk us through your experience building performance models—were they analytical, simulation-based, or both? How did you validate them against real hardware?
  • Describe a time you profiled a deep learning workload on an accelerator (GPU, TPU, ASIC, etc.) and identified a micro-architectural bottleneck. What did you do with that insight?
Apply on the employer's site

Opens the official application on the employer’s site. No login required.