Skip to main content

openai

Data Engineer, Scaling Analytics

San Francisco (Remote)fulltimemid

About this role

OpenAI's Scaling Analytics team seeks a Data Engineer to build and maintain the data infrastructure supporting global infrastructure operations, capacity planning, and supply chain decisions. You'll design scalable pipelines and reporting systems that transform operational data into actionable insights for hardware deployment, site execution, and strategic planning at unprecedented scale.

What you'll do

  • Design and maintain production data pipelines for infrastructure deployment, operations, and capacity planning across global data centers
  • Develop trusted datasets and reporting systems for hardware inventory, deployment status, and operational performance visibility
  • Partner with cross-functional teams to define metrics, establish data standards, and improve decision-making across infrastructure organizations
  • Create scalable data models supporting consistent reporting from multiple operational sources
  • Implement data quality checks, monitoring, and governance practices for critical infrastructure datasets
  • Support executive reporting, forecasting, and strategic planning through reliable analytical foundations

What they're looking for

  • SQL and scalable data modeling
  • Python or similar data engineering programming language
  • Modern data warehouses (Snowflake, BigQuery, Redshift)
  • Orchestration frameworks (Airflow, Dagster)
  • ETL/ELT workflow design and optimization
  • Data quality and observability implementation
  • Cross-functional stakeholder collaboration
  • Large-scale operational telemetry systems
Apply on the employer's site

Opens the official application on the employer’s site. No login required.