openai
Data Engineer, Scaling Analytics
San Francisco (Remote)fulltimemid
About this role
OpenAI's Scaling Analytics team seeks a Data Engineer to build and maintain the data infrastructure supporting global infrastructure operations, capacity planning, and supply chain decisions. You'll design scalable pipelines and reporting systems that transform operational data into actionable insights for hardware deployment, site execution, and strategic planning at unprecedented scale.
What you'll do
- Design and maintain production data pipelines for infrastructure deployment, operations, and capacity planning across global data centers
- Develop trusted datasets and reporting systems for hardware inventory, deployment status, and operational performance visibility
- Partner with cross-functional teams to define metrics, establish data standards, and improve decision-making across infrastructure organizations
- Create scalable data models supporting consistent reporting from multiple operational sources
- Implement data quality checks, monitoring, and governance practices for critical infrastructure datasets
- Support executive reporting, forecasting, and strategic planning through reliable analytical foundations
What they're looking for
- SQL and scalable data modeling
- Python or similar data engineering programming language
- Modern data warehouses (Snowflake, BigQuery, Redshift)
- Orchestration frameworks (Airflow, Dagster)
- ETL/ELT workflow design and optimization
- Data quality and observability implementation
- Cross-functional stakeholder collaboration
- Large-scale operational telemetry systems
Opens the official application on the employer’s site. No login required.