Skip to main content

openai

Data Engineer

San Franciscofulltimemid

About this role

OpenAI is hiring a Data Engineer to design and maintain critical data pipelines and warehouse infrastructure that power analytics, safety systems, and model training. You'll collaborate across teams to build scalable, fault-tolerant systems that support business decisions and AI research.

What you'll do

  • Design and manage data pipelines for ingesting user event data into the data warehouse
  • Develop canonical datasets to track product metrics like user growth, engagement, and revenue
  • Collaborate with Infrastructure, Data Science, Product, Marketing, Finance, and Research teams on data solutions
  • Build robust, fault-tolerant systems for data ingestion and processing
  • Participate in data architecture decisions and technical planning
  • Ensure data security, integrity, and compliance with industry standards

What they're looking for

  • Data pipeline design and development
  • Python, Scala, or Java
  • Apache Spark (writing, debugging, optimizing)
  • Distributed processing frameworks (Hadoop, Flink)
  • ETL schedulers (Airflow, Dagster, Prefect)
  • Distributed storage systems (HDFS, S3)
  • Data warehouse management
  • SQL and data modeling

Benefits

  • Work on AI systems with significant real-world impact
  • Collaborate with ChatGPT research team
  • San Francisco HQ location
  • Relocation assistance provided
  • Equal opportunity employer
Apply on the employer's site

Opens the official application on the employer’s site. No login required.