openai
Software Engineer, Data Acquisition
San Franciscofulltimemid
About this role
Join OpenAI's Data Acquisition team to design and maintain large-scale data collection infrastructure supporting model training. You'll lead engineering projects in web crawling, data ingestion, and search while collaborating across teams to ensure compliance and system reliability.
What you'll do
- Lead data acquisition engineering projects including web crawling, data ingestion, and search systems
- Build and deploy highly scalable distributed systems to handle petabyte-scale data volumes
- Design data indexing and search algorithms for efficient information retrieval
- Develop backend services for data storage using key-value databases and synchronization
- Collaborate with Data Processing, Architecture, and Scaling teams on data flow and operability
- Work with legal team on compliance and data privacy matters
What they're looking for
- Distributed systems architecture and design
- Kubernetes and Infrastructure-as-Code
- Large-scale web crawling
- Data processing and indexing
- Backend service development
- Key-value database management
- System experimentation and analysis
- Cross-team collaboration and communication
Opens the official application on the employer’s site. No login required.