Iambic Therapeutics
Machine Learning Scientist — Agentic data pipelines
Boston Office (Remote)$148k–$210kfulltimemidAdded 2 days ago
About this role
Join Iambic Therapeutics as a Machine Learning Scientist to design and build AI-powered data pipelines that acquire, clean, and quality-control biomedical datasets for training the Enchant multimodal transformer model. You'll develop LLM-based agents to automate data processing across diverse sources and modalities, combining software engineering rigor with biomedical domain knowledge.
What you'll do
- Design and build agentic systems for automated data acquisition from biomedical sources
- Develop LLM-based pipelines for data cleaning, normalization, and formatting across multiple modalities
- Implement automated quality-control workflows to detect anomalies and enforce data standards
- Evaluate and optimize agent architectures, prompting strategies, and tool-use patterns
- Collaborate with ML scientists to translate data requirements into scalable systems
- Monitor production pipelines, diagnose failures, and document data provenance
What they're looking for
- Python software engineering and production-quality code
- LLM APIs (Claude, GPT) and agentic patterns (tool use, orchestration)
- Biomedical and chemical data formats (PDB, UniProt, ChEMBL, FASTA)
- ETL design and data validation at scale
- Data engineering fundamentals for structured and unstructured data
- Agent orchestration frameworks
- Cloud infrastructure and workflow orchestration (AWS, Docker, Kubernetes)
- Multimodal biomedical data (molecules, proteins, genomics, clinical records)
Benefits
- Competitive salary and equity
- Company-paid healthcare and flexible spending accounts
- 401(k) matching and voluntary life insurance
- Uncapped vacation
- State-of-the-art San Diego facility with onsite gym and dining
- Remote work options (US or UK) with on-site availability in Bristol and Boston
Opens the official application on the employer’s site. No login required.