Bland

Machine Learning Researcher, Multimodal LLMs

San Francisco$140k–$250kfulltimemidAdded 2 days ago

About this role

Bland is seeking an ML researcher to develop next-generation multimodal LLMs that combine speech, text, and real-time reasoning for AI phone agents. You'll move research ideas through production systems handling millions of daily calls, focusing on natural conversational interactions that integrate streaming audio and tool execution.

What you'll do

Develop and improve multimodal LLM architectures combining speech, text, tools, and real-time reasoning
Design and run rapid experiments from hypothesis to conclusion within days
Translate research advances into production systems serving enterprise-scale voice interactions
Optimize for latency, correctness, and natural conversational behavior
Build datasets and fine-tuning approaches for speech-language integration
Drive models from research phase through deployment and user-facing improvements

What they're looking for

Large language models and multimodal model development
Speech-language systems or neural audio codecs
Prompting, fine-tuning, and model alignment techniques
Experimental design and rapid iteration methodology
Real-time voice systems or conversational AI
Tool-using agents and agent frameworks
Systems thinking and full-stack model deployment
Product intuition for user experience optimization

Benefits

Competitive salary: $180,000 – $260,000
Meaningful equity
Full healthcare, dental, and vision coverage
Office in Jackson Square, San Francisco (or remote)
High autonomy and high-impact work
Opportunity to shape foundational AI voice technology

Apply on the employer's site →

Opens the official application on the employer’s site. No login required.