Skip to main content

Bland

Machine Learning Researcher, Multimodal LLMs

San Francisco$140k–$250kfulltimemidAdded 2 days ago

About this role

Bland is seeking an ML researcher to develop next-generation multimodal LLMs that combine speech, text, and real-time reasoning for AI phone agents. You'll move research ideas through production systems handling millions of daily calls, focusing on natural conversational interactions that integrate streaming audio and tool execution.

What you'll do

  • Develop and improve multimodal LLM architectures combining speech, text, tools, and real-time reasoning
  • Design and run rapid experiments from hypothesis to conclusion within days
  • Translate research advances into production systems serving enterprise-scale voice interactions
  • Optimize for latency, correctness, and natural conversational behavior
  • Build datasets and fine-tuning approaches for speech-language integration
  • Drive models from research phase through deployment and user-facing improvements

What they're looking for

  • Large language models and multimodal model development
  • Speech-language systems or neural audio codecs
  • Prompting, fine-tuning, and model alignment techniques
  • Experimental design and rapid iteration methodology
  • Real-time voice systems or conversational AI
  • Tool-using agents and agent frameworks
  • Systems thinking and full-stack model deployment
  • Product intuition for user experience optimization

Benefits

  • Competitive salary: $180,000 – $260,000
  • Meaningful equity
  • Full healthcare, dental, and vision coverage
  • Office in Jackson Square, San Francisco (or remote)
  • High autonomy and high-impact work
  • Opportunity to shape foundational AI voice technology
Apply on the employer's site

Opens the official application on the employer’s site. No login required.