Interested in advancing how agents and LLMs learn from feedback in realistic environments?
You’ll be joining a research-driven AI company building reinforcement learning simulation environments for agents and large language models, with a focus on post-training, evaluation, and scalable supervision.
Their tools are already used in production by leading AI labs and enterprises and due to demand they are growing fast.
As a Research Scientist, you’ll work hands-on on fundamental problems spanning LLM post-training, RL environments, and agentic evaluation. Your work will shape core methods and benchmarks, and you’ll see your research deployed into production systems. The team actively publishes and collaborates with external research labs, with recent work appearing at ACL and NeurIPS.
Conduct research on LLM post-training methods (RLHF, RLAIF, RLVR)
Design and build realistic RL simulation environments for agents
Develop agentic evaluation and supervision frameworks
Create and maintain benchmarks for emerging AI capabilities
Collaborate with engineers to take research from idea to deployed systems
Experience in applied research in reinforcement learning, LLM post-training, or agent-based systems
Strong understanding of transformer architectures and LLM fine-tuning
Ability to translate research ideas into working, production-ready systems
Publications at top-tier venues (NeurIPS, ICML, ACL, EMNLP)
Experience working on evaluation, safety, or oversight for advanced AI systems
Prior work on large-scale training or simulation environments
SF-based. Compensation up to $300k base (flexible, DOE) plus equity, unlimited PTO, and benefits.
Interested in working on the foundations of AI training, evaluation, and safety—while publishing high-quality research that ships?
All applications will receive a response.