Jobs/Remote (United States)/Senior Research Engineer, Post-training & Evaluation

Remote (United States), United States, United States

Senior Research Engineer, Post-training & Evaluation

Reddit is a community of communities. It’s built on shared interests, passion, and trust, and is home to the most open and authentic conversations on the internet.

Open job Back to listings

Company

Compensation

Not listed

Schedule

Full-Time

Quick facts

Location: Remote (United States), US

Work style: Remote

Industry: Artificial Intelligence

Recruiter: Reddit talent team

Posted: Posted 2026-03-20

Source: greenhouse-reddit

Confidence: 90/100

Why it fits

Senior Research Engineer, Post-training & Evaluation is a high-signal remote role in Remote (United States), and it is most realistic for united states residents.

Company snapshot

Reddit builds large-scale consumer, ads, and platform systems with hiring across mobile, backend, machine learning, and product engineering.

Eligibility

United States residentsUnited States citizensCandidates already authorized to work in the United States

Technical signals

aillmmachine-learningresearchpythondataapimobilebackendplatform

Loading workspace tools...

Subscriber advantage

Source signal

Verified public source

90/100 confidence from greenhouse-reddit.

Eligibility

Clear market access

United States residents

Comp signal

Compensation hidden

Plan to ask for compensation range early in the process.

Role overview

What this role actually needs.

Senior Research Engineer, Post-training & Evaluation at Reddit in Remote (United States). UpJobz keeps this listing high-signal for applicants targeting serious high-tech roles across the United States, Canada, and Mexico. Reddit is a community of communities. It’s built on shared interests, passion, and trust, and is home to the most open and authentic conversations on the internet.

Responsibilities

Day-to-day expectations

A clear list of the work this role is designed to cover.

Architect and maintain the "Reddit Benchmark" evaluation suite: A comprehensive harness that rigorously tests model capabilities across Safety, Reasoning, and Reddit-specific knowledge (slang, norms).
Build scalable SFT (Supervised Fine-Tuning) pipelines: Implement efficient, distributed training loops for instruction tuning, converting raw base models into helpful assistants.
Develop Model-as-a-Judge systems: Engineer automated evaluation pipelines using strong models (e.g., GPT-5, Nova, Claude) to grade the outputs of our internal models, enabling rapid iteration cycles.
Execute Synthetic Data generation strategies: Create and curate high-quality instruction sets to improve model generalization where human data is scarce.
Collaborate with Safety Engineering: Translate high-level safety policies into concrete evaluation metrics and unit tests that run in our CI/CD pipelines.
Debug post-training instability: Dive deep into loss curves and evaluation logs to identify when fine-tuning is causing alignment tax or capability degradation.

Requirements

What a strong candidate brings

This keeps the job page specific, readable, and easier to match.

4+ years of professional experience in machine learning engineering, with a focus on LLM fine-tuning or evaluation.
Fluency in Python and PyTorch, with experience using libraries like Hugging Face Transformers, vLLM, or lm-eval-harness.
Deep understanding of Instruction Tuning (SFT) and how data quality impacts model behavior.
Experience building Evaluation Pipelines: You know the difference between MMLU, GSM8K, and how to build a custom domain-specific benchmark.
Familiarity with distributed training (FSDP/DeepSpeed) for fine-tuning jobs.
Strong data engineering skills for curating and cleaning instruction datasets.

Benefits

Why people would want this job

Benefits help searchers understand whether the role is a real fit before they apply.

Comprehensive Healthcare Benefits and Income Replacement Programs
401k with Employer Match
Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support
Family Planning Support
Gender-Affirming Care
Mental Health & Coaching Benefits

Browse similar jobs

Remote (United States)

AI Deployment Engineer- ChatGPT Ecosystem

OpenAI · $200.8K - $345K

Remote (United States)

AI Deployment Engineer- Codex

OpenAI · $157.6K - $222.4K

Remote (United States)

AI Deployment Manager - US Remote

OpenAI · $197K - $278K

Subscriber playbook

Turn this listing into an application plan.

This is the first pass at the premium UpJobz layer: a fast brief that helps serious applicants move with more clarity.

Next moves

Tailor your resume around ai and llm instead of sending a generic application.
Use the first two bullets of your application to connect your background directly to senior research engineer, post-training & evaluation is a high-signal remote role in remote (united states), and it is most realistic for united states residents.
Open the role quickly if it fits and bookmark three similar jobs before you leave the page.

Interview themes

Artificial IntelligenceRemoteaillmmachine-learningresearch

Watchouts

Compensation is hidden, so get range clarity in the first recruiter conversation.
Use united states residents as part of your positioning so the recruiter does not have to infer it.
Lead with distributed collaboration, async delivery, and timezone discipline.

SEO context

Search intent signals for this listing

Helpful keyword hooks for serious tech searchers and future programmatic job pages.

Senior Research Engineer, Post-training & EvaluationRedditRemote (United States)USArtificial Intelligenceaillmmachine-learningresearchpythondataapimobilebackendplatform

Next step

Ready to move on this role?

This page keeps the application flow simple while giving you enough context to decide quickly and move.

Open job Continue browsing tech jobs