Bytedance logo

Student Researcher [Seed LLM - Code Generation] 2026 Start (PhD)

Bytedance

San Jose, CA
Internship
Intern
135k-135k
5 days ago

Job Description

About the Role

The Seed LLM Code Generation Team at ByteDance is dedicated to enhancing the model's coding capabilities and building a bridge for AI to interact with the digital world. The team focuses on large-scale automated synthesis of coding problems, reinforcement learning for code agents, and constructing high-quality code pre-training datasets. PhD internships at ByteDance offer students the opportunity to contribute to products and research, engage in hands-on learning, community-building, and collaboration with industry experts. The internship is designed to blend practical experience with exposure to emerging technologies and organizational growth.

Key Responsibilities

  • Develop methods for code generation and editing using large language models, including improving performance on tasks such as synthesis, repair, documentation, and test generation.
  • Conduct research on self-evolving agents that can learn to write, edit, and optimize code over time with minimal supervision.
  • Explore multi-agent reinforcement learning settings where agents collaborate, compete, or communicate to solve complex programming tasks.
  • Investigate instruction tuning, multi-turn prompting, retrieval-augmented generation, and reinforcement learning for program synthesis.
  • Build benchmarks and tools for evaluating model performance in code understanding, collaborative reasoning, and long-horizon programming scenarios.

Requirements

  • Currently pursuing a PhD in Computer Science, Machine Learning, Programming Systems, or a related field.
  • Research experience in code generation, program synthesis, or LLMs for software engineering.
  • First-author publications in top venues such as NeurIPS, ICLR, ICML, PLDI, or OOPSLA.
  • Familiarity with LLM toolchains (e.g., HuggingFace, PyTorch) and structured code/data handling.

Nice to Have

  • Experience with reinforcement learning or agent-based learning in structured environments.
  • Background in multi-agent coordination, curriculum learning, or self-improving systems.
  • Understanding of software reasoning tasks such as static analysis, refactoring, or automated debugging.
  • Familiarity with open code benchmarks (e.g., HumanEval, MBPP, CodeContests, SWE-bench).

Qualifications

  • Research experience in code generation, program synthesis, or LLMs for software engineering.
  • First-author publications in top research venues.
  • Familiarity with LLM toolchains and structured code/data handling.

Benefits & Perks

  • Interns have day one access to health insurance, life insurance, wellbeing benefits, and more.
  • Interns receive 10 paid holidays per year and paid sick time (56 hours if hired in the first half of the year, 40 hours if in the second half).
  • Interns who are not working 100% remotely may be eligible for housing allowance.
  • The hourly rate for this position is $65.

Working at Bytedance

ByteDance's mission is to inspire creativity and enrich life through innovative products that help people express themselves, discover, and connect. The company values curiosity, humility, and impact, fostering a culture of continuous iteration and a 'Always Day 1' mindset. ByteDance is committed to diversity, inclusion, and creating an environment that reflects the communities it serves. It emphasizes collaboration, innovation, and making meaningful breakthroughs for its employees, users, and society.

Apply Now

Job Details

Posted AtAug 1, 2025
Job CategoryData Science
Salary135k-135k
Job TypeInternship
Work ModeOnsite
ExperienceIntern

Job Skills

AI Insights

Key skills identified from this job posting

Sign upto access all insights for this job

About Bytedance

Website

bytedance.com

Location

San Jose, CA

Industry

All Other Professional, Scientific, and Technical Services

Get job alerts

Set up personalized alerts for your job search and get tailored job digests for close matches