Skip to yearly menu bar Skip to main content


MLSys 2026 Career Opportunities

Here we highlight career opportunities submitted by our Exhibitors, and other top industry, academic, and non-profit leaders. We would like to thank each of our exhibitors for supporting MLSys 2026.

Search Opportunities

Team Description:

The AI Foundations team is at the center of bringing our vision for AI at Capital One to life. Our work touches every aspect of the research life cycle, from partnering with Academia to building production systems. We work with product, technology and business leaders to apply the state of the art in AI to our business.

In this role, you will:

Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money.

Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data.

Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation.

Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences.

Flex your interpersonal skills to translate the complexity of your work into tangible business goals.

The Ideal Candidate:

You love the process of analyzing and creating, but also share our passion to do the right thing. You know at the end of the day it’s about making the right decision for our customers.

Innovative. You continually research and evaluate emerging technologies. You stay current on published state-of-the-art methods, technologies, and applications and seek out opportunities to apply them.

Creative. You thrive on bringing definition to big, undefined problems. You love asking questions and pushing hard to find answers. You’re not afraid to share a new idea.

A leader. You challenge conventional thinking and work with stakeholders to identify and improve the status quo. You’re passionate about talent development for your own team and beyond.

Technical. You’re comfortable with open-source languages and are passionate about developing further. You have hands-on experience developing AI foundation models and solutions using open-source tools and cloud computing platforms.

Has a deep understanding of the foundations of AI methodologies.

Experience building large deep learning models, whether on language, images, events, or graphs, as well as expertise in one or more of the following: training optimization, self-supervised learning, robustness, explainability, RLHF.

An engineering mindset as shown by a track record of delivering models at scale both in terms of training data and inference volumes.

Experience in delivering libraries, platform level code or solution level code to existing products.

A professional with a track record of coming up with new ideas or improving upon existing ideas in machine learning, demonstrated by accomplishments such as first author publications or projects.

Possess the ability to own and pursue a research agenda, including choosing impactful research problems and autonomously carrying out long-running projects.

About Modular

At Modular, we’re on a mission to revolutionize AI infrastructure by systematically rebuilding the AI software stack from the ground up. Our team, made up of industry leaders and experts, is building cutting-edge, modular infrastructure that simplifies AI development and deployment. By rethinking the complexities of AI systems, we’re empowering everyone to unlock AI’s full potential and tackle some of the world’s most pressing challenges.

If you’re passionate about shaping the future of AI and creating tools that make a real difference in people’s lives, we want you on our team. You can read about our culture and careers to understand how we work and what we value.

About the role:

In the Cloud Inference team, we are focused on building end to end distributed LLM inference deployments that are fully vertically integrated with the MAX stack. Our goal is to make inference both the fastest and most scalable while also building an easiest platform for deploying and scaling models for enterprises and developers alike. We're seeking engineers who are passionate about pushing the boundaries of distributed inference systems and enjoy working at the intersection of large-scale systems and machine learning. We are looking for candidates based on their breadth and depth of experience in backend engineering, AI inference, and distributed systems development. If this sounds exciting, we invite you to join our world-leading AI infrastructure team and help drive our industry forward!

LOCATION: Candidates based in the US or Canada are welcome to apply. You can work out of our office in Los Altos, CA or remotely from home. Onboarding for new hires is conducted in-person in our Los Altos, CA office.

What you will do:

Build & ship a LLM focused inference platform using best in class inference techniques (disaggregated inference, multi-node deployment of large models, high performance networking, distributed kv-cache management, high throughput batch processing, etc) Push the envelope for operational excellence with request-to-kernel observability, multi-cloud deployments, clever autoscaling, cold-start optimizations, and more. Collaborate with our kernels and genAI teams to achieve SOTA application performance by integrating SOTA kernel & serving optimizations with SOTA cluster optimizations. Build helm charts, kubernetes operators, and more to make a create simple, effective, maintainable deployments.

What you bring to the table:

5+ years of experience working in backend engineering Experience with kubernetes and operating your own services Ability to create durable, reusable software tools and libraries that are leveraged across teams and functions Experience in machine learning technologies and use cases Creativity and curiosity for solving complex problems, a team-oriented attitude that enables you to work well with others, and alignment with our culture Strongly identifies with our core company cultural values.

Helpful but not required:

Experience with high performance computing / networking Experience working on high scale ML inference infrastructure (traditional AI or genAI) Familiarity with golang

About the job

Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google’s needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward.

As a PhD graduate, your research expertise is invaluable to us. Explore a variety of projects, collaborate with various teams, and contribute to products that are changing the world, across many product areas, including AI & Infrastructure, Cloud, and more!

Our Engineering teams include thousands of PhDs who bring their deep knowledge and research experience to enhance our systems and products. As a Google PhD Software Engineer, you will work on critical projects, with many opportunities to learn and follow your interests. We expect our engineers to be creative and versatile, leading and identifying new problems to push the field and Google technology forward.

Google is one of the world’s leading suppliers and consumers of ML and AI technology, with decades of experience in designing, deploying, and using ML software and custom ML hardware infrastructure at massive scale.

The US base salary range for this full-time position is $147,000-$211,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

Participate in or lead team projects to carry out design, analysis, and development of advanced systems across the stack, using your research expertise. Write documents that set new technical directions. Contribute to existing documentation or educational content, and adapt content based on product/program updates and user feedback.
Study, diagnose and resolve complex technical issues by analyzing the sources of the issues and the impact on software, hardware, network, or service operations and quality. Develop code, review code developed by other developers, and provide feedback to ensure best practices (e.g., style guidelines, accuracy, testability, and efficiency).

The D. E. Shaw group seeks a highly motivated and entrepreneurial technical product engineer to join its newly formed private equity venture, Cove, and help build the AI-powered platform at its core. This role sits at the intersection of product strategy and technical execution, offering the opportunity to define, shape, and deliver technology solutions that will become the operational backbone of the group. As an early team member, this product engineer will play a key role in addressing the open challenge of applying AI to private equity investments and operations, with the backing of one of the most technologically sophisticated investment firms in the world.

What you'll do day-to-day

You'll be involved in all aspects of building and scaling technology products for the fund's investment activities, including: - Work closely with the investment and operations teams to surface high-impact opportunities, pressure-test ideas, and translate workflow challenges into clear product direction. - Own product design end-to-end—from how data is structured and connected to the business logic that determines how a tool actually behaves—bringing both conceptual clarity and technical precision to each iteration. - Design and build AI-native products that use LLMs to change how investment teams work, with a solid intuition for how model behavior shapes user experience and where AI can add genuine leverage. - Drive products from prototype to production, contributing code directly—especially in early stages—when tight product and business judgment matters most.

Who we're looking for
  • A bachelor’s degree or higher, an impressive record of academic and professional achievement, and at least five years of relevant experience.
  • At least two years of experience developing technology products in direct collaboration with engineering teams, including at least one year focused on workflow products that streamline business operations and processes.
  • Experience successfully taking a product from conception to completion, ideally in a startup environment; prior experience developing products for vertical-specific or industry-focused applications is a plus.
  • A solid technical foundation in full-stack product development—spanning APIs, databases, and user interfaces—with the ability to read and execute code, and proficiency in overseeing technical aspects from architecture decisions to implementation details.
  • At least one year of experience developing and integrating LLM-powered systems into production applications, with knowledge of agentic frameworks and their practical implementation; demonstrated ability to translate AI capabilities (including autonomous agents, tool use, and multi-step reasoning) into practical product features that solve real-world problems.
  • Well-developed communication skills, a collaborative and entrepreneurial mindset, and the ability to successfully manage multiple projects at once.
  • The expected annual base salary for this position is $185,000 to $250,000. Our compensation and benefits package includes variable compensation in the form of a year-end bonus, guaranteed in the first year of hire, and benefits including medical and prescription drug coverage, 401(k) contribution matching, wellness reimbursement, family building benefits, and a charitable gift match program.

Location: San Francisco · On-site


ABOUT THE COMPANY

We're building autonomous research agents for recursive self-improvement (multi-agent systems that propose, run, and analyze machine learning experiments). We're a small team based in San Francisco, on-site

ABOUT THE ROLE

You'll be researching making models efficient: quantization, speculative decoding, sparse and structured attention, distillation, mixture-of-experts inference, and the training-time techniques that make those methods possible. The work spans algorithm design, careful evaluation, and pushing methods to where they actually run.

This is a senior research role with a clear engineering edge. You'll spend time at the intersection of model architecture and inference performance, designing methods that move accuracy/latency/cost trade-offs in our favor (then partnering with engineers to make those wins real in production).

WHAT YOU'LL DO

  • Research and develop quantization methods: post-training quantization, quantization-aware training, mixed-precision regimes, low-bit-width arithmetic
  • Design and evaluate speculative decoding approaches: draft models, tree attention, parallel speculation, lookahead decoding
  • Investigate training-time efficiency methods that compose well with inference: distillation, sparse attention, mixture-of-experts, low-rank adaptation, pruning
  • Run controlled experiments at production scale; characterize what works on real workloads, not just toy benchmarks
  • Co-design methods with the inference engineering team: push results to where they actually run, not stop at the paper
  • Read deeply across the efficient ML / efficient inference literature; translate the most useful ideas into our stack
  • Publish when the work warrants it; share findings internally
  • Partner with model and training researchers so efficiency choices align with model architecture and post-training decisions

WHAT WE'RE LOOKING FOR

  • Strong track record of ML research on efficiency methods: quantization, speculative decoding, distillation, MoE, sparse attention, or adjacent
  • 5+ years of hands-on research experience
  • Deep familiarity with both training and inference performance characteristics
  • Fluent in PyTorch, Jax or equivalent; comfortable working at the kernel and serving-framework level when methods require it
  • Track record of moving efficiency research from prototype to production
  • Strong statistical expertise: you'd notice a flawed comparison before someone else points it out
  • Strong written communication
  • Published research at NeurIPS, ICML, ICLR, MLSys, or comparable venues

NICE TO HAVE

  • PhD in ML, systems, or related field
  • Open-source contributions to quantization, speculative-decoding, or efficient-inference libraries
  • Experience with hardware-aware optimization and accelerator-specific tooling
  • Background in numerical methods, low-precision arithmetic, or approximate computation

THIS ROLE IS PROBABLY NOT FOR YOU IF

  • You want to focus on pretraining large models from scratch (that's a different role)
  • You prefer abstract algorithmic research without hands-on implementation
  • You want a fixed benchmark with stable targets (our targets shift with what our models actually need to do)

P.S. We’re also hosting a small private dinner during MLSys for people interested in agents, recursive self-improvement, and AI infrastructure. Apply to join us here: https://luma.com/u6yt1gri

About Modular

At Modular, we’re on a mission to revolutionize AI infrastructure by systematically rebuilding the AI software stack from the ground up. Our team, made up of industry leaders and experts, is building cutting-edge, modular infrastructure that simplifies AI development and deployment. By rethinking the complexities of AI systems, we’re empowering everyone to unlock AI’s full potential and tackle some of the world’s most pressing challenges.

If you’re passionate about shaping the future of AI and creating tools that make a real difference in people’s lives, we want you on our team. You can read about our culture and careers to understand how we work and what we value.

About the role:

At Modular, we're on a mission to make AI accessible to everyone. As a Mojo Libraries Engineer, you'll help realize this vision by shaping the foundational libraries of the Mojo programming language. Mojo combines performance and expressiveness to unlock the full potential of modern hardware for AI. In this role, you'll design and implement core abstractions across the Mojo Standard Library and kernel libraries — empowering developers to build efficiently, integrate with AI systems, and accelerate innovation across the AI stack.

LOCATION: Candidates based in the US or Canada are welcome to apply. You can work in our office in Los Altos, CA or remotely from home. Onboarding for new hires is conducted in-person in our Los Altos, CA office.

What you will do:

Partner with teams across Modular—including compiler, kernel, and runtime engineers—to design APIs that are ergonomic, expressive, and performance-conscious. Your work will directly shape the developer experience and deliver high business impact by enabling critical customer workloads and accelerating platform adoption. Design and implement abstractions that enable the development and optimization of kernels and algorithms running on GPUs and CPUs, delivering top-tier performance and accuracy throughout the GenAI inference solution stack. Optimize library functionality for performance, memory efficiency, and hardware utilization. Participate in design discussions and code reviews to uphold high engineering standards. Collaborate with other engineering teams to align the standard library with broader platform needs. Contribute to a fully open source project — everything you build will be public and part of our GitHub repo. Engage with the open source community to support high-quality contributions and foster a welcoming, inclusive environment.

What you bring to the table:

Experience with GPU programming languages like CUDA or OpenCL. Deep understanding of GPU architecture (memory hierarchies, tensor cores, etc.) and how it influences algorithm and API design. Proficiency in modern systems languages (C++, Rust, or Swift), with strong skills in low-level programming, performance optimization, and memory management. Passion for both library design and performance, with a commitment to following modern API patterns and language evolution in C++, Rust, and Swift ecosystems. Hands-on open source experience — either through meaningful contributions or by maintaining and reviewing community submissions. A collaborative mindset, intellectual curiosity, and drive to solve complex technical challenges as part of a high-performing team. Strong alignment with Modular's cultural values and enthusiasm for building a thoughtful, inclusive engineering culture.

Helpful, but not required:

Familiarity with GPU assembly languages such as PTX and SASS. Knowledge of cutting-edge APIs such as CUTLASS or Triton. Prior experience with Mojo, especially with accepted contributions to the Mojo Standard Library. Familiarity with LLVM, MLIR, or Python, particularly in systems or compiler-related contexts. An advanced degree in Computer Science or a related field.

Quants at the D. E. Shaw group apply mathematical techniques and write software to develop, analyze, and implement statistical models for our computerized financial trading strategies. They utilize their creativity and innovation to create novel approaches to trade profitably in markets around the globe with a firm that offers a collegial, collaborative, and engaging working environment.

What you'll do day-to-day

Specific responsibilities range from leveraging financial data in an effort to increase profitability, decrease risk, and reduce transaction costs to conceiving new trading ideas, formulating them into systematic strategies, and critically evaluating their performance.

Who we're looking for
  • Successful candidates will have impressive records of academic achievement and be the top students in their respective math, statistics, physics, engineering, computer science, and other technical and quantitative programs.
  • The expected annual base salary for this position is $275,000 for applicants who have completed undergraduate or master’s degrees and $300,000 for applicants who have completed PhDs (or have comparable professional experience). Our compensation and benefits package includes substantial variable compensation in the form of a year-end bonus, guaranteed in the first year of hire, a sign-on bonus, a relocation bonus, and benefits including medical and prescription drug coverage, 401(k) contribution matching, wellness reimbursement, family building benefits, and a charitable gift match program.