Skip to yearly menu bar Skip to main content


MLSys 2026 Career Opportunities

Here we highlight career opportunities submitted by our Exhibitors, and other top industry, academic, and non-profit leaders. We would like to thank each of our exhibitors for supporting MLSys 2026.

Search Opportunities

Inception creates the world’s fastest, most efficient AI models. Our Mercury model is the world’s fastest reasoning LLM and first commercially available diffusion LLM, delivering 5x greater speed and efficiency than today’s LLMs, with best-in-class quality.

We are the AI researchers and engineers behind such breakthrough AI technologies as diffusion models, flash attention, and DPO.

The Role We're looking for engineers and scientists to design, optimize, and maintain the core systems that enable scalable, efficient reinforcement learning for large models. This role sits at the intersection of research and large-scale systems engineering: you'll wear many hats, from optimizing rollout and reward pipelines to enhancing reliability, observability, and orchestration, collaborating closely with researchers to make RL stable, fast, and production-ready.

Key Responsibilities - Design, build, and optimize the infrastructure that powers large-scale reinforcement learning and post-training workloads. - Improve the reliability and scalability of RL training pipelines, distributed RL workloads, and training throughput. - Develop shared monitoring and observability tools to ensure high uptime, debuggability, and reproducibility for RL systems.

Qualifications - BS/MS/PhD in Computer Science, Engineering, or a related field (or equivalent experience). - Understanding of ML frameworks (PyTorch, TensorFlow, Ray, Megatron) from a systems perspective. - Experience working with reinforcement learning workloads (PPO, DPO, RLHF, or reward modeling). - Experience with containerization (Docker), orchestration (Kubernetes), and CI/CD pipelines.

Preferred Skills - Experience building and maintaining large-scale language models with tens of billions of parameters or more. - Experience with ML workflow orchestration tools (Kubeflow, Airflow). - Background in performance optimization and profiling of ML systems.

Why Join Inception - Work with World-Class Talent: Collaborate with the inventors of diffusion models and leading AI researchers - Shape Foundational Technology: Your decisions will influence how the next generation of AI products are built and used - Immediate Impact: Join at the ground floor where your contributions directly shape product direction and company trajectory

Perks & Benefits - Competitive salary and equity in a rapidly growing startup - Flexible vacation and paid time off (PTO) - Health, dental, and vision insurance - Catered meals (breakfast, lunch, & dinner) - Commuter subsidies - A collaborative and inclusive culture

Location Santa Clara, California USA or Toronto, Canada


Description At Lemurian Labs, we’re on a mission to bring the power of AI to everyone—without leaving a massive environmental footprint. We care deeply about the impact AI has on our society and planet, and we’re building a rock-solid foundation for its future, ensuring AI grows sustainably and responsibly. Because let’s face it, what good is innovation if it doesn’t help the world?

We are building a high-performance, portable compiler that lets developers “build once, deploy anywhere.” Yes, anywhere. We’re talking about seamless cross-platform compatibility, so you can train your models in the cloud, deploy them to the edge, and everything in between—all while optimizing for resource efficiency and scalability.

If the idea of sustainably scaling AI motivates you and you’re excited about making AI development both powerful and accessible, then we’d love to have you. Join us at Lemurian Labs, where you can have fun building the future—without leaving a mess behind.

The Role We're looking for a Senior ML Performance Engineer to architect and lead our Performance Testing Platform from the ground up. You'll be the technical authority on how we measure, validate, and optimize the performance of large language models (Llama 3.2 70B, DeepSeek, and others) before and after compiler optimization on modern GPU architectures.

This is a high-impact role where you'll directly influence our product quality and our customers' success. You'll work at the intersection of ML systems, GPU architecture, and performance engineering—building the infrastructure that proves our compiler delivers real value.

Here is what you will do: Design and build a comprehensive performance testing platform for evaluating LLM inference workloads across GPU clusters Define and implement the benchmarking methodology, metrics, and test suites that measure latency, throughput, memory utilization, power consumption, and model accuracy Establish baseline performance for unoptimized models (Llama 3.2 70B, DeepSeek, etc.) and validate post-optimization improvements Develop automated testing pipelines for continuous performance validation across compiler releases and model updates Investigate performance bottlenecks using profiling tools (ROCm profilers, GPU traces, system-level monitoring) and work with the compiler team to drive optimizations Create dashboards and reporting that provide clear visibility into performance trends, regressions, and wins Collaborate cross-functionally with compiler engineers, ML engineers, and DevOps to ensure performance testing is integrated into our development workflow Document best practices for performance testing and optimization of ML workloads on GPU hardware

Essential Skills and Experience: BS degree in computer science, computer engineering, electrical engineering, or equivalent practical experience 7+ years of experience in performance engineering, benchmarking, or systems engineering roles Deep understanding of ML inference workloads, particularly transformer-based models and LLMs Hands-on experience with GPU programming and optimization (CUDA, ROCm, or similar) Strong programming skills in Python and C/C++ Proven track record of building performance testing infrastructure or benchmarking platforms from scratch Experience with ML frameworks (PyTorch, TensorFlow, ONNX Runtime, vLLM, TensorRT-LLM, etc.) Proficiency with profiling and debugging tools for GPU workloads Strong analytical skills with the ability to design experiments, analyze results, and communicate findings clearly Experience with CI/CD systems and test automation frameworks

About the job

Google Cloud’s mission is to make every business successful through AI by combining cutting-edge technology, infrastructure, and talent. AI/ML software engineers in Cloud bridge the gap between pioneering models and a massive product vehicle reaching billions. Our talent density and AI-powered tools drive rapid development, rooted in a culture of empowerment and a bias to action. In this role, you aren’t just building technology; you’re shaping the frontier of enterprise and driving the evolution of advanced models.

We build the industry's best data agents to help customers make more, better, and faster data-driven decisions—achieved by enriching the customer knowledge layer, automating data preparation, providing tailored agent harnesses, and leveraging the advanced capabilities of BigQuery and its ecosystem.

The AI and Infrastructure team is redefining what’s possible. We empower Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud customers, and billions of Google users worldwide.

We're the driving channel behind Google's groundbreaking innovations, empowering the development of our cutting-edge AI models, delivering unparalleled computing power to global services, and providing the essential platforms that enable developers to build the future. From software to hardware our teams are shaping the future of world-leading hyperscale computing, with key teams working on the development of our TPUs, Vertex AI for Google Cloud, Google Global Networking, Data Center operations, systems research, and much more.

The US base salary range for this full-time position is $207,000-$300,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

Lead the technical strategy and architectural design of the core reasoning engine that translates natural language into reliable SQL insights, ensuring the platform scales to support complex enterprise data exploration. Drive cross-functional collaboration with AI/ML, UX, and Product teams to define the "agentic" future of BigQuery, bridging the gap between raw data and business-ready answers. Establish and maintain engineering excellence by setting the bar for performance, reliability, and observability of production-critical agent services across the BigQuery ecosystem. Mentor and influence a broad group of engineers, identifying and refining ambiguous, high-impact problems into tractable projects that advance our data-centric AI capabilities.

Inception creates the world’s fastest, most efficient AI models. Our Mercury model is the world’s fastest reasoning LLM and first commercially available diffusion LLM, delivering 5x greater speed and efficiency than today’s LLMs, with best-in-class quality.

We are the AI researchers and engineers behind such breakthrough AI technologies as diffusion models, flash attention, and DPO. The Role We're looking for engineers and scientists to design, optimize, and scale the systems that power our diffusion LLMs in production. Your work will make inference faster, more cost-effective, and more reliable.

Key Responsibilities - Build and optimize high-performance model serving systems for low-latency inference of diffusion LLMs. - Extend orchestration frameworks (Kubernetes, Ray, SLURM) for distributed inference, evaluation, and large-batch serving. - Implement and manage load balancing, autoscaling, and traffic routing for model endpoints. - Build systems for model versioning, canary deployments, and zero-downtime rollouts. - Develop monitoring, alerting, and observability tooling to ensure SLA compliance and rapid incident response. - Collaborate with ML researchers to translate model advances (new architectures, quantization techniques, batching strategies) into production-ready serving improvements.

Qualifications - BS/MS/PhD in Computer Science, Engineering, or a related field (or equivalent experience). - Knowledge of ML serving frameworks (SGLang, vLLM, Triton Inference Server, TensorRT-LLM). - Understanding of ML frameworks (PyTorch, TensorFlow) from a systems perspective. - Familiarity with high-performance computing and GPU programming (CUDA). - Experience with containerization (Docker), orchestration (Kubernetes), and CI/CD pipelines. - Background in performance optimization and profiling of ML systems.

Preferred Skills - Experience building and maintaining large-scale language models with tens of billions of parameters or more. - Experience with distributed systems and cloud computing platforms (AWS/GCP/Azure). - Experience with ML workflow orchestration tools (Kubeflow, Airflow). - Experience with model optimization techniques (quantization, distillation, speculative decoding, continuous batching). - Knowledge of ML-specific infrastructure challenges (checkpointing, resource scheduling, etc.).

Why Join Inception - Work with World-Class Talent: Collaborate with the inventors of diffusion models and leading AI researchers - Shape Foundational Technology: Your decisions will influence how the next generation of AI products are built and used - Immediate Impact: Join at the ground floor where your contributions directly shape product direction and company trajectory

Perks & Benefits - Competitive salary and equity in a rapidly growing startup - Flexible vacation and paid time off (PTO) - Health, dental, and vision insurance - Catered meals (breakfast, lunch, & dinner) - Commuter subsidies - A collaborative and inclusive culture

Overview:

At Capital One, we are creating trustworthy and reliable AI systems, changing banking for good. For years, Capital One has been leading the industry in using machine learning to create real-time, intelligent, automated customer experiences. From informing customers about unusual charges to answering their questions in real time, our applications of AI & ML are bringing humanity and simplicity to banking. We are committed to building world-class applied science and engineering teams and continue our industry leading capabilities with breakthrough product experiences and scalable, high-performance AI infrastructure. At Capital One, you will help bring the transformative power of emerging AI capabilities to reimagine how we serve our customers and businesses who have come to love the products and services we build.

Team Description:

The AI Foundations team is at the center of bringing our vision for AI at Capital One to life. Our work touches every aspect of the research life cycle, from partnering with Academia to building production systems. We work with product, technology and business leaders to apply the state of the art in AI to our business.

In this role, you will:

Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money.

Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data.

Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation.

Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences.

Flex your interpersonal skills to translate the complexity of your work into tangible business goals.

The Ideal Candidate:

You love the process of analyzing and creating, but also share our passion to do the right thing. You know at the end of the day it’s about making the right decision for our customers.

Innovative. You continually research and evaluate emerging technologies. You stay current on published state-of-the-art methods, technologies, and applications and seek out opportunities to apply them.

Creative. You thrive on bringing definition to big, undefined problems. You love asking questions and pushing hard to find answers. You’re not afraid to share a new idea.

A leader. You challenge conventional thinking and work with stakeholders to identify and improve the status quo. You’re passionate about talent development for your own team and beyond.

Technical. You’re comfortable with open-source languages and are passionate about developing further. You have hands-on experience developing AI foundation models and solutions using open-source tools and cloud computing platforms.

Has a deep understanding of the foundations of AI methodologies.

Experience building large deep learning models, whether on language, images, events, or graphs, as well as expertise in one or more of the following: training optimization, self-supervised learning, robustness, explainability, RLHF.

An engineering mindset as shown by a track record of delivering models at scale both in terms of training data and inference volumes.

Experience in delivering libraries, platform level code or solution level code to existing products.

A professional with a track record of coming up with high quality ideas or improving upon existing ideas in machine learning, demonstrated by accomplishments such as first author publications or projects.

Possess the ability to own and pursue a research agenda, including choosing impactful research problems and autonomously carrying out long-running projects.

Team Description:

The AI Foundations team is at the center of bringing our vision for AI at Capital One to life. Our work touches every aspect of the research life cycle, from partnering with Academia to building production systems. We work with product, technology and business leaders to apply the state of the art in AI to our business.

In this role, you will:

Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money.

Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data.

Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation.

Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences.

Flex your interpersonal skills to translate the complexity of your work into tangible business goals.

The Ideal Candidate:

You love the process of analyzing and creating, but also share our passion to do the right thing. You know at the end of the day it’s about making the right decision for our customers.

Innovative. You continually research and evaluate emerging technologies. You stay current on published state-of-the-art methods, technologies, and applications and seek out opportunities to apply them.

Creative. You thrive on bringing definition to big, undefined problems. You love asking questions and pushing hard to find answers. You’re not afraid to share a new idea.

A leader. You challenge conventional thinking and work with stakeholders to identify and improve the status quo. You’re passionate about talent development for your own team and beyond.

Technical. You’re comfortable with open-source languages and are passionate about developing further. You have hands-on experience developing AI foundation models and solutions using open-source tools and cloud computing platforms.

Has a deep understanding of the foundations of AI methodologies.

Experience building large deep learning models, whether on language, images, events, or graphs, as well as expertise in one or more of the following: training optimization, self-supervised learning, robustness, explainability, RLHF.

An engineering mindset as shown by a track record of delivering models at scale both in terms of training data and inference volumes.

Experience in delivering libraries, platform level code or solution level code to existing products.

A professional with a track record of coming up with new ideas or improving upon existing ideas in machine learning, demonstrated by accomplishments such as first author publications or projects.

Possess the ability to own and pursue a research agenda, including choosing impactful research problems and autonomously carrying out long-running projects.

The D. E. Shaw group seeks a highly motivated and entrepreneurial technical product engineer to join its newly formed private equity venture, Cove, and help build the AI-powered platform at its core. This role sits at the intersection of product strategy and technical execution, offering the opportunity to define, shape, and deliver technology solutions that will become the operational backbone of the group. As an early team member, this product engineer will play a key role in addressing the open challenge of applying AI to private equity investments and operations, with the backing of one of the most technologically sophisticated investment firms in the world.

What you'll do day-to-day

You'll be involved in all aspects of building and scaling technology products for the fund's investment activities, including: - Work closely with the investment and operations teams to surface high-impact opportunities, pressure-test ideas, and translate workflow challenges into clear product direction. - Own product design end-to-end—from how data is structured and connected to the business logic that determines how a tool actually behaves—bringing both conceptual clarity and technical precision to each iteration. - Design and build AI-native products that use LLMs to change how investment teams work, with a solid intuition for how model behavior shapes user experience and where AI can add genuine leverage. - Drive products from prototype to production, contributing code directly—especially in early stages—when tight product and business judgment matters most.

Who we're looking for
  • A bachelor’s degree or higher, an impressive record of academic and professional achievement, and at least five years of relevant experience.
  • At least two years of experience developing technology products in direct collaboration with engineering teams, including at least one year focused on workflow products that streamline business operations and processes.
  • Experience successfully taking a product from conception to completion, ideally in a startup environment; prior experience developing products for vertical-specific or industry-focused applications is a plus.
  • A solid technical foundation in full-stack product development—spanning APIs, databases, and user interfaces—with the ability to read and execute code, and proficiency in overseeing technical aspects from architecture decisions to implementation details.
  • At least one year of experience developing and integrating LLM-powered systems into production applications, with knowledge of agentic frameworks and their practical implementation; demonstrated ability to translate AI capabilities (including autonomous agents, tool use, and multi-step reasoning) into practical product features that solve real-world problems.
  • Well-developed communication skills, a collaborative and entrepreneurial mindset, and the ability to successfully manage multiple projects at once.
  • The expected annual base salary for this position is $185,000 to $250,000. Our compensation and benefits package includes variable compensation in the form of a year-end bonus, guaranteed in the first year of hire, and benefits including medical and prescription drug coverage, 401(k) contribution matching, wellness reimbursement, family building benefits, and a charitable gift match program.