Careers - MLSys 2026

AI Systems

Unconventional · Hybrid · United States

About Unconventional

Since 2022, AI has entered the mainstream, reshaping entire industries from education and software development to fundamental consumer behaviors. This revolution has created an unprecedented demand for computation - a demand that is now fundamentally limited by energy, not just in the datacenter, but at a global scale.

At Unconventional, our mission is to solve this. We are rethinking computing from the ground up to build a new foundation for AI that is 1000x more efficient. We're doing this by exploiting the rich physics of semiconductors, mapping neural networks directly to the device physics rather than relying on layers of inefficient abstraction.

The Role

As a Member of Technical Staff, AI Systems, you will develop state-of-the-art architectural components, write their bespoke implementations for our unconventional software framework, and map them efficiently down to the physical silicon. You are critical to preparing our software stack for upcoming tapeouts by acting as the bridge between model architecture and physical compute.

What You'll Do

AI Architectural Modeling: Co-design and evaluate next-generation AI models (e.g, transformers, diffusion, flow, and energy-based models).
You will collaborate closely across the team to combine, modify, and implement core modeling components, including both conventional (e.g., attention, normalization, Mixture-of-Experts, FFNs) and unconventional components.
You will ensure that they function optimally across our novel compute substrates.
Performance Modeling & Scaling: Establish and test scaling laws specific to our novel hardware. Develop rigorous performance models to evaluate compute vs. memory trade-offs
Advanced Mapping & Partitioning: Drive the partitioning and mapping of complex AI models down to hardware. Apply and invent advanced optimization strategies from first principles, including custom quantization schemes, sparsity/pruning, and distillation to fit the physical constraints of our substrates.
GPU Optimization & Kernel Development: Develop and optimize GPU kernels using low-level programming models like CUDA, Triton, or CUTLASS. Profile and debug complex ML codebases to resolve performance bottlenecks (training and inference).
Cross-Functional Collaboration: Act as a translator, discussing algorithmic trade-offs with theorists and converting model requirements into concrete specifications for infrastructure and hardware engineering teams.

Minimum Qualifications

Education: An MS/PhD or equivalent research/project experience in a quantitative field such as AI/Machine Learning, Computer Science, Physics, Electrical Engineering, or Applied Math.
Experience: Deep, practical understanding of the modern AI/ML stack and optimized compilation and execution of algorithms on modern GPU systems.
Proven experience in profiling, identifying, and resolving performance bottlenecks in complex ML codebases.
Systems Fluency: Demonstrated ability to map state-of-the-art AI model architectures (e.g., Transformers, Mixture of Experts, diffusion models) to system performance implications and apply advanced efficiency techniques such as sparsity, quantization, and distillation.
Software Development: Deep experience with PyTorch, including its internals, torch.compile, and distributed data parallel (DDP) / fully sharded data parallel (FSDP) libraries.

Preferred Qualifications (Nice to Have)

Unconventional Co-Design: A forward-looking perspective on co-designing algorithms for unconventional computing paradigms that map closely to the physics of underlying systems.
Next-Gen Efficiency: Theoretical or research experience in advanced approximation/compression techniques beyond standard quantization.

Principal Research Scientist – Scaling

Databricks, Inc. · United States

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Amazon · In Person · United States

Software Engineer III, AI/ML, Search Intelligence Freshness

Google · Hybrid · United States

About the job

Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google’s needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward.

In this role, you will ensure Google Search consistently understands and satisfies user needs for timely, up-to-date information, and to drive freshness quality of AI overview (AIO) and AI mode (AIM). By strategically feeding our models with real-time news context and freshness signals, we deliver accurate and world-aware AI products, reducing reliance on the model's parametric knowledge.

In Google Search, we're reimagining what it means to search for information – any way and anywhere. To do that, we need to solve complex engineering challenges and expand our infrastructure, while maintaining a universally accessible and useful experience that people around the world rely on. In joining the Search team, you'll have an opportunity to make an impact on billions of people globally.

The US base salary range for this full-time position is $147,000-$211,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

Write product or system development code. Collaborate with peers and stakeholders through design and code reviews to ensure best practices amongst available technologies (e.g., style guidelines, checking code in, accuracy, testability, and efficiency). Contribute to existing documentation or educational content and adapt content based on product/program updates and user feedback. Triage product or system issues and debug/track/resolve by analyzing the sources of issues and the impact on hardware, network, or service operations and quality. Implement solutions in one or more specialized ML areas, utilize ML infrastructure, and contribute to model optimization and data processing.

Sr. SDM, AI Inference Technology, Neuron SDK

Amazon · In Person · United States

Compiler Code Gen Engineer

Lemurian Labs · Hybrid · United States

Location Santa Clara, California US or Toronto, Canada

Description At Lemurian Labs, we’re on a mission to bring the power of AI to everyone—without leaving a massive environmental footprint. We care deeply about the impact AI has on our society and planet, and we’re building a rock-solid foundation for its future, ensuring AI grows sustainably and responsibly. Because let’s face it, what good is innovation if it doesn’t help the world?

We are building a high-performance, portable compiler that lets developers “build once, deploy anywhere.” Yes, anywhere. We’re talking about seamless cross-platform compatibility, so you can train your models in the cloud, deploy them to the edge, and everything in between—all while optimizing for resource efficiency and scalability.

If the idea of sustainably scaling AI motivates you and you’re excited about making AI development both powerful and accessible, then we’d love to have you. Join us at Lemurian Labs, where you can have fun building the future—without leaving a mess behind.

Here is what you will do: - Design, develop, maintain and improve our heterogeneous AI compiler. - Design and implement new capabilities in our compiler based on our novel compiler architecture. - Propose improvements to and expansions of our novel compiler architecture with respect to new advancements in machine learning model architectures and hardware. - Use the latest techniques in parallelization and partitioning to automate generation and exploit highly optimized kernels. - Generate and use performance data to identify opportunities and drive improvements. - Work with our product team to understand the evolving needs of ML engineers and drive improvements in architecture.

Essential Skills and Experience: - BS degree in computer science, computer engineering, electrical engineering, or equivalent practical experience - 4+ years of experience working with compilers. - Very strong knowledge of compiler algorithms and data structures. - Experience and interest in low level code generation, object file manipulation and target specific optimizations - 4+ years of experience with C/C++ - Strong written and oral communication, and able to write clear and concise documentation - Team first attitude - Detail oriented

Preferred Skills and Experience: - Masters or PhD degree in computer science, computer engineering, electrical engineering, or equivalent practical experience. - Knowledge of traditional compiler techniques; instruction selection, register allocation and traditional analysis like dominance, def-use et al. - Knowledge of calling conventions and APIs, linking and relocations. - Working knowledge of LLVM. - Experience with loop optimizations (vectorization, unrolling, fusion, parallelization, etc). - Experience with machine learning workloads and their demands on hardware.

Salary depends on experience and geographical location.

This salary range may be inclusive of several career levels and will be narrowed during the interview process based on a number of factors, such as the candidate’s experience, knowledge, skills, and abilities, as well as internal equity among our team.

Additional benefits for this role may include: equity, company bonus opportunities, medical, dental, and vision benefits; retirement savings plan; and supplemental wellness benefits.

Lemurian Labs ensures equal employment opportunity without discrimination or harassment based on race, color, religion, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity or expression, age, disability, national origin, marital or domestic/civil partnership status, genetic information, citizenship status, veteran status, or any other characteristic protected by law.

EOE

Staff Software Engineer, Generative AI, Data Analytics

Google · Hybrid · United States

About the job

Google Cloud’s mission is to make every business successful through AI by combining cutting-edge technology, infrastructure, and talent. AI/ML software engineers in Cloud bridge the gap between pioneering models and a massive product vehicle reaching billions. Our talent density and AI-powered tools drive rapid development, rooted in a culture of empowerment and a bias to action. In this role, you aren’t just building technology; you’re shaping the frontier of enterprise and driving the evolution of advanced models.

We build the industry's best data agents to help customers make more, better, and faster data-driven decisions—achieved by enriching the customer knowledge layer, automating data preparation, providing tailored agent harnesses, and leveraging the advanced capabilities of BigQuery and its ecosystem.

The AI and Infrastructure team is redefining what’s possible. We empower Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud customers, and billions of Google users worldwide.

We're the driving channel behind Google's groundbreaking innovations, empowering the development of our cutting-edge AI models, delivering unparalleled computing power to global services, and providing the essential platforms that enable developers to build the future. From software to hardware our teams are shaping the future of world-leading hyperscale computing, with key teams working on the development of our TPUs, Vertex AI for Google Cloud, Google Global Networking, Data Center operations, systems research, and much more.

The US base salary range for this full-time position is $207,000-$300,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

Lead the technical strategy and architectural design of the core reasoning engine that translates natural language into reliable SQL insights, ensuring the platform scales to support complex enterprise data exploration. Drive cross-functional collaboration with AI/ML, UX, and Product teams to define the "agentic" future of BigQuery, bridging the gap between raw data and business-ready answers. Establish and maintain engineering excellence by setting the bar for performance, reliability, and observability of production-critical agent services across the BigQuery ecosystem. Mentor and influence a broad group of engineers, identifying and refining ambiguous, high-impact problems into tractable projects that advance our data-centric AI capabilities.

Research Engineer

PDT Partners · United States

The Research Engineering team is dedicated to accelerating the velocity of machine learning research and expanding the exploration space for innovations at PDT. We partner with PDT’s quantitative researchers to design and build a state-of-the-art environment for testing ideas rapidly and efficiently.

Research at PDT requires significant compute, and as such, we are looking for a talented engineer with in-depth knowledge of ML techniques and DL ecosystem to help us build the infrastructure capable of supporting complex scientific research at scale.

This is a hybrid position and will require the person to work from our New York City office at minimum 3 days a week.

Why join us?  PDT Partners has a stellar 30+ year track record and a reputation for excellence. Our goal is to be the best quantitative investment manager in the world. PDT’s exceptional employee-retention rate speaks for itself. Our people are intellectually curious, collaborative, down-to-earth, and diverse.

Responsibilities:

Partner with the research team to understand future research directions and build the next generation of highly scalable infrastructure for alpha, signal, and portfolio construction.

Incorporate advancements in machine learning, hardware accelerators and high-performance computing to optimize research workflows.

Maintain, develop, and re-imagine the extensive internal research stack that continues to be a differentiating factor for PDT business. 

Optimize models for inference and use in real time trading systems.

Below is a list of skills and experiences we think are relevant. Even if you don’t think you’re a perfect match, we still encourage you to apply because we are committed to developing our people.

Experience with building infrastructure for training/fine-tuning large ML models.

Intellectual curiosity and a strong interest in solving difficult problems.

Exceptional programming skills and proficiency in identifying performance bottlenecks.

Experience with the python scientific stack and DL libraries (PyTorch, Tensorflow, etc.)

Experience with hardware accelerators.

Previous experience in Quant Finance is not required.

The salary range for this role is between $190,000 and $250,000. This range is not inclusive of any potential bonus amounts. Factors that may impact the agreed upon salary within the range for a particular candidate include years of experience, level of education obtained, skill set, and other external factors.

PRIVACY STATEMENT: For information on ways PDT may collect, use, and process your personal information, please see PDT’s privacy notices.

Sr. Distinguished Applied Researcher

Capital One · Hybrid · United States

Team Description:

The AI Foundations team is at the center of bringing our vision for AI at Capital One to life. Our work touches every aspect of the research life cycle, from partnering with Academia to building production systems. We work with product, technology and business leaders to apply the state of the art in AI to our business.

This is an individual contributor (IC) role driving strategic direction through collaboration with Applied Science, Engineering and Product leaders across Capital One. As a well-respected IC leader, you will guide and mentor a team of applied scientists and their managers without being a direct people leader. You will be expected to be an external leader representing Capital One in the research community, collaborating with prominent faculty members in the relevant AI research community.

In this role, you will:

Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money.

Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data.

Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation.

Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences.

Flex your interpersonal skills to translate the complexity of your work into tangible business goals.

The Ideal Candidate:

You love the process of analyzing and creating, but also share our passion to do the right thing. You know at the end of the day it’s about making the right decision for our customers.

Innovative. You continually research and evaluate emerging technologies. You stay current on published state-of-the-art methods, technologies, and applications and seek out opportunities to apply them.

Creative. You thrive on bringing definition to big, undefined problems. You love asking questions and pushing hard to find answers. You’re not afraid to share a new idea.

A leader. You challenge conventional thinking and work with stakeholders to identify and improve the status quo. You’re passionate about talent development for your own team and beyond.

Technical. You’re comfortable with open-source languages and are passionate about developing further. You have hands-on experience developing AI foundation models and solutions using open-source tools and cloud computing platforms.

Has a deep understanding of the foundations of AI methodologies.

Experience building large deep learning models, whether on language, images, events, or graphs, as well as expertise in one or more of the following: training optimization, self-supervised learning, robustness, explainability, RLHF.

An engineering mindset as shown by a track record of delivering models at scale both in terms of training data and inference volumes.

Experience in delivering libraries, platform level code or solution level code to existing products.

A professional with a track record of coming up with new ideas or improving upon existing ideas in machine learning, demonstrated by accomplishments such as first author publications or projects.

Possess the ability to own and pursue a research agenda, including choosing impactful research problems and autonomously carrying out long-running projects.

Key Responsibilities:

Partner with a cross-functional team of scientists, machine learning engineers, software engineers, and product managers to deliver AI-powered platforms and solutions that change how customers interact with their money.

Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation.

Member of Technical Staff, Data Infrastructure

Inception · In Person · United States

Inception creates the world’s fastest, most efficient AI models. Our Mercury model is the world’s fastest reasoning LLM and first commercially available diffusion LLM, delivering 5x greater speed and efficiency than today’s LLMs, with best-in-class quality.

We are the AI researchers and engineers behind such breakthrough AI technologies as diffusion models, flash attention, and DPO.

The Role We seek experienced engineers to architect and scale the core infrastructure behind distributed training pipelines and petabyte-scale data catalogs. You'll work directly with researchers to accelerate experiments, develop new datasets, improve infrastructure efficiency, and enable key insights across our data assets.

Key Responsibilities - Design, build, and operate scalable, fault-tolerant infrastructure for LLM research: distributed compute, data orchestration, and storage across modalities. - Develop high-throughput systems for data ingestion, processing, and transformation — including training data catalogs, deduplication, quality checks, and search. - Build systems for web crawling, data ingestion, and real-time data processing to support model training operations. - Develop tools and frameworks for efficient data storage, retrieval, and versioning across distributed systems. - Ensure data collection adheres to privacy regulations.

Qualifications - BS/MS/PhD in Computer Science, Machine Learning, or a related field (or equivalent experience). - 3+ years of experience building data processing pipelines at scale, particularly with AI/ML applications. - Strong proficiency in Python and experience with data processing frameworks (Apache Spark, Beam, Airflow). - Familiarity with synthetic data generation techniques and data augmentation strategies. - Familiarity with web scraping, crawling technologies, and Common Crawl datasets. - Solid understanding of machine learning fundamentals and experience with ML frameworks (PyTorch, TensorFlow). - Experience with SQL and NoSQL databases for managing structured and unstructured data.

Preferred Skills - Experience with large language models and understanding of tokenization, embeddings, and model architectures. - Experience managing human annotation workflows and quality control processes. - Experience with vector databases and embedding-based retrieval systems. - Knowledge of data privacy regulations and ethical AI practices. - Experience with distributed computing and large-scale data storage systems (HDFS, S3, BigQuery).

Why Join Inception - Work with World-Class Talent: Collaborate with the inventors of diffusion models and leading AI researchers - Shape Foundational Technology: Your decisions will influence how the next generation of AI products are built and used - Immediate Impact: Join at the ground floor where your contributions directly shape product direction and company trajectory

Perks & Benefits - Competitive salary and equity in a rapidly growing startup - Flexible vacation and paid time off (PTO) - Health, dental, and vision insurance - Catered meals (breakfast, lunch, & dinner) - Commuter subsidies - A collaborative and inclusive culture

Main Navigation

MLSys 2026 Career Opportunities

Search Opportunities