MLSys 2025 Career Opportunities
Here we highlight career opportunities submitted by our Exhibitors, and other top industry, academic, and non-profit leaders. We would like to thank each of our exhibitors for supporting MLSys 2025.
Search Opportunities
Lecturer/ Senior Lecturer
The Department of Computer Science at the University of Bath invites applications for up to seven faculty positions at various ranks from candidates who are passionate about research and teaching in artificial intelligence and machine learning. These are permanent positions with no tenure process. The start date is flexible.
The University of Bath is based on an attractive, single-site campus that facilitates interdisciplinary research. It is located on the edge of the World Heritage City of Bath and offers the lifestyle advantages of working and living in one of the most beautiful areas in the United Kingdom.
For more information and to apply, please visit: https://www.bath.ac.uk/campaigns/join-the-department-of-computer-science/
VLM Run – Founding ML Systems Engineer / Researcher
We're building bleeding-edge infrastructure for Vision Language Models (VLMs). Join us as a founding engineer to reimagine the visual AI infrastructure layer for enterprises.
📍 Location: Santa Clara, CA (3+ days/week)
🧠 Roles: ML Systems Engineer & Applied ML/CV Researcher
💰 Comp: $150K – $220K + 0.5 – 3% equity
📬 Apply: hiring@vlm.run with GitHub + standout work
🧱 What We’re Building
VLM Run is a horizontal platform to fine-tune, serve, and specialize VLMs with structured JSON outputs — for docs, images, video, and beyond.
Think of it as the orchestration layer for next-gen visual agents — built on a developer-friendly API and production-grade runtime.
We're tackling: - Fast inference: High-throughput, low-latency inference for multimodal ETL (vLLM-style, but for visual content like images, videos, streaming content) - Fine-tuning infra: Scalable fine-tuning and distillation for structured, multi-modal tasks (OCR++, layout parsing, video QA) - Compiler infra: All kinds of optimizations to make our GPUs go brrr (OpenAI Triton kernels, speculative/guided decoding etc)
We’re early — you’ll define the infrastructure backbone of VLMs in production.
💡 Why This Matters
Most VLMs are stuck in demos — slow, flaky, and hard to deploy.
We're fixing that with: - Developer-native APIs (not chat-based hacks) - Structured JSON outputs for automation - Fast, predictable inference on non-text modalities
You'll work on core ML systems — not glue code — with full ownership over compiler paths, serving infra, and fine-tuning pipelines.
👩💻 What You’ll Do
You'll shape the future of how VLMs are trained, served, and used in production. Your work could include: - Building low-latency runtimes and speculative decoders - Shipping distillation pipelines that power real-time visual agents - Designing APIs that make visual data programmable for developers
✅ You Might Be a Fit If:
- Built or optimized ML compilers, kernels, or serving infra (Triton, vLLM, TVM, XLA, ONNX)
- Deep PyTorch/HuggingFace experience; trained ViTs or LLaMA/Qwen-class models
- 2+ YOE post-MS or 4+ YOE post-BS in ML infra, CV systems, or compiler teams
- Bonus: Published OSS or papers, shipped SaaS infra, or scaled training/serving infra
🌎 Logistics
- Compensation: $150K – $220K + 0.5 – 3% equity
- In-Person: 3+ days/week in Santa Clara, CA
- Benefits: Top-tier healthcare, 401K, early ownership
🔗 Apply Now
📧 hiring@vlm.run
🌐 www.vlm.run
💼 LinkedIn
📎 Send GitHub, standout projects, or a quick note on why this is a fit.
Let’s build the future of visual intelligence — fast, structured, and programmable.
Location: Redwood City, CA or New York, NY
About Us:
Here at Fireworks, we’re building the future of generative AI infrastructure. Fireworks offers the generative AI platform with the highest-quality models and the fastest, most scalable inference. We’ve been independently benchmarked to have the fastest LLM inference and have been getting great traction with innovative research projects, like our own function calling and multi-modal models. Fireworks is funded by top investors, like Benchmark and Sequoia, and we’re an ambitious, fun team composed primarily of veterans from Pytorch and Google Vertex AI.
The Role:
As the Lead Security Engineer at Fireworks AI, you will be responsible for envisioning and implementing a world class security program from the ground up. Our cutting edge infrastructure, world-class inference technology, proprietary in-house research, and the open source large language models we host operate at the most extreme scale and are prime targets for sophisticated threat actors the world over. You will be entrusted to secure our AI platform, models, and infrastructure from all manner of attackers.
Key Responsibilities:
- Hardening our multi-cloud infrastructure to secure customer models and compute clusters
- Defining and implementing a right-sized Secure Software Development Lifecycle
- Performing code, architecture, and system security reviews
- Designing a scalable security program by leveraging automation wherever possible
- Secure corporate managed devices
- Conduct security assessments and risk analyses
- Ensure compliance with security frameworks, regulations, and standards (e.g., SOC 2, ISO 27001, GDPR)
Minimum Qualifications:
- 5+ years of experience in application, product, or infrastructure security
- Experience working with Python and/or Go
- Experience working with AWS, GCP, and/or Oracle Cloud
- Knowledge of Docker and Kubernetes concepts
- Experience integrating security tools such as Snyk or Semgrep with common CI tools such as Jenkins, Circle CI, or GitHub Actions
- A high degree of comfort working in a Linux server environment, including on the CLI
Preferred Qualifications
- Experience securing Kubernetes clusters
- Familiarity with common web frameworks for Python and/or Go
- Experience securing multi-cloud environments
- Familiarity with Oracle Cloud security controls
- Experience working in complex codebases
- Experience working with EDR and/or XDR solutions
- Experience with IaC technology, such as Terraform
- Familiarity with mobile device management systems
- Experience securing Google Workspace
- Experience configuring identity providers such as Okta or One Login
Why Fireworks AI?
- Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.
- Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.
- Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.
- Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.
Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.
Location: San Jose, California, US
Alternate Location: San Francisco, CA; Seattle, WA
Meet the Team
The Cisco’s AI Research team consists of AI research scientists, data scientists, and network engineers with subject matter expertise who collaborate on both basic and applied research projects. We are motivated by tackling unique research challenges that arise when connecting people and devices at a world-wide scale.
Who You’ll Work With
You will join a newly formed, dynamic AI team as one of the core members, and have the opportunity to influence the culture and direction of the growing team. Our team includes AI experts and networking domain experts who work together and learn from each other. We work closely with engineers, product managers and strategists who have deep expertise and experience in AI and/or distributed systems.
What You’ll Do
Your primary role is to produce research advances in the field of Generative AI that improve the capabilities of models or agents for networking automation, human-computer interaction, model safety, or other strategic gen-AI powered networking areas. You will research building domain-specific foundational representations relevant to networking, etc. that provide differentiative value across diverse sets of applications. You will be a thought leader in the global research community via publishing papers, giving technical talks, organizing workshops etc.
Minimum qualifications
- PhD in Computer Science or a relevant technical field and experience within an industry or academic research lab or a Masters Degree with strong LLM pre-training and post training experience within an industry or academic research lab and a minimum of 3 publications within top AI Venues such as ACL, EMNLP, ICLR, ICML, NAACL, NeurIPS
- Experience working with Machine Learning Models (MLMs) and familiarity with associated frameworks, such as TensorFlow, PyTorch, Hugging Face, or equivalent platforms
Preferred qualifications
- Experience driving research projects within an industry or university lab
- Interest in combining representation learning and problem-specific properties
- Experience in building, fine-tuning foundation models including LLMs and multi-modal models or domain specific models
- Ability to maintain cutting-edge knowledge in generative AI, Large Language Models (LLMs), and multi-modal models and apply these technologies innovatively to emerging business problems, use cases, and scenarios
- Outstanding communication, interpersonal, relationship building skills conducive to collaboration
- Experience working in an industrial research lab (full-time, internship, etc.)
Location: Redwood City, CA or New York, NY
About Us:
Here at Fireworks, we’re building the future of generative AI infrastructure. Fireworks offers the generative AI platform with the highest-quality models and the fastest, most scalable inference. We’ve been independently benchmarked to have the fastest LLM inference and have been getting great traction with innovative research projects, like our own function calling and multi-modal models. Fireworks is funded by top investors, like Benchmark and Sequoia, and we’re an ambitious, fun team composed primarily of veterans from Pytorch and Google Vertex AI.
The Role:
As an AI Researcher, you’ll push the boundaries of generative AI, advancing LLMs and multimodal systems through foundational research. Your work will enhance model efficiency, accuracy, and scalability, directly shaping our high-performance AI infrastructure. You'll collaborate with top experts in deep learning, distributed systems, and optimization to bring cutting-edge research into real-world applications. You'll also have the opportunity to shape how some of the world’s leading companies build and deploy AI through the models and tools you help create.
Minimum Qualifications:
- Research background in Artificial Intelligence, Machine Learning, Physics, or similar field
- Experience solving analytical problems using analytic and quantitative approaches
- Experience communicating research to audiences with different backgrounds
- Experience coding in C/C++, Python, or other similar languages
Preferred Qualifications:
- PhD degree in Computer Science, Computational Physics, Mathematics, or a similar field
- Research and engineering experience demonstrated via grants, fellowships, patents, internships, work experience, and/or coding competitions
- Experience having first-authored publications at peer-reviewed conferences or journals
- Experience working with ML frameworks such as PyTorch, TensorFlow, or Jax
Why Fireworks AI?
- Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.
- Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.
- Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.
- Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.
Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.
San Francisco, California
Founded in late 2020 by a small group of machine learning researchers, Mosaic AI enables companies to create state-of-the-art AI models from scratch on their own data. From a business perspective, Mosaic AI is committed to the belief that a company’s AI models are just as valuable as any other core IP, and that high-quality AI models should be available to all. From a scientific perspective, Mosaic AI is committed to reducing the cost of training state-of-the-art models - and sharing our knowledge about how to do so with the world - to allow everyone to innovate and create models of their own.
Now part of Databricks since July 2023 as the GenAI Team, we are passionate about enabling our customers to solve the world's toughest problems by building and running the world's best data and AI platform. We leap at every opportunity to solve technical challenges, striving to empower our customers with the best data and AI capabilities.
You will: - Explore and analyze performance bottlenecks in ML training and inference - Design, implement and benchmark libraries and methods to overcome aforementioned bottlenecks - Build tools for performance profiling, analysis, and estimation for ML training and inference - Balance the tradeoff between performance and usability for our customers - Facilitate our community through documentation, talks, tutorials, and collaborations - Collaborate with external researchers and leading AI companies on various efficiency methods
We look for: - Hands on experience the internals of deep learning frameworks (e.g. PyTorch, TensorFlow) and deep learning models - Experience with high-performance linear algebra libraries such as cuDNN, CUTLASS, Eigen, MKL, etc. - General experience with the training and deployment of ML models - Experience with compiler technologies relevant to machine learning - Experience with distributed systems development or distributed ML workloads - Hands on experience with writing CUDA code and knowledge of GPU internals (Preferred) - Publications in top tier ML or System Conferences such as MLSys, ICML, ICLR, KDD, NeurIPS (Preferred)
We value candidates who are curious about all parts of the company's success and are willing to learn new technologies along the way.
New York
Quantitative Strategies
Overview
Machine learning developers at the D. E. Shaw group work closely with researchers to creatively apply their knowledge of machine learning and software engineering to design, build, and maintain systems for high-performance, large-scale knowledge discovery in financial data. Machine learning developers have the opportunity to be part of an inclusive, collaborative, and engaging working environment.
What you’ll do day-to-day
Specific responsibilities include designing, implementing, testing, and documenting modules for all stages of the pipeline from data to predictions, assembling these modules into end-to-end systems, and interacting with researchers to achieve highly productive experimentation, model construction, and validation.
Who we’re looking for
- Successful candidates will have a strong knowledge of software engineering, machine learning, and open-source machine learning ecosystems. A track record of building and applying high-performance machine learning systems is desired. While an impressive record of academic achievement is a plus, we welcome outstanding candidates from diverse academic disciplines and backgrounds.
- The expected annual base salary for this position is USD 250,000 to USD 350,000. Our compensation and benefits package includes substantial variable compensation in the form of a year-end bonus, guaranteed in the first year of hire, a sign-on bonus, and benefits including medical and prescription drug coverage, 401(k) contribution matching, wellness reimbursement, family building benefits, and a charitable gift match program.
Location Seattle Cupertino
Description Do you want to be part of AI revolution? At AWS our vision is to make deep learning pervasive for everyday developers and to democratize access to AI hardware and software infrastructure. In order to deliver on that vision, we’ve created innovative software and hardware solutions that make it possible. AWS Neuron is the SDK that optimizes the performance of complex ML models executed on AWS Inferentia and Trainium, our custom chips designed to accelerate deep-learning workloads.
This role is for a software engineer in the Compiler team for AWS Neuron. As part of this role, you will be responsible for building next generation Neuron compiler which transforms ML models written in ML frameworks (e.g, PyTorch, TensorFlow, and JAX) to be deployed AWS Inferentia and Trainium based servers in the Amazon cloud. You will be responsible for solving hard compiler optimization problems to achieve optimum performance for variety of ML model families including massive scale large language models like Llama, Deepseek, and beyond as well as stable diffusion, vision transformers and multi-model models. You will be required to understand how these models work inside-out to make informed decisions on how to best coax the compiler to generate optimal implementation instruction. You will leverage your technical communications skill to partner with internal and external customers/stakeholders and will be involved in pre-silicon design, bringing new products/features to market, ultimately, making Neuron compiler highly performant and easy-to-use.
Experience in object-oriented languages like C++/Java is a must, experience with compilers or building ML models using ML frameworks on accelerators (e.g., GPUs) is preferred but not required. Experience with technologies like OpenXLA, StableHLO, MLIR will be added bonus!
Explore the product and our history! https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-cc/index htmlhttps://aws.amazon.com/machine-learning/neuron/ https://github.com/aws/aws-neuron-sdk https://www.amazon.science/how-silicon-innovation-became-the-secret-sauce-behind-awss-success
AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services.
Key job responsibilities You will design, implement, test, deploy and maintain innovative software solutions to transform Neuron compiler’s performance, stability and user-interface. You will work side by side with chip architects, runtime/OS engineers, scientists and ML Apps teams to seamlessly deploy state of the art ML models from our customers on AWS accelerators with optimal cost/performance benefits. You will have opportunity to work with open-source software (e.g., StableHLO, OpenXLA, MLIR) to pioneer optimizing advanced ML workloads on AWS software and hardware. You will also work on building innovative features that will deliver best possible experiences for our customers – developers across the globe.
Location: Redwood City, CA or New York, NY
About Us:
Here at Fireworks, we’re building the future of generative AI infrastructure. Fireworks offers the generative AI platform with the highest-quality models and the fastest, most scalable inference. We’ve been independently benchmarked to have the fastest LLM inference and have been getting great traction with innovative research projects, like our own function calling and multi-modal models. Fireworks is funded by top investors, like Benchmark and Sequoia, and we’re an ambitious, fun team composed primarily of veterans from Pytorch and Google Vertex AI.
Job Overview
As an Applied Machine Learning Engineer, you will serve as a vital bridge between cutting-edge AI research and practical, real-world applications. Your work will focus on developing, fine-tuning, and operationalizing machine learning models that drive business value and enhance user experiences. This is a hands-on engineering role that combines deep technical expertise with a strong customer focus to deliver scalable AI solutions.
Responsibilities
- Customer Success: Collaborate directly with the GTM team (Account Executives and Solutions Architects) to ensure smooth integration and successful deployment of ML solutions.
- Demo / Proof of Concept (PoC): Build and present compelling PoCs that demonstrate the capabilities of our AI technology.
- Application Build: Design, develop, and deploy end-to-end AI-powered applications tailored to customer needs.
- Platform Features / Bug Fixes: Contribute to the internal ML platform, including adding features and resolving issues.
- New Model Enablements: Integrate and enable new machine learning models into the existing platform or client environments.
- Performance Optimizations: Improve system performance, efficiency, and scalability of deployed models and applications.
- Partnership Enablement: Work closely with partners to enable joint AI solutions and ensure seamless collaboration.
Minimum Qualifications
- Bachelor’s degree in Computer Science, Engineering, or a related technical field.
- 5+ years of experience in a software engineering role, with a strong preference for customer-facing roles.
- Robust coding skills required, preferably with proficiency in Python.
- Demonstrated ability to lead and execute complex technical projects with a focus on customer success.
- Strong interpersonal and communication skills; ability to thrive in dynamic, cross-functional teams.
Preferred Qualifications
- Master’s degree in Computer Science, Engineering, or a related technical field.
- Experience working in a startup or fast-paced environment.
- Hands-on experience fine-tuning machine learning models, including supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF or RFT).
- Solid understanding of generative AI, machine learning principles, and enterprise infrastructure.
Why Fireworks AI?
- Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.
- Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.
- Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.
- Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.
Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.
Location: San Jose, California, US
Alternate Location: San Francisco, CA; Seattle, WA
Why You’ll Love Cisco
Everything is converging on the Internet, making networked connections more meaningful than ever before in our lives. Our employees' groundbreaking ideas impact everything. Here, that means we take creative ideas from the drawing board to build dynamic solutions that have real world impact. You'll collaborate with Cisco leaders, partner with mentors, and develop incredible relationships with colleagues who share your interest in connecting the unconnected. You'll be part a team that cares about its customers, enjoys having fun, and you'll take part in changing the lives of those in our local communities. Come prepared to be encouraged and inspired.
Who We Are
The Cisco’s AI Research team consists of AI research scientists, data scientists, and network engineers with subject matter expertise who collaborate on both basic and applied research projects. We are motivated by tackling unique research challenges that arise when connecting people and devices at a world-wide scale.
Who You’ll Work With
You will join a newly formed, dynamic AI team as one of the core members, and have the opportunity to influence the culture and direction of the growing team. Our team includes AI experts and networking domain experts who work together and learn from each other. We work closely with engineers, product managers and strategists who have deep expertise and experience in AI and/or distributed systems.
What You’ll Do
Your primary role is to produce research advances in the field of Generative AI that improve the capabilities of models or agents for networking automation, human-computer interaction, model safety, or other strategic gen-AI powered networking areas. You will research building domain-specific foundational representations relevant to networking, etc. that provide differentiative value across diverse sets of applications. You will be a thought leader in the global research community via publishing papers, giving technical talks, organizing workshops etc.
Minimum qualifications
- PhD in Computer Science or a relevant technical field and 2+ years of experience within an industry or academic research lab or a Masters Degree and 6+ years of experience within an industry or academic research lab and a minimum of 3 publications within top AI Venues such as ACL, EMNLP, ICLR, ICML, NAACL, NeurIPS
- Experience working with Machine Learning Models (MLMs) and familiarity with associated frameworks, such as TensorFlow, PyTorch, Hugging Face, or equivalent platforms
Preferred qualifications
- Experience driving research projects within an industry or university lab
- Interest in combining representation learning and problem-specific properties
- Experience in building, fine-tuning foundation models including LLMs and multi-modal models or domain specific models
- Ability to maintain cutting-edge knowledge in generative AI, Large Language Models (LLMs), and multi-modal models and apply these technologies innovatively to emerging business problems, use cases, and scenarios
- Outstanding communication, interpersonal, relationship building skills conducive to collaboration
- Experience working in an industrial research lab (full-time, internship, sabbatical, etc.)