Skip to yearly menu bar Skip to main content




MLSys 2024 Career Website

The MLSys 2024 conference is not accepting applications to post at this time.

Here we highlight career opportunities submitted by our Exhibitors, and other top industry, academic, and non-profit leaders. We would like to thank each of our exhibitors for supporting MLSys 2024. Opportunities can be sorted by job category, location, and filtered by any other field using the search box. For information on how to post an opportunity, please visit the help page, linked in the navigation bar above.

Search Opportunities

Please use the link below to review all opportunities at Cerebras Systems. We are actively hiring across our Machine Learning, Software, Hardware, Systems, Manufacturing, and Product organizations.

Why Join Cerebras People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

  1. Build a breakthrough AI platform beyond the constraints of the GPU
  2. Publish and open source their cutting-edge AI research
  3. Work on one of the fastest AI supercomputers in the world
  4. Enjoy job stability with startup vitality
  5. Our simple, non-corporate work culture that respects individual beliefs

Read our blog: Five Reasons to Join Cerebras in 2024.

Apply today and become part of the forefront of groundbreaking advancements in AI.

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.


Apply

d-Matrix has fundamentally changed the physics of memory-compute integration with our digital in-memory compute (DIMC) engine. The “holy grail” of AI compute has been to break through the memory wall to minimize data movements. We’ve achieved this with a first-of-its-kind DIMC engine. Having secured over $154M, $110M in our Series B offering, d-Matrix is poised to advance Large Language Models to scale Generative inference acceleration with our chiplets and In-Memory compute approach. We are on track to deliver our first commercial product in 2024. We are poised to meet the energy and performance demands of these Large Language Models. The company has 100+ employees across Silicon Valley, Sydney and Bengaluru.

Our pedigree comes from companies like Microsoft, Broadcom, Inphi, Intel, Texas Instruments, Lucent, MIPS and Wave Computing. Our past successes include building chips for all the cloud hyperscalers globally - Amazon, Facebook, Google, Microsoft, Alibaba, Tencent along with enterprise and mobile operators like China Mobile, Cisco, Nokia, Ciena, Reliance Jio, Verizon, AT&AT. We are recognized leaders in the mixed signal, DSP connectivity space, now applying our skills to next generation AI.

ML Compiler Backend Developer https://jobs.ashbyhq.com/d-Matrix/ed7241c7-8fe0-4023-9813-efb93b43180f

Machine Learning Senior Staff https://jobs.ashbyhq.com/d-Matrix/7bd32e05-677e-48ec-98cb-fbfb4c6a14f3

Machine Learning Performance Architect https://jobs.ashbyhq.com/d-Matrix/64ba00d5-55b7-44c6-a564-eba934c07c2b

SQA (Software Quality Engineer) https://jobs.ashbyhq.com/d-Matrix/bc81c7b1-98aa-40a9-99b7-740592585da0

AI / ML System Software Engineer https://jobs.ashbyhq.com/d-Matrix/71b6738b-1b65-4471-8505-6893e4261ae0


Apply

San Francisco, CA


As a Systems Research Engineer specialized in Machine Learning Systems, you will play a crucial role in researching and building the next generation AI platform at Together. Working closely with the modeling, algorithm, and engineering teams, you will design large-scale distributed training systems and a low-latency/high-throughput inference engine that serves a diverse, rapidly growing user base. Your research skills will be vital in staying up-to-date with the latest advancements in machine learning systems, ensuring that our AI infrastructure remains at the forefront of innovation.

Requirements

Strong background in machine learning systems, such as distributed learning and efficient inference for large language models and diffusion models Knowledge of ML/AI applications and models, especially foundation models such as large language models and diffusion models, how they are constructed and how they are used Knowledge of system performance profiling and optimization tools for ML systems Excellent problem-solving and analytical skills Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or equivalent practical experience Responsibilities

Optimize and fine-tune existing training and inference platform to achieve better performance and scalability Collaborate with cross-functional teams to integrate cutting edge research ideas into existing software systems Develop your own ideas of optimizing the training and inference platforms and push the frontier of machine learning systems research Stay up-to-date with the latest advancements in machine learning systems techniques and apply many of them to the Together platform About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.


Apply

As a Capital One Machine Learning Engineer (MLE), you'll be part of an Agile team dedicated to productionizing machine learning applications and systems at scale. You’ll participate in the detailed technical design, development, and implementation of machine learning applications using existing and emerging technology platforms. You’ll focus on machine learning architectural design, develop and review model and application code, and ensure high availability and performance of our machine learning applications. You'll have the opportunity to continuously learn and apply the latest innovations and best practices in machine learning engineering.


Apply

d-Matrix has fundamentally changed the physics of memory-compute integration with our digital in-memory compute (DIMC) engine. The “holy grail” of AI compute has been to break through the memory wall to minimize data movements. We’ve achieved this with a first-of-its-kind DIMC engine. Having secured over $154M, $110M in our Series B offering, d-Matrix is poised to advance Large Language Models to scale Generative inference acceleration with our chiplets and In-Memory compute approach. We are on track to deliver our first commercial product in 2024. We are poised to meet the energy and performance demands of these Large Language Models. The company has 100+ employees across Silicon Valley, Sydney and Bengaluru.

Our pedigree comes from companies like Microsoft, Broadcom, Inphi, Intel, Texas Instruments, Lucent, MIPS and Wave Computing. Our past successes include building chips for all the cloud hyperscalers globally - Amazon, Facebook, Google, Microsoft, Alibaba, Tencent along with enterprise and mobile operators like China Mobile, Cisco, Nokia, Ciena, Reliance Jio, Verizon, AT&AT. We are recognized leaders in the mixed signal, DSP connectivity space, now applying our skills to next generation AI.  

Location:

Hybrid, working onsite at our Santa Clara, CA headquarters 3 days per week.

The role: Software Engineer, Staff - Kernels

What you will do:

The role requires you to be part of the team that helps productize the SW stack for our AI compute engine. As part of the Software team, you will be responsible for the development, enhancement, and maintenance of software kernels for next-generation AI hardware. You possess experience building software kernels for HW architectures. You possess a very strong understanding of various hardware architectures and how to map algorithms to the architecture. You understand how to map computational graphs generated by AI frameworks to the underlying architecture. You have had past experience working across all aspects of the full stack tool chain and understand the nuances of what it takes to optimize and trade-off various aspects of hardware-software co-design. You are able to build and scale software deliverables in a tight development window. You will work with a team of compiler experts to build out the compiler infrastructure working closely with other software (ML, Systems) and hardware (mixed signal, DSP, CPU) experts in the company. 

What you will bring:

Minimum:

MS or PhD in Computer Engineering, Math, Physics or related degree with 5+ years of industry experience.

Strong grasp of computer architecture, data structures, system software, and machine learning fundamentals. 

Proficient in C/C++ and Python development in Linux environment and using standard development tools. 

Experience implementing algorithms in high level languages such as C/C++, Python. 

Experience implementing algorithms for specialized hardware such as FPGAs, DSPs, GPUs, AI accelerators using libraries such as CuDA etc. 

Experience in implementing operators commonly used in ML workloads - GEMMs, Convolutions, BLAS, SIMD operators for operations like softmax, layer normalization, pooling etc.

Experience with development for embedded SIMD vector processors such as Tensilica. 

Self-motivated team player with a strong sense of ownership and leadership. 

Preferred:

Prior startup, small team or incubation experience. 

Experience with ML frameworks such as TensorFlow and.or PyTorch. 

Experience working with ML compilers and algorithms, such as MLIR, LLVM, TVM, Glow, etc.

Experience with a deep learning framework (such as PyTorch, Tensorflow) and ML models for CV, NLP, or Recommendation. 

Work experience at a cloud provider or AI compute / sub-system company.


Apply