Skip to yearly menu bar Skip to main content




MLSys 2024 Career Website

The MLSys 2024 conference is not accepting applications to post at this time.

Here we highlight career opportunities submitted by our Exhibitors, and other top industry, academic, and non-profit leaders. We would like to thank each of our exhibitors for supporting MLSys 2024. Opportunities can be sorted by job category, location, and filtered by any other field using the search box. For information on how to post an opportunity, please visit the help page, linked in the navigation bar above.

Search Opportunities

d-Matrix has fundamentally changed the physics of memory-compute integration with our digital in-memory compute (DIMC) engine. The “holy grail” of AI compute has been to break through the memory wall to minimize data movements. We’ve achieved this with a first-of-its-kind DIMC engine. Having secured over $154M, $110M in our Series B offering, d-Matrix is poised to advance Large Language Models to scale Generative inference acceleration with our chiplets and In-Memory compute approach. We are on track to deliver our first commercial product in 2024. We are poised to meet the energy and performance demands of these Large Language Models. The company has 100+ employees across Silicon Valley, Sydney and Bengaluru.

Our pedigree comes from companies like Microsoft, Broadcom, Inphi, Intel, Texas Instruments, Lucent, MIPS and Wave Computing. Our past successes include building chips for all the cloud hyperscalers globally - Amazon, Facebook, Google, Microsoft, Alibaba, Tencent along with enterprise and mobile operators like China Mobile, Cisco, Nokia, Ciena, Reliance Jio, Verizon, AT&AT. We are recognized leaders in the mixed signal, DSP connectivity space, now applying our skills to next generation AI.  

Location:

Hybrid, working onsite at our Santa Clara, CA headquarters 3 days per week.

What You Will Do:

The role requires you to be part of the team that helps productize the SW stack for our AI compute engine. As part of the Software team, you will be responsible for the development, enhancement, and maintenance of the next-generation AI hardware simulation tools for hardware and for developing software kernels for the hardware. You possess experience building functional simulators for new HW architectures. You possess a very strong understanding of various hardware architectures and how to map algorithms to the architecture. You understand how to map computational graphs generated by AI frameworks to the underlying architecture. You have had past experience working across all aspects of the full stack tool chain and understand the nuances of what it takes to optimize and trade-off various aspects of hardware-software co-design. You are able to build and scale software deliverables in a tight development window. You will work with a team of compiler experts to build out the compiler infrastructure working closely with other software (ML, Systems) and hardware (mixed signal, DSP, CPU) experts in the company. 

What You Will Bring:

• MS or PhD preferred in Computer Science, Electrical Engineering, Math, Physics or related degree. with 12+ years of Industry Experience.

• Strong grasp of computer architecture, data structures, system software, and machine learning fundamentals. 

• Proficient in C/C++ and Python development in Linux environment and using standard development tools. 

• Experience implementing functional simulators in high level languages such as C/C++, Python. 

• Self-motivated team player with a strong sense of ownership and leadership.

Desired: 

• Prior startup, small team or incubation experience. 

• Experience implementing algorithms for specialized hardware such as FPGAs, DSPs, GPUs, AI accelerators. 

• Experience with ML algorithms and frameworks such as PyTorch and/or TensorFlow • Experience with ML compilers and frameworks such as MLIR, LLVM, TVM, GLow.

• Experience with a deep learning framework (such as PyTorch, Tensorflow) and ML models for CV, NLP, or Recommendation. 

• Work experience at a cloud provider or AI compute / sub-system company.


Apply

MatX is on a mission to be the compute platform for AGI. We are developing vertically integrated full-stack solutions from silicon to systems including hardware and software to train and run the largest ML workloads for AGI. We are looking for people who are excited about systems-focused ML research.

Responsibilities include: - Train and optimize LLMs for our hardware - Run quality evaluations - Build and set up distributed infrastructure for training and inference - Advise on the hardware architecture from an ML perspective

Requirements: - Excellent software engineering skills - Experience training and tweaking neural networks, ideally LLMs - Perhaps: experience optimizing neural networks for hardware efficiency, for example regarding FLOPs, memory bandwidth, communication bandwidth, precision, parallel layout, batch sizes

Compensation: The US base salary for this full-time position is $120,000 - $400,000 + equity + benefits

As part of our dedication to the diversity of our team and our focus on creating an inviting and inclusive work experience, MatX is committed to a policy of Equal Employment Opportunity and will not discriminate against an applicant or employee on the basis of race, color, religion, creed, national origin or ancestry, sex, gender, gender identity, gender expression, sexual orientation, age, physical or mental disability, medical condition, marital/domestic partner status, military and veteran status, genetic information or any other legally recognized protected basis under federal, state or local laws, regulations or ordinances.

All candidates must be authorized to work in the United States and work from our offices in Mountain View Tuesdays-Thursdays.

This position requires access to information that is subject to U.S. export controls. This offer of employment is contingent upon the applicants capacity to perform job functions in compliance with U.S. export control laws without obtaining a license from U.S. export control authorities.


Apply

Location: Palo Alto, CA - Hybrid


The era of pervasive AI has arrived. In this era, organizations will use generative AI to unlock hidden value in their data, accelerate processes, reduce costs, drive efficiency and innovation to fundamentally transform their businesses and operations at scale.

SambaNova Suite™ is the first full-stack, generative AI platform, from chip to model, optimized for enterprise and government organizations. Powered by the intelligent SN40L chip, the SambaNova Suite is a fully integrated platform, delivered on-premises or in the cloud, combined with state-of-the-art open-source models that can be easily and securely fine-tuned using customer data for greater accuracy. Once adapted with customer data, customers retain model ownership in perpetuity, so they can turn generative AI into one of their most valuable assets.

Job Summary We are looking for a world-class engineering leader to guide a team of talented Machine Learning engineers and researchers driving the development & innovation of our vision technology. One must thrive in a fast-paced environment, where you'll work closely with cross-functional teams to optimize performance and drive velocity. Leveraging cutting-edge techniques, you will play a vital role in our overall success in deploying state of the art AI capabilities all around the globe.

Responsibilities - Lead and mentor a team high-performing team of machine learning engineers in a fast-paced environment, providing technical guidance, mentorship, and support to drive their professional growth and development. - Oversee the rapid development and implementation of machine learning models, leveraging advanced algorithms and techniques to optimize performance. - Collaborate closely with cross-functional teams, including product managers, software engineers, and data engineers, to deliver data-driven insights and recommendations that enhance our solutions in an agile environment. - Stay at the forefront of industry trends, emerging technologies, and best practices in machine learning, vision and MLOps. Apply this knowledge to drive innovation, meet tight deadlines, and maintain a competitive edge. - Establish and maintain strong relationships with stakeholders, providing clear communication of technical concepts and findings to both technical and non-technical audiences.

Skills & Qualifications - Master's or PhD in a quantitative field such as Data Science, Computer Science, Statistics, or a related discipline. - 10+ years of experience in Machine Learning, with a focus in vision. - 5+ years proven success in technical leadership, while delivering impactful projects across the organization. - Strong expertise in machine learning algorithms, and data analysis techniques. - Proficiency in Python, with hands-on experience using machine learning libraries and frameworks such as Pytorch, Tensorflow, or JAX. - Strong communication and collaboration skills, with the ability to effectively convey technical concepts to both technical and non-technical stakeholders in a fast-paced context. - Experience and familiarity with production ML environments, including model release, evaluation and monitoring.

Preferred Qualifications - Track record of published ML papers and/or blogs. - Track record of engagement with open-source ML community. - Experience with Vision applications in AI for Science, Oil and Gas, or medical imaging. - Experience with Vision and Multi-modal foundation models such as Stable Diffusion, ViT and CLIP. - Experience with performance optimization of ML models. - 2+ years of experience in a startup environment.


Apply

US and Canada only

Cerebras' systems are designed with a singular focus on machine learning. Our processor is the Wafer Scale Engine (WSE), a single chip with performance equivalent to a cluster of GPUs, giving the user cluster-scale capability with the simplicity of programming a single device. Because of this programming simplicity, large model training can be scaled out using simple data parallelism to the performance of thousands of GPUs. ML practitioners can focus on their machine learning, rather than parallelizing and distributing their applications across many devices. The Cerebras hardware architecture is designed with unique capabilities including orders of magnitude higher memory bandwidth and unstructured sparsity acceleration, not accessible on traditional GPUs. With a rare combination of cutting-edge hardware and deep expertise in machine learning, we stand among the select few global organizations capable of conducting large-scale innovative deep learning research and developing novel ML algorithms not possible on traditional hardware.

About the role

Cerebras has senior and junior research scientist roles open with focus on co-design and demonstration of novel state-of-the-art ML algorithms with this unique specialized architecture. We are working on research areas including advancing and scaling foundation models for natural language processing and multi-modal applications, new weight and activation sparsity algorithms, and novel efficient training techniques. A key responsibility of our group is to ensure that state-of-the-art techniques can be applied systematically across many important applications.

As part of the Core ML team, you will have the unique opportunity to research state-of-the-art models as part of a collaborative and close-knit team. We deliver important demos of Cerebras capability as well as publish our findings as ways to support and engage with the community. A key aspect of the senior role will also be to provide active guidance and mentorship to other talented and passionate scientists and engineers.

Research Directions

Our research focuses on improving state-of-the-art foundation models in NLP, computer vision, and multi-modal settings by studying many dimensions unique to the Cerebras architecture:

  • Scaling laws to predict and analyze large-scale training improvements: accuracy/loss, architecture scaling, and hyperparameter transfer
  • Sparse and low-precision training algorithms for reduced training time and increased accuracy. For instance, weight and activation sparsity, mixture-of-experts, and low-rank adaptation
  • Optimizers, initializers, normalizers to improve training dynamics and efficiency

Responsibilities

  • Develop novel training algorithms that advance state-of-the-art in model quality and compute efficiency
  • Develop novel network architectures that address foundational challenges in language and multi-modal domains
  • Co-design ML algorithms that take advantage of existing unique Cerebras hardware advantages and collaborate with engineers to co-design next generation architectures
  • Design and run research experiments that show novel algorithms are efficient and robust
  • Analyze results to gain research insights, including training dynamics, gradient quality, and dataset preprocessing techniques
  • Publish and present research at leading machine learning conferences
  • Collaborate with engineers in co-design of the product to bring the research to customers

Requirements

  • Strong grasp of machine learning theory, fundamentals, linear algebra, and statistics
  • Experience with state-of-the-art models, such as GPT, LLaMA, DaLL-E, PaLI, or Stable Diffusion
  • Experience with machine learning frameworks, such as TensorFlow and PyTorch.
  • Strong track record of relevant research success through relevant publications at top conferences or journals (e.g. ICLR, ICML, NeurIPS), or patents and patent applications

Apply

San Francisco, CA


As a Systems Research Engineer specialized in Machine Learning Systems, you will play a crucial role in researching and building the next generation AI platform at Together. Working closely with the modeling, algorithm, and engineering teams, you will design large-scale distributed training systems and a low-latency/high-throughput inference engine that serves a diverse, rapidly growing user base. Your research skills will be vital in staying up-to-date with the latest advancements in machine learning systems, ensuring that our AI infrastructure remains at the forefront of innovation.

Requirements

Strong background in machine learning systems, such as distributed learning and efficient inference for large language models and diffusion models Knowledge of ML/AI applications and models, especially foundation models such as large language models and diffusion models, how they are constructed and how they are used Knowledge of system performance profiling and optimization tools for ML systems Excellent problem-solving and analytical skills Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or equivalent practical experience Responsibilities

Optimize and fine-tune existing training and inference platform to achieve better performance and scalability Collaborate with cross-functional teams to integrate cutting edge research ideas into existing software systems Develop your own ideas of optimizing the training and inference platforms and push the frontier of machine learning systems research Stay up-to-date with the latest advancements in machine learning systems techniques and apply many of them to the Together platform About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.


Apply

Remote

What You’ll Do

-Remotely provision and manage large-scale HPC clusters for AI workloads (up to many thousands of nodes) -Remotely install and configure operating systems, firmware, software, and networking on HPC clusters both manually and using automation tools -Troubleshoot and resolve HPC cluster issues working closely with physical deployment teams on-site -Provide context and details to an automation team to further automate the deployment process -Provide clear and detailed requirements back to HPC design team on gaps and improvement areas, specifically in the areas of simplification, stability, and operational efficiency -Contribute to the creation and maintenance of Standard Operating Procedures -Provide regular and well-communicated updates to project leads throughout each deployment -Mentor and assist less-experienced team members -Stay up-to-date on the latest HPC/AI technologies and best practices

You

-Have 10+ years of experience in managing HPC clusters -Have 10+ years of everyday Linux experience -Have a strong understanding of HPC architecture (compute, networking, storage) -Have an innate attention to detail -Have experience with Bright Cluster Manager or similar cluster management tools -Are an expert in configuring and troubleshooting: -SFP+ fiber, InfiniBand (IB), and 100 GbE network fabrics -Ethernet, switching, power infrastructure, GPU direct, RDMA, NCCL, Horovod environments -Linux-based compute nodes, firmware updates, driver installation -SLURM, Kubernetes, or other job scheduling systems -Work well under deadlines and structured project plans -Have excellent problem-solving and troubleshooting skills -Have the flexibility to travel to our North American data centers as on-site needs arise or as part of training exercises -Are able to work both independently and as part of a team

Nice to Have

-Experience with machine learning and deep learning frameworks (PyTorch, TensorFlow) and benchmarking tools (DeepSpeed, MLPerf) -Experience with containerization technologies (Docker, Kubernetes) -Experience working with the technologies that underpin our cloud business (GPU acceleration, virtualization, and cloud computing) -Keen situational awareness in customer situations, employing diplomacy and tact -Bachelor's degree in EE, CS, Physics, Mathematics, or equivalent work experience

About Lambda

-We offer generous cash & equity compensation -Investors include Gradient Ventures, Google’s AI-focused venture fund -We are experiencing extremely high demand for our systems, with quarter over quarter, year over year profitability -Our research papers have been accepted into top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOG -We have a wildly talented team of 200, and growing fast -Health, dental, and vision coverage for you and your dependents -Commuter/Work from home stipends -401k Plan with 2% company match -Flexible Paid Time Off Plan that we all actually use

Salary Range Information

Based on market data and other factors, the salary range for this position is $170,000-$230,000. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.

A Final Note:

You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.

Equal Opportunity Employer

Lambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information,


Apply

Bay Area, California Only

Cerebras has developed a radically new chip and system to dramatically accelerate deep learning applications. Our system runs training and inference workloads orders of magnitude faster than contemporary machines, fundamentally changing the way ML researchers work and pursue AI innovation.

At Cerebras, we're proud to be among the few companies globally capable of training massive LLMs with over 100 billion parameters. We're active contributors to the open-source community, with millions of downloads of our models on Hugging Face. Our customers include national labs, global corporations across multiple industries, and top-tier healthcare systems. Recently, we announced a multi-year, multi-million-dollar partnership with Mayo Clinic, underscoring our commitment to transforming AI applications across various fields.

The Role

As the Cerebras ML Product Manager, you'll spearhead the transformation of AI across various industries by productizing critical machine learning (ML) use cases. Collaborating closely with Product leadership and ML research teams, you'll identify promising areas within the industry and research community, balancing business value and ML thought leadership. Your role involves translating abstract neural network requirements into actionable tasks for the Engineering team, establishing roadmaps, processes, success criteria, and feedback loops for product improvement. This position requires a blend of deep technical expertise in ML and deep learning concepts, familiarity with modern models, particularly in the Large Language Model (LLM) space, and a solid grasp of mathematical foundations. Ideal candidates will anticipate future trends in deep learning and understand connections across different neural network types and application domains.

Responsibilities

  • Understand deep learning use cases across industries through market analysis, research, and user studies
  • Develop and own the product roadmap for neural network architectures and ML methods on Cerebras platform
  • Collaborate with end users to define market requirements for AI models
  • Define software requirements and priorities with engineering for ML network support
  • Establish success metrics for application enablement, articulating accuracy and performance expectations
  • Support Marketing, Product Marketing, and Sales by documenting features and defining ML user needs
  • Collaborate across teams to define product go-to-market strategy and expand user community
  • Clearly communicate roadmaps, priorities, experiments, and decisions

Requirements

  • Bachelor’s or Master’s degree in computer science, electrical engineering, physics, mathematics, a related scientific/engineering discipline, or equivalent practical experience
  • 3-10+ years product management experience, working directly with engineering teams, end users (enterprise data scientists/ML researchers), and senior product/business leaders
  • Strong fundamentals in machine learning/deep learning concepts, modern models, and the mathematical foundations behind them; understanding of how to apply deep learning models to relevant real-world applications and use cases
  • Experience working with a data science/ML stack, including TensorFlow and PyTorch
  • An entrepreneurial sense of ownership of overall team and product success, and the ability to make things happen around you. A bias towards getting things done, owning the solution, and driving problems to resolution
  • Outstanding presentation skills with a strong command of verbal and written communication

Preferred

  • Experience developing machine learning applications or building tools for machine learning application developers
  • Prior research publications in the machine learning/deep learning fields demonstrating deep understanding of the space

Apply

d-Matrix has fundamentally changed the physics of memory-compute integration with our digital in-memory compute (DIMC) engine. The “holy grail” of AI compute has been to break through the memory wall to minimize data movements. We’ve achieved this with a first-of-its-kind DIMC engine. Having secured over $154M, $110M in our Series B offering, d-Matrix is poised to advance Large Language Models to scale Generative inference acceleration with our chiplets and In-Memory compute approach. We are on track to deliver our first commercial product in 2024. We are poised to meet the energy and performance demands of these Large Language Models. The company has 100+ employees across Silicon Valley, Sydney and Bengaluru.

Our pedigree comes from companies like Microsoft, Broadcom, Inphi, Intel, Texas Instruments, Lucent, MIPS and Wave Computing. Our past successes include building chips for all the cloud hyperscalers globally - Amazon, Facebook, Google, Microsoft, Alibaba, Tencent along with enterprise and mobile operators like China Mobile, Cisco, Nokia, Ciena, Reliance Jio, Verizon, AT&AT. We are recognized leaders in the mixed signal, DSP connectivity space, now applying our skills to next generation AI.

ML Compiler Backend Developer https://jobs.ashbyhq.com/d-Matrix/ed7241c7-8fe0-4023-9813-efb93b43180f

Machine Learning Senior Staff https://jobs.ashbyhq.com/d-Matrix/7bd32e05-677e-48ec-98cb-fbfb4c6a14f3

Machine Learning Performance Architect https://jobs.ashbyhq.com/d-Matrix/64ba00d5-55b7-44c6-a564-eba934c07c2b

SQA (Software Quality Engineer) https://jobs.ashbyhq.com/d-Matrix/bc81c7b1-98aa-40a9-99b7-740592585da0

AI / ML System Software Engineer https://jobs.ashbyhq.com/d-Matrix/71b6738b-1b65-4471-8505-6893e4261ae0


Apply

Seattle or Remote

OctoAI is a leading startup in the fast-paced generative AI market. Our mission is to empower businesses to build differentiated applications that delight customers with the latest generative AI features.

Our platform, OctoAI, delivers generative AI infrastructure to run, tune, and scale models that power AI applications. OctoAI makes models work for you by providing developers easy access to efficient AI infrastructure so they can run the models they choose, tune them for their specific use case, and scale from dev to production seamlessly. With the fastest foundation models on the market (including Llama-2, Stable Diffusion, and SDXL), integrated customization solutions, and world-class ML systems under the hood, developers can focus on building apps that wow their customers without becoming AI infrastructure experts.

Our team consists of experts in cloud services, infrastructure, machine learning systems, hardware, and compilers as well as an accomplished go-to-market team with diverse backgrounds. We have secured over $130M in venture capital funding and will continue to grow over the next year. We're based largely in Seattle but have a remote-first culture with people working all over the US and elsewhere in the world.

We dream big but execute with focus and believe in creativity, productivity, and a balanced life. We value diversity in all dimensions and are always looking for talented people to join our team!

Our Automation team specializes in developing the most efficient engine for generative model deployment. We concentrate on enhancements from detailed GPU kernel adjustments to broader system-level optimizations, including continuous batching.

We are seeking a highly skilled and experienced Machine Learning Systems Engineer with experience in CUDA Kernel optimization to join our dynamic team. In this role, you will be responsible for driving significant advancements in GPU performance optimizations and contributing to cutting-edge projects in AI and machine learning.


Apply

San Francisco, CA


Together AI is looking for an ML Engineer who will develop systems and APIs that enable our customers to perform inference and fine tune LLMs. Relevant experience includes implementing runtime systems that perform inference at scale using AI/ML models from simple models up to the largest LLMs.

Requirements

5+ years experience writing high-performance, well-tested, production quality code Bachelor’s degree in computer science or equivalent industry experience Demonstrated experience in building large scale, fault tolerant, distributed systems like storage, search, and computation Expert level programmer in one or more of Python, Go, Rust, or C/C++ Experience implementing runtime inference services at scale or similar Excellent understanding of low level operating systems concepts including multi-threading, memory management, networking and storage, performance and scale GPU programming, NCCL, CUDA knowledge a plus Experience with Pytorch or Tensorflow, a plus Responsibilities

Design and build the production systems that power the Together Cloud inference and fine-tuning APIs, enabling reliability and performance at scale Partner with researchers, engineers, product managers, and designers to bring new features and research capabilities to the world Perform architecture and research work for AI workloads Analyze and improve efficiency, scalability, and stability of various system resources Conduct design and code reviews Create services, tools & developer documentation Create testing frameworks for robustness and fault-tolerance Participate in an on-call rotation to respond to critical incidents as needed About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.


Apply

Please use the link below to review all opportunities at Cerebras Systems. We are actively hiring across our Machine Learning, Software, Hardware, Systems, Manufacturing, and Product organizations.

Why Join Cerebras People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

  1. Build a breakthrough AI platform beyond the constraints of the GPU
  2. Publish and open source their cutting-edge AI research
  3. Work on one of the fastest AI supercomputers in the world
  4. Enjoy job stability with startup vitality
  5. Our simple, non-corporate work culture that respects individual beliefs

Read our blog: Five Reasons to Join Cerebras in 2024.

Apply today and become part of the forefront of groundbreaking advancements in AI.

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.


Apply

Remote

What You'll Do

-Help scale Lambda’s high performance cloud network -Contribute to the reproducible automation of network configuration -Contribute to the design and development of software defined networks -Help manage Spine and Leaf networks -Ensure high availability of our network through monitoring, failover, and redundancy -Ensure VMs have predictable networking performance -Help with deploying and maintaining network monitoring and management tools

You

-Have led the implementation of production-scale networking projects -Experience managing BGP -Have experience with Spine and Leaf (Clos) network topology -Have experience with multi-data center networks and hybrid cloud networks -Have experience building and maintaining Software Defined Networks (SDN) -Are comfortable on the Linux command line, and have an understanding of the Linux networking stack -Have python programming experience

Nice To Have

-Experience with OpenStack -Experience with HPC networking, such as Infiniband -Experience automating network configuration within public clouds, with tools like Terraform -Experience with configuration management tools like Ansible -Experience building and maintaining multi-data center networks -Have led implementation of production-scale SDNs in a cloud context (e.g. helped implement the infrastructure that powers an AWS VPC-like feature) -Deep understanding of the Linux networking stack and its interaction with network virtualization -Understanding of the SDN ecosystem (e.g. OVS, Neutron, DPDK, Cisco ACI or Nexus Fabric Controller, Arista CVP) -Experience with Next-Generation Firewalls (NGFW)

About Lambda

-We offer generous cash & equity compensation -Investors include Gradient Ventures, Google’s AI-focused venture fund -We are experiencing extremely high demand for our systems, with quarter over quarter, year over year profitability -Our research papers have been accepted into top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOG -We have a wildly talented team of 250, and growing fast -Health, dental, and vision coverage for you and your dependents -Commuter/Work from home stipends -401k Plan with 2% company match -Flexible Paid Time Off Plan that we all actually use

Salary Range Information

Based on market data and other factors, the salary range for this position is $180,000 - $230,000. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.

A Final Note:

You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.

Equal Opportunity Employer

Lambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.


Apply

Bengaluru, Karnataka, India

Cerebras has developed a radically new chip and system to dramatically accelerate deep learning applications. Our system runs training and inference workloads orders of magnitude faster than contemporary machines, fundamentally changing the way ML researchers work and pursue AI innovation.

We are innovating at every level of the stack – from chip, to microcode, to power delivery and cooling, to new algorithms and network architectures at the cutting edge of ML research. Our fully-integrated system delivers unprecedented performance because it is built from the ground up for deep learning workloads.

About the role

The AppliedML team is seeking a senior technical leader to spearhead new initiatives on Generative AI solutions. In this role, you will lead a team of research and software engineers to plan, develop, and deliver end-to-end solutions trained on massive supercomputers. These projects may be part of our customer collaborations or open-source initiatives. These solutions will be trained on some of the largest systems and using some unique datasets we have developed in partnership with our diverse collaborators. You will plan and design experiments, execute them using Cerebras' unique workflow, and share the findings with internal stakeholders and external partners.

Responsibilities

  • Lead the technical exploration – from framing the problem statement, defining the option space, and approaching the options in a data-driven way to identify the final approach
  • Design experiments to test the different hypotheses analyze output to distill the learnings, and use them to adjust the project direction
  • Keep up with the state-of-the-art in Generative AI – efficient training recipes, model architecture, alignment, and instruction tuning, among others
  • Influence and mentor a distributed team of engineers
  • Integrate and enhance the latest research in model compression, including sparsity and quantization, to achieve super-linear scaling in model performance and accuracy
  • Breakthrough efficiency through co-designing hardware capabilities, model architecture, and training/deployment recipes

Requirements

  • MS in Computer Science, Statistics, or related fields
  • Experience with technical leadership of a moderate size team for 2+ years
  • Hands-on experience with training DL models for speech, language, vision, or a combination of them (multi-modal)
  • Experience with being the technical lead of a feature or project from conception through productization
  • Experience operating in a self-directed environment with multiple stakeholders
  • Experience working with other leaders to define strategic roadmaps
  • Proven track record of clearly articulating the findings to a broad audience with varying technical familiarity with the subject matter

Preferred

  • Ph.D. in Computer Science, Statistics, or related fields
  • Publications in top conferences such as NeurIPS, ICML, and CVPR, among others
  • Track record of building impactful features through open source or productization
  • People management experience is desired

Apply

Seattle or Remote


OctoAI is a leading startup in the fast-paced generative AI market. Our mission is to empower businesses to build differentiated applications that delight customers with the latest generative AI features.

Our platform, OctoAI, delivers generative AI infrastructure to run, tune, and scale models that power AI applications. OctoAI makes models work for you by providing developers easy access to efficient AI infrastructure so they can run the models they choose, tune them for their specific use case, and scale from dev to production seamlessly. With the fastest foundation models on the market (including Llama-2, Stable Diffusion, and SDXL), integrated customization solutions, and world-class ML systems under the hood, developers can focus on building apps that wow their customers without becoming AI infrastructure experts.

Our team consists of experts in cloud services, infrastructure, machine learning systems, hardware, and compilers as well as an accomplished go-to-market team with diverse backgrounds. We have secured over $130M in venture capital funding and will continue to grow over the next year. We're based largely in Seattle but have a remote-first culture with people working all over the US and elsewhere in the world.

We dream big but execute with focus and believe in creativity, productivity, and a balanced life. We value diversity in all dimensions and are always looking for talented people to join our team!

Our MLSys Engineering team specializes in developing the most efficient and feature packed engines for generative model deployment. This includes feature enablement and optimization for popular media models, such as Mixtral, Llama-2, Stable Diffusion, SDXL, SVD, and SD3 and thus, requires broad understanding on a various system layers from serving API to hardware-level. We do this by building systems that innovate new techniques as well as leveraging and contributing to open source projects including TVM, MLC-LLM, vLLM, CUTLASS, and more.

We are seeking a highly skilled and experienced Machine Learning Systems Engineer to join our dynamic team. In this role, you will be responsible contributing to the latest techniques and technologies in AI and machine learning.


Apply

Please visit our careers page at the link below.


Apply