Skip to yearly menu bar Skip to main content




MLSys 2024 Career Website

The MLSys 2024 conference is not accepting applications to post at this time.

Here we highlight career opportunities submitted by our Exhibitors, and other top industry, academic, and non-profit leaders. We would like to thank each of our exhibitors for supporting MLSys 2024. Opportunities can be sorted by job category, location, and filtered by any other field using the search box. For information on how to post an opportunity, please visit the help page, linked in the navigation bar above.

Search Opportunities

San Francisco, CA


As a Systems Research Engineer specialized in GPU Programming, you will play a crucial role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications. Working closely with the modeling and algorithm team, you will co-design GPU kernels and model architecture to enhance the performance and efficiency of our AI systems. Collaborating with the hardware and software teams, you will contribute to the co-design of efficient GPU architectures and programming models, leveraging your expertise in GPU programming and parallel computing. Your research skills will be vital in staying up-to-date with the latest advancements in GPU programming techniques, ensuring that our AI infrastructure remains at the forefront of innovation.

Requirements

Strong background in GPU programming and parallel computing, such as CUDA and/or Triton. Knowledge of ML/AI applications and models Knowledge of performance profiling and optimization tools for GPU programming Excellent problem-solving and analytical skills Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or equivalent practical experiences Responsibilities

Optimize and fine-tune GPU code to achieve better performance and scalability Collaborate with cross-functional teams to integrate GPU-accelerated solutions into existing software systems Stay up-to-date with the latest advancements in GPU programming techniques and technologies About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more


Apply

Remote

What You’ll Do

-Build out beautiful and easy to use interfaces to deliver an industry-leading ML and AI cloud -Bring the best models, tooling, workflows, etc from the AI space to our platform -Own features end-to-end, from design to deployment to monitoring

You

-Have frontend web app development experience – Minimum of 6 years building product-grade “responsive” frontend software using: Typescript React (or equivalent strong experience in Vue or Svelte) HTML and modern CSS Vite -Have backend web app development experience – Minimum of 8 years of experience implementing business critical services, from initial conception to successful launch using: -Python, Unix/Command line -Django or FastAPI -Relational database like PostgreSQL or MySQL -Have CI/CD experience – Automation around testing and deployment to create a smooth developer experience -Have reliability & observability experience – Building highly available systems and SRE work including observability, alerting, and logging -Have experience with Cloud Native Services – Strong understanding of public cloud features like Cloudflare, Okta, AWS, etc.

Nice-to-haves

-Experience in Kubernetes -IaC (Terraform, Atlantis, Crossplane, etc) -Worked with event-based or serverless technologies (AWS Lambda, Kinesis, etc) -Experience in machine learning, AI, or data science fields -Held a leadership role, with the ability to lead and mentor junior team members -Knowledge of application security and web hardening -Strong engineering background - EECS preferred, Mathematics, Software Engineering, Physics

About Lambda

-We offer generous cash & equity compensation -Investors include Gradient Ventures, Google’s AI-focused venture fund -We are experiencing extremely high demand for our systems, with quarter over quarter, year over year profitability -Our research papers have been accepted into top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOG -We have a wildly talented team of 250, and growing fast -Health, dental, and vision coverage for you and your dependents -Commuter/Work from home stipends -401k Plan with 2% company match -Flexible Paid Time Off Plan that we all actually use

Salary Range Information

Based on market data and other factors, the salary range for this position is $169,000-$243,000. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.

A Final Note:

You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.

Equal Opportunity Employer

Lambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.


Apply

San Francisco, CA


Together AI is looking for an AI Engineer who will develop systems and APIs that enable our customers to perform inference and fine tune LLMs and integrate those APIs into third-party AI toolchains such as Langchain. Relevant experience includes building developer tools used and loved by developers around the world.

Requirements

5+ years experience writing large-scale AI developer tools or similar Bachelor’s degree in computer science or equivalent industry experience Expert level programmer in one or more of Python, Go, Rust, or C/C++ Experience integrating with AI inference and fine-tuning APIs or similar GPU programming, NCCL, CUDA knowledge a plus Experience with Pytorch or Tensorflow, a plus Responsibilities

Design and build the production systems that power the Together Cloud inference and fine-tuning APIs, enabling reliability and performance at scale Integrate Together Cloud inference and fine-tuning APIs with third party AI toolchains such as Langchain Partner with researchers, engineers, product managers, and designers to bring new features and research capabilities to the world Perform architecture and research work for AI workloads Analyze and improve efficiency, scalability, and stability of various system resources Conduct design and code reviews Create services, tools & developer documentation Create testing frameworks for robustness and fault-tolerance Participate in an on-call rotation to respond to critical incidents as needed About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.


Apply

Our mission at Capital One is to create trustworthy, reliable and human-in-the-loop AI systems, changing banking for good. For years, Capital One has been leading the industry in using machine learning to create real-time, intelligent, automated customer experiences. From informing customers about unusual charges to answering their questions in real time, our applications of AI & ML are bringing humanity and simplicity to banking. Because of our investments in public cloud infrastructure and machine learning platforms, we are now uniquely positioned to harness the power of AI. We are committed to building world-class applied science and engineering teams and continue our industry leading capabilities with breakthrough product experiences and scalable, high-performance AI infrastructure. At Capital One, you will help bring the transformative power of emerging AI capabilities to reimagine how we serve our customers and businesses who have come to love the products and services we build.

We are looking for an experienced Lead Generative AI Engineer to help build and maintain APIs and SDKs to train, fine-tune and access AI models at scale. You will work as part of our Enterprise AI team and build systems that will enable our users to work with Large-Language Models (LLMs) and Foundation Models (FMs), using our public cloud infrastructure. You will work with a team of world-class AI engineers and researchers to design and implement key API products and services that enable real-time customer-facing applications.


Apply

San Francisco, CA


As a Systems Research Engineer specialized in Machine Learning Systems, you will play a crucial role in researching and building the next generation AI platform at Together. Working closely with the modeling, algorithm, and engineering teams, you will design large-scale distributed training systems and a low-latency/high-throughput inference engine that serves a diverse, rapidly growing user base. Your research skills will be vital in staying up-to-date with the latest advancements in machine learning systems, ensuring that our AI infrastructure remains at the forefront of innovation.

Requirements

Strong background in machine learning systems, such as distributed learning and efficient inference for large language models and diffusion models Knowledge of ML/AI applications and models, especially foundation models such as large language models and diffusion models, how they are constructed and how they are used Knowledge of system performance profiling and optimization tools for ML systems Excellent problem-solving and analytical skills Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or equivalent practical experience Responsibilities

Optimize and fine-tune existing training and inference platform to achieve better performance and scalability Collaborate with cross-functional teams to integrate cutting edge research ideas into existing software systems Develop your own ideas of optimizing the training and inference platforms and push the frontier of machine learning systems research Stay up-to-date with the latest advancements in machine learning systems techniques and apply many of them to the Together platform About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.


Apply

San Francisco, CA


As an AI Researcher, you will be building next generation open models, both large language models and computer vision models such as diffusion models, using the computation and software infrastructure at Together. You will be working closely with the data engineering team to unveil the recipe of building open models that push the frontier, and will be working with the algorithm and engineering team to make your model widely available to everyone. You will also interact with customers to help them in their journey of training, using, and improving their AI applications using open models. Your research skills will be vital in staying up-to-date with the latest advancements in NLP and Computer Vision, ensuring that we stay at the cutting edge of open model innovations.

Requirements

Strong background in Natural Language Processing or Computer Vision Experience in building state-of-the-art models at large scale Passion in contributing to the open model ecosystem and pushing the frontier of open models Excellent problem-solving and analytical skills Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or equivalent practical experiences Responsibilities

Taking advantage of the computational infrastructure of Together to create the best open models in their class Understanding and improving the full lifecycle of building open models; release and publish your insights (blogs, academic papers etc.) Collaborating with cross-functional teams to deploy your model and make available to a wider community and customer base Staying up-to-date with the latest advancements in NLP and Computer Vision About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more


Apply

Please use this link to explore open positions across all departments at Lambda.

We encourage you to share this link with anyone you know who is also searching!


Apply

Seattle or Remote

OctoAI is a leading startup in the fast-paced generative AI market. Our mission is to empower businesses to build differentiated applications that delight customers with the latest generative AI features.

Our platform, OctoAI, delivers generative AI infrastructure to run, tune, and scale models that power AI applications. OctoAI makes models work for you by providing developers easy access to efficient AI infrastructure so they can run the models they choose, tune them for their specific use case, and scale from dev to production seamlessly. With the fastest foundation models on the market (including Llama-2, Stable Diffusion, and SDXL), integrated customization solutions, and world-class ML systems under the hood, developers can focus on building apps that wow their customers without becoming AI infrastructure experts.

Our team consists of experts in cloud services, infrastructure, machine learning systems, hardware, and compilers as well as an accomplished go-to-market team with diverse backgrounds. We have secured over $130M in venture capital funding and will continue to grow over the next year. We're based largely in Seattle but have a remote-first culture with people working all over the US and elsewhere in the world.

We dream big but execute with focus and believe in creativity, productivity, and a balanced life. We value diversity in all dimensions and are always looking for talented people to join our team!

Our Automation team specializes in developing the most efficient engine for generative model deployment. We concentrate on enhancements from detailed GPU kernel adjustments to broader system-level optimizations, including continuous batching.

We are seeking a highly skilled and experienced Machine Learning Systems Engineer with experience in CUDA Kernel optimization to join our dynamic team. In this role, you will be responsible for driving significant advancements in GPU performance optimizations and contributing to cutting-edge projects in AI and machine learning.


Apply

Our mission at Capital One is to create trustworthy, reliable and human-in-the-loop AI systems, changing banking for good. For years, Capital One has been leading the industry in using machine learning to create real-time, intelligent, automated customer experiences. From informing customers about unusual charges to answering their questions in real time, our applications of AI & ML are bringing humanity and simplicity to banking. Because of our investments in public cloud infrastructure and machine learning platforms, we are now uniquely positioned to harness the power of AI. We are committed to building world-class applied science and engineering teams and continue our industry leading capabilities with breakthrough product experiences and scalable, high-performance AI infrastructure. At Capital One, you will help bring the transformative power of emerging AI capabilities to reimagine how we serve our customers and businesses who have come to love the products and services we build.

We are looking for an experienced Director, AI Platforms to help us build the foundations of our enterprise AI Capabilities. In this role you will work on developing generic platform services to support applications powered by Generative AI. You will develop SDKs and APIs to build agents, information retrieval and to build models as a service for powering generative AI workflows such as optimizing LLMs via RAG.

Additionally you will manage end-to-end coordination with operations and manage creation of high quality curated datasets and productionizing of models along with working with applied research and product teams to identify and prioritize ongoing and upcoming services.


Apply

Location: Palo Alto, CA - Hybrid


The era of pervasive AI has arrived. In this era, organizations will use generative AI to unlock hidden value in their data, accelerate processes, reduce costs, drive efficiency and innovation to fundamentally transform their businesses and operations at scale.

SambaNova Suite™ is the first full-stack, generative AI platform, from chip to model, optimized for enterprise and government organizations. Powered by the intelligent SN40L chip, the SambaNova Suite is a fully integrated platform, delivered on-premises or in the cloud, combined with state-of-the-art open-source models that can be easily and securely fine-tuned using customer data for greater accuracy. Once adapted with customer data, customers retain model ownership in perpetuity, so they can turn generative AI into one of their most valuable assets.

Job Summary We are looking for a world-class engineering leader to guide a team of talented Machine Learning engineers and researchers driving the development & innovation of our vision technology. One must thrive in a fast-paced environment, where you'll work closely with cross-functional teams to optimize performance and drive velocity. Leveraging cutting-edge techniques, you will play a vital role in our overall success in deploying state of the art AI capabilities all around the globe.

Responsibilities - Lead and mentor a team high-performing team of machine learning engineers in a fast-paced environment, providing technical guidance, mentorship, and support to drive their professional growth and development. - Oversee the rapid development and implementation of machine learning models, leveraging advanced algorithms and techniques to optimize performance. - Collaborate closely with cross-functional teams, including product managers, software engineers, and data engineers, to deliver data-driven insights and recommendations that enhance our solutions in an agile environment. - Stay at the forefront of industry trends, emerging technologies, and best practices in machine learning, vision and MLOps. Apply this knowledge to drive innovation, meet tight deadlines, and maintain a competitive edge. - Establish and maintain strong relationships with stakeholders, providing clear communication of technical concepts and findings to both technical and non-technical audiences.

Skills & Qualifications - Master's or PhD in a quantitative field such as Data Science, Computer Science, Statistics, or a related discipline. - 10+ years of experience in Machine Learning, with a focus in vision. - 5+ years proven success in technical leadership, while delivering impactful projects across the organization. - Strong expertise in machine learning algorithms, and data analysis techniques. - Proficiency in Python, with hands-on experience using machine learning libraries and frameworks such as Pytorch, Tensorflow, or JAX. - Strong communication and collaboration skills, with the ability to effectively convey technical concepts to both technical and non-technical stakeholders in a fast-paced context. - Experience and familiarity with production ML environments, including model release, evaluation and monitoring.

Preferred Qualifications - Track record of published ML papers and/or blogs. - Track record of engagement with open-source ML community. - Experience with Vision applications in AI for Science, Oil and Gas, or medical imaging. - Experience with Vision and Multi-modal foundation models such as Stable Diffusion, ViT and CLIP. - Experience with performance optimization of ML models. - 2+ years of experience in a startup environment.


Apply

US & Canada only

Cerebras is on a mission to accelerate the pace of progress in Generative AI by building AI supercomputers that deliver unprecedented performance for LLM training! Cerebras is leveraging these supercomputers to turbocharge the exploration of end-to-end solutions that address real-world challenges, such as breaking down language barriers, enhancing developer productivity, and advancing medical research breakthroughs. The AppliedML team at Cerebras is a team of Generative AI practitioners and experts who leverage Cerebras AI supercomputers to push the technical frontiers of the domain and work with our partners to build compelling solutions. Some of this team's publicly announced successful efforts are BTLM, Jais 30B multilingual model, and Arabic chatbot, among others.

About the role

As an applied machine learning engineer, you will work on adapting state of the art deep learning (DL) models to run on our wafer scale system. This includes both functional validation and performance tuning of a variety of core models for applications like Natural Language Processing (NLP), Large Language Models (LLMs), Computer Vision (CV) and Graph Neural Networks (GNN).

As a member of the Cerebras engineering team you will be implementing models in popular DL frameworks like PyTorch and using insights into our hardware architecture to unlock to full potential of our chip. You will work on all aspects of the DL model pipeline including:

  • Dataloader implementation and performance optimization
  • Reference model implementation and functional validation
  • Model convergence and hyper-parameters tuning
  • Model customization to meet customer needs.
  • Model architecture pathfinding

This role will allow you to work closely with partner companies at the forefront of their fields across many industries. You will get to see how deep learning is being applied to some of the world’s most difficult problems today and help ML researchers in these fields to innovate more rapidly and in ways that are not currently possible on other hardware systems.

Responsibilities

  • Analyze, implement, and optimize DL models for the WSE
  • Functional and convergence of models on the WSE
  • Work with engineering teams to optimize models for the Cerebras stack
  • Support engineering teams in functional and performance scoping new models and layers
  • Work with customers to optimize their models for the Cerebras stack
  • Develop new approaches for solving real world AI problems on various domains

Requirements

  • Master's degree or PhD in engineering, science, or related field with 5+ years of experience
  • Experience programming in modern language like Python or C++
  • In-depth understanding of DL learning methods and model architectures
  • Experience with DL frameworks like PyTorch, TensorFlow and JAX
  • Familiar with state-of-the-art transformer architectures for language and vision model
  • Experience in model training and hyper-parameter tuning techniques
  • Familiar with different LLM downstream tasks and datasets

Preferred Skills

  • A deep passion for cutting edge artificial intelligence techniques
  • Understanding of hardware architecture
  • Experience programming accelerators like GPUs and FPGAs

Apply

Location: Palo Alto, CA - Hybrid


The era of pervasive AI has arrived. In this era, organizations will use generative AI to unlock hidden value in their data, accelerate processes, reduce costs, drive efficiency and innovation to fundamentally transform their businesses and operations at scale.

SambaNova Suite™ is the first full-stack, generative AI platform, from chip to model, optimized for enterprise and government organizations. Powered by the intelligent SN40L chip, the SambaNova Suite is a fully integrated platform, delivered on-premises or in the cloud, combined with state-of-the-art open-source models that can be easily and securely fine-tuned using customer data for greater accuracy. Once adapted with customer data, customers retain model ownership in perpetuity, so they can turn generative AI into one of their most valuable assets.

Working at SambaNova This role presents a unique opportunity to shape the future of AI and the value it can unlock across every aspect of an organization’s business and operations, including innovating from strategic product pathfinding to large-scale production. We are excited to have talents on board, pushing towards democratizing the modern LLM capability in real-world use cases.

Responsibilities SambaNova is hiring a Principal Engineer for the Foundation LLM team.

  • Design and implement large-scale data pipelines that feed billions of high-quality tokens into LLMs.
  • Continuously improve SambaNova’s LLM by exploring new ideas, including but not limited to new modeling techniques, prompt engineering, instruction tuning, and alignment.
  • Curate and crawl the necessary dataset to induce domain specificity.
  • Collaborate with product management and executive teams to develop a roadmap for continuous improvement of LLM and incorporate new capabilities.
  • Work closely with the product team and our customers to translate product requirements into requisite LLM capabilities.
  • Expand LLM capabilities into new languages and domains.
  • Develop applications on top of LLMs including but not limited to semantic search, summarization, conversational agents, etc.

Basic Qualifications - Bachelor's or Master's degree in engineering or science fields - 5-10 years of hands-on engineering experience in machine learning

Additional Required Qualifications - Experience with one or more deep learning frameworks like TensorFlow, PyTorch, Caffe2, or Theano - A deep theoretical or empirical understanding of deep learning - Experience building and deploying machine learning models - Strong analytical and debugging skills - Experience with either one of Large Language Models, Multilingual Models, Semantic Search, Summarization, Data Pipelines, Domain Adaptation (finance, legal, or bio-medical), and conversational agents. Experience in leading small teams. Experience in Python and/or C++.

Preferred Qualifications - Experience working in a high-growth startup - A team player who demonstrates humility - Action-oriented with a focus on speed & results - Ability to thrive in a no-boundaries culture & make an impact on innovation


Apply

Bay Area, California Only

Cerebras has developed a radically new chip and system to dramatically accelerate deep learning applications. Our system runs training and inference workloads orders of magnitude faster than contemporary machines, fundamentally changing the way ML researchers work and pursue AI innovation.

At Cerebras, we're proud to be among the few companies globally capable of training massive LLMs with over 100 billion parameters. We're active contributors to the open-source community, with millions of downloads of our models on Hugging Face. Our customers include national labs, global corporations across multiple industries, and top-tier healthcare systems. Recently, we announced a multi-year, multi-million-dollar partnership with Mayo Clinic, underscoring our commitment to transforming AI applications across various fields.

The Role

As the Cerebras ML Product Manager, you'll spearhead the transformation of AI across various industries by productizing critical machine learning (ML) use cases. Collaborating closely with Product leadership and ML research teams, you'll identify promising areas within the industry and research community, balancing business value and ML thought leadership. Your role involves translating abstract neural network requirements into actionable tasks for the Engineering team, establishing roadmaps, processes, success criteria, and feedback loops for product improvement. This position requires a blend of deep technical expertise in ML and deep learning concepts, familiarity with modern models, particularly in the Large Language Model (LLM) space, and a solid grasp of mathematical foundations. Ideal candidates will anticipate future trends in deep learning and understand connections across different neural network types and application domains.

Responsibilities

  • Understand deep learning use cases across industries through market analysis, research, and user studies
  • Develop and own the product roadmap for neural network architectures and ML methods on Cerebras platform
  • Collaborate with end users to define market requirements for AI models
  • Define software requirements and priorities with engineering for ML network support
  • Establish success metrics for application enablement, articulating accuracy and performance expectations
  • Support Marketing, Product Marketing, and Sales by documenting features and defining ML user needs
  • Collaborate across teams to define product go-to-market strategy and expand user community
  • Clearly communicate roadmaps, priorities, experiments, and decisions

Requirements

  • Bachelor’s or Master’s degree in computer science, electrical engineering, physics, mathematics, a related scientific/engineering discipline, or equivalent practical experience
  • 3-10+ years product management experience, working directly with engineering teams, end users (enterprise data scientists/ML researchers), and senior product/business leaders
  • Strong fundamentals in machine learning/deep learning concepts, modern models, and the mathematical foundations behind them; understanding of how to apply deep learning models to relevant real-world applications and use cases
  • Experience working with a data science/ML stack, including TensorFlow and PyTorch
  • An entrepreneurial sense of ownership of overall team and product success, and the ability to make things happen around you. A bias towards getting things done, owning the solution, and driving problems to resolution
  • Outstanding presentation skills with a strong command of verbal and written communication

Preferred

  • Experience developing machine learning applications or building tools for machine learning application developers
  • Prior research publications in the machine learning/deep learning fields demonstrating deep understanding of the space

Apply

d-Matrix has fundamentally changed the physics of memory-compute integration with our digital in-memory compute (DIMC) engine. The “holy grail” of AI compute has been to break through the memory wall to minimize data movements. We’ve achieved this with a first-of-its-kind DIMC engine. Having secured over $154M, $110M in our Series B offering, d-Matrix is poised to advance Large Language Models to scale Generative inference acceleration with our chiplets and In-Memory compute approach. We are on track to deliver our first commercial product in 2024. We are poised to meet the energy and performance demands of these Large Language Models. The company has 100+ employees across Silicon Valley, Sydney and Bengaluru.

Our pedigree comes from companies like Microsoft, Broadcom, Inphi, Intel, Texas Instruments, Lucent, MIPS and Wave Computing. Our past successes include building chips for all the cloud hyperscalers globally - Amazon, Facebook, Google, Microsoft, Alibaba, Tencent along with enterprise and mobile operators like China Mobile, Cisco, Nokia, Ciena, Reliance Jio, Verizon, AT&AT. We are recognized leaders in the mixed signal, DSP connectivity space, now applying our skills to next generation AI.  

Location:

Hybrid, working onsite at our Santa Clara, CA headquarters 3 days per week.

The role: Software Engineer, Staff - Kernels

What you will do:

The role requires you to be part of the team that helps productize the SW stack for our AI compute engine. As part of the Software team, you will be responsible for the development, enhancement, and maintenance of software kernels for next-generation AI hardware. You possess experience building software kernels for HW architectures. You possess a very strong understanding of various hardware architectures and how to map algorithms to the architecture. You understand how to map computational graphs generated by AI frameworks to the underlying architecture. You have had past experience working across all aspects of the full stack tool chain and understand the nuances of what it takes to optimize and trade-off various aspects of hardware-software co-design. You are able to build and scale software deliverables in a tight development window. You will work with a team of compiler experts to build out the compiler infrastructure working closely with other software (ML, Systems) and hardware (mixed signal, DSP, CPU) experts in the company. 

What you will bring:

Minimum:

MS or PhD in Computer Engineering, Math, Physics or related degree with 5+ years of industry experience.

Strong grasp of computer architecture, data structures, system software, and machine learning fundamentals. 

Proficient in C/C++ and Python development in Linux environment and using standard development tools. 

Experience implementing algorithms in high level languages such as C/C++, Python. 

Experience implementing algorithms for specialized hardware such as FPGAs, DSPs, GPUs, AI accelerators using libraries such as CuDA etc. 

Experience in implementing operators commonly used in ML workloads - GEMMs, Convolutions, BLAS, SIMD operators for operations like softmax, layer normalization, pooling etc.

Experience with development for embedded SIMD vector processors such as Tensilica. 

Self-motivated team player with a strong sense of ownership and leadership. 

Preferred:

Prior startup, small team or incubation experience. 

Experience with ML frameworks such as TensorFlow and.or PyTorch. 

Experience working with ML compilers and algorithms, such as MLIR, LLVM, TVM, Glow, etc.

Experience with a deep learning framework (such as PyTorch, Tensorflow) and ML models for CV, NLP, or Recommendation. 

Work experience at a cloud provider or AI compute / sub-system company.


Apply

Remote

What You’ll Do

-Remotely provision and manage large-scale HPC clusters for AI workloads (up to many thousands of nodes) -Remotely install and configure operating systems, firmware, software, and networking on HPC clusters both manually and using automation tools -Troubleshoot and resolve HPC cluster issues working closely with physical deployment teams on-site -Provide context and details to an automation team to further automate the deployment process -Provide clear and detailed requirements back to HPC design team on gaps and improvement areas, specifically in the areas of simplification, stability, and operational efficiency -Contribute to the creation and maintenance of Standard Operating Procedures -Provide regular and well-communicated updates to project leads throughout each deployment -Mentor and assist less-experienced team members -Stay up-to-date on the latest HPC/AI technologies and best practices

You

-Have 10+ years of experience in managing HPC clusters -Have 10+ years of everyday Linux experience -Have a strong understanding of HPC architecture (compute, networking, storage) -Have an innate attention to detail -Have experience with Bright Cluster Manager or similar cluster management tools -Are an expert in configuring and troubleshooting: -SFP+ fiber, InfiniBand (IB), and 100 GbE network fabrics -Ethernet, switching, power infrastructure, GPU direct, RDMA, NCCL, Horovod environments -Linux-based compute nodes, firmware updates, driver installation -SLURM, Kubernetes, or other job scheduling systems -Work well under deadlines and structured project plans -Have excellent problem-solving and troubleshooting skills -Have the flexibility to travel to our North American data centers as on-site needs arise or as part of training exercises -Are able to work both independently and as part of a team

Nice to Have

-Experience with machine learning and deep learning frameworks (PyTorch, TensorFlow) and benchmarking tools (DeepSpeed, MLPerf) -Experience with containerization technologies (Docker, Kubernetes) -Experience working with the technologies that underpin our cloud business (GPU acceleration, virtualization, and cloud computing) -Keen situational awareness in customer situations, employing diplomacy and tact -Bachelor's degree in EE, CS, Physics, Mathematics, or equivalent work experience

About Lambda

-We offer generous cash & equity compensation -Investors include Gradient Ventures, Google’s AI-focused venture fund -We are experiencing extremely high demand for our systems, with quarter over quarter, year over year profitability -Our research papers have been accepted into top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOG -We have a wildly talented team of 200, and growing fast -Health, dental, and vision coverage for you and your dependents -Commuter/Work from home stipends -401k Plan with 2% company match -Flexible Paid Time Off Plan that we all actually use

Salary Range Information

Based on market data and other factors, the salary range for this position is $170,000-$230,000. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.

A Final Note:

You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.

Equal Opportunity Employer

Lambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information,


Apply