Founding MLE (On-device AI)

Greylock

Greylock

Software Engineering, Data Science

Redwood City, CA, USA

Posted on May 10, 2026

Summary

Early-stage AI investment of ours, founded by a successful repeat entrepreneur, is looking to hire a Senior Machine Learning Engineer or Applied Research Scientist focused on efficient on-device and edge-deployed language models.

This will be one of the earliest ML hires and a foundational technical role inside the company. The person will help architect and build core ML systems focused on efficient inference, model optimization, deployment reliability, and production-scale edge AI infrastructure.

The role sits at the intersection of applied ML research, inference systems, and production engineering.

What You’ll Work On

  • Architect and build core ML infrastructure for efficient language model deployment
  • Optimize small language models for constrained compute and memory environments
  • Improve inference latency, throughput, memory footprint, and deployment reliability
  • Develop production-ready pipelines for model evaluation, benchmarking, deployment, and monitoring
  • Translate experimental research code into scalable, maintainable production systems
  • Work closely across research and engineering to productionize new model capabilities
  • Help define long-term technical direction across edge AI and inference systems

Key Qualifications

  • Strong experience deploying ML systems or language models in constrained runtime environments
  • Deep understanding of model optimization techniques including quantization, distillation, pruning, and efficient inference
  • Experience with modern inference runtimes, deployment frameworks, or accelerated ML systems
  • Strong systems intuition around latency, memory efficiency, and real-time inference behavior
  • Strong PyTorch experience and comfort operating across both research and production environments
  • Experience building scalable ML infrastructure and evaluation pipelines
  • Ability to operate independently in ambiguous, zero-to-one startup environments
  • Prior experience leading small, high-impact technical initiatives or teams preferred

Strong Plus Signals

  • Experience working with small language models (SLMs) or edge-deployed LLM systems
  • Background in embedded AI, systems optimization, runtime engineering, or inference infrastructure
  • Familiarity with low-level performance optimization or hardware-aware ML deployment
  • Experience productionizing transformer-based systems in resource-constrained environments

Background

  • Advanced degree in Computer Science, Electrical Engineering, Applied Mathematics, or related field preferred
  • 4+ years of relevant industry or applied research experience

Please note:

There are no fees associated with any of the support we provide our investments. Greylock Talent provides free candidate referrals/introductions to all of our active investments (one of the many services we provide).

Due to the volume of applicants we typically receive, a follow-up email will not be sent unless a match is identified.