My job alerts

Founding MLE (On-device AI)

Greylock

Software Engineering, Data Science

Redwood City, CA, USA

Posted on May 10, 2026

Apply now

Summary

Early-stage AI investment of ours, founded by a successful repeat entrepreneur, is looking to hire a Senior Machine Learning Engineer or Applied Research Scientist focused on efficient on-device and edge-deployed language models.

This will be one of the earliest ML hires and a foundational technical role inside the company. The person will help architect and build core ML systems focused on efficient inference, model optimization, deployment reliability, and production-scale edge AI infrastructure.

The role sits at the intersection of applied ML research, inference systems, and production engineering.

What You’ll Work On

Architect and build core ML infrastructure for efficient language model deployment
Optimize small language models for constrained compute and memory environments
Improve inference latency, throughput, memory footprint, and deployment reliability
Develop production-ready pipelines for model evaluation, benchmarking, deployment, and monitoring
Translate experimental research code into scalable, maintainable production systems
Work closely across research and engineering to productionize new model capabilities
Help define long-term technical direction across edge AI and inference systems

Key Qualifications

Strong experience deploying ML systems or language models in constrained runtime environments
Deep understanding of model optimization techniques including quantization, distillation, pruning, and efficient inference
Experience with modern inference runtimes, deployment frameworks, or accelerated ML systems
Strong systems intuition around latency, memory efficiency, and real-time inference behavior
Strong PyTorch experience and comfort operating across both research and production environments
Experience building scalable ML infrastructure and evaluation pipelines
Ability to operate independently in ambiguous, zero-to-one startup environments
Prior experience leading small, high-impact technical initiatives or teams preferred

Strong Plus Signals

Experience working with small language models (SLMs) or edge-deployed LLM systems
Background in embedded AI, systems optimization, runtime engineering, or inference infrastructure
Familiarity with low-level performance optimization or hardware-aware ML deployment
Experience productionizing transformer-based systems in resource-constrained environments

Background

Advanced degree in Computer Science, Electrical Engineering, Applied Mathematics, or related field preferred
4+ years of relevant industry or applied research experience

Please note:

There are no fees associated with any of the support we provide our investments. Greylock Talent provides free candidate referrals/introductions to all of our active investments (one of the many services we provide).

Due to the volume of applicants we typically receive, a follow-up email will not be sent unless a match is identified.

Apply now

See more open positions at Greylock