Featherless AI — ML Engineer — AI Architecture Research
About the role
Featherless AI is looking for an ML Engineer to work at the intersection of model architecture and production inference. You will evaluate emerging open-source model architectures, implement and benchmark them on the Featherless platform, and determine which architectural advances translate into real-world serving gains at scale.
What you'll do
- Implement and benchmark emerging LLM architectures (attention variants, SSMs, hybrid models) on production infrastructure
- Evaluate architectural variants for throughput, quality, and serving efficiency
- Work with AI Researchers to translate architecture research into deployable systems
- Develop tooling to automate model integration and evaluation across thousands of models
- Track open-source releases and triage new model families for platform support
Requirements
- Deep understanding of transformer architectures and variants (Llama, Mistral, Mamba, Jamba, etc.)
- Experience running LLM benchmarks (MMLU, HumanEval, MT-Bench, etc.)
- Proficiency in Python and PyTorch; experience reading and modifying model code
- Familiarity with model serving frameworks and inference performance profiling
About Featherless AI
Featherless AI is a serverless inference platform hosting 3,000+ open-source LLMs, letting developers call any model via a simple API without managing GPU infrastructure.
AI Alerts shares third-party job opportunities for informational purposes only. We are not the employer and are not involved in the hiring process. Always verify the company and role through official channels before applying, and never pay to apply, train, onboard, process documents, or secure a job offer. Legitimate employers do not ask applicants for money. Read our Terms to learn more.