AI Engineer (SLM - Python) - Future Proof Labs
Description
About Futureproof Labs
Futureproof Labs is an AI-native product studio building human‑first AI products designed for real‑world speed and impact. You’ll be joining a team focused on transforming raw AI capability into polished, commercially deployable products.
Role Summary
We are seeking a Mid to Senior AI Engineer specializing in Small Language Models (SLMs) to design, finetune, and optimize lightweight models for fast, efficient inference in production environments. This is a focused ML engineering role; your primary responsibility will be improving model performance, efficiency, deployment, and reliability for Futureproof Labs AI products.
We are looking to add the absolute best engineering talent to bolster our team as we build AI products for the US and Canadian markets. You'll get to work with experienced and established founders in North America as you build cutting‑edge solutions that can have a global impact.
Key Responsibilities
Design, train, finetune, and evaluate small/efficient language models for targeted tasks.
Optimize models for speed, memory efficiency, and cost‑effective inference on local or cloud infrastructure.
Implement techniques like quantization, distillation, pruning, and specialized training pipelines.
Build robust evaluation pipelines: benchmarks, accuracy tests, latency profiling, and regression checks.
Work closely with product teams to align model behavior with real‑world use cases.
Integrate SLMs with existing backend systems and APIs.
Ensure versioning, monitoring, safety checks, and performance tracking of deployed models.
Keep up with emerging SLM research and propose improvements.
Required Skills & Experience
At least 3 to 5+ years of hands‑on ML/AI engineering experience.
Strong experience training or fine‑tuning small or efficient models.
Solid understanding of optimization techniques: quantization, pruning, distillation, low‑rank finetuning (Lo RA/QLo RA).
Strong Python skills and familiarity with Py Torch or JAX.
Experience deploying models in production with attention to speed, cost, and reliability.
Ability to build clean, maintainable ML pipelines (training, inference, monitoring).
Nice to Have
Experience with retrieval‑augmented pipelines (RAG) optimized for small models.
Familiarity with on‑device ML or edge deployment.
Knowledge of evaluation frameworks, safety testing, and alignment techniques.
Understanding of GPU/accelerator performance tuning.
#J-18808-Ljbffr
Posted: 18th December 2025 10.19 am
Application Deadline: N/A