Artificial Intelligence Engineer
تفاصيل الوظيفة
Job Title: AI Engineer - Performance Optimization Specialist About the Role Are you obsessed with pushing the boundaries of AI model performance? Do you thrive on optimizing every aspect of AI systems — from shaving milliseconds off inference times to maximizing GPU utilization and reducing power consumption? ⚡️ We are seeking an AI Engineer to join our team and focus on building cutting-edge solutions that enhance the efficiency, scalability, and accessibility of AI systems. You’ll work with state-of-the-art models, including LLMs and multimodal systems, and deploy them across large GPU clusters. This is your chance to make a significant impact by driving innovation in AI performance optimization. Key Responsibilities
- Optimize AI Systems : Design and implement performance enhancements for large-scale AI models, ensuring minimal latency and maximum throughput.
- Distributed Inference : Develop and fine-tune systems for distributed inference, enabling seamless operation across multi-GPU and multi-node setups.
- Hardware Efficiency : Leverage advanced hardware capabilities, such as GPU acceleration and high-performance networking, to improve system efficiency and reduce energy consumption.
- Model Optimization : Research and apply techniques like quantization, pruning, and sparsity to improve model performance and resource utilization.
- Pipeline Development : Create robust deployment pipelines for AI model serving, monitoring, and continuous optimization in production environments.
- Collaborative Innovation : Work closely with cross-functional teams to drive advancements in AI infrastructure and share insights into best practices for performance engineering.
- Experience deploying and optimizing AI models in multi-GPU and multi-node systems .
- Proficiency in AI runtimes such as PyTorch , TensorRT , ONNX Runtime , or similar frameworks.
- Knowledge of distributed inference engines like Ray Serve , Triton Inference Server , or SLURM .
- Familiarity with AI compilers, including OpenXLA , torch.compile , MLIR , or TVM .
- Understanding of high-performance networking technologies, such as RDMA , Infiniband , or NVLink .
- Expertise in model optimization techniques like quantization and sparsity.
- A growth mindset with a passion for AI innovation and efficiency.
- Experience contributing to open-source projects or showcasing work through personal blogs or GitHub repositories.
- Familiarity with experimental hardware setups for AI model serving and optimization.
Apply safely
To stay safe in your job search, information on common scams and to get free expert advice, we recommend that you visit SAFERjobs, a non-profit, joint industry and law enforcement organization working to combat job scams.