Hiring LEAD AI ENGINEER (LLM Infrastructure & Orchestration)
Hiring Lead AI Engineer at Trivita AI to lead architecture, infrastructure optimization and large-scale LLM deployment.

Job description
- Define the technical architecture roadmap for AI systems, ensuring scalability, cost efficiency, and performance
- Design and implement LLM orchestration systems, including prompt chaining, streaming responses, and long-term memory
- Build and optimize infrastructure for LLM deployment, including frameworks such as vLLM
- Manage and partition GPU resources (NVIDIA MIG) to maximize throughput across model instances
- Implement inference optimization techniques such as PagedAttention to reduce latency
- Lead and mentor AI and backend engineering teams through code reviews and technical guidance
- Collaborate with Product and MLOps teams to ensure high-quality product releases
Technical requirements
- Strong understanding of Transformer architectures, LLM fine-tuning, and inference optimization
- Proficiency in Python and Java/Kotlin
- Experience building APIs for AI systems
- Experience with Docker and Kubernetes
- Hands-on experience with GPU infrastructure (NVIDIA, CUDA, MIG)
- Experience with vector databases and optimizing relational databases (PostgreSQL/MySQL)
Soft skills
- Strong leadership and team development capabilities
- Proactive problem-solving in complex distributed systems
Nice to have
- Experience building chatbot platforms or AI automation systems
- Contributions to open-source projects
- Knowledge of web performance optimization or responsive design
Benefits
- Work with experienced AI engineers and experts
- Mac provided for work
- Full social insurance based on gross salary
- Modern working environment
- Access to facilities such as swimming pool, gym, and table tennis
Contact information
- Address: No. 01, Street 104, Quarter 3, Binh Trung Ward, Ho Chi Minh City
- Phone: 0909797699
- Email: hr@trivita.ai