Job Description :Our client is looking for a Agentic AI Engineer in Toronto, ON
Must Have Primary Skills :
Client are seeking a hands-on Agentic AI Platform Engineer with deep experience in Python, Generative AI frameworks, and containerized deployments on Kubernetes (OCP, Azure, AWS). This is a 100% coding role, focused on designing, developing, and deploying production-grade AI platform components — from LLM orchestration to secure, scalable, multi-agent systems.
- Expert-level Python developer — strong track record of building frameworks, SDKs, or orchestration systems.
- Hands-on experience coding and deploying GenAI / LLM-powered applications using LangChain, Semantic Kernel, or custom agent frameworks.
- Deep expertise in containerization and Kubernetes: Proficient in Docker, Helm, and Kubernetes manifests (Deployments, Services, ConfigMaps, Secrets). Experienced with OpenShift (OCP), Azure AKS, and/or AWS EKS for production-grade deployments.
- Familiar with Kubernetes networking, security (RBAC, NetworkPolicies), and monitoring.
- Strong understanding of CI/CD pipelines and automation tools such as GitHub Actions, ArgoCD, or Jenkins.
- Familiarity with observability stacks(Prometheus, Grafana, Loki, OpenTelemetry).
- Hands-on experience with microservices design, API development, and event-driven orchestration.
- Solid understanding of LLM system design, context management, and retrieval-augmented generation (RAG) architectures.
- Comfortable working across hybrid and multi-cloud environments with secure service connectivity.
Nice To Have Secondary Skills :
-
Design, code, and deploy Python-based microservices and frameworks enabling orchestration of LLM-driven agents.
- Build and maintain containerized AI workloadsusing Docker and Kubernetes (OpenShift, EKS, AKS). Develop APIs, SDKs, and Python libraries that power GenAI and agentic workloads across clients business lines.
- Implement end-to-end orchestration for agent workflows, integrating frameworks such as LangChain, Semantic Kernel, or Haystack.Integrate and operationalize MCP-Context-Forge for context management, orchestration, and inter-agent communication.
- Embed observability, monitoring, and governance into all platform services (Prometheus, Grafana, OpenTelemetry).
- Ensure secure and compliant AI operationsthrough Kubernetes-native policies, RBAC, and network isolation.
- Collaborate closely with data scientists, AI researchers, and DevOps teams to productionize models and agent workflows.
- Prototype, benchmark, and deploy LLM pipelines on multi-cloud environments (OCP, Azure, AWS).
- Continuously enhance developer experience by contributing to internal Python SDKs, deployment automation, and CI/CD pipelines.
Proven Experience In :Preferred Qualifications: Experience developing Python SDKs, internal APIs, or developer tools for enterprise platforms. Familiarity with model serving frameworks(KServe, Ray Serve, BentoML) and distributed AI orchestration. Knowledge of service mesh architectures(Istio, Linkerd) and policy enforcement in Kubernetes. Experience integrating MCP-Context-Forge or similar orchestration technologies. Background in financial services, particularly in secure AI deployment or regulated environments.
IND1Please send your resume to
[email protected]