Principal AI Platform Engineer (CA)

1 month ago
Full-time
Lead
DevOps and Infrastructure
PointClickCare

PointClickCare

PointClickCare provides a leading cloud-based healthcare software platform that enables long-term and post-acute care providers to effectively manage the complete lifecycle of resident care while enhancing operational efficiency and improving resident ...

Health Care Providers & Services
1K-5K
Founded 2000
$232M raised

Description

  • Design, build, and maintain the core infrastructure layer supporting GenAI products, including model gateways, prompt/versioning stores, vector databases, and LLM evaluation tools.
  • Implement secure access controls and authentication mechanisms integrated by default into AI platform components.
  • Develop and manage observability, monitoring, and logging solutions for GenAI workloads and infrastructure.
  • Collaborate with product and engineering teams to integrate GenAI infrastructure with agent frameworks and downstream applications.
  • Optimize infrastructure for scalability, high availability, and cost efficiency for production workloads.
  • Enable seamless delivery of AI-generated insights into agent workflows by connecting AI systems with existing products.
  • Support and operate GenAI platform components across the organization and work with horizontal partners to ensure safe, scalable delivery.

Requirements

  • Extensive experience building and maintaining AI platform infrastructure, Kubernetes, and container security.
  • Demonstrated expertise with observability and monitoring frameworks focused on real-time performance (e.g., OpenTelemetry, MLFlow).
  • Experience with AI infrastructure components such as vector databases, prompt/versioning stores, and AI IDEs.
  • Familiarity with vLLM, SGLang, or similar frameworks to host LLM inference workloads (preferred).
  • Experience with CI/CD pipelines and automation for AI model deployment and platform operations.
  • Strong knowledge of authentication and authorization frameworks integrated into AI platforms.
  • Experience optimizing infrastructure for scalability, high availability, and cost efficiency in production environments.
  • Experience collaborating with product and engineering teams to identify, build, and support generative AI solutions.

Benefits

  • CAD base salary range $169,000–$188,000 per year.
  • Bonus in addition to base salary.
  • Company benefits package (unspecified).
  • Remote work option (Canada) or Mississauga, ON location.
  • Position is not overtime eligible.
  • Total rewards package with individual compensation determined by job-related skills, experience, and work location; recruiter can provide more details.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Staff Machine Learning Engineer - Community Support Engineering

Airbnb 5K-10K Hotels, Restaurants & Leisure

Senior Machine Learning Engineer on Airbnb’s Community Support Products (CSP) Machine Learning team, responsible for building and deploying generative-AI-driven systems to transform and scale Airbnb’s customer support experience.

Generative AI Machine Learning
14 hours, 45 minutes ago

Staff/Principal Machine Learning Engineer (Modeling), Afterpay Risk

Block 10K-50K Capital Markets

Senior individual contributor on Afterpay's Fraud and Abuse team at Block, working remotely (US/Canada) to architect and build systems that prevent fraud and abuse across the lending lifecycle and strengthen the resilience of the lending ecosystem.

Apache Airflow Feature Engineering GitHub LightGBM Machine Learning MLflow NumPy Pandas Prefect Python PyTorch Scikit-learn Snowflake SQL XGBoost
1 month ago

AI/ML engineer

Remofirst 11-50 Professional Services

AI Engineer at a rapidly scaling, VC-backed US private company, responsible for building and deploying AI-driven product features, automations, and models to move concepts from proof-of-concept to production and accelerate company growth.

Computer Vision MLOps Neural Networks Python Rust
1 month ago

Senior Python Engineer - Agentic AI Deployment Services

Resil 251-1K Internet Software & Services

Senior Python Engineer at Resilinc on the Implementation Deployment Services team, responsible for building and scaling agentic AI‑powered, data‑intensive platforms that enable enterprises to predict supply chain disruptions and act in real time.

Databricks Machine Learning PostgreSQL Python SQL
1 month ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers