Artera

Artera

Artera specializes in modernizing and enhancing critical infrastructure for energy utilities and municipalities, providing solutions that ensure the reliable distribution and transmission of natural gas and electric power across America.

Construction & Engineering
51-250

Description

  • Develop the long-term vision and roadmap for Artera’s AI platform to support scaling inference volume and development workloads.
  • Own ML compute infrastructure, including distributed training infrastructure and developer libraries for foundation model development.
  • Build and evolve core libraries used by AI scientists to develop, launch, and monitor AI products.
  • Collaborate with model developers to improve GPU and CPU efficiency and data throughput for large-scale training runs.
  • Optimize storage and serving of terabytes of digital pathology data for large-scale training workflows.
  • Maintain and improve observability infrastructure to identify opportunities to optimize model performance across the platform.
  • Work closely with AI model developers, machine learning engineers, and platform engineering to support production deployment of optimized models.

Requirements

  • 8+ years of industry software engineering experience.
  • 4+ years of experience using ML orchestration frameworks such as Flyte, Ray, Kubeflow, Metaflow, MLflow, Dagster, Argo Workflows, or Prefect.
  • 4+ years of experience using PyTorch, TensorFlow, or JAX in Python.
  • 3+ years of experience building with AWS, Docker, and Kubernetes.
  • 1+ years of experience optimizing large-scale, high-throughput distributed machine learning training pipelines.
  • Experience with Terraform and SqlAlchemy is preferred.
  • Experience with multi-node and multi-GPU training is preferred.
  • Experience deploying and maintaining infrastructure for machine learning training and production inference is preferred.
  • Familiarity with TorchScript, ONNXRuntime, DeepSpeed, AWS Neuron, or similar inference optimization approaches is preferred.
  • Must be currently authorized to work in the United States or Canada without visa sponsorship.

Benefits

  • Base salary of $180,000 to $220,000 per year.
  • Equity is a core component of the compensation package.
  • 401(k) matching.
  • Unlimited paid time off (PTO).
  • Remote role open to candidates authorized to work in the U.S. or Canada.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Staff Machine Learning Engineer

Samsara 1K-5K IT Services

Samsara is hiring a Staff Machine Learning Engineer to develop end-to-end AI solutions and core ML infrastructure for physical operations customers using large-scale sensor, video, diagnostic, and text data.

Apache Spark C++ Computer Vision Machine Learning Python Rust
23 hours, 58 minutes ago

Senior Intelligent Process Automation Engineer (IPA)

GlobalDev Tech 51-250 Internet Software & Services

Senior Intelligent Process Automation Engineer at a transportation and logistics company, responsible for designing integration-first automation solutions that connect multiple systems into end-to-end workflows and support intelligent document processing.

Docker Kubernetes Machine Learning Microservices NLP REST API
23 hours, 58 minutes ago

Principal Machine Learning Engineer

Qodea is seeking a Principal Machine Learning Engineer to lead the architecture and evolution of large-scale data and ML systems that improve data quality, enrichment, and intelligent product linking within its Knowledge domain.

CI/CD Docker GCP Go GraphQL Kubernetes LLM Machine Learning NLP Node.js Python Redis REST API Scala SQL
1 day ago

Senior MLOps Engineer

Prolific 51-250 Professional Services

Prolific is hiring a Senior MLOps Engineer to build and operate the cloud and machine learning infrastructure that takes AI research into production across use cases like fraud detection and RAG-based search.

AWS CI/CD GCP GitHub Actions Kubernetes Machine Learning MLflow MLOps Serverless Terraform
1 day ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers