Senior Software Engineer I - CloudOps
Aurigo Software Technologies
About Aurigo
Aurigo is an American technology company founded in 2003 with a mission to help public sector agencies and facility owners plan, deliver and maintain their capital projects and assets safely and efficiently. Our award-winning Aurigo Masterworks Cloud software is now the industry- leading solution for both public and private agencies funding and maintaining large capital, infrastructure, and facilities investments. We are a privately held U.S. corporation proudly headquartered in Austin, Texas, with software development and support centers in Canada and India.
If you are ready to work for a fast-paced software company growing at over 100% YOY and interact with some of the brightest minds in the industry to solve real problems, we want to talk to you.
Description:
The SRE team at Aurigo provides hosting, infrastructure lifecycle management, security, and scaling support for Aurigo’s flagship AI/ML-powered products on AWS Cloud Infrastructure. Within SRE, the CloudOps function focuses on ensuring the reliability, scalability, and performance of applications, Infrastructure and data pipelines in production.
To deliver cutting-edge AI-driven capabilities to our products, we need to design, implement, and manage highly available, resilient, and scalable cloud infrastructures tailored for ML workloads. Aurigo is looking for a dynamic CloudOps Engineer with expertise in both ML model lifecycle management and cloud infrastructure administration. The engineer will be responsible for ensuring a highly available and reliable ML environment, handling tasks such as model deployment, pipeline automation, system administration, and release engineering. Given the computational and data-intensive nature of ML workloads, maintaining optimal performance and reliability is critical. We are committed to achieving 99.99% uptime for our platforms and services.
Requirements:
- 3-6 years of experience in CloudOps, or DevOps teams focusing on MLOps.
- Hands-on experience with AWS AI/ML services such as SageMaker, Bedrock, Lambda, and S3 etc.
- Strong expertise in CI/CD automation for ML using tools like GitLab CI/CD, Jenkins, Azure DevOps or AWS CodePipeline.
- Experience with workflow orchestration using Kubeflow, Apache Airflow, or Step Functions.
- Proficiency in containerization technologies (Docker, Kubernetes, AWS EKS).
- Strong scripting and automation skills using Python, Bash
- Knowledge of model monitoring, logging, and lineage tracking for governance and compliance.
- Understanding of ML model lifecycle, feature stores, and data engineering workflows.