Senior DevOps/MLOps Engineer

وصف الوظيفة

Visionary Tech Services (VTS) is expanding its AI & Data Practice and is in search of a Senior DevOps/MLOps Engineer. This position focuses on creating dependable, secure, and scalable machine learning platforms that enhance data science, generative AI, and analytics solutions for clients throughout the region.

In this pivotal role, you will bridge software engineering, cloud computing, and machine learning operations. You will take charge of the design and implementation of Continuous Integration/Continuous Deployment (CI/CD) for data and machine learning, define standards for model lifecycles, and facilitate swift and secure transitions from development notebooks to production environments.

متطلبات الوظيفة

What you’ll do

  • Own CI/CD for data and ML: reusable pipelines, automated tests, security scans, approvals, and releases.
  • Design cloud infrastructure for ML workloads with IaC, networking, identity/secrets, storage, and observability.
  • Standardise model lifecycle: experiment tracking, model registry, feature store, lineage, and reproducibility.
  • Deploy models for real-time and batch use cases with safe rollout strategies and autoscaling.
  • Implement monitoring and alerting for data quality, model performance and drift, latency, throughput, and cost.
  • Collaborate with Data Engineering, Data Science, and Platform teams to ship reliable solutions quickly.
  • Lead incident response, blameless postmortems, and continuous improvement.

What you’ll bring

  • 6+ years in DevOps/SRE/Platform and MLOps roles, with production ownership of ML systems.
  • Proficiency with Python and containerisation. Strong Git practices and CI/CD tooling.
  • Deep experience on at least one major cloud (Azure preferred), plus infrastructure-as-code.
  • Hands-on with ML toolchain: MLflow, registries, feature stores, model serving on CPUs/GPUs.
  • Familiar with data platforms: Databricks, Snowflake, Spark, Kafka, Airflow.
  • Strong grounding in security, networking, IAM, secret management, compliance, and cost control.
  • Nice to have
  • Experience with GenAI/LLM evaluation, prompt pipelines, and model serving.
  • Prior work with canary, blue/green, shadow deployments, and multi-region setups.
  • Knowledge of data governance, responsible AI, and auditability.

Tech we use

Python, Git, Docker, Azure (Azure ML, Azure DevOps), AWS, GCP, MLflow, Databricks, Snowflake, Spark, Kafka, Apache Airflow, FastAPI/Flask, REST/GraphQL.