DevOps Engineer - Cloud Native AI (Sovereign Cloud)
Descrizione dell'offerta
AI Venture Builder operates as a venture builder, fostering high-growth businesses by ideating, developing, and launching AI solutions to address market-specific needs. Specializing in fields such as Fintech, Pharmaceuticals, Fashion Tech, and Cybersecurity, the company transforms marketable products into spin-off companies for growth.
- Employment Type: Full-Time
Why this role matters
We are building cutting-edge AI solutions within the boundaries of the Italian National Cloud Strategy. You won't just be clicking buttons in AWS; you will be architecting compliant, self-hosted AI platforms on sovereign infrastructure. This is a role for an engineer who understands how to build cloud-native reliability without relying on public hyperscalers.
Role Summary
As a DevOps Engineer for Sovereign AI, you will design and maintain the infrastructure that powers our AI/ML workloads on Italian National Cloud providers (e.g., PSN, Aruba, TIM Enterprise). Your focus will be on self-managed Kubernetes, strict Data Sovereignty, and implementing open-source MLOps toolchains that function independently of US-based public clouds.
Key Responsibilities
- Deploy and manage production-grade Kubernetes clusters on private cloud or national provider infrastructure (using Rancher, OpenShift, or Kubespray).
- Manage underlying virtualization layers (e.g., OpenStack or VMware vSphere) if bare-metal access is required.
- Ensure high availability and disaster recovery within the specific zones/regions of the national provider.
Self-Hosted MLOps
- Since we cannot use managed services (like SageMaker/Vertex), you will architect and maintain a self-hosted MLOps stack using tools like Kubeflow, MLflow, or Polyaxon.
- Configure and optimize MinIO or Ceph for S3-compatible object storage to handle large training datasets locally.
- Manage container registries (Harbor) located strictly within Italian borders.
Compliance & Security (GDPR/AGID)
- Strictly enforce Data Sovereignty principles; ensure no data egresses outside of Italy/EU.
- Manage strict network policies (Calico/Cilium) and air-gapped or proxy-restricted environments.
GPU & Hardware Optimization
- Configure NVIDIA vGPU or PCI passthrough on virtualized national cloud instances.
- Optimize the AI stack (CUDA drivers, Container Toolkit) for maximum performance on constrained infrastructure.
- Serverless GPU usage experience
The environment is "Cloud Native" but relies heavily on open-source and self-hosted equivalents of public cloud services.
Domain & Related Technology
- Cloud Environment - Italian National Cloud (PSN, TIM, Aruba, Almaviva)
- Orchestration - Red Hat OpenShift, SUSE Rancher, or Vanilla K8s
- AI/ML Platform - Kubeflow (Crucial), MLflow, JupyterHub
- Observability - Prometheus, Grafana, Loki (PLG Stack)
Qualifications
Required:
- 3+ years experience in System Engineering, DevOps, or SRE.
- Mastery of Kubernetes: You must know how to deploy and fix K8s when you don't have a "Support" button from Google or Amazon.
- Experience with Linux System Administration (RHEL, Ubuntu, CentOS) at a deep level.
- Understanding of Data Sovereignty and GDPR regulations in a technical context.
- Proficiency in Python and Bash scripting.
Preferred:
- Experience migrating workloads from AWS/Azure to Private/National Clouds.
- Knowledge of GitOps principles (using ArgoCD or Flux).
- Experience working with Public Sector clients or heavily regulated industries (Finance, Healthcare).
- Italian language fluency (often required for documentation with national providers).
What We Offer
- Competitive salary tailored to the Italian market.
- Opportunity to work on high-impact projects within the National Strategic framework.
- Training budget for Kubernetes (CKA/CKS) and Red Hat certifications.
If you are passionate about AI applied to the real world and want to contribute to concrete and innovative projects, apply now. We look forward to meeting you!
#J-18808-Ljbffr