Head of Compute Engineering

Jobgether · Milano, Lombardia, Italia ·


Descrizione dell'offerta

We are looking for a Head of Compute Engineering in Italy.

Responsible for leading the development and operation of high-performance GPU compute infrastructure, serving AI researchers, ML teams, and fast-growing startups. Combines deep technical expertise with engineering leadership, driving architecture decisions, infrastructure reliability, and product innovation. Manages a team of engineers, shapes roadmap for GPU provisioning and orchestration, and works closely with customers to optimize workload performance. Hands‑on leadership role with focus on scalable systems, observability frameworks, and operational processes.

Accountabilities

  • Lead a team of engineers across GPU compute, platform infrastructure, backend services, and internal tooling, ensuring high standards of quality and reliability.
  • Contribute directly to architecture and code, particularly on GPU provisioning, orchestration, and customer-facing control planes.
  • Own engineering delivery, including planning, prioritization, execution, and alignment with product and company goals.
  • Implement observability, alerting, and on-call processes to maintain SLAs and operational health.
  • Collaborate with early customers to gather requirements and translate them into scalable product solutions.
  • Hire, onboard, and mentor engineers, fostering a culture of technical excellence and ownership.
  • Continuously improve compute infrastructure and operational workflows to support high-performance AI workloads.

Requirements

  • 7+ years of engineering experience, including 1‑2 years in a tech lead or engineering management role within GPU compute environments.
  • Practical experience with CUDA, NVIDIA drivers, MIG/vGPU, or similar GPU compute technologies.
  • Strong knowledge of container orchestration (Kubernetes, Docker) and infrastructure‑as‑code tools (Terraform, Ansible).
  • Expertise in GPU cluster management, bare‑metal provisioning, and inference software.
  • Startup mindset: thrives in fast‑moving, ambiguous environments and takes full ownership of outcomes.
  • Excellent leadership, communication, and mentoring skills to guide engineering teams.
  • Bonus: experience with hyperscalers, Neocloud infrastructure, job schedulers (Slurm, LSF, Ray), large‑scale storage (VAST, Weka, DAOS, Ceph, AWS S3, Lustre, GPFS), networking (InfiniBand/RoCE), and GPU inference/fine‑tuning workloads (vLLM, LanceDB).

Benefits

  • Remote‑first work environment within CET‑aligned time zones.
  • Opportunity to lead GPU compute engineering in a high‑growth AI infrastructure space.
  • Hands‑on role with deep technical involvement, not just managerial responsibilities.
  • Attractive equity for an early‑stage, high‑growth company.
  • Direct collaboration with founders, influencing product and technical strategy.
  • Flexible work arrangements promoting work‑life balance.

#J-18808-Ljbffr

Candidatura e Ritorno (in fondo)