LLM & Generative AI Engineer

Reply · Milano, Lombardia, Italia ·


Descrizione dell'offerta

Are you an AI Engineer expert in Large Language Models and Generative AI ?

We are Reply , and we are looking for you!


WHO WE ARE

Reply is a company that specialises in Consulting, Systems Integration and Digital Services with a focus on the conception, design and implementation of solutions based on the new communication channels and digital media. Reply partners with key industrial groups in defining and developing business models made possible by the new technological and communication paradigms such as Artificial Intelligence, Big Data, Cloud Computing, Digital Communication, the Internet of Things and Mobile and Social Networking.


WHAT WILL YOU DO?

  • Core activities . You will design, build, and industrialize enterprise-grade Generative AI solutions based on Large Language Models, with a strong focus on the Mistral ecosystem. You will fine-tune and optimize LLMs using techniques such as LoRA and QLoRA, ensuring efficient performance in terms of latency, throughput, and memory usage. You will develop end-to-end GenAI pipelines, from experimentation to production deployment across cloud and on-prem environments. You will also contribute to monitoring, scaling, and continuously improving model performance in real-world enterprise use cases.
  • Technologies. You will work primarily with Python and Large Language Models, leveraging the Mistral AI ecosystem. You will use fine-tuning and optimization techniques such as LoRA, QLoRA, and quantization. You will work with cloud platforms including AWS, Azure, and GCP, and apply MLOps practices for deploying and maintaining production systems. You may also use tools and frameworks for distributed systems, containerization, and API development, including Java and Spring Boot in some contexts.
  • Team work . You will join a cross-functional team of AI engineers, software engineers, and cloud specialists working on cutting-edge Generative AI solutions for enterprise clients. You will collaborate closely with architects, data scientists, and business stakeholders to translate requirements into scalable AI systems. You will work in an environment that values experimentation, engineering excellence, and rapid iteration, with a strong focus on delivering production-ready and secure AI solutions.


WE'LL TOTALLY LOVE YOU IF YOU HAVE…

  • Academic background . Bachelor's o Master's Degree in Informatics, Computer Engineering, Telecommunication Engineering, Electronic, Automation, Robotics Engineering.
  • Valuable expertise You have at least 2 years of experience in backend development and system integration, with exposure to scalable digital solutions. You have experience with Python and Large Language Models, going beyond simple usage into real implementation and production scenarios. You are familiar with fine-tuning techniques such as LoRA and QLoRA, as well as model optimization strategies including quantization, VRAM reduction, and latency/throughput tuning. You have experience building end-to-end GenAI pipelines and deploying ML/LLM solutions in production environments. You are comfortable working with cloud platforms such as AWS, Azure, or GCP and understand distributed and cloud-native architectures. Knowledge of Java and Spring Boot is appreciated
  • Nice to have . Experience with the Mistral AI ecosystem or other open-source LLM stacks. Exposure to MLOps tools and practices (e.g. model monitoring, CI/CD for ML, orchestration frameworks). Familiarity with vector databases, retrieval-augmented generation (RAG), and agent-based systems. Understanding of GPU optimization and inference acceleration frameworks, previous experience in enterprise or consulting environments will be considered a plus.
  • Soft skills . You have strong problem-solving skills and a pragmatic mindset when dealing with complex AI systems. You are able to communicate technical concepts clearly to both technical and non-technical stakeholders. You demonstrate ownership and accountability in delivering production-ready solutions. You are proactive, curious, and comfortable working in fast-evolving environments where experimentation and iteration are key.


YOU WILL LOVE WORKING WITH US BECAUSE…

  • We have a start-up heart . Hundreds of small units with their own projects and teams. Guaranteed hands-on experience, flexibility, table footballs and free coffee.
  • But we dream worldwide . We have the structure to make your ideas matter. We partner with major groups over 4 continents and 15 countries.
  • We are customers-obsessed . Excellence is in our DNA. We strive for the best. We get our hands dirty. We get results.
  • We are learning . We at Reply are always aiming for true innovation. Even though it still may look unreal.


WHAT ARE THE NEXT STEPS

The first step of our recruiting process will be the meetings with the technical referents and then a face to face interview with the HR team. We care about an equal recruiting process.


Feel interested?


Reply is committed to embracing diversity and creating an inclusive work environment by valuing the uniqueness of people regardless of age, gender, sexual orientation, religion, nationality, or disabilities as protected by Italian Law (L.68/99).

Furthermore, Reply is committed to ensuring a fair and accessible selection process: to help you during the recruitment process, please let us know of any kind of support you may need.

Candidatura e Ritorno (in fondo)