Lead Data Engineer (Scraping Operations)
Descrizione dell'offerta
Lead Data Engineer (Scraping Operations)
Job Title: Lead Data Engineer
Location: Hybrid
Company: PREFE
Who We Are:
PREFE is an AI-driven tech startup active in the Fast-Moving Consumer Goods (FMCG) industry. We are a dynamic team passionate about leveraging data to create solutions that improve people’s lives, and we are expanding our tech department. If you are eager to grow in a fast-paced environment and work on impactful projects, we’d love to meet you!
Position Overview:
We are seeking a highly motivated Lead Data Engineer that will be working as Head of Web Scraping to take over the leadership of our large-scale scraping operations. In this role, you will oversee, optimize, and expand our scraping infrastructure while ensuring high data quality and reliability, directly contributing to the company’s success.
What you’ll be doing:
- Lead and manage daily scraping operations, running 7,000+ retail store spiders using the Scrapy framework.
- Orchestrate workflows with Apache Airflow and monitor production pipelines.
- Manage and optimize infrastructure running on Docker within VM environments.
- Ensure data quality and code quality by reviewing and QA’ing the work of team members.
- Use PostgreSQL databases to query, validate, and analyze scraped data.
- Contribute hands‑on by building and maintaining spiders using Scrapy.
- Own and improve documentation and best practices to standardize the way of working.
- Collaborate with cross-functional teams to ensure scraped data meets business needs.
- Use GitHub for version control and code collaboration.
What we’re looking for:
- Proven experience with web scraping at scale (Scrapy required).
- Strong knowledge of Apache Airflow for orchestration.
- Hands‑on experience with Docker and VM‑based environments.
- Solid knowledge of SQL and ability to work with PostgreSQL for querying and data validation.
- Strong code review and QA skills with an eye for data accuracy and performance.
- Experience managing freelancers or small teams is a plus.
- Excellent documentation, organization, and communication skills.
- Proficiency with Git/GitHub workflows.
What will make you immediately stand out:
- Experience managing proxy infrastructure (Scrapoxy or similar).
- Familiarity with Selenium and Playwright for dynamic content extraction.
- Knowledge of Italian language (B2).
What We Offer:
- Flexible working hours and remote work opportunities.
- A young, dynamic, collaborative, and inclusive work environment.
- Fast professional growth opportunities.
- The chance to work on innovative projects that make a real difference in people’s lives.
How to Apply:
If you’re excited about this opportunity and meet the requirements, please send your CV, a brief cover letter, and links to relevant projects or GitHub repositories to
PREFE is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
#J-18808-Ljbffr