Sr. Site Reliability Engineer, Security Remote - US

CentralReach, LLC. · WorkFromHome, Veneto, Italia · · 50€ - 70€


Descrizione dell'offerta

CentralReach is a leading provider of autism and IDD care software for Applied Behavior Analysis (ABA), multidisciplinary therapy, and special education. Trusted by more than 200,000 users, we enable therapy providers, educators, and employers to scale the way they deliver ABA and related therapies with innovative technology, market‑leading industry expertise, and world‑class customer satisfaction.

The Engineering Operations group at CentralReach builds the underlying technologies that power our Public and Private Cloud Platforms worldwide. The group is responsible for storage, data infrastructure, IT, observability systems, DevOps, SRE, provisioning, compute, orchestration platform, internal tools, internal platforms (laptops, networks, systems, etc.) and services – all the components that make up the CentralReach Platform.

If you have a passion for the future, enjoy and thrive in an agile, fast‑moving, ever‑changing startup environment, and are ready to take on technical challenges of all shapes and sizes, then read on!

As a Sr. SRE, you will work closely with the key stakeholders in Software Engineering to drive adoption of modern reliability practices like SLOs, error budget policies, actionable alerts, incident retrospectives, chaos testing, and end‑to‑end ownership.

Key Accountabilities

  • Responsible for availability, latency, performance, efficiency, monitoring/observability, emergency response, capacity planning, setting and maintaining SLOs, SLIs, and error budgets, creating dashboards.
  • Analyze, troubleshoot, and resolve operational challenges contributing to defined SLOs.
  • Manage site stability, performance, reliability, and maintain uptime for production environments.
  • Develop a fully automated multi‑environment observability stack based on the existing system and extend it to predict capacity needs based on usage patterns.
  • Strive for automation to reduce toil and increase development velocity.
  • Perform application‑specific production support, incident management, change management, problem management, RCAs, and service restoration as needed.
  • Identify changes for the product architecture from the reliability, performance, and availability perspective with a data‑driven approach.
  • Document resolution runbooks and standard operating procedures.
  • Actively look for opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation.
  • Collaborate with software development teams in the release management process and to shape the future roadmap and establish strong operational readiness across teams.
  • Implementation of reliability and observability tools (like New Relic, Prometheus, Grafana, etc.).
  • Collaborate with the Security team and other platform engineering teams to build reliable, maintainable, and scalable solutions that improve our security posture.

Desired Skills and Experience

  • Strong background as a SRE supporting a 24x7 highly available production environment for a SaaS or cloud service provider.
  • Solid experience with Monitoring/APM/Observability tools (Splunk, New Relic, etc.).
  • Experience implementing observability plans around logs, metrics, and traces.
  • Experience in an agile development team developing software.
  • Experience with cloud infrastructure environments, preferably AWS, and Infrastructure as Code (Terraform, CloudFormation).
  • Extensive experience with Docker, Kubernetes, Helm, CI/CD, and config management tools like Ansible, Chef.
  • Strong experience with containerization technology and/or Kubernetes.
  • Experience with release automation, system administration, configuration management.
  • Experience with programming languages (Java, Python, Go, etc.).
  • Strong understanding of Linux, Windows, software development, systems, networking, and cloud concepts.
  • Strong interpersonal and teaming skills – ability to set and enforce process and influence engineers who are not direct reports.
  • Strong analytical and programming skills (Python, Go, Java, etc.).
  • Deep understanding around best practices for modern cloud security.
  • Proven experience building observability for security concerns, such as privilege escalations and bot detection.

Base Salary Range

$160,000 – $180,000 USD

Backed by Roper Technologies, Inc. (Nasdaq: ROP), and led by award‑winning CEO Chris Sullens, CentralReach is entering an exciting phase of growth, innovation, and scale.

Recognized as one of the best places to work over 10 times by organizations such as Inc, Built In, and NJBIZ, our culture is centered around impact, inclusion, and flexibility. As a hybrid company with collaborative offices in Ft. Lauderdale, FL; Holmdel, NJ; and Verona, Italy, we foster a workplace where top talent can thrive and make a real difference in the lives of those we serve.

Benefits

  • Competitive compensation
  • Comprehensive health benefits
  • Generous PTO
  • 401(k) matching
  • Paid parental leave
  • Hybrid work schedules
  • Career development support
  • Wellness programs
  • Community engagement initiative (CR Cares™)

As set forth in CentralReach’s Equal Employment Opportunity policy, we do not discriminate on the basis of any protected group status under any applicable law.

#J-18808-Ljbffr

Candidatura e Ritorno (in fondo)