Site Reliability Engineer (SRE) Job at Openkyber, California

bzRFcmpFS2ovdHRQTlpmMEhRUmhHQW9yR3c9PQ==
  • Openkyber
  • California

Job Description

Overview:

Dataflix is seeking a highly experienced Senior or Lead Platform Engineer/Site Reliability Engineer (SRE)/Hadoop Admin to manage and enhance our petabyte-scale, on-premises data platform. This platform is built using the open-source Hadoop ecosystem. The ideal candidate brings deep technical expertise, a strong understanding of distributed systems, and extensive experience operating and optimizing large-scale data infrastructure.

Responsibilities:
  • Own and operate the end-to-end infrastructure of a large-scale, on-prem Hadoop-based data platform, ensuring high availability and reliability.
  • Design, implement, and maintain core platform components, including Hadoop, Hive, Spark, NiFi, Iceberg, ELK, OpenSearch and Ambari.
  • Automate infrastructure management, monitoring, and deployments using CI/CD pipelines (GitLab) and scripting.
  • Implement and enforce security controls, access management, and compliance standards.
  • Perform system upgrades, patching, performance tuning, and troubleshooting across platform components.
  • Optimize observability and telemetry using tools like Prometheus, Grafana, and OpenTelemetry for real-time performance monitoring and alerting.
  • Proactively monitor system health, resolve incidents, and conduct root-cause analyses to prevent recurrence.
  • Collaborate with data engineering, analytics, and infrastructure teams to align platform capabilities with evolving needs.
Requirements:
  • 10+ years of experience in Platform Engineering, Site Reliability Engineering, or similar roles, with proven success managing large-scale, distributed Hadoop infrastructure.
  • Deep expertise in the Hadoop ecosystem, including HDFS, YARN, Hive, Spark, NiFi, Ambari, and Iceberg.
  • Strong Linux system administration skills (CentOS/Rocky preferred), including system tuning, performance optimization, and troubleshooting.
  • Proficiency in containerization and orchestration using Docker and Kubernetes.
  • Solid experience with automation and Infrastructure as Code, leveraging tools like GitLab CI/CD and scripting in Python and bash.
  • Practical knowledge of monitoring and observability tools (e.g., Prometheus, Grafana, OpenTelemetry) and understanding of system health, alerting, and telemetry.
  • Familiarity with networking concepts, security protocols, and data compliance requirements.
  • Experience managing petabyte-scale data platforms and implementing disaster recovery strategies.
  • Understanding of data governance, metadata management, and operational best practices.

Job Tags

Similar Jobs

Passionately Pets

Dog Walker - Alexandria Job at Passionately Pets

Are you seeking a job you really enjoy? One that gets you out of the office and into the fresh air? Do you love dogs and other pets? Then, joining the Passionately Pets Team may be the right job for you!Passionately Pets, a local In-Home Pet Sitting and Dog Walking Company...

TROC The Revenue Optimization Companies

Event sales specialist Job at TROC The Revenue Optimization Companies

 ...that fits around your life? FLEX-ROC by T-ROC is seeking energetic, outgoing individuals to represent exciting consumer brands inside Costco, Sams, BJs and various big-box retailers during special two-week roadshows. This is a short-term, high-impact role where you... 

Pinkerton Consulting & Investigations, Inc.

Investigator Job at Pinkerton Consulting & Investigations, Inc.

 ...candidates with diverse backgrounds, experiences, and perspectives to join our family of industry subject matter experts. The Investigator, assigned to a specific client, is responsible for conducting comprehensive reviews and analyses of various digital and physical... 

Community College

NCC Adjunct Faculty -Anthropology Job at Community College

 ...for the following on- campus courses in our Social, Educational and Behavioral Sciences Department:~Ethnography of Work~Cultural Anthropology ~Introduction to SociologyMinimum Qualifications:Minimum qualifications are Masters Degree in the specific subject or... 

Community Health Centers of the Central Coast

Licensed Clinical Social Worker Job at Community Health Centers of the Central Coast

Job Title:Licensed Clinical Social WorkerDepartment:Behavioral HealthReports To:Director of Behavioral HealthFLSA Status:ExemptWage...  ...work schedule that may include evenings/weekends, and travel as needed.The above statements are intended to describe...