Site Reliability Engineer - Operations (PST Timezone)
Skyflow
Location
Remote - India
Employment Type
Full time
Location Type
Remote
Department
EngineeringSite Reliability
Skyflow is a data privacy vault company built to radically simplify how companies isolate, protect, and govern their customers’ most sensitive data. With its global network of data privacy vaults, Skyflow is also a comprehensive solution for companies around the world looking to meet complex data localization requirements. Skyflow currently supports a diverse customer base that spans verticals like fintech, retail, travel, and healthtech.
Skyflow is headquartered in Palo Alto, California and was founded in 2019. For more information, visit www.skyflow.com or follow on X and LinkedIn.
About the role:
As a Site Reliability Engineer, you will be responsible for driving the effort to identify, design, and develop the best technical and field solutions to automate our production systems. This position will collaborate often with various internal and external business and engineering teams. You will also have an opportunity to lead efforts to champion and instill a culture of Site Reliability Engineering at Skyflow.
We know great Site Reliability Engineers come from diverse technical backgrounds, so no single individual may have all the desired skills on day one. But if you are the kind of software engineer who would have loved to engineer solutions for Stripe or Twilio API's, or the Slack or Zendesk app, or the Snowflake or MongoDB platform - we want to talk to you
You have:
2+ years in infrastructure automation and SRE-driven software delivery.
Strong experience with at least one major cloud platform (AWS preferred; Azure or GCP acceptable).
Programming experience in Go (preferred) or Python.
Hands-on experience with:
Infrastructure as Code: Terraform, CloudFormation
CI/CD tools: Jenkins
Configuration management: Ansible
Containers & orchestration: Docker, Kubernetes
Linux systems engineering and scripting for automation
RDBMS and production-grade infrastructureDeep understanding of site reliability practices, observability patterns, and operational excellence.
Experience working with large-scale, distributed infrastructures.
Ability to collaborate with distributed global teams, providing technical guidance and leadership.
You will:
Provide operational support aligned with US time zones, ensuring system reliability and availability.
Design, build, and maintain highly available and scalable cloud infrastructure using AWS and modern SRE practices.
Develop and maintain CI/CD pipelines for automated testing, building, and deployment of applications.
Automate infrastructure provisioning, configuration, and deployment using Terraform, CloudFormation, Helm, and Ansible.
Work extensively with Docker and Kubernetes for container orchestration and service management.
Implement and maintain observability solutions including monitoring, logging, alerting, and tracing.
Evaluate and improve reliability, performance, scalability, and security of production systems.
Support migration and modernization initiatives, choosing the right approach based on prior experience.
Collaborate with cross-functional teams and clients to deliver robust, cloud-based solutions and best-in-class customer experiences.
Act as a thought leader within the SRE team, contributing to processes, standards, and the overall SRE culture.
Benefits:
Work from home expense
Excellent Health Insurance Options
Very generous PTO
Flexible Hours
Generous Equity
At Skyflow, we believe that diverse teams are the strongest teams. We invite applicants of all genders, races, ethnicities, nationalities, ages, religions, sexual orientations, disability statuses, educational experiences, family situations, and socio-economic backgrounds.