Site Reliability Engineer (USA Timezone)
Skyflow
Skyflow is a data privacy vault company built to radically simplify how companies isolate, protect, and govern their customers’ most sensitive data. With its global network of data privacy vaults, Skyflow is also a comprehensive solution for companies around the world looking to meet complex data localization requirements. Skyflow currently supports a diverse customer base that spans verticals like fintech, retail, travel, and healthtech.
Skyflow is headquartered in Palo Alto, California and was founded in 2019. For more information, visit www.skyflow.com or follow on X and LinkedIn.
About the role:
As a Site Reliability Engineer, you will be responsible for driving the effort to identify, design, and develop the best technical and field solutions to automate our production systems. This position will collaborate often with various internal and external business and engineering teams. You will also have an opportunity to eventually lead efforts to champion and instill a culture of Site Reliability Engineering at Skyflow.
We know great Site Reliability Engineers come from diverse backgrounds so no single individual may have all the desired skills on day one. But if you are the kind of software engineer who would have loved to engineer solutions for Stripe or Twilio API's, or the Slack or Zendesk app, or the Snowflake or MongoDB platform - we want to talk to you
Desired Qualifications:
3+ overall years hands-on experience with 2+ years of experience in infrastructure automation and software delivery using SRE practices
Familiarity with cloud platforms (e.g., AWS, Azure, GCP).
Coding experience with Go (preferred) or Python.
Experience with SRE tools - CloudFormation/Terraform, Jenkins, Ansible and others
Hands-on experience with Linux Systems Engineering, Docker and Kubernetes container orchestration, RDBMS, and scripting for automation
Ability to work with distributed teams to provide technical guidance and leadership
Solid understanding of the common challenges with migrations and modernizations, the ability to choose the right path based on previous experience
Expertise with application observability patterns and site reliability practices
-
Extensive experience working with large distributed infrastructures
Responsibilities:
Able to provide support in USA timezone i.e either 8 PM to 2 AM IST or 4 AM to 10 AM IST
Utilize programming languages like Python and Go, Container Orchestration services including Docker and Kubernetes, CM tools including Terraform and Helm, and a variety of AWS tools and services on a daily basis
Develop and maintain CI/CD pipelines to enable automated testing, building, and deployment of applications.
Collaborate with cross-functional teams and clients to deliver robust cloud-based solutions that drive best-in-class experiences to Skyflow customers
Automate and maintain tools/systems involving software builds, continuous testing, automated deployments, software health monitoring and software releases
Evaluate reliability, performance, scalability, and engineering aspects to ensure a smooth software production rollout and delivery
Be a thought leader and key contributor within our SRE team and help build a Site Reliability Engineering culture
Benefits:
Work from home expense
Excellent Health Insurance Options
Very generous PTO
Flexible Hours
-
Generous Equity
At Skyflow, we believe that diverse teams are the strongest teams. We invite applicants of all genders, races, ethnicities, nationalities, ages, religions, sexual orientations, disability statuses, educational experiences, family situations, and socio-economic backgrounds.