Description

Fortanix is a dynamic start-up solving some of the world’s most demanding data protection challenges for companies and governments around the world. Our disruptive technology maintains data privacy across its entire lifecycle -- at rest, in motion, and in use across any enterprise IT infrastructure -- public cloud, on-premise, hybrid cloud, and SaaS.

With key strategic partners like Microsoft, Intel, ServiceNow, and Snowflake, Fortanix customers like PayPal, Google & Adidas are reaping the benefits. Recognized by Gartner as a “Cool Vendor”, Fortanix is revolutionizing cyber security.

Join the revolution!

Why work with us?

We're seeking passionate people to work with us to change the very idea of how people use cloud computing. We take pride in making Fortanix a great place to work. Coworkers recognize that great ideas can come from anyone, and everyone is encouraged to jump in, contribute, and ask questions.

In tackling the hardest problems, we believe that working together will produce better solutions.

The Job

As part of this job, you will be responsible for the reliability of multiple Fortanix production environments. Design operations as code, continuously work with the product engineering to improve reliability, implement actionable monitoring framework and be part of production on-call.

Key Responsibilities

  • Improve production reliability of multiple Fortanix products via automation.
  • Participate in production upgrades, migrations, disaster recovery drills, backup/restore, securing cloud environments, logging, log analytics etc
  • Work with Devops, Networking, Customer Success and Development to continuously improve production environment.
  • Manage service status and incidence portal.
  • Participate in the on-call incident response for critical issues
  • Responding/communicating to impacted customers and providing root-cause-analysis/action plan.
  • Design tests to simulate scenarios/events before they occur.
  • Manage IAM of production system.

Requirements

Technical Experience

Experience with modern enterprise Site reliability engineering. Along with experience in the following areas

  • Automation experience with Python, Ansible, Terraform, CloudFormation, etc
  • Advanced experience with Linux administration and automation.
  • Experience with production debugging and the ability to implement fast workarounds.
  • Advanced experience in managing software deployment on Cloud via pipelines (example: bitbucket/Gitlab) and Datacentre.
  • Understanding DevOps practices on how modern software is deployed, upgraded and monitored.
  • Experience with both managed (AKS, EKS, GKE.) and unmanaged (on-prem) Kubernetes. Especially production experiences with Kubernetes and Docker.
  • Experience with high-level network infrastructure for Datacentre and Cloud

Key Requirements

  • Engineering: 5+ Years of engineering experience with 3+ Years of core Site reliability engineering experience.
  • Solid understanding of Cloud technologies.
  • Demonstrated ability to coordinate cross-functional work teams toward completion.
  • Demonstrated multitasking, effective leadership, and analytical skills.
  • Must be a team player.

Benefits

  • Unlimited PTO (it’s between you and your work)
  • Health Insurance, Dental and Vision.
  • Friendly culture that brings the best out of everybody
  • Great working environment, we believe this in its truest form, "Never Doubt that a small group of thoughtful committed technologists can change the world. Indeed, it is the only thing that ever has" - Margaret Mead


Fortanix is an equal opportunity employer that celebrates diversity and is committed to creating an inclusive workplace with equal opportunity for all applicants and teammates. Our goal is to recruit the most talented people from a diverse candidate pool regardless of race, color, religion, age, gender, gender identity, sexual orientation, or any other status. If you’re interested in working in a fast-growing, exciting working environment – we encourage you to apply!