Senior Staff Engineer - Platform QA
Graphcore
About Graphcore
How often do you get the chance to build a technology that transforms the future of humanity? Graphcore products have set the standard in made-for-AI compute hardware and software, gaining global attention and industry acclaim. Now we are developing the next generation of artificial intelligence compute with systems that will allow AI researchers to develop more advanced models, help scientists unlock exciting new discoveries, and power companies around the world as they put AI at the heart of their business. Graphcore recently joined SoftBank Group – bringing large and ongoing investment from one of the world’s leading backers of innovative AI companies.
Job Summary
We need an engineer to join our Platform Quality Assurance group. You will creatively exercise our product to provide feedback to the engineering delivery and product management teams. This helps them make decisions about engineering effort and the trajectory of the product. Working in a component Quality Assurance team means you will have a close working relationship with the relevant development teams; or working in the integration Quality Assurance team means you will be exercising the aggregated and composed system at a much larger scale with a bias towards informing the product management team.
You will provide valuable observations and measurements of the product, looking beyond the natural focus of the delivery and product teams, painting a comprehensive behavioural picture of the product in typical and atypical scenarios.
Responsibilities and Duties
- Planning, constructing, and executing tests and reports optimised for the different decision-making readers including delivery engineering and the product management teams.
- Organising and maintaining a repository of results & collaboration with the QA team.
- Developing or refining your expertise in the domain area of the product component or the system in aggregate and at scale.
- Specific domains include Workload Management (Kubernetes, Ray, and so on); Cloud Development (Cloud Infrastructure Automation); Management & Observability (open source and commercial monitoring, observability and DCIM solutions)
Skills and Experience
[Essential]
- Strong relevant programming experience Python/Go/C++/infrastructure-as-code scripting or related to the domain.
- Experience working in Linux environments.
- Automation of building/testing with continuous integration systems.
- Strong impartial report writing optimised for the reader.
- Aptitude for planning, constructing, and executing responsibilities & duties above.
- English- C1 level.
[Desirable]
- Domain experience of the products under test: Containerisation (e.g. Docker), Virtualisation and Provisioning, Workload and job scheduling (e.g. Kubernetes, Ray) on high core-count machines and rack-scale installations, Management and Observability (e.g. Prometheus, OpenTelemetry, DataDog, Splunk, etc.).
- 10+ years of relevant experience related to quality assurance/testing teams.
- Experience with the Atlassian suite and CI/CD platforms such as Jenkins; GitHub or GitLab actions.
Benefits
In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we’re committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments.