Home > Find Jobs

Job Search

A tropical beach
Katmai Tech Inc. company logo

Katmai Tech Inc.

USA, Canada

Posted on: 09 June 2024

Experience

n/a

Work

n/a

Employee Type

n/a

Salary Range

n/a

Senior Site Reliability Engineer

ABOUT KATMAI

Katmai is pioneering the future of virtual experiences and hybrid work. The platform brings people together inside an easy-to-navigate 3D environment, enabling natural communication & collaboration, spontaneous interactions, and a sense of place that’s been missing from the digital world. The simplicity of the user experience means no headsets are required — Katmai runs in the browser of any webcam-enabled computer.

Katmai was founded in 2020 and is partnering with innovative-focused brands to create everything from branded virtual offices to one-off interactive experiences to digital twins of physical real estate.

JOIN THE TEAM

The Katmai team spans the globe from Alaska to the Netherlands, with many stops along the way. We are a fully remote team that works inside our own product. Call us biased, but we think it’s awesome.

Working at Katmai, we experience first-hand the advantages of being together in the same space. We feel the way it facilitates more natural communication and self-expression. We see it turn strangers into close colleagues and friends. Yes — we are building a virtual game changer, here.

Ready to be part of the future?  Join Katmai’s innovative team and together, let’s redefine virtual experiences and hybrid work in a headset-free world of 3D collaboration and spontaneous interactions!

We work inside of the Katmai Virtual Office.

WHAT WE’RE SEEKING

We are looking for a Senior Reliability Engineer to join our DevOps team who has a track record of crafting solutions that are focused on improving Availability, Performance, Monitoring and Incident Management. If you’re passionate about mentoring and crave a role where your contributions make a significant impact, we want to hear from you.

WHAT YOU WILL DO

As a Senior SRE, you will work with the DevOps team (DevX, Infrastructure, SecOps, ITOps) and other engineering teams to ensure performance, availability, monitoring and incident response to all our systems. Additionally, you will collaborate with the different support and operations teams to establish a streamlined process for incident management.  Duties include, but are not limited to:

  • Assist in the design of systems with a focus on reliability. You will work closely with the DevOps team to ensure that the systems are built with reliability in mind form the ground up.
  • Monitor systems to track the performance and health of various components within the infrastructure, including configuring alerts to notify the team of potential issues before they escalate into outages
  • Lead efforts to quickly diagnose and resolve incidents, minimizing downtime and impact on users.
  • Develop and maintain tools and scripts to automate repetitive tasks, streamline processes, and improve efficiency.
  • Analyze system performance, stress test systems and collect metrics to forecast future capacity needs and plan for scaling infrastructure accordingly.
  • Design and implement incident management systems and disaster recovery plans to ensure that the systems can quickly recover from outages or failures which includes implementing back-up systems, failover mechanisms, and conducting regular disaster recovery drills.
WHAT YOU BRING
  • Technical Skills
    • Knowledge and experience working with logging and monitoring tools such Dynatrace, Datadog, Splunk, New Relic or similar tools.
    • Work experience with alerting systems (PagerDuty or similar) and on-call response.
    • Ability to setup and manage incident management system by integrating and utilizing different tool sets.
    • Experience building dashboards integrating different metrics and monitoring which can show the health of our different systems and services.
    • Excellent coding and scripting (python, shell, groovy) skills in more than one language, Terraform is a plus.
    • Experience as an SRE in a public cloud environment with a good understanding of Cloud (AWS), networking and storage.
    • Skill/ experience around observability, debugging and performance tuning
    • Addressing issues based on priority and urgency, facilitate follow-up and resolution to meet SLA’s for production availability.
    • Recognize when you are dealing with a problem repeatedly and take that initiative to automate.
    • A good understanding of CI/CD process and systems
    • Bachelor's Degree in Computer Science, Computer Engineering, or equivalent work experience.
  • Communication
    • Ability to communicate openly and effectively within team and across different teams.
    • Be open to learning new technologies, tools and optimizing their potential.
    • Always be approachable and foster an inclusive environment
  • Teamwork
    • You excel in remote team environments, drawing on your extensive experience to contribute effectively.
    • You help others with specific, constructive feedback to support the growth and development of your peers.
    • You can set team direction to inspire collaboration towards shared goals.
    • Passionate about mentoring junior engineers as well as peers.
    • You take ownership and responsibility in a situation.

 

…………………………………………………………………………………………………………………………………………………….

Katmai is a fully remote company currently hiring in the United States and Canada.   The minimum and maximum full-time annual base salaries for this role are based on your job-related skills and experience and by location.   Please note that this base salary information is solely for candidates hired to perform work within one of these locations and refers to the amount Katmai is willing to pay at the time of this posting.  Actual base salaries will vary depending on factors including but not limited to job-related skills, experience, performance, and work location.  The salary range listed is just one component of Katmai’s total compensation package for employees.  These ranges may be modified in the future

  • Location is United States: USD $165,000 and USD $185,000 for a Senior SRE
  • Location is Canada:   CA$160,000 and CA$185,000 for a Senior SRE

 

Katmai is an equal opportunity/affirmative action employer. All qualified applicants will be considered for employment without unlawful discrimination based on race, color, creed, national origin, sex, age, disability, marital status, sexual orientation, military status, prior record of arrest or conviction, citizenship status, current employment status, or caregiver status.

 

Benefits

Katmai is committed to offering a comprehensive portfolio of employee benefits designed to support the health and wellbeing of you and your family. Benefits vary by location and include:

  • 100% remote work environment
  • MacBook Pro and monitor
  • Health, dental, and vision insurance (100% of premium paid)
  • Katmai swag
  • Vacation, Sick Leave, and Holiday Pay
  • Short-term Disability and Life Insurance
  • Employee Assistance Plan

Tags

automation
CI/CD
Apply to job