Site Reliability Engineer

Company Info
Tempo Software
Boston, MA, United States

Phone:
Web Site: tempo.io

Company Profile
col-narrow   

Title:

Site Reliability Engineer

Location:

Boston, MA 

Job ID:

67627
col-wide   

Job Description:

 

*The position can be performed remotely or in office from Iceland, USA or Canada*


The Role

We at Tempo are looking for a Software Developer for Site Reliability Engineering to find innovative ways to optimize the development pipeline, runtime performance and the availability and efficiency of all our cloud applications and services. If cloud security, reliability, performance, and cost optimizations, are projects that you find exciting and you have hands-on experience operating large scale cloud services, we may well have the job for you!

 

The role involves

  • Developing the solutions for scalability and performance challenges to keep our high-availability products up and running.
  • Supporting our developers with continuously delivering their software to the Tempo cloud platform.
  • Ensuring reliability, responding to outages and supporting the team in resolving pressing and complex technical issues.
  • Proactively find ways to optimize our platform to ensure effective scalability and operational cost reductions.
  • Supporting various initiatives regarding cloud security, including identity verification, access controls, and permissions.
  • Developing and maintaining our tools for deployment and cluster management.
  • Debugging complex production issues using our observability tools, logging, metrics and APM.

 

The ideal candidate

  • Has a BS or MS degree in Computer Science or a related technical field.
  • Has experience with managing production systems in AWS.
  • Has experience with managing microservice solutions with Kubernetes.
  • Has experience with programming in Python and GoLang.
  • Has experience with Terraform and Ansible.
  • Has experience with DevOps, various deployment strategies and maintaining large scale, multi-tenant applications.
  • Has an impressive track record in managing and working with cloud platforms and cloud automation and monitoring tools. Our stack includes AWS, GCP, Kubernetes, Docker, RabbitMQ, and Datadog.
  • Experience working with these tools or alternative tools in a production setting is a must.
  • Has a solid understanding of configuration management and engineering for large scale websites and/or products, including networking, databases, and operating systems.
  • Has a deep understanding of distributed version control systems like Git, including branching and merging strategies.
  • Has important know-how of software build tools (e.g. Gradle and Maven) and continuous integration tools.
  • Is proactive and creative in identifying ways to improve systems and their reliability.
  • Is passionate about automation: We strongly believe in the benefits that repeatable environments bring to a software organization.

 

What's In It For You

  • Remote work options!
  • Unlimited vacation in most of our locations!!
  • Great benefits plan including health, dental, vision and more
  • Great office spaces in Canada & Iceland
  • Diverse and dynamic teams
  • Challenging and exciting work
  • An opportunity to have a real impact on our business
  • Free breakfast and snacks
  • A great range of social activities
  • And so much more!!

  Note: As our hiring teams are global, please submit your resume in English only!