Site Reliability Engineer (SRE)
Why is the Site Reliability Engineer important at PowerChord?
PowerChord’s digital platform helps manufacturers grow online engagement and in-store sales through their dealer networks. Working with leading brands from around the world, we’ve combined years of industry expertise with advanced technology to provide easy-to-use, scalable tools that help dealers compete against large, online retailers.
PowerChord needs an experienced Site Reliability Engineer (SRE) who is passionate about service reliability and performance to help lead the creation and evolution of our SaaS platform which powers millions of web pages serving customers on three continents. As we continue to build out our SaaS platform, we recognize the need to empower a detailed-oriented and results-driven Site Reliability Engineer (SRE).
Traits of an ideal candidate for PowerChord’s Site Reliability Engineer:
- Loves writing Go (golang) to automate away manual processes and eliminate friction from continuous integration and deployment pipelines
- Experienced with, and wants to leverage Kubernetes and Google Cloud Platform (GKE) to achieve great things
- Wants to be empowered to make a meaningful impact on our organization
- Recognizes software that deploys and observes our software is our most important software
What are some of the things you will help us achieve?
- Build and maintain a resilient and fast pipeline that detects and recovers from common failures
- Ship often
- Help everyone who writes code feel comfortable deploying and rolling back to the last known good state
- Invest in observability and instrumentation so we can confidently verify each release is working following a deployment and nothing strange is occurring
- Invest in canaries and the promotion of traffic gradually and automatically
- Enrich alerts by attaching meaningful details--especially to new alerts seen within an hour of deployment
- Build and continuously refine scalable infrastructure to deliver improved reliability, repeatability, and security which increases service levels while lowering costs and increasing productivity
- Build reliable, available, and sustainable observability into our products, infrastructure, and organization
- Partner in the design and development of new and evolving services, architecture, and performance standards
- Set the rules and create tools needed to automate processes, and facilitate the deployment and rollback of new services or changes to existing ones
- Collaborate to, and directly instrument systems so their internal state is observable and highly visible within the organization
- Write code and scripts to automate resource provisioning, service configuration, and the promotion of code using tools and languages such as Go (golang) and Bash
- Develop and maintain interactive runbooks which describe problems, steps to triage, actions to resolve, escalation paths, and other guidance to help team members quickly identify and resolve issues in production systems
- Troubleshoot issues in production and other environments to diagnose and resolve problems which include sharing on-call responsibilities
- Minimize risk of failure outcomes as it pertains to durability, availability, performance, and correctness
- Document incidents and produce incident response reports and post-mortem reports
- Collaborate and partner with the rest of engineering, working together to come up with solutions that meet the dynamic and changing needs of our business
- 5+ years of experience in Software Engineering and/or project management
- BA/BS in a technical discipline or equivalent experience
- Strong technical knowledge and experience working in complex technology engagements
- Software Engineering and/or hands-on programming experience
- Experience with Atlassian JIRA, Confluence, and the JIRA Portfolio add-on
- Knowledge of server technologies such as Go (golang), NodeJS, and public cloud providers including Google Cloud and AWS
- Experience with Kubernetes (k8s), GKE, and cloud-native software
- Most of our Engineering Team is distributed throughout the continental United States; however, preference for this position is given to candidates in the Tampa, FL region.
Working at PowerChord
- Competitive compensation including simple IRA with 3% company match
- 14 days of vacation your first year – woo hoo! +2 additional days each year up to 22 max
- Health is important – we carry a variety of plans to meet your needs and budget
- Paid Short Term and Long Term Disability plans run concurrent with FMLA Leave
- 2 weeks full paid Family Leave for birth or adoption of new child (mother or father)
- Unlimited coffee, cold beverages & snacks
- Dress for success in your own personal style - shorts and flip flops will do!
- Company sponsored tech talks and happy hours
- Much more…
For more information please visit us at www.PowerChord.com.
PowerChord is an equal opportunity employer. Must be legally eligible to be employed in the United States without sponsorship/transfer.