Rootly: Developer Time is Precious

by Mars Garza

A common theme we look for when investing in early stage companies  is the concept of the “single player” to “multiplayer mode”, that is, how can a product provide value both to the individual and across roles at a company to unify execution. This is particularly important when something fails. Systems encompass much more than code and commands — they include the exec team, engineering team, customer success, users, vendors, and of course, cost time and money. This complexity requires collaboration, and yet, there are no existing solutions that coordinate response across multiple teams when there’s a fire and a product cannot be effectively delivered to its customers. 

This white space has only widened during the pandemic and we take the view that it will continue to grow as teams stay increasingly remote. Issues are harder to resolve and engineers no longer use physical war rooms to muscle through an incident, let alone prevent and predict them by improving actual tooling. The stats support this perspective:  73% of companies saw a significant increase in demand as customers expect services to be always-on; 51% of companies report slower incident response times as COVID-19 forced the world remote; demand for Site Reliability Engineering is rapidly growing and is the 5th fastest growing job title per LinkedIn. In parallel, system complexity is exponentially growing as companies shift toward cloud native microservices and depend more on 3rd party services. As Rootly co-founder, Quentin says, “It is unlikely for one person to possess all of the tribal knowledge required to fix the problem, and anyone who does probably is not going to be available to help in the moment of need.” 

Regardless, teams continue to be largely non-collaborative during fires, which is in direct conflict with the nature of incidents – urgent, complex, and novel. Instead, engineers memorize manual processes and use fragmented tooling while under pressure to resolve an incident as quickly as possible. And in the end, downtime costs an average of +$300,000/hr. Examples in the last year include Slack, Robinhood, and Fastly, which only goes to show that every company experiences chaotic incidents and everyone needs a better solution. 

This is where Rootly comes in. Grounded in the insight that incidents will 100% happen, teams should benefit rather than be harmed by them. Not only does the Rootly platform and Slackbot provide incident resolution through automated manual administrative tasks (and much more robust, integrated and centralized tooling), but they also provide insight into future prevention with restropectives and actionable insights. The key to SRE is a successful and blameless postmortem that creates learning opportunities across teams. 

We believe there is no one better to build and deliver this solution than Quentin and JJ, they understand the problem at a very practical level. Quentin was an early Site Reliability Engineer at Instacart, building much of the foundations for Instacart’s system which went from handling thousands of orders a week to millions as they scaled. JJ applied his own set of best practices on the product side of Instacart as the company experienced crazy growth during the pandemic. Together they’ve built an elegant solution that takes the best practices and tools used by million dollar budgets and made them accessible to any organization, effectively eliminating the need for a company to build something in-house.


XYZ is excited to announce our lead investment in Rootly’s $3.2M seed round, alongside 8VC, Y Combinator and C-Suite executives at companies like Dropbox, GitHub and Instacart, to help execute the team’s vision to change the SRE landscape and save companies millions in revenue and lost productivity. We think there is no better time than now — we’ve had enough fires.

Previous
Previous

Blues Wireless: The World’s Machines Connected

Next
Next

Sanlo: Game Development Reimagined