37Signals had a fantastic blog post last week detailing their internal metrics service they use for all their products. After using a larger stats package, they moved to a custom system for all of the reasons that we’ve created Metricfire:
While we still use some of those tools today, we found ourselves wanting more – more information about the distribution of timing, more control over what’s being measured, a more intuitive user interface, and more real-time access to data when something’s going wrong.
We’re pleased to see that the reasons they moved away from existing tools, and the design decisions they made in their own system are not a million miles away from our own. We’re also using stats delivery over UDP, and instant aggregations of min, max, sum, observations, standard deviation – as well as data transformations baked in which make other statistical information like moving averages available without you having to mess with code.
We started Metricfire to solve the ‘Everything in One Place’ problem where you try to get several systems designed for different purposes working together to produce a holistic service that covers the whole problem. Generally, it either doesn’t work or just takes too long and leaves you with plenty of quirks to work around. We’re taking care of the entire chain, from where you want to know how a certain part of your app is performing, right up to the point at which you wake someone up in the middle of the night.
It’s interesting to get this sort of perspective (and validation) from a company like 37Signals. Let us know how we can help you get the same sort of application monitoring for your own systems.