Welcome back to the series on Datadog pricing. In the previous blog post we covered some of the reasons why you should care about your Datadog costs. In this blog post, we’re going to cover how in general Datadog products get billed, and what causes this end-of-month invoices to be a bit, let's say, unexpected.
As in the previous post, a disclaimer first - Datadog is an AMAZING product. And although throughout this series I'm covering how costly it is, it’s also very valuable, and the observability gained by it is exceptionally good. You pay a lot, but also get quite a lot in return.
Datadog Products in High Level
Generally, we can break down Datadog product pricing into 3 major buckets:
Host based pricing
Volume based pricing
User based pricing
Some of those pricing models usually have exceptions, and other add-ons, but the general scheme of things is the above three.
To best understand the end-of-the-month invoice, it’s important to understand each of those, and how they can be optimized, or forecasted.
Host Based Products
In host base products, such as APM, infrastructure and CSM, Datadog offers a monitoring platform for hosts (instances), with various capabilities.
In those products pricing model - the customer commits to a number of monitored hosts on a monthly basis, with what seems to be a very reasonable price of a few tens of dollars, but the interesting and what might be the surprising part, is that Datadog bills by the hourly concurrently monitored hosts (concurrently active host in a 5 minute intervals, averaged by the hour). So generally, when you’re committing to X monitored hosts, you’re actually committing to an X concurrently active hosts in each monthly hour.
Why can that be surprising? Because our systems are usually elastic, and might (auto) scale - therefore the 100 hosts you’ve committed to can turn 170 in the peak hours of your system’s traffic, and 70 in the low point of your traffic, but you have to put a fixed number on your commitment. And here it becomes a game of dynamic optimization - commit to just enough hosts to not be slaughtered on on-demand rates during your peaks, while not overcommitting in your low hours.
Moreover Datadog pushes you to commit up front in this pricing model by applying quite an aggressive method: If you’re not committing to any quantity - they are billing by 99’th percentile used hour and NOT by the actual usage. I.e. the number of hosts active in your 99’th percentile most active hour will be your end-of-month invoice. And as engineers, we know how unpredictable our 99’th percentile can be.
On the engineering side - I can understand why this pricing model was chosen - when Datadog has no estimation for your usage, they need to overprovision their system to support any load, thus having larger operational cost. Does this justify the p99 billing point - it’s up to you to decide.
Volume Based Products
In volume base products, such as Logs, Synthetics you pay per use - quite as simple as that. You send Datadog 100GB of logs, you’ll pay for 100GB of logs, You execute 1M synthetic checks - you’ll pay for 1M synthetic checks.
The commitment model here is quite simple too - you commit to Volume, get a reduced price, and when you use all your committed volume, you get billed the on-demand rates.
This is the basic SAAS business model, and Datadog is not different from many other vendors.
Logs, since being a very popular Datadog product, varies a bit from other volume based products and is worth diving a bit deeper into.
It’s important to understand that in the Logs product sending logs to the Datadog platform has multiple billings applied to it:
Ingestion - You pay for the GB volume of logs you send (usually it’s a fixed price of 0.1$ / GB of data)
Indexing - You pay for the amount (count) of logs you store on the Datadog platform, and the pricing varies upon the retention for which the data is saved (the longer you save the data, the more expensive that is)
Rehydration - if for any reason, you want to make logs that passed their retention period available again, you need to rehydrate them, and pay for it.
Up until this point, no surprises - you send data to Datadog, therefore you pay for it, totally makes sense.
The interesting part about logs is that you pay regardless of usage - i.e. even if you never opened the Datadog log search webpage, you still pay those amounts. That makes total sense business wise for Datadog, since they have to process, store, and index all these logs anyhow, and pay themselves for processing it. But as a user, you might be paying for something you never actively use, or use very rarely. I don’t have numbers to support that, but if I had to guess by my personal experience, I’d say that less than 0.1% of the indexed logs are actually being used.The bottom line here is to think before you log (I’ve written in the past on log cost optimization).
Logs aren’t free, and aren’t cheap.
It’s important to mention, that in usage-based products it is very easy to over-use commitments and get to on-demand pricing which is >150% of the committed price (by default).
All it takes is to make a synthetic API test to run every 5 minutes instead of 10 to double the cost. Or launch a new service to production that takes another 20% chunk of logs, those are things that developers can do (and are doing) without applying much thought to it.
And to be honest, in this case I have no complaints against Datadog - they are quite gentlemen about defining what “active” users are.
Buy One Get Some for Free
The last piece of the puzzle are products such as containers, custom metrics, index spans.
For those products, you get a “free amount” for some of the “main” Datadog products you use.
For example, for each monitored host, you’ll get (for enterprise account):
4 monitored containers
200 custom metrics
1M APM spans
Those products will generally behave as host based / usage based products mentioned above.
These products are the hardest to understand, reason about, and get their const under control. I won’t get to the dirty details of why they are so difficult to manage, but I'll try to summarize the general theme bellow.
The Challenge of controlling container monitoring cost:
In highly containerized and dynamic environments such as Kubernetes, each host (i.e. cloud provider instance), can have a huge amount of unique active containers, causing the host-monitoring cost to skyrocket. In our case it increased our cost by ~50%.
To be fair - Datadog tries to average out outliers, but in the age of Kubernetes - containers come and go and incline additional Datadog costs.
Since the cost is per unique combinations it means the cost grow non-linearly when adding additional tags! (it depends greatly on the cardinality of each tag and whether they are codependent).
In addition using the right observability primitive is critical! Distribution/ histogram custom metrics are billed at least 5 times higher than Gauges/ Counters.
If you ever built modern complex microservice based systems, you can probably understand how the cardinality of those tags can get out of hand quite easily.
The Complexity of the Datadog Pricing Mental Model
Datadog has about 20-30 products, sub products, and additional fine-print add-ons, all of them behave roughly in the above mentioned pricing models (with some variations), some price per GB, some price per 1M event, some 1K events, and some for 10K events, some per host-month, some per host-hour.
The Browser tests are not 2.4 times more expensive than API tests, they are 24 times more expensive.
Or the part about non-committed hosts priced by the p99 usage hour, it isn’t mentioned in the main infrastructure/apm pricing pages, it’s mentioned in the billing FAQ, a page that took me 5 minutes to find while writing this post, even though I know it exists and knew what i was looking for.
Someone needs to hold this mental model in their head to understand, provision, and commit to a budget.
It took me more than two weeks to get my head around all those bits and bytes and create usage to cost translation for our application. Those are two weeks that I was in fullfocus on solving that problem, and I already have previous experience with many other observability platforms.
A week later I forgot half of the things I learned. Expecting anyone doing this commitment budgeting once a year (at best) to understand those bits and bytes is quite an unreasonable expectation.
This is hard on the FinOps engineers of the company - although they probably understand these pricing models, they usually don’t know the actual technical behaviour of the system, and aren't involved in the technical roadmap (thus don’t know how many new services will be spawned or scrapped - which affects usage).
This is hard on the Dev team too - doing budgeting and cost allocation isn’t their core competence, and even for Devs it’s hard to understand and forecast volumes.
The bottom line is that this is a guessing game until you get experience with forecasting usage, translating it to cost, and figuring out the right cost-effective commitment. It shouldn’t be.
I hope I was able to shed some light on how the Datadog pricing model works, and help with your understanding of what you’re paying for, and maybe where you start looking for optimization opportunities or adjusting your commitments to better suit your needs.
You can find the link to the original blog post here.
In the next part we’ll cover what can be done to optimize Datadog costs.
As always, thoughts and comments are welcome on twitter at @cherkaskyb
Read More About Datadog Costs
How Much Does Datadog Cost?
Understanding Datadog's pricing model is crucial when evaluating it as a solution. Explore the various factors that influence Datadog's pricing and gain insights into its cost structure. Additionally, discover effective considerations for managing usage-based pricing tools like Datadog within the context of FinOps.
In the first part of the blog series, written by our talented Software Engineer, Boris Cherkasky, we explore the question: "Why you should care about your Datadog costs?" Boris dives into crucial aspects of Datadog costs, emphasizing the importance of understanding them. He also sheds light on how Datadog pricing works, shares his experiences and lessons learned as a Datadog user, and discusses strategies to crack the Datadog cost/usage model. Moreover, Boris provides valuable insights on how to effectively gain control over Datadog costs.
In the third part of the blog series written by our talented Software Engineer, Boris Cherkasky, you will discover the key factors to consider for effectively managing your Datadog costs. Boris will guide you through the process of uncovering the hidden potential for Datadog optimization, enabling you to make the most out of this powerful platform.
Discover the intricacies of Datadog pricing, explore key features such as debug, custom metrics, and synthetic monitoring, and provide strategies to optimize costs without compromising on functionality.
Datadog Debug offers developers the remarkable ability to streamline bug resolution and optimize application performance. To fully harness the potential of this invaluable tool, it is important to grasp its pricing structure, evaluate the value of its advanced features for your specific debugging requirements, and identify key elements that influence Debug pricing.
In this blog post, we dive deep into these essential aspects, providing you with the knowledge needed to make informed decisions and leverage Datadog Debug effectively for enhanced development workflows.
Datadog custom metrics empower businesses to capture and analyze application-specific data points, tailored to their unique use cases. The true potential of Datadog custom metrics lies in the precise insights they offer into application performance. Therefore, comprehending the product's pricing structure and evaluating the value of advanced features becomes crucial in making informed decisions to optimize costs effectively.
Integrating Datadog Synthetic Monitoring into your monitoring and observability strategy is a vital step for organizations seeking to proactively monitor and optimize their applications, while ensuring exceptional user experiences and mitigating risks.
In this blog, we will dive into the Datadog Synthetic pricing structure and explore the key factors that influence these costs. By understanding these aspects, you will be equipped to make informed decisions and leverage the full potential of Datadog Synthetic Monitoring.
Discover effective cost optimization strategies for utilizing Datadog to its full potential without incurring unnecessary expenses. By implementing these best practices, organizations can achieve maximum efficiency with Datadog while ensuring a high level of observability. Learn how to reduce monitoring costs without compromising on the quality of insights and monitoring capabilities.