As a data practitioner, new technologies and concepts click for me when I understand how they can be used to make my job easier. One of these new technologies is MetricFlow, Transform’s metric framework, which is now open source and available to all.
My “ah-ha” moment when it came to the metric layer and specifically MetricFlow, was when I realized I could reduce hundreds of lines of code in my analytics code base and rely on MetricFlow’s abstractions to build the metric I needed. My favorite PRs to author and review as an analytics engineer are those that reduce more code than they add!
For this post, I'm going to focus on creating a cumulative metric with MetricFlow. Relying on MetricFlow’s abstractions (measures, dimensions, data source and metrics) to express this metric made my code base easier to maintain and allowed me to focus on uncovering insights with my business partners instead of spending time maintaining and debugging SQL.
Cumulative metrics are just one metric type supported by MetricFlow. I’m focusing on this metric type because it’s often difficult to calculate and manage, and usually tracks company north star metrics like Recurring Revenue or Weekly Active Users (WAUs).
For those of you who want a primer of what a metric layer is and why we built it, our open source announcement for MetricFlow, Transform’s metric framework, is a great primer on the topic.
Behind the scenes: How MetricFlow works
In MetricFlow, we take a “functional approach” to defining metrics. For a cumulative metric, like Weekly Active Users (WAUs), we only need to specify:
- The measure (i.e. the column we’re aggregating)
- The accumulation window (i.e. 7 days, 28 days, all time)
- Optional parameters such as grain-to-date, which indicates that we should only count the values up to the current date. These parameters are specified in a few lines of YAML (see Exhibit 1 below).
Once we’ve defined a metric, we can start interacting with this metric through the Transform Server or the MetricFlow API. This is where the magic happens.
I’m going to talk through the case using MetricFlow and Transform together. Transform exposes a GraphQL API that we can use to request the metric we just defined. When a metric is requested, MetricFlow will figure out the best way to return the correct metric values. This might be generating SQL—which is executed against your data warehouse—or by retrieving the results from a caching layer. Let’s focus on the case where the cache hasn’t been primed, and we’re both generating & executing SQL against the data warehouse.
In the WAU example in Image 1, the SQL we’re generating to calculate WAUs is actually quite complex, which means you don’t have to worry about owning and maintaining this code. At a high level we’re selecting the requested columns from your data source, joining to a date spine based on the accumulation window, outer joining any requested dimensions that are in another data source, and filtering the output by any constraints you have on your metrics (for example filtering out test users).
MetricFlow built what we call a dataflow plan (Image 2), which is essentially our strategy for computing the metric. If you’re interested in learning more about how we construct cumulative metrics, check out our MetricFlow repository. We can request this metric at any time granularity and cut by any dimension available in the graph of your data sources. MetricFlow translates these requests into SQL and ensures both accuracy and flexibility for our metric.
Creating a cumulative metric BEFORE MetricFlow
How does this differ from a normal workflow? Without a metrics layer, the SQL needed to generate this metric would live in your analytics code base, likely as a leaf node on your DAG (directed acyclic graph). Another common pattern is keeping this logic as a one off-query saved in a BI tool or locally in an analysts IDE. This will serve the basic use case of being able to generate a summary table for this metric, but you’ll quickly start to run into issues if you need to update the time granularity or if you need to cut the analysis by another dimension for a stakeholder. The way most teams approach this is either pull out the SQL from your code base and make a new view by modifying the underlying query, or by adding different versions of the model to the code base. There are issues with both approaches.
If you start writing custom SQL based on your original model, it’s tough to ensure logic consistency. Multiple analysts could be making different updates. Additionally, you’ll inevitably duplicate efforts by rewriting and re-running the query multiple times. If you try anticipating what your stakeholders will ask for by adding each permutation of granularity and dimensions to your transformation layer, your DAG will quickly become bloated and hard to manage. Now your team is responsible for maintaining an unsustainable number of models. Think of models called daily_active.sql, weekly_active.sql, monthly_active.sql etc. Clearly neither of these options are ideal for a data team.
Creating a cumulative metric WITH MetricFlow and Transform
The way that MetricFlow solves the challenges mentioned above is what gets me excited about the product. Firstly, you’re able to DRY (Don’t Repeat Yourself) out your code base by relying on MetricFlow’s abstractions, allowing you to avoid managing models at different levels of granularity or dimensions. MetricFlow will act as a denormalization engine by joining in the tables that contain the needed dimension and aggregate your measure to the right level of granularity. This can be done on the fly, or you can prime the cache for common requests to increase performance.
Second, the calculation of your most important metrics are standardized. You don’t need to worry about analysts modifying the WAU model—like adding a country split—and coming up with the incorrect value. Your organization has a conical definition of WAU, defined in code, that is flexible enough to meet the needs of multiple stakeholders from finance, to product, to marketing.
Lastly, and my personal favorite (!), your data team no longer has to maintain a large, brittle model! This means less time debugging and maintaining code.
Ultimately, new technologies should empower your data team. I think defining metrics in code, and relying on MetricFlow’s abstractions to generate your metrics, makes it much easier for data teams to manage a complex piece of SQL, like the one needed to generate cumulative metrics. It’s a shift in thinking from adding more nodes to your DAG.