Defining a revenue metric in Transform (Part 1): Capture metric logic in code

Revenue is a critically important metric for every business, but often, it’s poorly defined across growing organizations. For example, it’s not uncommon for teams like sales, finance, and product to end up with their own slightly different definitions for what revenue actually means! This can lead to a number of obvious problems with everything from compliance to reporting, but it also introduces inefficiencies and confusion that can hurt an organization’s ability to make the best decisions.

To create alignment, your organization needs a centralized place to capture and publish vetted metric definitions and other institutional knowledge around your most important metrics. Transform provides a central catalog, framework, and API to capture metrics definitions, add context, and help people find the metrics they need to make decisions.

To see Transform in action, watch our product demo.

Using a revenue metric as an example, this post will walk through how a fictional song manufacturing company, CustomSong Inc., implements best practices around metrics.

  • In this article (part 1 of this series), you’ll see how CustomSong uses Transform to record an approved, change-controlled metric definition. Once they’ve codified the definition, you’ll see how they use the Transform API to analyze the metric and associated metadata in downstream tools.
  • In part 2 of this series, we’ll share how to add context to your metrics with the Metrics Catalog.

Creating a consistent definition of revenue in code

Inconsistent metric definitions lead to costs, inefficiencies, duplicative work, misalignment, and imprecise, or even inaccurate decision making. Airbnb’s Tech Blog shared a great post about the processes they use to create alignment across teams.

Without a standard and centralized definition of revenue, analysts may spend hours writing repetitive SQL statements from saved queries that may not match reporting from other parts of the organization. If an organization uses multiple business intelligence (BI) tools, analysts are likely duplicating work in addition to calculating the same metrics, over and over.

Transform computes metrics at the appropriate level of aggregation enabling analysts to leverage the same code-based definition to do analyses or develop reporting in any downstream BI tool. The definitions are captured in the Transform Metric Framework.

In our example, CustomSong derives revenue from the transactions table in their accounting system schema.

This table can be represented in the Transform Framework as a data source using the following YAML configuration:

 sql_table: accounting_system.transactions

This first field,  sql_table, specifies the name of the SQL table that this configuration applies to. For something more complex, a SQL query (as sql_query) can be used.

sql_query: |
    SELECT
        t.transaction_id
        , t.customer_id
        , t.order_id
        , t.transaction_amount_usd
        , tt.transaction_type
    FROM
       accounting_system.transactions AS t
    JOIN
       accounting_system.transaction_type AS tt
    ON
       tt.id = ta.transaction_type_id

Measures are aggregatable expressions written from the various columns of this underlying table or query. The definition of a measure includes an aggregation type that can be used to construct metrics. This image is from the Transform Framework datasource configuration file. In addition to defining the source table or query, this file captures the measure and dimensions associated with that source.

The name field can either be the column name or, when coupled with the expr field, a new name altogether—as seen in the transaction_orders measure. Name and agg (short for aggregation) are the only required fields, and fields can be expressed in any order.

agg defines how the field will be aggregated over a date or dimension granularity. For example, a sum aggregation type over a granularity of day would sum the values across a given day. Transform currently supports all common aggregations—sum, count, min, max, average, sum_boolean, and count_distinct.

measures:
   - name: transaction_amount_usd
     agg: sum
     description: The total USD value of the transaction.
   - name: transaction_orders
     description: The total number of transaction orders.
     expr: order_id
     agg: count_distinct
   - name: cancellations_usd
     description: The total USD value of the transactions that were 	 cancelled.
     expr: CASE WHEN transaction_type_name = 'cancellation' 
           THEN transaction_amount_usd ELSE 0 END
     agg: sum

In addition to the measures defined above, you will define identifiers to enable this query construction. Identifiers act as the join keys between tables and the Transform Metrics Framework uses the Identifier type (primary, foreign, unique) to automatically construct appropriate joins with other tables. Identifiers also have all the properties of a dimension in that they can act as a granularity for aggregation.

Transform supports the most common metric types, including ratio, expression, and cumulative metrics. Using the measures created in the datasource YAML, constructing a revenue metric is simple. The revenue metric will be an expression metric (expr). In Transform, expression metrics allow you to pass in any valid SQL expression using a combination of measures. Revenue is net of cancellations and alterations.

metric:
 name: revenue_usd
 description: |
   Revenue (USD) is the income generated from normal business
   operations net of cancellations and alterations, expressed in
   USD.
 owners:
   - support@transformdata.io
 display_name: Revenue (USD)
 type: expr 
 type_params:
   expr: transaction_amount_usd - cancellations_usd + alterations_usd
   measures:
     - transaction_amount_usd
     - cancellations_usd
     - alterations_usd
 tier: "1"

In addition to measures and metrics, you can define related dimensions. Dimensions are non-aggregatable expressions that are used to define the level of aggregation that a Transform Metrics Framework user would like a metric to be aggregated to.

 dimensions:
   - name: ds
     type: time
     type_params:
       is_primary: True
       time_format: YYYY-MM-DD
       time_granularity: day
   - name: is_large
     type: categorical
     expr: CASE WHEN transaction_amount_usd >= 30 THEN TRUE ELSE FALSE END

Once you create the definitions listed  above, you can answer questions with the API:

  • What was CustomSong’s revenue for last year?
  • What is CustomSong’s year-over-year revenue growth?
  • What was CustomSong’s revenue for last year from large transactions?
  • What is the week-over-week revenue growth of large transactions?

If another data source were added that described a users table, you could quickly answer more complex questions:

  • What was CustomSong’s revenue by user's country?
  • What was CustomSong’s revenue from returning customers?
  • What was CustomSong’s year-over-year revenue growth from US users?

At this point, the Transform MQL server will generate and compute the query against your data warehouse—no SQL required. Results are cached so subsequent queries are lighting fast. While these examples are simple, queries can quickly become complex when many metrics, dimensions, and joins are involved.

By using Transform Metric Framework, CustomSong will have a codified, canonical definition of the revenue metric, as well as programmatic access to the metric’s value via the API.

Analyze a consistent revenue metric in all your downstream tools

Transform's goal is to allow you to query your key metrics simply and reliably within your existing data tools and workflows. After users define their important metrics in our Framework, we offer a wide variety of interfaces to access this information, which all rely on the core Metrics Query Language API (MQL).

MQL is the single source-of-truth layer that allows you to access consistent query results in all your favorite tools. Check out our recent post on our JDBC driver. In addition to the JDBC driver, Transform currently supports various interfaces that you can use for your last-mile analysis.

We believe that SQL should be treated with the same rigor as any other business logic: peer-reviewed, version-controlled, and abiding by the principle of Don't Repeat Yourself or a commitment to reduce the duplication of logic. To this last point, Transform can eliminate the need for writing (or copy-pasting) SQL into any of the tools used to do last-mile data visualization and analysis, in exchange for making simple Transform queries reliably across all of these tools.

The simplification of requests not only eliminates the need for complex and redundant logic in your downstream tools, but also ensures that your most important metrics are consistent across all of them, whether it be a CRM product, anomaly detection algorithm, or a BI tool.

With revenue available in all tools, the CustomSong growth team can evaluate experiments with the same metric that the finance team reports revenue to the board. They can all sing the same revenue song in harmony.

In our next post in this series, we share how you can bring context to your revenue metrics, including asking questions, adding annotations, and assigning metrics owners with the Transform Metrics Catalog.

Belinda Bennett

Belinda Bennett

Belinda is a Data Science Manager at Transform.