Transform is a centralized metrics store in your data stack that enables you to define all your important metrics in a single place using MetricFlow, our Metrics Framework, explore and collaborate on these metrics using the Metrics Catalog, and perform further analysis by querying for metrics within your favorite tools using our Metrics API.
How to query for metrics within your favorite tools and workflows
Transform's goal is to allow you to query your key metrics simply and reliably within your existing data tools and workflows. After users define their important metrics in our Framework, we offer a wide variety of interfaces to access this information, which all rely on the core API layer, MetricFlow Query Language (MQL). MQL is the single source-of-truth layer that allows you to access consistent query results in all your favorite tools.
We believe that SQL should be treated with the same rigor as any other business logic: peer-reviewed, version-controlled, and abiding by the principle of Don't Repeat Yourself. To this last point, Transform can eliminate the need for writing (or copy-pasting) SQL into any of the tools used to do last-mile data visualization and analysis, in exchange for making simple Transform queries reliably across all of these tools.
The simplification of requests not only eliminates the need for complex and redundant logic in your downstream tools, but also ensures that your most important metrics are consistent across all of them, whether it be a CRM product, anomaly detection algorithm, or a BI tool.
In this post, we will explore the interfaces that Transform currently supports. More important than any individual interface, you will find they are all consistent with Transform's core API so that you can resolve all your metrics in the best tool for the job. You can even switch between tools without skipping a beat as your analysis needs evolve.
MQL: a GraphQL API
The core MQL API is a GraphQL interface, exposing a strongly-typed API schema for Metric metadata and query results. This MQL API forms the basis for all our other interfaces below and acts as the single source of truth that all requests are routed through. This allows all downstream interfaces to enjoy the same ease-of-use, authentication, and reliability promises made by this core API layer.
Why choose GraphQL?
- A strongly-typed API schema reduces bugs and allows our API to be fairly self-documenting for common use cases.
- GraphQL queries allow clients to build the query response they want, rather than having to piece together data across different endpoints.
- GraphiQL, a GraphQL IDE, allows our users to easily explore the graph and iteratively build and test the queries they want to make (more below).
GraphQL API Explorer: a built-in GraphQL IDE
- Autocomplete, documentation, typed schema, built-in authentication
- Never deal with API tokens and cURL requests if you don't want to
A note on authentication and data governance:
All requests to the MQL API must be authenticated to a valid user on the Transform service using a JSON Web Token or Transform API key.
Transform CLI Tool
The CLI is the swiss-army knife interface. You can use it to instantly answer questions around your metrics and their corresponding dimensions.
Consider a single metric called rainfall that expresses rainfall in inches during the month of October 2020. At this point, we've already used MetricFlow to define this metric, and now we want to look at some of the data.
First, list all available metrics to make sure rainfall exists:
Next, let's explore some values by asking for rainfall over time using MQL. The syntax is incredibly powerful and allows you to easily and quickly slice and dice data over a single dimension or multiple dimensions. In this case, I've used a time dimension (defined as ds) to look at values over time.
The CLI interface is a great way to do basic metric querying to explore your defined metrics and their associated attributes.
We know that many data analysts' have a strong preference for Python; Transform's Python interface allows users to make MQL requests in Python scripts or notebooks so that the clean and accurate metrics that we produce can be pulled in for further analysis by your data teams.
Taking the same example we demonstrated with the CLI, we are now using Python to query the metric rainfall over time and by the dimension country.
The Python interface allows you to use your existing workflows and tools to do additional analysis on top of your metrics.
While our API should allow you to derive complex results alone, we understand that for many use cases and problem sets, SQL is king for data analysis and many downstream tools rely on it.
Using the MQL JDBC connector, you can now express an API request inside of a SQL expression, which allows you to consume metrics from Transform into various other tools that support this interface. Under the hood, we send the SQL to the same MQL API, extract the MQL-specific queries, and resolve them into the fully qualified SQL
With the same example that we've provided throughout this post (the rainfall metric), we are querying metrics and selecting rainfall based on time and country by expressing our requests in an MQL() call.
You can also build additional datasets by joining the Transform metric data into tables in your data warehouse that are not part of the metric dataset.
The JDBC connector unlocks the flexibility to view, analyze and enrich your Transform metric data with common SQL-based tools that support the interface.
In summary, the metrics API can use any generic interface (SQL, Python, GraphQL etc.) and push metrics to downstream tools that end-users already use. By doing so, we are simply extending the single source of truth from the metrics layer to any single tool in your data stack so that everybody in your organization stays on the same page.