Published on June 20, 2023 (about 2 years ago)

Creating a data reporting strategy with Google Pub/Sub, BigQuery, & Looker Studio

Richard Headon
By Richard Headon7 min readEngineering

If you’re a Mux Data media customer, you’re probably aware that you have the option to export the raw data collected by Mux into your own data warehouse. This is available either as a daily CSV file or as a data stream via Google Pub/Sub or AWS Kinesis, with the data stream allowing you to receive the data as views are completed in real time.

Why should you store this data? First, and most importantly, this is your data.

The Mux dashboard is a powerful tool to analyze your Quality of Experience metrics. For many customers, the dashboard covers all of their requirements, although one of the first limitations that may materialize is when you want to analyze data older than the maximum 100-day Mux retention policy. This alone is a reason to permanently store the Mux view data in your own data warehouse.

Another reason is that you might want to perform a complex query not directly available in the Mux dashboard. You could submit a product request — or with the data in your own data warehouse, you could simply help yourself. You may not know what insights you want to look at now, but when the time comes, you’ll be glad all your data was committed to a data warehouse.

For the purposes of this article, we will focus on creating a data reporting pipeline using tools within Google Cloud Platform, or GCP. We will set up a Pub/Sub data stream to receive the view exports from Mux as they complete, permanently persist the data in BigQuery, and then connect Looker to BigQuery so you can easily analyze the data that Mux has provided.

LinkSet up a data pipeline

Before we get started, we need to download the latest version of video_view.proto from the mux-protobuf repository at https://github.com/muxinc/mux-protobuf/tree/main/video_view. This is used to create the Pub/Sub schema and generate the BigQuery schema.

Open up your Google Cloud Console at https://console.cloud.google.com/ and select the GCP project in which you want to ingest your Mux video views.

LinkCreate a Pub/Sub schema

In the navigation menu, find Pub/Sub and click on Schemas.

A screenshot showing the location of the Schemas submenu item within the Pub/Sub parent menu

Click on CREATE SCHEMA.

LinkCreate the schema

  • Name your schema.
  • Select Protocol buffer for the schema type.
  • Copy and paste the contents of video_view.proto into the schema definition.
A screenshot showing the new schema form. The schema ID value is mux-video-view-schema. The Schema type is "Protocol buffer". The Schema definition contains the proto file contents mentioned above.

In the schema definition, find the sequence_number field under the Event object. The unsigned integer data type uint64 is unsupported in BigQuery, which will cause the BigQuery subscription step to fail. Change this data type to int64.

required int64 sequence_number = 7;

Validate the schema and click CREATE.

A screenshot showing a "Schema is valid" message, which has appeared after clicking on the button labeled "Validate definition."

LinkCreate a Pub/Sub topic

In the navigation menu, find Pub/Sub and click on Topics.

A screenshot showing the location of the Topics submenu item within the Pub/Sub parent menu

Click on CREATE TOPIC.

LinkCreate the topic

  • Name your topic.
  • Check the box for Use a schema.
  • Select the Pub/Sub schema created above.
  • Select Binary message encoding.
A screenshot of the "New topic" form prefilled with the values described above.

Click CREATE.

LinkCreate a BigQuery table

Before creating the BigQuery table, we must first create the schema definition from the same video_view.proto as used in the Pub/Sub topic.

Google has a plugin for the ProtocolBuffer compiler that can read protocol buffer schemas and convert them to BigQuery schema files in JSON: https://github.com/GoogleCloudPlatform/protoc-gen-bq-schema.

Install the ProtocolBuffer Compiler (https://github.com/protocolbuffers/protobuf) and this plugin, ensuring that both protoc and protoc-gen-bq-schema commands are available on your $PATH.

Open a terminal window and run the following command, substituting the paths to the output video_view.proto file:

protoc --bq-schema_out=/PATH/TO/OUTPUT -bq-schema_opt=single-message video_view.proto

This will generate a file named video_view.schema in JSON format.

In the Navigation menu, find BigQuery and click on SQL workspace.

A screenshot showing the location of the SQL workspace submenu item within the BigQuery parent menu

If you haven’t already, create a data set in BigQuery in which you can create your table.

A screenshot showing the location of the "Create data set" action, located under a kebab menu.

In the Explorer menu, select Create table on your data set.

A screenshot showing the location of the "Create table" action item, located under a kebab menu.

LinkCreate the table

  • Set Create table from to Empty table.
  • Name your table.
  • Set Table type to Native table.

Partitioning your table is optional, but it is good practice in BigQuery, as specifying a partition will ensure that less data is being processed. You can force users to use it by checking the Require WHERE clause to query data box.

A screenshot of the "Create table" form prefilled with the values described above.

LinkDefine the table schema

  • Toggle the Edit as text switch.
  • Paste the JSON schema generated from video_views.proto.
A screenshot of a form field for editing the schema as text. The textarea is prefilled with the JSON schema as described above.

Click CREATE TABLE.

LinkCreate a Pub/Sub to BigQuery subscription 

Navigate to your Pub/Sub topic and click EXPORT TO BIGQUERY.

A screenshot of a section containing a button labeled "Export to BigQuery"

Select BigQuery and Use Pub/Sub and click CONTINUE.

A screenshot of a section labeled "Export." The "Sink" dropdown value is selected as "BigQuery" and the Use Pub/Sub radio is selected.

LinkCreate a subscription

  • Provide a subscription ID.
  • Select your Pub/Sub topic.
  • Select Write to BigQuery as the delivery type.
  • Select your BigQuery data set.
  • Enter the name of the table you created in BigQuery.
  • Check the box Use topic schema.
  • Check the box Drop unknown fields.
A screenshot of the "New subscription" form containing the values described above.

Click CREATE.

If you see an error message like the one below, then you need to grant the service account Cloud Pub/Sub Service Agent and BigQuery Data Editor roles.

Service account service-XXXXXXXX@gcp-sa-pubsub.iam.gserviceaccount.com is missing permissions required to write to the BigQuery table: bigquery.tables.get, bigquery.tables.updateData.

To grant access, start by making a copy of this service account ID.

From the navigation menu, open IAM and admin and select IAM (it’s best to open this in a new tab).

A screenshot showing the location of the "IAM" submenu item within the "IAM and admin" parent menu

Click GRANT ACCESS.

A screenshot showing the "Grant access" tab selected within the "IAM" section.

In Add principals, paste the service account ID.Under Assign Roles, select Cloud Pub/Sub Service Agent and BigQuery Data Editor.

A screenshot showing the "Grant access" form prefilled using the values described above.

Click SAVE.

Navigate back to your subscription and click CREATE.

LinkConfigure Mux to push exports to Pub/Sub

If you’ve made it this far, you’re now ready to configure your streaming exports on Mux.

Open your Mux Dashboard at https://dashboard.mux.com/, navigate to the Streaming Exports page under Settings, and click New Streaming Export.

A screenshot showing the "Streaming Exports" submenu item selected, located under "Settings" in the Mux dashboard.

Select the Mux environment that you would like to export, select Video Views, and set the service to Google Cloud Pub/Sub.

A screenshot of the "New Streaming Export" form, prefilled with the values described above.

Take a copy of the Mux Google Cloud Service Account and navigate back to your Google Pub/Sub Topic on your Google Cloud Console: https://console.cloud.google.com/.

A screenshot showing the location of the "Topics" submenu item within the "Pub/Sub" parent menu

Select your Pub/Sub topic.

On the INFO PANEL on the right, navigate to PERMISSIONS and click ADD PRINCIPAL.

A screenshot of the Topics page, showing the "Permissions" section which contains a button labeled "Add Principal."

Paste the Mux Google Cloud Service Account ID into New principles and assign the role of Pub/Sub Publisher.

A screenshot of the "Add principals" form, prefilled using the values described above.

Click SAVE.

Take a copy of the Pub/Sub topic name.

Navigate to the Mux Dashboard and paste the Pub/Sub topic name.

A screenshot of the "Pub/Sub topic name" field within the Mux dashboard, populated with the value of our example topic name.

Click Enable export.

Congratulations! You have successfully set up your Mux Video Views Exports data pipeline to push to BigQuery.

LinkConnect to your BigQuery data source within Looker Studio

In a new browser tab, open up Looker Studio (https://lookerstudio.google.com/) and select Data Sources.

A screenshot of Looker Studio dashboard. The "Data sources" tab is selected.

Click the Create button in the top left corner and select Data source.

A screenshot of the Looker Studio dashboard. The "Create" button is selected, and a dropdown menu is appearing. The "Data source" dropdown entry is highlighted.

In the list of Google connectors, select BigQuery.

A list of Google Connectors within the Looker Studio dashboard. There are several options, but BigQuery is the correct one to choose.

Then select the Google Cloud project, data set, and table you created in BigQuery earlier.

A screenshot of the BigQuery connector settings. The example table we created in BigQuery earlier in this article is shown as selected.

Click CONNECT.

You can now create a report in Looker Studio using your Mux data stored in BigQuery.

LinkWrapping up

So there you have it. Your Mux view data is now permanently stored in BigQuery and will be updated in real time as the views complete, and with Looker Studio, you can create powerful insights into your customers’ video Quality of Experience over the life of your product.

Written By

Richard Headon

Richard Headon – Solutions Architect

[@portabletext/react] Unknown block type "span", specify a component for it in the `components.types` prop

Leave your wallet where it is

No credit card required to get started.