Analyzing Google Analytics Data in BigQuery (Part1)

What is BigQuery?

Among Google Cloud Platform family products, there are Google App Engine, Google Compute Engine, Google Cloud Datastore, Google Cloud Storage, Google Big Query (for analytics), and Google Cloud SQL.

The most important product for BI Analyst is Big Query, it is an OLAP Data Warehouse which supports DW, Join and fully managed. It can make developers use SQL to query massive amounts of data in seconds.

Why BigQuery?

The main advantage is BigQuery can integrate with Google Analytics. It means we can synchronize Session/Event data to BigQuery easily to make custom analytics, not only the Google Analytics functions.

In other words, BigQuery can dump raw GA data into it. So it means some custom analytics which can’t be performed with the GA interface now can be generated by BigQuery.

Moreover, we can also bring in third-party data into it.

What is the difficulty for BI Analyst, it means we need to calculate every metrics in queries.

Which SQL is preferred in Big Query?

Standard SQL syntax is preferred in Big query nowadays.

How we can get the data from Google Analytics?

A daily dataset can be got from GA to BigQuery. Any within each dataset, a table is imported for each day of export. Its name format is ga_sessions_YYYYMMDD.

We can also set some steps to make sure the tables, dashboards and data transfers are always up-to-date.

How to get it a try?

Firstly, set up a Google Cloud Billing account. With a Google Cloud Billing account, we can use BigQuery web UI with Google Analytics 360.

The next step is to run a SQL query and visualize the output. The query editor is standard and follows the SQL syntax.

For example, here is a sample query that queries user-level data, total visits and page views.

SELECT fullVisitorId,
       visitId,
       trafficSource.source,
       trafficSource.medium,
       totals.visits,
       totals.pageviews,
FROM 'ga_sessions_YYYYMMDD'

In this step, if we need to get a good understanding of ga_sessios_table in BigQuery, we need to make sure what is the available raw GA data fileds can be got in BigQuery.

We can use an interactive visual representation as the reference.

Next blog we will give more examples about how to analyze GA data in BigQuery according to data ranges or others like users, sessions, traffic sources, etc.

If you are interested in or have any problems with Business Intelligence or BigQuery, feel free to contact me.

Or you can connect with me through my LinkedIn.

Author: Jacqui

Data Science|Business Intelligence

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s