Quickly writing this up, as I need to publish this for a much larger post I’m due to go out with regarding chart-specific calculated fields around combining Google Ads and Google Search Console data via Google Data Studio.
What’s the issue with Chart-specific calculated fields?
The background to this post is simple: there is problem with Google Data Studio that doesn’t allow you to calculate division & multiplication with Chart-specific calculated fields properly on the basis that you have more than one dimension type within your Join Key and Dimension field enabled within a blended data set setup.
You’ll end up with charts calculating CTR using Date, Country and Device looking like the chart above (“Crazy CTR Chart”) because the underlying data set is working out the CTR for each Dimension and adding them up instead of calculating the SUM of all the Dimensions first.
This article should hopefully explain why there is an issue in the first place, explain a part solution, and ultimately leads us to the final solution which is dependent on Google fixing a specific bug with chart-specific calculated fields — with the help of a Google employee who kindly responded to my Issue Tracker ticket. A massive thank you to said Google employee. You can read more about the Issue Tracker bug report here.
So, what is Google’s explanation?
I’ve sort of already explained what the issue is and why there is a problem, but I wouldn’t do this post justice to the Google employee who responded to the Issue Tracker with an extremely thorough response on the Issue Tracker and also the sample Google Data Studio report I sent over. He/she did preface their response by saying “sorry for the wall of text, it’s hard without pictures”, so I’ll try and provide the missing pictures/visualisations to his explanations.
So I supplied Google with a dashboard that effectively gave them an explanation of the problem, and then they used that dashboard and the data source itself to run some tests & do some debugging – here’s a screenshot of that report:
I have effectively given my take on the problem in the dashboard I created for them, and they’ve gone away and found a part solution to fixing the problem through debugging — although there is an actual solution which fixes a semi-related bug, which then totally gets rid of this problem — that’s explained at the end of this post.
Here is the response from Google:
Note: Some of this is paraphrased.
— start quote —
Jonathan’s blend was Google Ads and Search Console, blended on the following dimensions and metrics:
|Data Source||Date Dimension||Query Dimension||Device Dimension|
|Google Ads||Day||Search Term||Device Type (PPC)|
|Google Search Console||Date||Query||Device Category|
Then Jonathan had Clicks, CTR, Impressions and Average Position showing for Google Search Console – as metrics, and for Google Ads, he had Clicks as a metric showing.
Why you’re seeing what you’re seeing?
When you make a blended datasource, you can think of it as creating a new spreadsheet that has every column that you selected. Any charts you use with your datasource will work with this as your underlying data. When you make a formula on a chart with a blended datasource, that formula is calculated BEFORE any per-chart reaggregation is applied.
- So a formula like “Clicks / Impressions” will be (“Clicks” of the Query-Device-Date) / (“Impressions of the Query-Device-Date). After you’ve created it, you can think of that metric as part of the underlying datasource.
- If you make a chart that uses only one or two of the dimensions, the CTR column is reaggregated according to whatever aggregation type you set in your metrics. When you set it to SUM, it will add up the CTR ratios and give you garbage data.
That’s also why AVG looks closer to correct, as Seth pointed out. The SUM of the ratios will always be way higher than the actual ratio, but average is usually more realistic.
It’s wrong…except in special cases, you can’t fix reaggregation problems with AVG. But they look close.
How to fix it
As a general rule, the fix for reaggregation problems is to stop reaggregation from happening in the first place.
Aggregation only occurs if you use fewer dimensions in your chart than there are in your datasource. That means you can fix your reaggregation problems by removing dimensions from the blended datasource that aren’t used in your chart.
For example, one of Jonathan’s tables had just the “Query” dimension with metrics and formulas:
The fix for that table was to create a copy of the blend and remove the “Date = Day” and “Device Category = Device Type” join keys, leaving only “Query = Search Term” (side note: you can duplicate a blend from the “Resource > Manage blended datasources” menu):
Note: that even when you don’t include dates in your blend, a date range filter will still work because dates are pushed down to the underlying datasource (and your underlying “spreadsheet” is rebuilt with new values).
Since the blended datasource and chart use the same dimensions, there’s no re-aggregation and a CTR formula (and any other formulas) end up with the correct values.
The other example from Jonathan’s report was a Time Series chart with Date, two CTR metrics, and a formula that summed those two metrics.
Using a blend with all 3 join keys gives you ridiculous summed ratios over 1000% because of the issue described above:
The fix for this is basically the same. Create a duplicate blend, remove all but the “Date” dimension, and leave everything else the same.
Again, there’s no re-aggregation because the chart uses all the dimensions in the blended datasource.
To sum up
- To prevent reaggregation, just remove dimensions that aren’t in your chart from your blended datasource.
- Until the SUM(metric1)/SUM(metric2) fix is submitted, this is the only way you can get accurate values.
— end quote —
The ultimate fix – ‘SUM(metric1)/SUM(metric2)’
Ultimately, the fix he mentions on ‘SUM(metric1)/SUM(metric2)’ would be the actual solution to this problem – as this would allow you to use important dimensions that give you the ability to slice your data up and give you insight – with the part solution that is mentioned above, you’re unable to slice your data up (as far as I know).
Whilst an ETA was not provided for the fix to SUM(metric1)/(metric2), I’ve asked the question in the Issue Tracker via the bug report I filed in there, so hoping to get an answer back or be magically surprised when they fix it through Google Data Studio eventually! 🙂
Edit (08/01/2019) ETA for the fix – update:
Jan 8th, 2019: “We’re taking a look this week at the SUM(metric1)/SUM(metric2) thing. Not sure what that means in terms of it being available, but I’ll update this thread once I get a better time estimate.”
As a side note, you can see the ‘Invalid formula’ message occurs when you try and add SUM to a chart-specific calculated field:
Bug report itself: https://issuetracker.google.com/issues/119319856