1

I am using a query to calculate daily retention on my Firebase Analytics data exported to BigQuery. It is working well and the numbers match with the numbers in Firebase, but when I try to filter the query by a cohort of users, the numbers don't add up.

I want to compare the results of an A/B test from Firebase, and so I've looked at the user_property "firebase_exp_2" which is my A/B test, and I've split up the users in each group (0/1). The retention numbers do not match (at all) the numbers that I can see in my A/B test results in Firebase - actually they show the opposite pattern.

The query is adapted from here: https://github.com/sagishporer/big-query-queries-for-firebase/wiki/Query:-Daily-retention All I've changed is adding the following under the "WHERE" clause:

WHERE
event_name = 'user_engagement' AND user_pseudo_id IN 
(SELECT user_pseudo_id
FROM `analytics_XXX.events_*`,
UNNEST (user_properties) user_properties
WHERE user_properties.key = 'firebase_exp_2' AND user_properties.value.string_value='1')

Firebase says that there are 6,043 users in the Control group and 6,127 in the Variant A group, but my numbers are 5,632 and 5,730, and the retained users are around 1,000 users more than what Firebase reports.

What am I doing wrong?

1 Answers1

0

The export to BigQuery happens on a daily basis and each imported table is named events_YYYYMMDD. Additionally, a table is imported for events received throughout the current day. This table is named events_intraday_YYYYMMDD.

The additions you made are querying from events_* which is fine. The example uses events_201812* though which would ignore the intraday table. That would explain why your numbers a lower. You are missing users added to the A/B test during the current day.

kphil
  • 891
  • 1
  • 9
  • 14
  • Hi, you're right, but I changed that in the query already, so that's not what makes the difference. But I guess the BigQuery export from Firebase is made at certain time intervals, so there could still be data missing from the previous days, thanks.Any idea what event firebase uses to define the date that someone signs up? Is it the timestamp of first_open, the user_first_touch_timestamp or something else? – Maja From Andersen Apr 25 '19 at 07:15
  • The export happens daily and the intraday table is populated in real-time as events are collected. So you shouldn't be missing data. Its more the other way around, because i don't think the firebase console A/B test data updates in real-time. The `event_timestamp` for `first_open` and `user_first_touch_timestamp` are exactly the same for me. – kphil Apr 25 '19 at 10:42
  • Okay then I don't understand, because there is still more than 500 users in difference between the numbers I get and the ones in Firebase. – Maja From Andersen Apr 25 '19 at 11:06
  • What's your condition for the A/B test and 1000 more retained users is for day 1? – kphil Apr 25 '19 at 12:19
  • The condition for a user being part of it? That they are part of an audience with first_open events. No it's for day 2-3 retention – Maja From Andersen Apr 26 '19 at 13:58
  • Did you checked [Firebase vs BigQuery Active Users Discrepancies] (https://stackoverflow.com/questions/54919813/firebase-vs-bigquery-active-users-discrepancies) ? Where is mentioned that if you want ensure the match you have to query the event_timestamp value in your BigQuery table to match up with what's in your timezone. – Enrique Zetina Sep 11 '19 at 20:38