Google BigQuery
Overview
Google BigQuery is a fully managed, analytics data warehouse that is part of the Google Cloud Platform.
Using the Lytics integration with Google BigQuery, you can import and export events, and also export user profiles and audience changes to BigQuery. Granular event data in BigQuery will allow for ease of auditing, querying, and reporting through other data visualization tools.
Authorization
If you haven't already done so, you will need to set up a Google BigQuery account before you begin the process described below. You can follow the Google quickstart guide and may also want to read the Google BigQuery documentation.
If you are using an IP allowlist for your Google BigQuery account, contact your account administrator to add Lytics' IP addresses to your allowlist. Reach out to your Lytics Account Manager for the current list of IP addresses.
There are two different options for authorization with Google BigQuery:
User Account and Service Account. See Google's authentication documentation for details on the differences between user and service accounts. See the end of this document for information on how to give 3rd parties access to export data in BigQuery.
If you are new to creating authorizations in Lytics, see the Authorizations documentation for more information.
- Select Google Cloud from the list of providers.
- Select the authorization method. Note that different methods may support different job types. Google BigQuery supports the following authorization methods:
- Enter a Label to identify your authorization.
- (Optional) Enter a Description for further context on your authorization.
- Complete the configuration steps needed for your authorization. These steps will vary by method.
- Click Save Authorization.
User account
This option uses your Google user account to access your BigQuery data. Note that you must have projectEditor
permissions on the project you are importing from in order for BigQuery exports and imports to run properly with User Account authorization.
- Select Big Query OAuth.
- From the user selection window, select the Google account you want to use from the list of accounts.
- Click Allow.
Service account
This option uses a service account to access your Google BigQuery data.
- You will need a service account credential file to use this authorization type. If you do not have one, follow the instructions in Google's documentation to create one.
- For imports and exports, the service account must be able to list projects, create and write to tables, and create jobs. The combination of
BigQuery Data Editor
andBigQuery Job User
permissions will satisfy the requirements. Learn more about the BigQuery permissions here.
- For imports and exports, the service account must be able to list projects, create and write to tables, and create jobs. The combination of
- In your Lytics account, select Big Query Service Account JWT. Note there are different options for Import and Export.
- In Cert JSON textbox, copy the contents of the service account's credential file.
Give 3rd Parties Access to Export Data in BigQuery
When exporting data from Lytics to BigQuery, you can specify email addresses that you would like to give access to. However, if you later decide that you would like to give access to another person, you can do this from within BigQuery. See Google's documentation regarding access controls for datasets.
Import Activity Data
Many applications let you write to tables in Google BigQuery and you can easily import this custom data into Lytics. Once imported, you can leverage powerful insights on this custom data provided by Lytics data science to drive your marketing efforts.
Integration Details
- Implementation Type:Server-side Integration.
- Implementation Technique: REST API Integration.
- Frequency: Batch Integration.
- Resulting data: User Profiles, User Fields, Raw Event Data.
This integration utilizes the Google BigQuery APIs to import user data. Once the import is started the job will:
- Start a query job for data in Google BigQuery.
- Export the result to Google Cloud Storage.
- Import file from Google Cloud Storage into the selected Lytics data stream.
Fields
Fields imported through Google BigQuery may require custom data mapping. Fields and Mappings can be configured under Building Profiles > Schema.
Configuration
Follow these steps to set up and configure an import of Google BigQuery events into the Lytics platform. If you are new to creating jobs in Lytics, see the Data Sources documentation for more information.
- Select Google Cloud from the list of providers.
- Select the BigQuery: Import Activity Data job type from the list.
- Select the Authorization you would like to use or create a new one.
- Enter a Label to identify this job you are creating in Lytics.
- (Optional) Enter a Description for further context on your job.
- In the Stream text input, enter the stream where imported data will go.
- From the BigQuery Project input, select the Google BigQuery Project you want to import data from.
- From the BigQuery Dataset input, select the Google BigQuery Dataset you want to export data from.
- From the BigQuery Table input, select the Google BigQuery table you would like to import. NOTE: "nested tables" are not supported by this import process.
- From the Timestamp Field input, select the name of the field that contains the event timestamp. On continuous imports, the most recent time in this field will be saved. On the next import, only rows with a timestamp greater than that value will be imported. The field must be of type
Datetime
orTimestamp
in BigQuery. - Select the Keep Updated checkbox to continuously run this import. If this option is selected, the Timestamp Field must be provided.
- From the Frequency input, select how often do you want to run the import job. By default, the import job will run every 4 hours.
- (Optional) From the Time of Dayinput, select the time of day to start importing.
- (Optional) From the Time Zone input, select the time zone of the time of day above.
- Click Complete to start import.
The time necessary to import a table depends on the table size. It can take from 30 minutes to several hours to complete the import.
It's important to be aware that big imports can make your entire Lytics account slower as data is processed and audience memberships are updated.
Export Activity Data
Export event data from any Lytics data stream to Google BigQuery. NOTE: Unlike user fields, events are not represented within the Lytics dashboard. Events are the raw data received from integrations as seen in your Data Streams.
Integration Details
- Implementation Type: Server-side Integration.
- Implementation Technique: REST API Integration.
- Frequency: Batch Integration.
- Resulting data: Raw Event Data is stored in the resulting table.
This integration utilizes the Google BigQuery APIs to send user data. Once the export is started the job will:
- Check if dataset exists, if not create it in Google BigQuery.
- Check if a table exists, if not create one.
- Generate a CSV for each table to be created or updated in BigQuery by:
- Scan through events on the data stream.
- Generate CSV row(s) for each scanned event.
- Write CSV row to Google Cloud Storage.
- Upload the CSV(s) to BigQuery.
There are a few things to know when running an event export:
- The event export scans the complete data stream from the first collected event to the current event, unless otherwise specified by the export configuration.
- The export will run continuously throughout the day.
- Field names in BigQuery will not be an exact match of field names in Lytics. Names will be formatted to fit BigQuery's field naming schema.
- If two field names are transformed to BigQuery format and conflict, one will get an integer appended to the end of the field name. For example, if both
first_name
andFirst Name
exist in Lytics and are exported, they'll be written tofirst_name
andfirst_name1
in BigQuery. - Low usage fields (fields populated on <0.1% of total volume for the exported stream) will not be included within the event export.
Fields
The fields included will depend on the raw event in Lytics data. All fields in the selected stream will be included in the exported table. To see a list of fields in a stream, select the stream name from Data Pipeline > Streams in the Lytics platform.
Configuration
Follow these steps to set up and configure an export of events from Lytics to Google BigQuery. If you are new to creating jobs in Lytics, see the Destinations documentation for more information.
- Select Google Cloud from the list of providers.
- Select the BigQuery: Export Activity Data job type from the list.
- Select the Authorization you would like to use or create a new one.
- Enter a Label to identify this job you are creating in Lytics.
- (Optional) Enter a Description for further context on your job.
- From the BigQuery Project input, select the Google BigQuery Project you want to export data to.
- From the BigQuery Dataset input, select the Google BigQuery Dataset you want to export data to.
- From the Data Streams to Export input, select data streams to export. A stream is a single source/type of data. You may choose more than one. If none are selected, all streams will be exported.
- In the Maximum numeric field, enter the number of events to be exported. If left blank, all events will be exported.
- In the Start Date text input, enter the date of the oldest event you want to export. Events from this date onwards will be exported. Use
yyyy-mm-dd
format (e.g.2015-08-13
). - In the End Date text input, enter the most recent date you want to export. Events before, but NOT including this date will be exported. Use
yyyy-mm-dd
format (e.g.2015-08-13
). - Click on the Show Advanced Options tab to expand the advanced configuration.
- Select the Partitioned Table checkbox to use partitioned tables that can help lower query costs by allowing you to target a specific date range of data instead of the whole data set. Tables will be partitioned by the event date.
- From the Partition Duration dropdown, select the duration of the table's partitions. This is ignored if Partitioned Table is not selected.
- Select the Flatten Streams checkbox to export all streams to one BigQuery table.
- Select the Keep Updated checkbox to continuously run this export.
- Select the Start Export From Now Onwards checkbox to only export events collected from now onwards. This will override any start or end date configuration made.
- Click Start Export.
Export Audience Changes
Export audience changes from Lytics to Google BigQuery.
Integration Details
- Implementation Type: Server-side Integration.
- Implementation Technique: REST API Integration.
- Frequency: Batch Integration.
- Resulting data: Audience segment change events.
This integration utilizes the Google BigQuery APIs to send user data. Once the export is started the job will:
- Check if dataset exists, if not create it in Google BigQuery.
- Check if a table exists, if not create one.
- Generate a CSV in BigQuery by:
- Receive segment change events.
- Write change event to Google Cloud Storage as a CSV row.
- Every 10 minutes Upload the CSV to BigQuery and start writing events to a new CSV.
Fields
By default, Lytics exports the following fields to Google BigQuery:
Lytics User Field | Description | Google BigQuery Field | Type |
---|---|---|---|
Email Address | string | ||
- | Audience Slug | segment_slug | string |
- | Event Time | timestamp | TIMESTAMP |
- | was this an audience Exit | exit | boolean |
Configuration
Follow these steps to set up and configure an export of audience change events from Lytics to Google BigQuery. If you are new to creating jobs in Lytics, see the Destinations documentation for more information.
- Select Google Cloud from the list of providers.
- Select the Export Audience Changes job type from the list.
- Select the Authorization you would like to use or create a new one.
- Enter a Label to identify this job you are creating in Lytics.
- (Optional) Enter a Description for further context on your job.
- From the BigQuery Project input, select the Google BigQuery Project you want to export data to.
- From the Audiences input, select the Lytics audience(s) you'd like to track. Leave empty to track all segment changes.
- From the BigQuery Dataset input, select the Google BigQuery Dataset you want to export data to.
- From the ID Field input, select the ID field in Lytics that'll get mapped to an identifying field in BigQuery.
- Click Start Export.
Export Audience Definitions
Use the Export Audience Definitions to export your Lytics audience definitions to Google BigQuery. You can use the exported audiences to do segmentation of users in other tools.
Integration Details
- Implementation Type: Server-side Integration.
- Implementation Technique: REST API Integration, File Based Transfer Integration.
- Frequency: Batch Integration.
- Resulting data: Audience Definitions
The selected audience definitions are retrieved, if they use any INCLUDE statements the included audience is inlined into the exported audience's definition replacing the INCLUDE statement.
Steps/Pattern:
- Check if the dataset exists in BigQuery, create it if it doesn't exist
- Check if the table exists in BigQuery, if it does, check that schema match, if it does not, create a new table
- For each selected audience:
- Retrieve the FilterQL for the audience
- Inline any INCLUDE statements
- Add row to CSV file
- Start Load Job in BigQuery
Fields
By default, Lytics exports the following fields to Google BigQuery:
Lytics Audience Field | Description | Google BigQuery Field | Type |
---|---|---|---|
ID | Audience ID (configurable) | id | string |
Name | Audience name (configurable) | name | string |
Slug | Audience slug (configurable) | slug | string |
FilterQL | Filter statement | filter | string |
ly_exported_ts | Time of export | ly_exported_ts | date |
ly_created_ts | Time audience was created | ly_created_ts | date |
ly_updated_ts | Time audience was updated | ly_updated_ts | date |
Configuration
Follow these steps to set up and configure an export job for Google BigQuery in the Lytics platform. If you are new to creating jobs in Lytics, see the Destinations documentation for more information.
- Select Google Cloud from the list of providers.
- Select the Export Audience Definitions job type from the list.
- Select the Authorization you would like to use or create a new one.
- Enter a Label to identify this job you are creating in Lytics.
- (Optional) Enter a Description for further context on your job.
- From the BigQuery Project input, select the Google BigQuery Project you want to export data to.
- (optional) From the BigQuery Dataset input, select the Google BigQuery Dataset you want to export data to. If no dataset is selected, the lytics_(AID) dataset will be used. If the selected Dataset does not exist it will be created.
- (optional) In the BigQuery Table text box, enter the Google BigQuery table you want to export data to. If an existing table is not selected then one will be created. If no table name is entered
lytics_(AID)_audience_definition_export
will be used. - (optional) Select the Include Audience ID checkbox, to include the Audience's ID in the row.
- (optional) Select the Include Audience Name checkbox, to include the Audience's Name in the row.
- (optional) Select the Include Audience slug checkbox, to include the Audience's slug in the row.
- Click Start Export.
Export Audiences
Export user profiles including user fields and audience memberships from Lytics to Google BigQuery.
Integration Details
- Implementation Type: Server-side Integration.
- Implementation Technique: REST API Integration.
- Frequency: Batch Integration.
- Resulting data: User Fields will be exported to Google BigQuery as either a flat table or multiple tables with join keys for non scalar fields (maps and sets).
- Field names in BigQuery will not be an exact match of field names in Lytics. Names will be formatted to fit BigQuery's field naming schema.
- If two field names are transformed to BigQuery format and conflict, one will get an integer appended to the end of the field name, e.g. if both
first_name
andFirst Name
exist in Lytics and are exported, they'll be written tofirst_name
andfirst_name1
in BigQuery. - If non-scalar fields are exported, each field will be written to a separate table, with a unique identifier
id
field that can be used to join the main user table to the fields tables. - This
id
is not an identifier that exists in Lytics, but rather it's a hash of all the user unique identifier fields. This is calculated for the express purpose of providing a way to join the non-scalar field tables to the main user table. This field's value may change between exports.
This integration utilizes the Google BigQuery's APIs to send user data. Once the export is started the job will:
- Check if dataset exists, if not create it in Google BigQuery.
- Generate a CSV for each table to be created or updated in BigQuery by:
- Scan the selected Audience.
- Generate CSV row(s) for each scanned user.
- Write CSV row to Google Cloud Storage.
- Upload the CSV(s) to BigQuery.
There are a few things to know when running a BigQuery user export:
- The user export scans the complete Lytics audience and exports all entries into a BigQuery table.
- The export will run at most once a day, but large audiences may take more than a day to complete.
- For large audiences (bigger than a million users), users will first be written into a dated user table user_ + audience slug + _ + date as the table builds in BigQuery. As more batches of users are exported, they will be appended to the dated table. Once the complete audience has been sent to BigQuery, the table will be renamed with the date removed.
- If continuous export is selected, the work will start running again on the next day at the time specified. A new temporary dated audience table will be created in BigQuery alongside the completed non-dated table.
- If an audience is large enough to take over a day to export and continuous export is selected, the work will not run again until after the previous export completes.
Fields
The fields exported to the Google BigQuery will depend on the Export Non-Scalar Fields and the Export Scalar Fields options in the workflow's configuration. Any user field in your Lytics account may be available for export.
Configuration
Follow these steps to set up and configure an export of users from Lytics to Google BigQuery. If you are new to creating jobs in Lytics, see the Destinations documentation for more information.
- Select Google Cloud from the list of providers.
- Select the BigQuery: Export Audience job type from the list.
- Select the Authorization you would like to use or create a new one.
- Enter a Label to identify this job you are creating in Lytics.
- (Optional) Enter a Description for further context on your job.
- From the BigQuery Project input, select the Google BigQuery Project you want to export data to.
- From the BigQuery Dataset input, select the Google BigQuery Dataset you want to export data to.
- From the Audience input, select the audience you would like to export to BigQuery.
- From the Maximum numeric field, enter a limit to the number of users to be exported. If left blank, all users in the selected audiences will be exported.
- Select the Audience Membership checkbox to export the audiences each user is a member of. See Export Non-Scalars as JSON below to select how this field is exported to BigQuery.
- From the Export Scalar Fields input, select a list of scalar fields to export. NOTE: these will be created in the same table. If none are selected, all fields are exported.
- From the Export Non-Scalar Fields input, select a list of non-scalar fields to export. See Export Non-Scalars as JSON below to select how this field is exported to BigQuery.
- By default the Export Non-Scalars as JSON is selected and non-scalar fields, including the audience membership, are exported to a single column of type JSON. See the Google Bigquery documentation for more information: https://cloud.google.com/bigquery/docs/reference/standard-sql/json-data. Uncheck this field to export non-scalar fields to separate tables.
- Select the Keep Updated checkbox to continuously run this export.
- Select the Keep Old checkbox to keep daily exports in dated tables.
- From the Time of Day input, select the time of day to start export each day.
- From the Timezone input, select the timezone for the time of day.
- Click Complete to start the BigQuery job.
Exporting table data to BigQuery
Please see the Google BigQuery documentation on Exporting Table Data for options and instructions for exporting data to BigQuery.
Google Cloud: Export All Users
Lytics Conductor users can export all profiles in their Lytics Account to BigQuery via the Google Cloud: Export All Users workflow. This workflow is optimized for scanning the Lytics user table without the need for considering audience membership, and offers superior performance to the audience export workflows as a result. This makes it a great option for Lytics users who wish to maintain a daily sync of all their Lytics profiles to a BigQuery data warehouse of their choice.
Integration Details
- Implementation Type: Server-side
- Implementation Technique: REST API
- Frequency: Batch
- Resulting data: User Fields will be exported to Google BigQuery a flat table by default. To export non-scalars to separate join tables, deselect Export Non-Scalars as JSON and see the Export Audiences workflow documentation for information on how the target export tables are formatted in this mode.
- Field names in BigQuery will not be an exact match of field names in Lytics. Names will be formatted to fit BigQuery's field naming schema.
- If two field names are transformed to BigQuery format and conflict, one will get an integer appended to the end of the field name, e.g. if both
first_name
andFirst Name
exist in Lytics and are exported, they'll be written tofirst_name
andfirst_name1
in BigQuery. - If non-scalar fields are exported, they will be written to BigQuery columns of type JSON by default. See the BigQuery documentation for more information on working with JSON data
This integration utilizes the Google BigQuery's APIs to send user data. Once the export is started the job will:
- Check if dataset exists, if not create it in Google BigQuery.
- Generate a CSV for each table to be created or updated in BigQuery by:
- Scan the Lytics
user
table. - Generate a CSV row for each scanned profile.
- Write the CSV row to Lytics-owned Google Cloud Storage.
- Scan the Lytics
- Upload the CSV(s) to BigQuery.
There are a few things to know when running a BigQuery user export:
- The user export scans the complete Lytics user table and exports all entries into a BigQuery table.
- The export will run at most once a day, but large accounts may take more than a day to complete.
- For large accounts (with more than a million profiles), users will first be written into a dated user table user_ + audience slug + _ + date as the table builds in BigQuery. As more batches of users are exported, they will be appended to the dated table. Once the complete audience has been sent to BigQuery, the table will be renamed with the date removed.
- If Keep Updated is selected, the work will start running again on the next day at the time specified. A new temporary dated audience table will be created in BigQuery alongside the completed non-dated table.
- If an account is large enough to take over a day to export and continuous export is selected, the work will not run again until after the previous export completes.
Fields
The fields exported to the Google BigQuery will depend on the Export Non-Scalar Fields and the Export Scalar Fields options in the workflow's configuration. Any user field in your Lytics account may be available for export.
Configuration
Follow these steps to set up and configure an export job for Google Cloud in the Lytics platform. If you are new to creating jobs in Lytics, see the Jobs Dashboard documentation for more information.
- Select Google Cloud from the list of providers.
- Select the BigQuery: Export All Users from the job list.
- Select the authorization you would like to use or create a new one.
- Enter a Label to identify this job you are creating in Lytics.
- (Optional) Enter a Description for further context on your job.
google - Complete the configuration steps for your job.
- From the BigQuery Project input, select the Google BigQuery Project you want to export data to.
- (optional) From the BigQuery Dataset input, select the Google BigQuery Dataset you want to export data to.
- (optional) From the Export Non-Scalar Fields input, select a list of non-scalar fields to export.
- (optional) Select the Export Non-Scalars as JSON checkbox, to by default, non-scalar fields JSON columns. Deselect this to export non-scalar fields to separate join tables.
- (optional) From the Export Scalar Fields input, select a list of scalar fields to export (note: these will be created in the same table. If none are selected, all fields are exported).
- (optional) Select the Keep Updated checkbox, to continuously run this export.
- (optional) Select the Keep Old checkbox, to select to keep daily exports in dated tables.
- (optional) From the Time of Day input, select time of day to start export each day.
- (optional) From the Timezone input, select timezone for time of day.
- Click Complete to start the job
Google Cloud: Export System Events
Export Lytics system events to the BigQuery table and monitor jobs, authorizations, audiences, and queries in your Lytics instance.
Integration Details
- Implementation Type: Server-side Integration.
- Implementation Technique: REST API Integration.
- Frequency: Batch Integration.
- Resulting data: Lytics system events will be stored as a record in a BigQuery table.
This integration utilizes the Google BigQuery APIs to send system events. Once the export is started the job will:
- Check if given dataset exists in Google BigQuery.
- Check if given table exists, if not, then it will create one. If the name is given, it will use that name to create the table, otherwise the new table name will be of this
lytics_system_event_export_{AID}
format. - Stores the system events in a csv file.
- The csv file is uploaded to BigQuery via Upload the CSV to BigQuery. The file is uploaded for every 100,000 events or when file size reaches 1GB. Once the file is uploaded, the export job will go to sleep for 15 minutes to let BigQuery process the file.
- The job will repeat the step from 1-4 until it sends all the events. Once it sends all the events, the job will sleep for 24 hours and will run again if the job is continuous.
Fields
By default, Lytics exports the following fields to Google BigQuery:
ID | Date | AID | User ID | Subject Type | Subject ID | Verb | Notes | Related |
---|---|---|---|---|---|---|---|---|
ID of the event | Date when the event occurred | Account AID | User associated with the event | What the event is about | ID associated with the subject | Action described by the event | Extra information about the event | Any related information associated with the event |
Configuration
Follow these steps to set up and configure an export of Lytics system events to Google BigQuery. If you are new to creating jobs in Lytics, see the Destinations documentation for more information.
- Select Google Cloud from the list of providers.
- Select the BigQuery: Export System Events job type from the list.
- Select the Authorization you would like to use or create a new one.
- Enter a Label to identify this job you are creating in Lytics.
- (Optional) Enter a Description for further context on your job.
- From the BigQuery Project input, select the Google BigQuery Project you want to export system events to.
- From the BigQuery Dataset input, select the Google BigQuery Dataset you want to export system events to.
- (Optional) From the BigQuery Table input, select the Google BigQuery table you want to export system events to. If an existing table is not selected then one will be created. If no table name is entered
lytics_system_event_export_{AID}
will be used. - (Optional) Using the Start Date textbox, enter the date to export system events after this date (ex: 2020-01-01). If no start date is given then system events for last 30 days will be exported.
- (Optional) Using the End Date textbox, enter the date to export system events up to this date (ex: 2020-01-01). If no end date is given then system events up to the current date will be exported.
- (Optional) Using the Subject Type textbox, enter the Subject Type of system events you would like to export (work, segment, account). Only events specific to this Subject Type will be exported.
- (Optional) Using the Subject ID textbox, enter the Subject ID for above system event subject type you would like to export (work_id, segment_id, account_id). Only events specific to this Subject ID will be exported.
- Select the Keep Updated checkbox to continuously run this export. The export will run daily once it sends all the data.
- From the Time of Day input, select the time of day to start export each day.
- From the Timezone input, select the timezone for the time of day.
- Click Start Export.
Updated 3 months ago