What is bigquery?
BigQuery is nothing more (and nothing less) than a fully managed cloud data warehouse. It’s a storage space for data. It’s a warehouse that works on/rolls with SQL. SQL enables it to access data, connect different sources of data and set up a table to work with for analysis purposes.
Sign up, make an account The amount of data that it can pull and collect is limited but enough if you just get started. Even if you go beyond that, payment is still very low. It’s a super quick way to analyze & visualize an insane amount of data in just seconds. Aggregate data, build dashboards, transform data into understandable graphs.
There is no need to manually build tables like you do in MySQL. You do have to create a ‘dataset’ and a table, but once you’ve done that, you upload the data directly into the table. From there you can merge multiple csv files into one table, and show whatever you want. SQL gives it power and makes it easy to transform the data.
GCP is powerful because BQ can easily connect with data studio. As long as you keep your bigquery up to date with the latest data, datastudio is almost a live update.
Doing this, means that you’re basically there. But, it wouldn’t be me if I think one step further. If there is a daily update for example, are you going to spend time every day uploading this file? That’s boring stuff, and why doing boring stuff if you don’t have to, right? So, automate it!
In order to automate stuff, you could probably pay money to use a program or pay a company that sets up a pipeline. Some programs let you upload your data and connect it and set it to automatically update stuff. But if they can do it, you can also write a script for it, so I decided to go for that with the bigquery library in python. I might use software in the future, but I want to understand how it works and that’s why for now I rather write it myself. Another huge advantage is that you can write a script to get exactly doing what you want, for free. You just need to figure out how to write the script.
Upload to bigquery
Upload from google sheets to bigquery
Automate upload to bigquery
from google.cloud import bigquery
bigqueryClient = bigquery.Client()
jobConfig = bigquery.LoadJobConfig()
jobConfig.skip_leading_rows = 1
jobConfig.source_format = bigquery.SourceFormat.CSV jobConfig.write_disposition = bigquery.WriteDisposition.WRITE_APPEND
datasetName = "dataset-name"
targetTable = "table-name"
uri = "gs://bucket-name/file-name.csv"
tableRef = bigqueryClient.dataset(datasetName).table(targetTable) bigqueryJob = bigqueryClient.load_table_from_uri(uri,
So, the most important thing done! Data is imported! Save the python files.You can either run the files in your terminal (or command line) or set a timer that your computer runs it every morning, monday, first or the month, whatever. Check your timer for key points in order to let the timer run.
OWOX BI is an add-on in google sheets that makes if very user friendly and easy to upload data into bigquery.
- Download the add on
- Click on add-ons → OWOX BI BigQuery requests → Upload data to bigquery
- Specify project, dataset, name of table and data type (int64, float, date, string etc)
Automate uploading data to bigquery
- Click on add-ons → OWOX BI BigQuery requests → schedule reports
- Select your own preferences at schedule report.
- Select the report(s) you want to be updated.
- Choose whether you want to receive email notifications. If yes: all or in case of errors only.
- Confirm that you understand and agree with Google BigQuery pricing policy.
That’s pretty much it!