Automation, the why, the process and the how. Automation is more than just automate data download / upload from a source. This article is about automating processes, but also about my approach towards new things data related.
I’ve been working with data for quite a few years now and one thing I’ve come to realize is that data analysis is only a small part of your job as a data analyst. The preparation, cleaning, decision making, analyzing & visualization is an ongoing process but that’s not just it. There’s another side, a side I’m learning a lot about, something I would like to share.
It’s the more ‘the technical side’. It’s about storing data, data bases, processing data, funneling data, pushing and pulling data. All of this enables you to get the data to work with.
Downloading a data file is probably one of the most straightforward ways to get data to work with. A one off, often a static csv or excel. You save it somewhere and you start working with it. It actually is a labour intensive process. Especially if you have to do this every day, with loads of data sources, for multiple clients. Just imagine. Not very efficient.
That’s where automation could become helpful. Pull data, store it somewhere, push it somewhere where it can be used for analysis and/or visualization, preferably every day. And what if you have multiple data sources that you have to combine?
Databases and automation
I’ve been working with BigQuery for a while and I just started working with mySQL. I use those two data warehouse / databases because they are free* (*BQ is limited up to 10GB monthly). I prefer to use this over certain types of software that are freely available online but you often pay for it. I’d recommend you start working with MySQL & BigQuery so you force yourself to do the research and learn automation and everything that’s involved from scratch. I’m sure plenty of other free options are available as well, it’s about finding your way, whatever works best for you.
A quick search online will give you several scripts, depending on what you need to push data into mySQL, combine that with a little python script (which is also freely available online) to pull your data and you’ve pretty much set up your own automation script. Using those scripts helped me to increase my understanding of what automation actually is and how it works. It took me quite some time to understand the scripts properly and to make it work, but it resulted in helping a business pulling their data from Facebook and pushing it into data studio, via Google Sheets, which updates automatically every day. Understanding how it works helped me to debug errors and adjust some little things, because of the knowledge I gained earlier.
The next step is connecting Facebook data via Google Analytics data to BigQuery. Pushing it into BigQuery enables it to merge data in a later stage (more about that in a later article). OWOX BI is a brilliant add-on in google sheets that can be downloaded for free and that easily let’s you push data to BigQuery. You could also write a (not too complex) python script. I’ll show you one in one of my next articles. Every time I approach something I don’t know I start with the result I prefer to have, and work my way back to what I think is going to be the first step. While doing that, I prefer to simplify things in that process. Complex wording are in my opinion often used to make stuff sound complex. Half of the time it’s quite easy if you use different wording.
Another important reason for doing this is because I need to be able to explain it to my clients/customers. If they don’t understand what I want to do, how do you expect them to pay you if they don’t understand where their money is going.
A third reason for this approach is that it enables me to explain what I’m doing. If I’m able to explain to someone what I’m doing, and if I’m capable of answering their questions, I’ll truly understand what’s going on and what I’m doing. I’m mastering the skill. And that skill is something you need if you have to report to a management or a boss. If you’re able to explain it, you completely own the knowledge.
Automation of anything related to your business potentially can have a lot of benefits. People in data related businesses like data analysts, scientists understand this most of the time. Managers or people in different businesses do not, is my experience so far. Managers sometimes don’t realise that it is actually possible, CEO’s want to focus on their business, not on those details. That’s why I bring up this topic, and explain, so people will understand.
I’ve worked for quite a few business now from different sizes where no attention has been paid to data / data organisation / data analysis, EVER. Because businesses are growing, doing well, money is being made, so there’s not much attention/time for data. All the data that comes in by you doing business is important and should be collected so it can be used for insights and analysis to see whether you’re on the right track for your business.