Those of you who follow Tim Wilson and heard this talk of his would probably guess what I mean by the title. Those of you who haven't, first of all: please do, for it has wisdom that one of the most prominent digital analysts alive today has gathered over the years (and it has nothing to do with the fact that it made me feel good about my skills after feeling like a dinosaur ever since I started working with digital analysts).
So, to give you some background, I recently started working at a Digital Analytics consultancy, MarketLytics as their 'Statistics-with-R-guy', tasked with basically extending our capabilities beyond what the usual Digital Analytics tools offer. After the usual on-boarding and getting a tad familiar with the most commonly used tools, I have been tasked with developing an app to help in auditing analytics setups and detecting potential causes of faults, analysis, and modeling of data obtained from analytics engines or provided by clients, and building data processing and modeling/forecasting pipelines.
So What's it about?
After having found a wealth of articles and talks by Digital Analytics gurus, for Digital Analysts about how to use R, Statistics and Machine-learning to expand their capabilities, I thought about writing from the perspective of someone who comes the other way round, i.e a user of R for Statistics and Machine Learning starting to work with Digital Analytics platforms. After all, it's just some new data to work on, right? Right?...
First off: Google Analytics has a lot, and I mean a lot of features and tools, and the wealth of these tools at the fingertips of analysts seems to give them a certain way of thinking and skill in navigating through data which people working on command-line tools like R find... too fast. So much so that I felt like a dinosaur in their amongst.
For example, an experienced digital analyst would, in the span of a few minutes, make so many transformations of the data just to explore, before a guy like me could get their brain out off the rabbit-hole of all the pre-processing that would have been required for them to enable those transformations in the first place, had I been given the same raw data that GA has. The sheer ease that GA's pre-processing of the data and its wealth of tools provide the analyst, takes the worry of ‘how?’ totally out of the analysis process, enabling them to completely focus on the information itself that the data holds, practically giving them more experience in analysis and exploration than dinosaurs like us would get in the same amount of hours on the terminal.
Advice: Use Google Analytics for a couple of weeks working on tasks assigned by an expert, show them your work/findings every day, and watch as they prove how slow you are each time. A week or two with such a routine would help you figure out what goes on in their brains and how to think like them, what they prioritize, and how to make most of the mech that GA is.
Secondly: Don't let the ease soften you. Even though Google Analytics does a lot of pre-processing, don't let that delude you into not inspecting the data for all the treacherous parasites and trash that you are used to. GA, after all, stores the data sent to it through tags written by humans, from websites or apps coded by humans. That means that a landing page dimension can have tens of different strings which basically point to the same page, the reason can be anything from inconsistent capitalization in href tags, or paths that changed over the course of time, or whatever reasons there can be in the realm of web developers, and it would be an absolute bummer if you found that out after you did a tedious stepwise model selection in a regression analysis over a fairly large dataset, found the 'best' fit, and started analysis to report the most influential variables. It might have happened to someone out there, I have absolutely no idea who. You can ask around. Definitely not me. Nope. Someone I know told me about that, so yeah it's bad. Please focus on the point, it wasn't me.
In the course of my work, I also had the chance to work with MixPanel, another Analytics platform, which I learned to love because of one reason: It gives access to the raw event data.
That's more like it. The absolute garbage complete with the stench that drives regular software developers away, that too in the God-forsaken ndjson format. What else do we live for? And what could be a better assignment than to code a pipeline to automatically process that data, conduct some tests and modeling, and spit out findings on any analytics setup thrown at it? This, my friends, is THE LIFE. And after working on the project, and handing it over to our analyst friends to audit setups, debugging it in production after seeing it blow up on our faces every time it was unleashed on a new setup, and finally reaching a point where it is being used and helping in our workflows, if I try to recall now, I couldn't even remember what experience to share with you! So I did the next logical step I should take while writing this blog: I opened up my code, and went through it to see if it triggers some memories.
Update: had to take a 5-minute break to fix a piece of code that I didn't think of before. But it did trigger a memory: Digital analysts are so used to interfaces with tons of bells and whistles, drag-n-drop-py objects, variables pretending to be interactive buttons and what not, that anything you make in shiny is going to look LAME.
Just kidding O:). By all means do learn everything. Expand your skillset and become the most valuable resource for your employers O:). I'm an angel. See the ring above my head?
That said: working in Digital Analytics certainly exposed me to the world of possibilities that ease of setting up data-collection and processing pipelines opens up, and how much analysts can achieve by removing the dirtiest parts of this job from their workflows.
As a closing point that I haven't stopped thinking about since I started working in this field and thus can't help but share: How much are these tools, through their sheer speed and omnipresence, going to speed up the evolution of our behaviors as consumers, by continuously optimizing them around consumption? It's really something that could be a subject of study over time. Those of you who couldn't figure out the subject of their doctoral thesis, you can thank me later.