In a time where self-service tools are making data more accessible to non-technical users, anybody can call themselves an analyst, build reports, and derive insights. This is widely viewed as a positive move, with analytical resource being freed to work on more complicated insights whilst internal clients can now “self-serve”. What could possibly be the downside?
Analysts, data scientists, statisticians etc. are generally known for their attention to detail and inquisitive nature. This has meant that building the reports that are now so easily produced, would have previously been hand crafted with consideration for other factors influencing their analysis.
Take the case below as an example, this chart shows the proportion of web visitors who log into the site. In general, we would expect this to increase over time; the chart below suggests that there has been an almost exponential increase in logged in users in the last week:
However, when we overlay the start date of the sources of our data capture, we can actually see this relationship is mainly caused by new ways in which to ask people to log in:
In this case the cause of a trend may be fairly self-evident, but causes for missing data or an underlying trend can be so varied that it can often be difficult to spot. Even if it is known, would a non-technical user have the skills to properly attribute ‘logged in’ growth against each data source?
Let us explore how the same data could be interpreted differently. The chart below shows three KPI’s for visitors in a store collected by different means; we can compare them to validate each other. The KPI’s are not normalised, which means the actual numbers are expected to be different, however they are generally expected to show a similar/same pattern.
A good way to compare these KPI’s is to normalise them through a benchmark or index. The two charts below show different ways of normalising these charts:
The first chart, uses August as a benchmark and looks at percentage changes in the other months compared to August, whereas the chart on the right uses the average value of all the months. We can see hear that using the August index sources 2&3 look similar (from August), where as sources 1&2 look more aligned using the average.
When looking back at the raw data results we see sources 1&2 generally look a bit closer than source 3. When we also consider that in this case we would expect more activity in August given school holidays, the spike in activity in source 3 in October does not make sense; hence this KPI is likely to be invalid.
Five top tips for avoiding costly mistakes
- DON’T RUSH your reports– there is a reason analysts used to take a while to get you those reports
- Question everything – in the case above a peak in October didn’t make sense, and helped identify an issue with that data source and KPI
- KISS (Keep it simple) – there is often a temptation to find a pattern that isn’t there, but remember anything you say now could cause issues and confusion further down the line
- Validate your findings where possible – for instance in using a set of KPI’s not just one
- Don’t re-invent the wheel – standardise the reports that you can to avoid changing KPI’s
There is a saying “a good analyst keeps you honest”; just because you now have better access to the data does not mean that you will have a perfect month every month.
Over analysing to force postive conclusions can lead to confusion over what you are actually doing well; in cases such as this focus on how you will use the findings to do better.
What now is the role for an analyst?
Now that there are self service options out there, this removes the analyst from getting invovled in reporting? Wrong! There are still several ways in which you can support this process:
1 Setting up the reporting structure
All of those years gaining experience through time spent reviewing, cleaning, and understanding data needs to be translated into better using the new reporting solutions and tools. Where possible you should look to include auto-validation checks to ensure that the reports are accurate.
Also, helping build those first standard reports can help give a basis for users to compare results against.
2 User support
It is unlikely that a new user will understand all of the detail straight away, but it is important not to just produce the report for them; instead support and train users thus giving them confidence to have a go on their own.
3 Complex analysis
By reducing the time you spend building standard reports, you should now have more time to dig a little deeper. It is good to work with other teams to define the key analytical projects that will add value. These projects may not always be an advanced model, but most reporting packages cannot cope with any advanced data mining (on the level of SQL or SAS).