As mentioned in the first blog in the series, we recently advertised a Data Analyst role, which had the following desirable skills listed:
- creating and querying data models using SQL
- working with both structured and semi-structured data
- exploring and visualising data
- using probability and statistics to perform and support analyses
- drawing insights from large and complex datasets
- using hypothesis tests to create rigorous insights from data
- working in an agile manner to continuously deliver work
- articulating results to a broad range of audiences
We didn’t expect any candidates to tick every single box, but the most compelling applications at the first stage ticked a number of them. We covered the first couple of bullet points in the prior blog, which I referred to as “Data Engineering skills”. This blog covers the second skill we look for: Data Analysis.
What is Data Analysis?
The name of the job! Once we have our data, we want to analyse it to produce meaningful insights. This is all about having the knowledge and skills to explore the data and identify patterns/trends – you’ll want to be sure that what you’re saying is justified, and so knowledge of statistics is crucial.
How does it relate to desirable skills?
In our job ad, the four middle bullet points were the Data Analysis skills we were interested in:
- exploring and visualising data
- using probability and statistics to perform and support analyses
- drawing insights from large and complex datasets
- using hypothesis tests to create rigorous insights from data
How to learn Data Analysis
There’s lots covered in the desired skills, and lots to learn. Some of the books and links below are a great starting point:
What do you need to know? | How can you learn? |
---|---|
Working knowledge of statistical principles | In order to perform exploratory data analysis and perform rigorous, meaningful analyses, a working knowledge of statistics is necessary. This can be done via book learning, but many online “Data Science” courses introduce some of these ideas. There are some good Data Science courses at DataCamp, which let you choose between python or R (we use both!), which also cover some of the basics. Delphine from our Data Science team recommends this Udemy Data Science bootcamp, which is reasonably priced and very thorough. (You can read more about how Delphine moved from lab scientist to data scientist, here). |
How to visualise data to ease interpretation | Aside from the technical details of how you visualise data, the goal of doing so is to make it easier to interpret. This helps people quickly answer: What is this data telling me? To do this as well as possible, it’s important to understand how humans interpret visual information – read this paper by Cleveland and McGill to learn more about that. How Charts Lie by Alberto Cairo will also provide a good grounding in visualisation. |
Hopefully you’ve already read my blog on how to get started with Data Engineering, there’s just one more skill to look at: Data Evangelism, and you can read all about it, here.