Quantitative data analysis is one of those things that often strikes fear into students when they reach the research stage of their degree. It’s totally understandable – quantitative data analysis is a complex topic, full of daunting lingo like medians, modes, correlation and covariance. Suddenly we’re all wishing we’d paid a little more attention in math class.
The good news is that while quantitative data analysis is a mammoth topic, gaining a working understanding of the basics isn’t that hard, even for those of us who avoid numbers and math at all costs. In this post, we’re going to break quantitative analysis down into simple, bite-sized chunks so you can get comfy with the core concepts and approach your research with confidence.
What is quantitative data analysis?
Despite being a mouthful, quantitative data analysis simply means analysing data that is numbers-based (as opposed to words-based), or data that can be easily “converted” into numbers without losing any meaning. For example, category-based variables such as gender, ethnicity, or native language could all be “converted” into numbers without losing meaning.
What’s it used for?
Quantitative data analysis is typically used to measure differences between groups (for example, the popularity of different clothing colours), relationships between variables (for example, the relationship between weather temperature and voter turnout), and to test hypotheses in a scientifically rigorous way. This contrasts with qualitative data analysis, which can be used to analyse people’s perceptions and feelings about an event or situation. To learn more about the differences between qualitative and quantitative research, check out this article.
How does it work?
Since quantitative data analysis is all about analysing numbers, it’s no surprise that it involves statistics. Statistical analysis methods and techniques are the engine that powers quantitative data analysis, and these methods and techniques can vary from pretty basic calculations (for example, averages and medians) through to more sophisticated analyses (for example, correlations and regressions).
Do I need to become a statistician?
There are loads of different statistical techniques and, admittedly, things can get pretty complicated. Don’t stress though – you don’t need to be a master statistician to undertake quality research. You just need a solid understanding of the basics and you can learn about the analysis techniques that will be relevant to your specific research as you progress. We’ll take a look at those basics here.
What are the main analysis methods & techniques?
As we discussed earlier, quantitative data analysis is powered by statistical analysis. There are two main “branches” of statistical methods/techniques that are used – descriptive statistics and inferential statistics. In your research, you might only use descriptive statistics, or you might use a mix of both, depending on what you’re trying to figure out (in other words, depending on your research questions, aims and objectives).
But first – a quick detour:
Before we look at these two branches of statistics, you need to understand two very important words: population and sample.
The population is the entire group of people (or animals or companies or whatever) that you’re interested in researching. For example, if you were interested in researching Tesla owners in the US, then the population would be all Tesla owners in the US.
However, it’s extremely unlikely that you’re going to be able to interview or survey every single Tesla owner in the US. You’ll likely only be able to get access to a few hundred, maybe a few thousand owners. This group of accessible people whose data you actually collect is called your sample.
In other words, the population is the full chocolate cake, whereas a slice of that cake is the sample.
Right, now let’s get back to those two branches of statistics – descriptive and inferential:
Descriptive statistics serve a simple but critically important role in your research – to describe your data set (who would have thought?). In other words, they help you understand the details of your sample (the small slice of the population). Unlike inferential statistics (which we’ll get to soon), descriptive statistics are not aiming to make inferences about the entire population – they’re just interested in the details of your specific sample.
When you’re writing up your analysis chapter, descriptive statistics are the first set of stats you’ll cover, before moving on to inferential statistics. However, depending on your research objectives and questions, they may be the only type of statistics you use. Whatever the case, they’re essential.
Some common statistical techniques used in this branch include:
- Mean – this is simply the mathematical average of a range of numbers.
- Median – this is the middle point of a range of numbers (if those numbers were arranged from low to high).
- Standard deviation and variance – these indicate how dispersed a range of numbers are. In other words, how close (or far) all the numbers are to (or from) the average.
- Skewness – this indicates how symmetrical a range of numbers is. In other words, do they tend cluster into a smooth bell curve shape in the middle (this is called a “normal distribution”), or do they skew to the left or right.
Here’s an example of these descriptive statistics in action. In this example, we’re looking at the bodyweight of 10 people. In other words, our sample consists of 10 respondents.
As you can see, these descriptive statistics give us a clear view of the data set.
- The mean/average weight is 72.4 kilograms.
- The median is very similar, suggesting that this data set has a relatively symmetrical distribution (i.e. a smooth bell curve shape).
- The standard deviation of 10.6 indicates that there’s quite a wide spread of numbers (ranging from 55 to 90).
- The skewness of -0.2 tells us that the data is slightly negatively skewed.
Why descriptive statistics matter
While these are all fairly basic statistics to calculate (you can calculate all of them in Excel with a few clicks), they’re incredibly important for a few reasons:
- They help you get both a macro and micro-level view of your data. In other words, they help you understand both the big picture and the finer details.
- They help you spot potential errors in the data – for example, if an average is way higher than you’d intuitively expect, or responses to a question are highly varied.
- They help inform which inferential statistical techniques you can use, as those techniques depend on the skewness (symmetry and normality) of the data.
Simply put, descriptive statistics are really important, even though the statistical techniques used are fairly basic. All too often, we see students skimming over the descriptives in their eagerness to get to the seemingly more exciting inferentials, and then landing up with some very flawed results. Don’t be a sucker – give your descriptive statistics the love and attention they deserve.
As we discussed earlier, while the descriptive statistics are all about the details of your specific data set (your sample), inferential statistics aim to make inferences about the population. In other words, inferential statistics aims to make predictions about what you’d find in the full population. This could include predictions about:
- Differences between groups – for example, height differences between children grouped by their favourite meal.
- Relationships between variables – for example, the relationship between body weight and the number of hours a week a person does yoga.
In other words, inferential statistics (when done correctly), allow you to connect the dots and predict what will happen in the real world, based on what you observe in your sample data. For this reason, inferential statistics are used for hypothesis testing – in other words, testing statements of change or of difference.
Of course, when you’re working with inferential statistics, the composition of your sample is really important. In other words, if your sample doesn’t accurately represent the population you’re researching, then your findings won’t necessarily be very useful – i.e. you won’t be able to infer very much.
For example, if your population of interest is a mix of 50% male and 50% female, but your sample is 80% male, you can’t make inferences about the population based on your sample, since its not representative. This area of statistics is called sampling, but we won’t go down that rabbit hole here (it’s a deep one!) – we’ll save that for another post.
Some common inferential statistical techniques include:
- T-Tests – this compares the averages of two groups of data to assess whether they’re significantly different. In other words, do they have significantly different means (averages), standard deviations and skewness.
- ANOVAs – this is similar to a T-test, but it allows you to analyse multiple groups, not just two groups.
- Correlations – this assesses the relationship between two variables. In other words, if one variable goes up, does the other variable also go up, down, or stay the same.
- Regressions – this is similar to correlation, but it goes a step further to understand cause and effect between variables, not just whether they move together. In other words, does the one variable actually cause the other one to move, or do they just happen to move together naturally thanks to another force.
Let’s take a look at an example of a correlation in action. Here’s a scatter plot demonstrating the correlation (relationship) between weight and height. Intuitively, we’d expect there to be some relationship between these two variables, which is what we see below – i.e. the results tend to cluster together in a diagonal line from bottom left to top right.
These are just a handful of common quantitative methods techniques – there are many, many more. The right technique depends on many factors, including the distribution of the data (i.e. how symmetrical or skew it is). And that’s exactly why descriptive statistics are so important – they’re the first step to knowing which inferential techniques you can and can’t use.
If this all sounds like gibberish to you right now, don’t worry. You just need to be aware that there are many options, and each option has its own set of assumptions, data requirements and limitations. It’s perfectly natural (and extremely common) to learn as you go, figuring things out as and when you need them.
How to choose the right analysis
When you start thinking about quantitative data analysis, it’s tempting to jump straight into the statistical analysis methods and techniques – for example, correlation analysis, regression analysis, etc. But before you can make any decisions about which statistical tests and analyses to use, you need to think about two very important factors:
- The type of quantitative data you have (level and shape)
- Your research questions and hypotheses
Let’s take a closer look at each of these:
The type of data you have
Unfortunately, not all quantitative data is created equally. Four different types of quantitative data reflect different levels of measurement – nominal, ordinal, interval and ratio. If you’re not familiar with this terminology, check out this post where we explain levels of measurement before continuing.
Why does this matter? Well, because different statistical methods and techniques require different types of data. For example, some techniques work with categorical data (such as nominal or ordinal data), while others work with numerical data (such as interval or ration) – and some work with a mix.
Another important factor is the shape of the data – in other words, does it have a normal distribution (i.e. is it a smooth bell-shaped curve, centred in the middle) or is it very skewed to the left or right. Again, different statistical techniques work for different shapes of data – some are designed for symmetrical data while others are designed for skewed data. Yet another reminder of the importance of descriptive statistics!
Your research questions and hypotheses
The nature of your research questions and research hypotheses will heavily influence which statistical methods and techniques you use.
If you’re just interested in understanding attributes of your sample (as opposed to the entire population), then descriptive statistics are probably all you need. For example, if you just want to assess means (averages) and medians (centre points) of variables in a group of people.
On the other hand, if your research questions are investigating an entire population, looking to understand differences between groups or relationships between variables, then you’ll likely need both descriptive statistics and inferential statistics.
Therefore, its really important to get very clear about your research aims and objectives, and more importantly, your research questions and hypotheses before you start looking at which statistical techniques to use. Don’t try to shoehorn a specific statistical technique into your research just because you like it or have some experience with it.
Time to recap…
You’re still with me? That’s impressive. We’ve covered a lot of ground here, so let’s recap on the key points:
- Quantitative data analysis is all about analysing number-based data (which includes categorical and numerical data) using various statistical techniques.
- The two main branches of statistics are descriptive statistics and inferential statistics. Descriptives describe your sample, whereas inferentials make predictions about what you’ll find in the population.
- Common descriptive statistical techniques include mean (average), median, standard deviation (and/or variance) and skewness.
- Common inferential statistical techniques include t-tests, ANOVAs, correlation and regression analysis.
- To choose the right statistical methods and techniques, you need to consider the type of data you’re working with (nominal, ordinal, interval or ratio), as well as your research questions and hypotheses.