Qualitative Data Coding 101

How to code qualitative data, the smart way (with examples).

By: Jenna Crosley (PhD) | Reviewed by:Dr Eunice Rautenbach | December 2020

As we’ve discussed previously, qualitative research makes use of non-numerical data – for example, words, phrases or even images and video. To analyse this kind of data, the first dragon you’ll need to slay isย qualitative data codingย (or just “coding” if you want to sound cool). But what exactly is coding and how do you do it?ย 

Overview: Qualitative Data Coding

In this post, weโ€™ll explain qualitative data coding in simple terms. Specifically, we’ll dig into:

  1. What exactly qualitative data coding is
  2. What different types of coding exist
  3. How to code qualitative data (the process)
  4. Moving from coding to qualitative analysis
  5. Tips and tricks for quality data coding
Qualitative Data Coding: The Basics

What is qualitative data coding?

Letโ€™s start by understanding what a code is. At the simplest level,ย a code is a label that describes the contentย of a piece of text. For example, in the sentence:

โ€œPigeons attacked me and stole my sandwich.โ€

You could use “pigeons” as a code. This code simply describes that the sentence involves pigeons.

So, building onto this,ย qualitative data coding is the process of creating and assigning codes to categorise data extracts.ย ย You’ll then use these codes later down the road to derive themes and patterns for your qualitative analysis (for example, thematic analysis). Coding and analysis can take place simultaneously, but it’s important to note that coding does not necessarily involve identifying themes (depending on which textbook you’re reading, of course). Instead, it generally refers to the process ofย labelling and grouping similar types of dataย to make generating themes and analysing the data more manageable.ย 

Makes sense? Great. But why should you bother with coding at all? Why not just look for themes from the outset? Well, coding is a way of making sure yourย data is valid. In other words, it helps ensure that yourย analysis is undertaken systematicallyย and that other researchers can review it (in the world of research, we call this transparency). In other words, good coding is the foundation of high-quality analysis.

Definition of qualitative coding

What are the different types of coding?

Now that weโ€™ve got a plain-language definition of coding on the table, the next step is to understand what overarching types of coding exist – in other words, coding approaches. Letโ€™s start with the two main approaches, inductive and deductive.

With deductive coding, you, as the researcher, begin with a set ofย pre-established codesย and apply them to your data set (for example, a set of interview transcripts). Inductive coding on the other hand, works in reverse, as you create the set of codes based on the data itself – in other words, theย codes emerge from the data. Let’s take a closer look at both.

Deductive coding 101

With deductive coding, we make use of pre-established codes, which are developed before you interact with the present data. This usually involves drawing up a set ofย codes based on a research question or previous research. You could also use a code set from the codebook of a previous study.

For example, if you were studying the eating habits of college students, you might have a research question along the lines ofย 

โ€œWhat foods do college students eat the most?โ€

As a result of this research question, you might develop a code set that includes codes such as โ€œsushiโ€, โ€œpizzaโ€, and โ€œburgersโ€.ย ย 

Deductive coding allows you to approach your analysis with a very tightly focused lens and quickly identify relevant data. Of course, the downside is that you could miss out on some very valuable insights as a result of this tight, predetermined focus.ย 

Deductive coding of data

Inductive coding 101ย 

But what about inductive coding? As we touched on earlier, this type of coding involves jumping right into the data and then developing the codesย based on what you findย within the data.ย 

For example, if you were to analyse a set of open-ended interviews, you wouldnโ€™t necessarily know which direction the conversation would flow. If a conversation begins with a discussion of cats, it may go on to include other animals too, and so you’d add these codes as you progress with your analysis. Simply put, with inductive coding, you “go with the flow” of the data.

Inductive coding is great when you’re researching something that isn’t yet well understood because the coding derived from the data helps you explore the subject. Therefore, this type of coding is usually used when researchers want to investigate new ideas or concepts, or when they want to create new theories.ย 

Inductive coding definition

A little bit of bothโ€ฆ hybrid coding approaches

If you’ve got a set of codes you’ve derived from a research topic, literature review or a previous study (i.e. a deductive approach), but you still donโ€™t have a rich enough set to capture the depth of your qualitative data, you canย combine deductive and inductiveย methods – this is called aย hybridย coding approach.ย 

To adopt a hybrid approach, you’ll begin your analysis with a set of a priori codes (deductive) and then add new codes (inductive) as you work your way through the data. Essentially, the hybrid coding approach provides the best of both worlds, which is why it’s pretty common to see this in research.

Need a helping hand?

See how Grad Coach can help you...


How to code qualitative data

Now that we’ve looked at the main approaches to coding, the next question you’re probably asking is “how do I actually do it?”. Let’s take a look at theย coding process, step by step.

Both inductive and deductive methods of coding typically occur in two stages:ย initial codingย andย line by line coding.ย 

In the initial coding stage, the objective is to get a general overview of the data by reading through and understanding it. If you’re using an inductive approach, this is also where you’ll develop an initial set of codes. Then, in the second stage (line by line coding), you’ll delve deeper into the data and (re)organise it according to (potentially new) codes.ย 

Let’s take a look at these two stages of coding in more detail.

Step 1 – Initial coding

The first step of the coding process is to identifyย the essenceย of the text and code it accordingly. While there are various qualitative analysis software packages available, you can just as easily code textual data using Microsoft Word’s “comments” feature.ย 

Let’s take a look at a practical example of coding. Assume you had the following interview data from two interviewees:

What pets do you have?

I have an alpaca and three dogs.

Only one alpaca? They can die of loneliness if they donโ€™t have a friend.

I didnโ€™t know that! Iโ€™ll just have to get five more.ย 

What pets do you have?

I have twenty-three bunnies. I initially only had two, Iโ€™m not sure what happened.ย 

In the initial stage of coding, you could assign the code of โ€œpetsโ€ or โ€œanimalsโ€. These are just initial,ย fairly broad codesย that you can (and will) develop and refine later. In the initial stage, broad, rough codes are fine – they’re just a starting point which you will build onto in the second stage.ย 

Qualitative Coding By Experts

How to decide which codes to use

But how exactly do you decide what codes to use when there are many ways to read and interpret any given sentence? Well, there are a few different approaches you can adopt. Theย main approachesย to initial coding include:

  • In vivo codingย 
  • Process coding
  • Open coding
  • Descriptive coding
  • Structural coding
  • Value coding

Letโ€™s take a look at each of these:

In vivo coding

When you use in vivo coding, you make use of aย participantsโ€™ own words, rather than your interpretation of the data. In other words, you use direct quotes from participants as your codes. By doing this, you’ll avoid trying to infer meaning, rather staying as close to the original phrases and words as possible.ย 

In vivo coding is particularly useful when your data are derived from participants who speak different languages or come from different cultures. In these cases, it’s often difficult to accurately infer meaning due to linguistic or cultural differences.ย 

For example, English speakers typically view the future as in front of them and the past as behind them. However, this isn’t the same in all cultures. Speakers of Aymara view the past as in front of them and the future as behind them. Why? Because the future is unknown, so it must be out of sight (or behind us). They know what happened in the past, so their perspective is that it’s positioned in front of them, where they can โ€œseeโ€ it.ย 

In a scenario like this one, it’s not possible to derive the reason for viewing the past as in front and the future as behind without knowing the Aymara cultureโ€™s perception of time. Therefore, in vivo coding is particularly useful, as it avoids interpretation errors.

Process coding

Next up, there’s process coding, which makes use ofย action-based codes. Action-based codes are codes that indicate a movement or procedure. These actions are often indicated by gerunds (words ending in โ€œ-ingโ€) – for example, running, jumping or singing.

Process coding is useful as it allows you to code parts of data that aren’t necessarily spoken, but that are still imperative to understanding the meaning of the texts.ย 

An example here would be if a participant were to say something like, โ€œI have no idea where she isโ€. A sentence like this can be interpreted in many different ways depending on the context and movements of the participant. The participant could shrug their shoulders, which would indicate that they genuinely donโ€™t know where the girl is; however, they could also wink, showing that they do actually know where the girl is.ย 

Simply put, process coding is useful as it allows you to, in a concise manner, identify the main occurrences in a set of data and provide a dynamic account of events. For example, you may have action codes such as, โ€œdescribing a pandaโ€, โ€œsinging a song about bananasโ€, or โ€œarguing with a relativeโ€.

Descriptive coding

Descriptive coding aims to summarise extracts by using aย single word or nounย that encapsulates the general idea of the data. These words will typically describe the data in a highly condensed manner, which allows the researcher to quickly refer to the content.ย 

Descriptive coding is very useful when dealing with data that appear in forms other than traditional text – i.e. video clips, sound recordings or images. For example, a descriptive code could be โ€œfoodโ€ when coding a video clip that involves a group of people discussing what they ate throughout the day, or “cooking” when coding an image showing the steps of a recipe.ย 

Structural coding

Structural coding involves labelling and describingย specific structural attributesย of the data. Generally, it includes coding according to answers to the questions of โ€œwhoโ€, โ€œwhatโ€, โ€œwhereโ€, and โ€œhowโ€, rather than the actual topics expressed in the data. This type of coding is useful when you want to access segments of data quickly, and it can help tremendously when you’re dealing with large data sets.ย 

For example, if you were coding a collection of theses or dissertations (which would be quite a large data set), structural coding could be useful as you could code according to different sections within each of these documents – i.e. according to the standardย dissertation structure. What-centric labels such as โ€œhypothesisโ€, โ€œliterature reviewโ€, and โ€œmethodologyโ€ would help you to efficiently refer to sections and navigate without having to work through sections of data all over again.ย 

Structural coding is also useful for data from open-ended surveys. This data may initially be difficult to code as they lack the set structure of other forms of data (such as an interview with a strict set of questions to be answered). In this case, it would useful to code sections of data that answer certain questions such as “who?”, “what?”, “where?” and “how?”.

Let’s take a look at a practical example. If we were to send out a survey asking people about their dogs, we may end up with a (highly condensed) response such as the following:ย 

Bella is my best friend. When Iโ€™m at home I like to sit on the floor with her and roll her ball across the carpet for her to fetch and bring back to me. I love my dog.

In this set, we could codeย Bellaย as โ€œwhoโ€,ย dogย as โ€œwhatโ€,ย homeย andย floorย as โ€œwhereโ€, andย roll her ballย as โ€œhowโ€.ย 

Values coding

Finally, values coding involves coding that relates to theย participant’s worldviews. Typically, this type of coding focuses on excerpts that reflect the values, attitudes, and beliefs of the participants. Values coding is therefore very useful for research exploring cultural values and intrapersonal and experiences and actions.ย ย 

To recap, the aim of initial coding is to understand andย familiarise yourself with your data, toย develop an initial code setย (if you’re taking an inductive approach) and to take the first shot atย coding your data. The coding approaches above allow you to arrange your data so that it’s easier to navigate during the next stage, line by line coding (we’ll get to this soon).ย 

While these approaches can all be used individually, itโ€™s important to remember that it’s possible, and potentially beneficial, toย combine them. For example, when conducting initial coding with interviews, you could begin by using structural coding to indicate who speaks when. Then, as a next step, you could apply descriptive coding so that you can navigate to, and between, conversation topics easily. You can check out some examples of various techniques here.

Need a helping hand?

See how Grad Coach can help you...


Step 2 – Line by line coding

Once you’ve got an overall idea of our data, are comfortable navigating it and have applied some initial codes, you can move on to line by line coding. Line by line coding is pretty much exactly what it sounds like – reviewing your data, line by line,ย digging deeperย and assigning additional codes to each line.ย 

With line-by-line coding, the objective is to pay close attention to your data toย add detailย to your codes. For example, if you have a discussion of beverages and you previously just coded this as “beverages”, you could now go deeper and code more specifically, such as โ€œcoffeeโ€, โ€œteaโ€, and โ€œorange juiceโ€. The aim here is to scratch below the surface. This is the time to get detailed and specific so as to capture as much richness from the data as possible.ย 

In the line-by-line coding process, it’s useful toย code everythingย in your data, even if you donโ€™t think youโ€™re going to use it (you may just end up needing it!). As you go through this process, your coding will become more thorough and detailed, and youโ€™ll have a much better understanding of your data as a result of this, which will be incredibly valuable in the analysis phase.

Line-by-line coding explanation

Moving from coding to analysis

Once you’ve completed your initial coding and line by line coding, the next step is toย start your analysis. Of course, the coding process itself will get you in “analysis mode” and you’ll probably already have some insights and ideas as a result of it, so you should always keep notes of your thoughts as you work through the coding.ย ย 

When it comes to qualitative data analysis, there areย many different types of analysesย (we discuss some of theย most popular ones here) and the type of analysis you adopt will depend heavily on your research aims, objectives and questions. Therefore, we’re not going to go down that rabbit hole here, but we’ll cover the important first steps that build the bridge from qualitative data coding to qualitative analysis.

When starting to think about your analysis, it’s useful toย ask yourselfย the following questions to get the wheels turning:

  • What actions are shown in the data?ย 
  • What are the aims of these interactions and excerpts? What are the participants potentially trying to achieve?
  • How do participants interpret what is happening, and how do they speak about it? What does their language reveal?
  • What are the assumptions made by the participants?ย 
  • What are the participants doing? What is going on?ย 
  • Why do I want to learn about this? What am I trying to find out?ย 
  • Why did I include this particular excerpt? What does it represent and how?
The type of qualitative analysis you adopt will depend heavily on your research aims, objectives and research questions.
As with the initial coding and line by line coding, your qualitative analysis can follow certain steps. The first two steps areย code categorisationย andย theme identification.

Code categorisation

Categorisation is simply the process of reviewing everything youโ€™ve coded and thenย creating code categoriesย that can be used to guide your future analysis. In other words, it’s about creating categories for your code set. Let’s take a look at a practical example.

If you were discussing different types of animals, your initial codes may be โ€œdogsโ€, โ€œllamasโ€, and โ€œlionsโ€. In the process of categorisation, you could label (categorise) these three animals as โ€œmammalsโ€, whereas you could categorise โ€œfliesโ€, โ€œcricketsโ€, and โ€œbeetlesโ€ as โ€œinsectsโ€. By creating these code categories, you will be making your data more organised, as well as enriching it so that you can see new connections between different groups of codes.ย 

From this categorisation, you can move onto the next step, which is to identify the themes in your data.ย 

Theme identification

From the coding and categorisation processes, you’ll naturally start noticing themes. Therefore, the logical next step is toย identify and clearly articulate the themesย in your data set. When you determine themes, you’ll take what you’ve learned from the coding and categorisation and group it all together to develop themes. This is the part of the coding process where you’ll try to draw meaning from your data, and start toย produce a narrative. The nature of this narrative depends on your research aims and objectives, as well as your research questions (sounds familiar?) and theย qualitative data analysis methodย you’ve chosen, so keep these factors front of mind as you scan for themes.ย 

Themes help you develop a narrative in your qualitative analysis

Tips & tricks for quality coding

Before we wrap up, let’s quickly look at some general advice, tips and suggestions to ensure your qualitative data coding is top-notch.

  • Before you begin coding,ย plan out the stepsย you will take and the coding approach and technique(s) you will follow to avoid inconsistencies.ย 
  • When adopting deductive coding, it’s useful toย use a codebookย from the start of the coding process. This will keep your work organised and will ensure that you donโ€™t forget any of your codes.ย 
  • Whether you’re adopting an inductive or deductive approach,ย keep track of the meaningsย of your codes and remember to revisit these as you go along.
  • Avoid using synonymsย for codes that are similar, if not the same. This will allow you to have a more uniform and accurate coded dataset and will also help you to not get overwhelmed by your data.
  • While coding, make sure that youย remind yourself of your aimsย and coding method. This will help you toย avoidย directional drift, which happens when coding is not kept consistent.ย 
  • If you are working in a team, make sure that everyone hasย been trained and understandsย how codes need to be assigned.ย 

Thanks for reading this post. We hope that you have a better understanding of the qualitative data coding process and that youโ€™re feeling more confident about getting started. Good luck!

Qualitative Coding By Experts
Share This