Can You Remove Outliers From Your Dataset?

by | Apr 6, 2026

YouTube video

🎯 The Short Answer: Yes, you can remove outliers, but only using a standardized, mathematically defensible method like the IQR (interquartile range) test. Always document and cite your approach to maintain transparency and credibility.

One of the most common concerns we hear from postgraduate researchers is whether it’s okay to remove statistical outliers from their dataset, and more importantly, how to do it without looking like they’re manipulating their results. It’s a legitimate worry. After all, removing data can feel suspicious, even if it’s done for all the right reasons. The good news? You absolutely can remove outliers, but you need to do it the right way.

πŸ” Why Outliers Matter in Your Research

Outliers are data points that sit far outside the normal range of your dataset. They can occur for legitimate reasons (like a measurement error or an unusual case that doesn’t represent your population) or sometimes they’re just part of natural variation. The problem is that outliers can skew your statistical results and make your findings less accurate. However, simply eyeballing your data and deciding “that looks weird, I’ll remove it” is never acceptable. Your examiners will spot this immediately, and it raises serious questions about the integrity of your research.

The key is using a standardized, mathematically justified method that you can defend in your methodology section.

πŸ“ The IQR Test

The most widely accepted method for identifying outliers is the Interquartile Range (IQR) test, also known as the two-key outlier test. This method is established, commonly used across disciplines, and it’s straightforward to apply. Here’s how it works:

  1. You calculate the interquartile range (the difference between your third quartile and first quartile), multiply it by 1.5
  2. You then add that value to your third quartile.
  3. Any data point above that threshold is mathematically classified as an outlier.

The same process works in reverse for your lower quartile. This isn’t arbitrary or subjective; it’s a formula you can cite and defend.

There are plenty of tutorials online and in statistics textbooks that explain this method step-by-step. The beauty of using the IQR test is that it’s transparent, reproducible, and recognized across academic fields. When you use this method, you’re not making a judgment call; you’re applying a standardized statistical procedure.

✍️ Document and Explain Every Removal

Here’s where many students slip up: they remove outliers during data cleaning but then don’t mention it in their methodology. This is a huge red flag. Even if you use a perfectly legitimate method like the IQR test, failing to disclose that you removed data introduces uncertainty about your dataset and makes your research look less trustworthy. We often see this issue come up in ourΒ coaching sessions, and it’s easily preventable with clear documentation.

In your methodology section, you need to be explicit about what you did. Write something like:

I identified outliers using the Interquartile Range test as described by [cite a relevant source]. Using this method, I removed X data points from the original dataset of Y observations. The removed cases were [briefly describe what made them outliers].

This transparency actually strengthens your research because it shows you’ve thought carefully about your data quality and you’re not hiding anything.

🎯 Cite Your Method and Check Your Field’s Norms

Different academic disciplines sometimes have slightly different conventions for handling outliers, so it’s worth checking what’s standard in your specific field. Some disciplines are stricter than others, and your supervisor will expect you to follow disciplinary norms. Once you’ve identified the appropriate method for your field, cite it properly. This might be a reference to a statistics textbook, a methodological paper, or guidance from your disciplinary association. The citation shows that you’re not inventing a new approach; you’re following established practice.

When you cite your outlier removal method, you’re essentially saying to your examiners: “This is how researchers in my field handle this situation, and I’ve applied that standard approach.” It’s a powerful statement because it demonstrates both competence and integrity. You’re not trying to hide anything; you’re following best practices.

βš–οΈ Transparency Is Your Best Defense

The underlying principle here is simple: transparency prevents suspicion. If you’re upfront about what you did, why you did it, and how you did it, your examiners will trust your work. They understand that data cleaning is a normal part of research. What they won’t tolerate is the appearance of data manipulation or hidden decisions that could bias your results. By using a standardized method, documenting it clearly, and citing your approach, you’re demonstrating that you’ve handled your data responsibly.

πŸ“Œ Key Takeaways

  • Use the IQR test to mathematically identify outliers, not subjective judgment.
  • Always document and explain outlier removal in your methodology section.
  • Cite your method to show you’re following established, disciplinary best practices.
  • Transparency about data decisions builds credibility and prevents suspicion.
  • Check your field’s norms as outlier handling conventions may vary by discipline.

P.S. Join our next Live Q&A Session to get your questions answered, for free!

Don’t stop now…

I’ve Found My Research Gap – Now What?

I’ve Found My Research Gap – Now What?

🎯 The Short Answer: Create a detailed timeline working backward from your target completion date, then dive into your literature review to get familiar with the existing research. This foundation will guide every other decision you make throughout your dissertation....

How Do I Get My Dissertation Back on Track?

How Do I Get My Dissertation Back on Track?

🎯 The Short Answer: Let go of the guilt, reread your work to reorient yourself, talk through your current progresss with a peer or mentor, then identify your next milestone and tackle it one step at a time. If life has knocked your dissertation or thesis schedule off...

How Do I Get Started With Chapter One?

How Do I Get Started With Chapter One?

🎯 The Short Answer: Start with your research purpose and research question(s) - then check your university handbook for required sections, and tackle remaining sections in the order you feel most confident writing them. Chapter one is often the most overwhelming...

How Do I Find Foundational (Seminal) Papers For My Literature Review?

How Do I Find Foundational (Seminal) Papers For My Literature Review?

🎯 The Short Answer: Look for papers cited repeatedly in your reading, use tools like Research Rabbit to map connections, and build a relationship with your university's research librarian. These strategies help you find foundational papers without drowning in hundreds...

I’m Completely Overwhelmed By My Dissertation. What Should I Do?

I’m Completely Overwhelmed By My Dissertation. What Should I Do?

🎯 The Short Answer: First, recognize that overwhelm is a universal experience in dissertation writing. Second, identify the real cause (biological needs, knowledge gaps, or external stress) and address it directly. Third, lean on support from peers, faculty or a coach...

How Can I Finish My Dissertation Faster Without Cutting Corners?

How Can I Finish My Dissertation Faster Without Cutting Corners?

🎯 The Short Answer: You can finish your dissertation or thesis faster by choosing topics with accessible data, securing early stakeholder support, controlling your scope, aligning with your adviser, and writing early. These strategies reduce uncertainty rather than...