Let’s go over a few idiomatic ways to remove duplicates from lists in Python.
Method #1 - Create a new list (simplest) 🔗
This is the easiest algorithm to code, but because it requires creating a new list, also requires more memory and is a bit slower.
def remove_duplicates(original): deduped =  for item in original: if item not in deduped: deduped.append(item) return deduped
We take advantage of Python’s in keyword here, only adding each item to the final list if it isn’t already present.
Method #2 - Create a new list with syntactic sugar (less code, harder to understand) 🔗
def remove_duplicates(original): deduped =  [deduped.append(item) for item in original if item not in deduped] return deduped
This is the same exact code from a performance standpoint but only uses one line. If you’re into code golf, then this might be your solution.
Get a back-end job without spending $10k on a bootcamp
- Build the professional projects you need to land your first job
- Spend about 6 months (when done part-time)
- Pricing as low as $24/month*
- No risk. Cancel anytime.
Method #3 - Use the built-in “set” data structure (fast, loses order) 🔗
set() is a group of values that doesn’t contain any duplicates. By casting a list into a set and back, you remove all duplicates. The main drawback here is that you’ll lose your ordering.
def remove_duplicates(original): return list(set(original))
This method will be faster in most circumstances than the previous two because each transfer is
O(n) in big-o notation terms. A group of two
O(n) operations is faster than one
O(n^2) operation. As a bonus, it even uses less code.
Method #4 - Use an ordered dictionary (fast, maintains order) 🔗
from collections import OrderedDict def remove_duplicates(original): return list(OrderedDict.fromkeys(original))
Learn back-end without spending $10,000
- Build and deploy real backend projects to your personal portfolio
- Compete in the job market by mastering computer science fundamentals