A Set is a Collection that cannot contain duplicate elements. It models the mathematical set abstraction. In Java, the java.util.Set interface defines a collection where the add() method will return false if you attempt to insert an element that is already present.
Unlike a List, a Set has a very specific set of behaviors that dictate how you interact with your data:
e1.equals(e2), you can only have one of them in the set.null element (except TreeSet).There are three primary flavors of Sets in Java. Your choice depends entirely on whether you care about the order of your data.
Speed King. Uses Hashing. No guarantee of order; the order can even change over time.
$O(1)$ Performance
Order Keeper. Maintains the order in which items were inserted while still ensuring uniqueness.
$O(1)$ Performance
The Sorter. Elements are stored in a Red-Black tree and are always kept in sorted order.
$O(\log n)$ Performance
This is the most critical technical concept. When you call set.add(obj), Java doesn't just look at the object; it follows a two-step process to check for duplicates:
equals() to see if the new object is identical to the ones already there.Rule: If you override equals(), you must override hashCode(). If you don't, your Set will fail to detect duplicates, leading to bugs that are incredibly hard to find.
Sets are powerful for mathematical operations like Union, Intersection, and Difference. This example shows how to use a Set to find unique tags and perform a union.
If you need your unique items to stay sorted, TreeSet is your implementation. It implements NavigableSet, which provides powerful "search" methods based on value.
TreeSet<Integer> scores = new TreeSet<>(List.of(10, 50, 80, 100));scores.lower(80); // Returns 50 (Greatest element strictly less than 80)scores.higher(80); // Returns 100 (Least element strictly greater than 80)
Because HashSet is backed by a HashMap, its performance for add(), remove(), and contains() is $O(1)$. This is significantly faster than a List's $O(n)$ for checking if an item exists. If you have a collection of 1,000,000 items and you need to check if "Item X" exists, use a Set—it will be thousands of times faster than a List.
If you are creating a set of Enum values, Java provides a highly optimized class called EnumSet. It is represented internally as a bit vector. It is faster than HashSet and uses extremely little memory. Always use EnumSet for enums.
Q: What happens if you add a duplicate element to a Set?
A: The add() method simply returns false and the set remains unchanged. No exception is thrown.
Q: Why does TreeSet not allow null elements?
A: TreeSet uses compareTo() or a Comparator to sort elements. Comparing anything to null throws a NullPointerException.
Q: How do you convert a List with duplicates into a unique List?
A: Pass the List into a HashSet constructor, then pass it back:
List<String> unique = new ArrayList<>(new HashSet<>(listWithDuplicates));
The Set Interface is your gatekeeper. It ensures data integrity by preventing duplicates and offers blazing-fast lookup speeds. In enterprise applications, Sets are indispensable for managing relationships, permissions, and distinct datasets. Choose HashSet for speed, LinkedHashSet for order, and TreeSet for sorting.