Java Reference
In-Depth Information
Sets allow you to examine lots of data while ignoring duplicates. For example, if
you wanted to see how many unique words appear in the topic
Moby-Dick,
you could
write code such as the following:
Set<String> words = new HashSet<String>();
Scanner in = new Scanner(new File("mobydick.txt"));
while (in.hasNext()) {
String word = in.next();
word = word.toLowerCase();
words.add(word);
}
System.out.println("Number of unique words = " + words.size());
This code produces the following output when run on the text of
Moby-Dick
(available from
http://www.gutenberg.org):
Number of unique words = 30368
The
HashSet
class has a convenient constructor that accepts another collection as a
parameter and puts all the unique elements from that collection into the
Set
. One
clever usage of this constructor is to find out whether a
List
contains any duplicates.
To do so, simply construct a
HashSet
from the list and see whether the sizes differ:
// returns true if the given list contains any duplicate elements
public static boolean hasDuplicates(List<Integer> list) {
Set<Integer> set = new HashSet<Integer>(list);
return set.size() < list.size();
}
One drawback of a
Set
is that it doesn't store elements by indexes. The following
loop doesn't compile on a
Set
, because it doesn't have a
get
method:
// this code does not compile
for (int i = 0; i < words.size(); i++) {
String word = words.get(i); // error -- no get method
System.out.println(word);
}
Instead, if you want to loop over the elements of a
Set
, you must use an iterator.
Like other collections, a
Set
has an
iterator
method that creates an
Iterator
object to examine its elements. You can then use the familiar
hasNext/next
loop to
examine each element:
// this code works correctly
Iterator<String> itr = words.iterator();
Search WWH ::
Custom Search