Java Reference
In-Depth Information
3. The third phase simply outputs the contents of the array by displaying each element in turn, using
a collection-based for loop. The String variable, s , defined in the loop references each string in the
array in turn. You display each string by passing s as the argument to the println() method.
What you have been doing here is breaking a string up into tokens — substrings in other words — that
are separated by delimiters — characters that separate one token from the next. This is such a sufficiently
frequent requirement that Java provides you with an easier way to do this — using the split() method
in the String class.
Tokenizing a String
The split() method in the String class is specifically for splitting a string into tokens. It does this in a
single step, returning all the tokens from a string as an array of String objects. To do this it makes use of a
facility called regular expressions, which I discuss in detail in Chapter 15. However, you can still make use
of the split() method without knowing about how regular expressions work, so I ignore this aspect here.
Just keep the split() method in mind when you get to Chapter 15.
The split() method expects two arguments. The first is a String object that specifies a pattern for a
delimiter. Any delimiter that matches the pattern is assumed to be a separator for a token. Here I talk only
about patterns that are simply a set of possible delimiter characters in the string. You see in Chapter 15 that
the pattern can be much more sophisticated than this. The second argument to the split() method is an
integer value that is a count of the maximum number of times the pattern can be applied to find tokens and,
therefore, affects the maximum number of tokens that can be found. If you specify the second argument as
zero, the pattern is applied as many times as possible and any trailing empty tokens discarded. This can arise
if several delimiters at the end of the string are being analyzed. If you specify the limit as a negative integer,
the pattern is also applied as many times as possible, but trailing empty tokens are retained and returned. As
I said earlier, the tokens found by the method are returned in an array of type String[] .
The key to tokenizing a string is providing the appropriate pattern defining the set of possible delimiters.
At its simplest, a pattern can be a string containing a sequence of characters, each of which is a delimiter.
You must specify the set of delimiters in the string between square brackets. This is necessary to distinguish
a simple set of delimiter characters from more complex patterns. Examples are the string "[abc]" defining
'a' , 'b' , and 'c' as delimiters, or "[, .:;]" specifying a comma, a period, a space, a colon, or a semicolon
as delimiters. There are many more powerful ways of defining a pattern, but I defer discussing that until
Chapter 15.
To see how the split() method works, consider the following code fragment:
String text = "to be or not to be, that is the question.";
String[] words = text.split("[, .]", 0); // Delimiters are comma, space, or
period
The first statement defines the string to be analyzed and split into tokens. The second statement calls the
split() method for the text object to tokenize the string. The first argument to the method specifies a
comma, a space, or a period as possible delimiters. The second argument specifies the limit on the number of
applications of the delimiter pattern as zero, so it is applied as many times as necessary to tokenize the entire
string. The split() method returns a reference to an array of strings that are stored in the words variable.
In case you hadn't noticed, these two lines of code do the same thing as most of the code in main() in the
previous working example!
Search WWH ::




Custom Search