Arrays and Strings - Beginning Java

Java Reference

In-Depth Information

3. The third phase simply outputs the contents of the array by displaying each element in turn, using

a collection-based for loop. The String variable, s , defined in the loop references each string in the

array in turn. You display each string by passing s as the argument to the println() method.

What you have been doing here is breaking a string up into tokens — substrings in other words — that

are separated by delimiters — characters that separate one token from the next. This is such a sufficiently

frequent requirement that Java provides you with an easier way to do this — using the split() method

in the String class.

Tokenizing a String

The split() method in the String class is specifically for splitting a string into tokens. It does this in a

single step, returning all the tokens from a string as an array of String objects. To do this it makes use of a

facility called regular expressions, which I discuss in detail in Chapter 15. However, you can still make use

of the split() method without knowing about how regular expressions work, so I ignore this aspect here.

Just keep the split() method in mind when you get to Chapter 15.

The split() method expects two arguments. The first is a String object that specifies a pattern for a

delimiter. Any delimiter that matches the pattern is assumed to be a separator for a token. Here I talk only

about patterns that are simply a set of possible delimiter characters in the string. You see in Chapter 15 that

the pattern can be much more sophisticated than this. The second argument to the split() method is an

integer value that is a count of the maximum number of times the pattern can be applied to find tokens and,

therefore, affects the maximum number of tokens that can be found. If you specify the second argument as

zero, the pattern is applied as many times as possible and any trailing empty tokens discarded. This can arise

if several delimiters at the end of the string are being analyzed. If you specify the limit as a negative integer,

the pattern is also applied as many times as possible, but trailing empty tokens are retained and returned. As

I said earlier, the tokens found by the method are returned in an array of type String[] .

The key to tokenizing a string is providing the appropriate pattern defining the set of possible delimiters.

At its simplest, a pattern can be a string containing a sequence of characters, each of which is a delimiter.

You must specify the set of delimiters in the string between square brackets. This is necessary to distinguish

a simple set of delimiter characters from more complex patterns. Examples are the string "[abc]" defining

'a' , 'b' , and 'c' as delimiters, or "[, .:;]" specifying a comma, a period, a space, a colon, or a semicolon

as delimiters. There are many more powerful ways of defining a pattern, but I defer discussing that until

Chapter 15.

To see how the split() method works, consider the following code fragment:

String text = "to be or not to be, that is the question.";

String[] words = text.split("[, .]", 0); // Delimiters are comma, space, or

period

The first statement defines the string to be analyzed and split into tokens. The second statement calls the

split() method for the text object to tokenize the string. The first argument to the method specifies a

comma, a space, or a period as possible delimiters. The second argument specifies the limit on the number of

applications of the delimiter pattern as zero, so it is applied as many times as necessary to tokenize the entire

string. The split() method returns a reference to an array of strings that are stored in the words variable.

In case you hadn't noticed, these two lines of code do the same thing as most of the code in main() in the

previous working example!

Search WWH ::

Custom Search

Home