A Collection of Useful Classes - Beginning Java

Java Reference

In-Depth Information

You saw in Chapter 4 that you could tokenize a string using the split() method for a String object. As

I mentioned then, the split() method does this by applying a regular expression — in fact, the first argu-

ment to the method is interpreted as a regular expression. This is because the expression text.split(str,

limit) , where text is a String variable, is equivalent to the expression:

Pattern.compile(str).split(text, limit)

This means that you can apply all of the power of regular expressions to the identification of delimiters

in the string. To demonstrate that this is the case, I will repeat the example from Chapter 4, but modify the

first argument to the split() method so only the words in the text are included in the set of tokens.

TRY IT OUT: Extracting the Words from a String

Here's the code for the modified version of the example:

public class StringTokenizing {

public static void main(String[] args) {

String text =

"To be or not to be, that is the question."; // String

to segment

String delimiters = "[^\\w]+";

// Analyze the string

String[] tokens = text.split(delimiters);

// Output the tokens

System.out.println("Number of tokens: " + tokens.length);

for(String token : tokens) {

System.out.println(token);

}

StringTokenizing.java

Now you should get the following output:

Number of tokens: 10

To

be

or

not

to

be

that

is

the

Search WWH ::

Custom Search

Home