Java Reference
In-Depth Information
5. If both URIs are hierarchical, they're ordered according to their authority compo‐
nents, which are themselves ordered according to user info, host, and port, in that
order. Hosts are case insensitive.
6. If the schemes and the authorities are equal, the path is used to distinguish them.
7. If the paths are also equal, the query strings are compared.
8. If the query strings are equal, the fragments are compared.
URIs are not comparable to any type except themselves. Comparing a URI to anything
except another URI causes a ClassCastException .
String Representations
Two methods convert URI objects to strings, toString() and toASCIIString() :
public String toString ()
public String toASCIIString ()
The toString() method returns an unencoded string form of the URI (i.e., characters
like é and \ are not percent escaped). Therefore, the result of calling this method is not
guaranteed to be a syntactically correct URI, though it is in fact a syntactically correct
IRI. This form is sometimes useful for display to human beings, but usually not for
retrieval.
The toASCIIString() method returns an encoded string form of the URI . Characters
like é and \ are always percent escaped whether or not they were originally escaped. This
is the string form of the URI you should use most of the time. Even if the form returned
by toString() is more legible for humans, they may still copy and paste it into areas
that are not expecting an illegal URI. toASCIIString() always returns a syntactically
correct URI.
x-www-form-urlencoded
One of the challenges faced by the designers of the Web was dealing with the differences
between operating systems. These differences can cause problems with URLs: for ex‐
ample, some operating systems allow spaces in filenames; some don't. Most operating
systems won't complain about a # sign in a filename; but in a URL, a # sign indicates
that the filename has ended, and a fragment identifier follows. Other special characters,
nonalphanumeric characters, and so on, all of which may have a special meaning inside
a URL or on another operating system, present similar problems. Furthermore, Unicode
was not yet ubiquitous when the Web was invented, so not all systems could handle
characters such as é and . To solve these problems, characters used in URLs must
come from a fixed subset of ASCII, specifically:
• The capital letters A-Z
Search WWH ::




Custom Search