URLConnections - Java Network Programming

Java Reference

In-Depth Information

Content-Type: application/xml; charset=iso-2022-jp

In this case, getContentType() returns the full value of the Content-type field, including

the character encoding. You can use this to improve on Example 7-1 by using the en‐

coding specified in the HTTP header to decode the document, or ISO-8859-1 (the HTTP

default) if no such encoding is specified. If a nontext type is encountered, an exception

is thrown. Example 7-2 demonstrates.

Example 7-2. Download a web page with the correct character set

import java.io.* ;

import java.net.* ;

public class EncodingAwareSourceViewer {

public static void main ( String [] args ) {

for ( int i = 0 ; i < args . length ; i ++) {

try {

// set default encoding

String encoding = "ISO-8859-1" ;

URL u = new URL ( args [ i ]);

URLConnection uc = u . openConnection ();

String contentType = uc . getContentType ();

int encodingStart = contentType . indexOf ( "charset=" );

if ( encodingStart != - 1 ) {

encoding = contentType . substring ( encodingStart + 8 );

}

InputStream in = new BufferedInputStream ( uc . getInputStream ());

Reader r = new InputStreamReader ( in , encoding );

int c ;

while (( c = r . read ()) != - 1 ) {

System . out . print (( char ) c );

}

r . close ();

} catch ( MalformedURLException ex ) {

System . err . println ( args [ 0 ] + " is not a parseable URL" );

} catch ( UnsupportedEncodingException ex ) {

System . err . println (

"Server sent an encoding Java does not support: " + ex . getMessage ());

} catch ( IOException ex ) {

System . err . println ( ex );

}

public int getContentLength()

The getContentLength() method tells you how many bytes there are in the content. If

there is no Content-length header, getContentLength() returns -1. The method throws

Search WWH ::

Custom Search

Home