Java Reference
In-Depth Information
This method is called to download images and other binary objects. Anything that is not
HTML is downloaded by this method. HTML is handled differently because HTML contains
links to other pages. This method begins by creating a buffer to read the binary data.
byte[] buffer = new byte[1024];
int length;
Next, a filename is created. The filename uses the convertFilename function to
convert the URL into a file that can be saved to the local computer. The convertFilename
function also creates the directory structure to hold the specified file.
String filename = URLUtility.convertFilename(this.path, url,
true);
Next, the data is read in. It is read using the buffer variable that was created earlier.
try {
OutputStream os = new FileOutputStream(filename);
do {
length = stream.read(buffer);
if (length != -1) {
os.write(buffer, 0, length);
}
} while (length != -1);
Once the data has been read, the output stream can be closed.
os.close();
If any exceptions are caught, they are displayed to the user.
} catch (FileNotFoundException e) {
e.printStackTrace();
}
This recipe also has to handle HTML data. If a URL has HTML data, then the second
form of the spiderProcessURL method is used.
public void spiderProcessURL(URL url, SpiderParseHTML parse)
First, a filename is generated, just as was done for the binary URL. An OutputStream
is opened to write the file to.
String filename =
URLUtility.convertFilename(this.path, url, true);
OutputStream os = new FileOutputStream(filename);
The OutputStream is then attached to the ParseHTML object, so that any data
ready from the HTML stream is also written to the OutputStream . This saves the HTML
file to the local computer.
parse.getStream().setOutputStream(os);
Search WWH ::




Custom Search