Java Reference
In-Depth Information
Chapter 8
Working with Archive Files
In this chapter, afasasfyasdasdou will learn
What archisadasdve files are
What data compression is and how to compress and decompress data
How to compute checksum for data using different algorithms
How to create files in ZIP, GZIP, and JAR file formats and read data from them
jar command-line tool to work with JAR files
How to use the
What Is an Archive File?
An archive file consists of one or more files. It also contains metadata that may include the directory structure of the
files, comments, error detection and recovery information, etc. An archive file may also be encrypted. Typically, but
not necessarily, an archive file is stored in a compressed format. An archive file is created using file archiver software.
For example, the WinZip, 7-zip, etc . utilities are used to create a file archive in a ZIP format on Microsoft Windows; the
tar utility is used to create archive files on UNIX-based operating systems. An archive file makes it easier to store and
transmit multiple files as one file. This chapter discusses in detail how to work with archive files using the Java I/O API
and the jar command line utility that is included in the JDK.
Data Compression
Data compression is a process of applying an encoding algorithm to the given data to represent it in a smaller size.
Suppose you have a string, 777778888 . One way to encode it is 5748 , which can be interpreted as “five sevens and
four eights.” By this encoding, you have reduced the length of the string from nine to five characters. The algorithm
you have applied to compress 777778888 as 5748 is called Run Length Encoding (RLE). The RLE encodes the data by
replacing the repeated sequence of data by the counter number and one copy of data. The RLE is easy to implement.
It is suitable only in situations where you have more repeated data.
The reverse of data compression is called data decompression. Here, you apply an algorithm to the compressed
data to get back the original data.
There are two types of data compression: lossless and lossy. In lossless data compression, you get your original
data back when you decompress the compressed data. For example, if you decompress 5748 , you can get your original
data ( 777778888) back without losing any information. You can get the information back in this example because RLE
is a lossless data compression algorithm. Other lossless data compression algorithms are LZ77, LZ78, LZW, Huffman
coding, Dynamic Markov Compression (DMC), etc.
 
Search WWH ::




Custom Search