Database Reference
In-Depth Information
Notice that we don't put a recovery.conf into the backup this time. That's because we're
assuming we want flexibility at the time of recovery, rather than a gift-wrapped solution. The
reason for that is we don't know when or where or how we will be recovering, nor do we need
to make a decision on that yet.
How it works...
The key point here is that we must have both the base backup and the archive in order to
recover. Where you put them is completely up to you. You can use any file system backup
technology and/or file system backup management system to do this.
Many backup management systems have claimed that they have a PostgreSQL interface/
plugin, though this most often means they support logical backup. However, there's no need
for them to officially support PostgreSQL; there isn't any "Runs on PostgreSQL" badge or
certification required. If you can copy files, then you can run the preceding processes to keep
your database safe.
The preceding procedure uses a simple secure file copy, though it could also use rsync .
If the network or backup server goes down, then the command will begin to fail. When the
archive_command fails, it will repeatedly retry until it succeeds. PostgreSQL does not
remove WAL files from pg_xlog until the WAL files have been successfully archived, so the
end result is that your pg_xlog directory fills up. It's a good idea to have an archive_
command that reacts better to that condition, though that is left as an improvement for the
sysadmin. Typical action is to make that an emergency call out so we can resolve the problem
manually. Automatic resolution is difficult to get right as this condition is one for which it is
hard to test.
When continuously archiving, we will generate a considerable number of WAL files. If archive_
timeout is set to 30 seconds, we will generate a minimum of 2*60*24 = 2880 files per
day, each 16 MB in size, so a total volume of 46 GB per day (minimum). With a reasonable
transaction rate, a database server might generate 100 GB of archive data per day, so you
should use that as a rough figure for calculations before you have better measurements. Of
course the rate could be much higher, with rates of 1 TB per day or higher being possible.
Clearly we would only want to store WAL files that are useful for backup, so when we decide
we no longer wish to keep a backup we will also want to remove files from the archive. In each
base backup you will find a file called backup_label. The earliest WAL file required by a physical
backup is the filename mentioned on the first line of the backup_label file. We can use a contrib
module called pg_archivecleanup to remove any WAL files earlier than the earliest file.
The size of the WAL archive is clearly something we would want to compress. Ordinary
compression is only reasonably effective. As is typically the case, a domain-specific
compression tool is usually better at compressing archives. pg_lesslog is available at the
following website to do this.
http://pgfoundry.org/frs/?group_id=1000310
 
Search WWH ::




Custom Search