Database Reference
In-Depth Information
SELECT
SUBSTRING_INDEX
(
SUBSTRING_INDEX
(
url
,
'?'
,
1
),
'.'
,
-
1
)
AS
extension
,
COUNT
(
size
)
AS
requests
,
SUM
(
size
)
AS
bytes
,
AVG
(
size
)
AS
'bytes/request'
FROM
httpdlog
WHERE
url
LIKE
'%.%'
GROUP
BY
extension
;
The
WHERE
clause selects only
url
values that have a period in them, to eliminate
pathnames that refer to files that have no extension. To extract the extension values
for the output column list, the inner
SUBSTRING_INDEX()
call strips any parameter
string from the right end of the URL and leaves the rest. (This turns a value like
/
cgi-bin/script.pl?id=43
into
/cgi-bin/script.pl
. If the value has no param‐
eter part,
SUBSTRING_INDEX()
returns the entire string.) The outer
SUBSTRING_IN
DEX()
call strips everything up to and including the rightmost period from the
result, leaving only the extension.
Other logging issues
The preceding discussion shows a simple method for hooking Apache to MySQL: write
a short script that communicates with MySQL, and tell Apache to write to the script
rather than to a file. This works well if you log all requests to a single file, but certainly
isn't appropriate for every possible configuration of which Apache is capable. For ex‐
ample, if you have virtual servers defined in your
httpd.conf
file, you might have separate
CustomLog
directives defined for each server. To log them all to MySQL, you can change
each directive to write to
httpdlog.pl
, but that results in a separate logging process for
each virtual server.
That brings up the issue of how you associate log records with the proper virtual server.
One solution is to create a separate log table for each server and modify
httpdlog.pl
to
take an argument that indicates which table to use. Another is to use a table that has a
vhost
column, an Apache log format that includes the
%v
virtual host format specifier,
and a logging script that uses the
vhost
value when it generates
INSERT
statements. The
apache/httpdlog
directory of the
recipes
distribution contains an
httpdlog2.pl
script
that implements this method, along with instructions for using it.
Logging to a database rather than a file enables you to bring the full power of MySQL
to bear on log analysis, but it doesn't eliminate the need to think about space manage‐
ment. Web servers can generate a lot of activity, and log records use space regardless of
whether you write them to a file or to a database. One way to save space is to expire
records now and then. For example, to remove log records that are more than a year
old, run the following statement periodically:
DELETE
FROM
httpdlog
WHERE
dt
<
NOW
()
-
INTERVAL
1
YEAR
;