Database Reference
In-Depth Information
SELECT
DATE
(
dt
)
AS
day
,
COUNT
(
*
)
FROM
httpdlog
GROUP
BY
day
;
Answering this question requires stripping the time-of-day part from the
dt
values
so that requests received on a given date can be grouped. The statement does this
using the
DATE()
function to convert
DATETIME
values to
DATE
values. However, if
you intend to run a lot of statements that use just the date part of the
dt
values, it
would be more efficient to create the
httpdlog
table with separate
DATE
and
TIME
columns, change the
LogFormat
directive to produce the date and time as separate
output values, and modify
httpdlog.pl
accordingly. Then you can operate on the
request dates directly without stripping the time, and you can index the date column
for even better performance.
• What is the hour-of-the-day request histogram?
SELECT
HOUR
(
dt
)
AS
hour
,
COUNT
(
*
)
FROM
httpdlog
GROUP
BY
hour
;
• What is the average number of requests received each day?
SELECT
COUNT
(
*
)
/
(
DATEDIFF
(
MAX
(
dt
),
MIN
(
dt
))
+
1
)
FROM
httpdlog
;
The numerator is the number of requests in the table. The denominator is the
number of days spanned by the records.
• What is the longest URL recorded in the table?
SELECT
MAX
(
LENGTH
(
url
))
FROM
httpdlog
;
If the
url
column is defined as
VARCHAR(255)
and this statement produces a value
of 255, it's likely that some URL values were too long to fit in the column and were
truncated at the end. To avoid this, change the column definition to permit more
characters. For example, to permit up to 5,000 characters, modify the
url
column
as follows:
ALTER
TABLE
httpdlog
MODIFY
url
VARCHAR
(
5000
)
CHARACTER
SET
latin1
COLLATE
latin1_general_cs
NOT
NULL
;
• What is the total number of bytes served and the average bytes per request?
SELECT
COUNT
(
size
)
AS
requests
,
SUM
(
size
)
AS
bytes
,
AVG
(
size
)
AS
'bytes/request'
FROM
httpdlog
;
The statement uses
COUNT(size)
rather than
COUNT(*)
to count only those requests
with a non-
NULL
size
value. If a client requests a page twice, the server may respond
to the second request by sending a header indicating that the page hasn't changed
rather than by sending content. In this case, the log entry for the request will have
NULL
in the
size
column.
• How much traffic has there been for each kind of file (based on filename extension
such as
.html
,
.jpg
, or
.php
)?