Database Reference
In-Depth Information
In this case, 9 is the modal score value.
The median of a set of ordered values can be calculated like this: 1
• If the number of values is odd, the median is the middle value.
• If the number of values is even, the median is the average of the two middle values.
Based on that definition, use the following procedure to determine the median of a set
of observations stored in the database:
1. Issue a query to count the number of observations. From the count, you can de‐
termine whether the median calculation requires one or two values, and what their
indexes are within the ordered set of observations.
2. Issue a query that includes an ORDER BY clause to sort the observations and a LIM
IT clause to pull out the middle value or values.
3. If there is a single middle value, it is the median. Otherwise, take the average of the
middle values.
Suppose that a table t contains a score column with 37 values (an odd number). To get
the median, select a single value using a statement like this:
SELECT score FROM t ORDER BY score LIMIT 18 , 1
If the column contains 38 values (an even number), select two values:
SELECT score FROM t ORDER BY score LIMIT 18 , 2
Then take the values returned by the statement and compute the median from their
average.
The following Perl function implements a median calculation. It takes a database handle
and the names of the database, table, and column that contain the set of observations.
Then it generates the statement that retrieves the relevant values and returns their aver‐
age:
sub median
{
my ( $dbh , $db_name , $tbl_name , $col_name ) = @_ ;
my ( $count , $limit );
$db_name = $dbh -> quote_identifier ( $db_name );
$tbl_name = $dbh -> quote_identifier ( $tbl_name );
$col_name = $dbh -> quote_identifier ( $col_name );
$count = $dbh -> selectrow_array ( qq{
1. The definition of median given here isn't fully general; it doesn't address what to do if the middle values in
the dataset are duplicated.
Search WWH ::




Custom Search