Databases Reference
In-Depth Information
where they are, it's pretty certain you can't get there in an absolutely straight line,
and there are often a lot of obstacles in the way, such as large rivers that require
long detours to cross a bridge. Distance is therefore a poor proxy for how close
something actually is.
• If we've looked up someone's location from his zip code or city, we're measuring
from the center of an area to the center of another area, which also adds wiggle
room. Baron lives in Charlottesville, but not exactly at its center, and he's probably
not interested in traveling precisely to the center of Washington.
Maybe you really do need the precision, but for most applications, it's just overkill. It's
analogous to significant digits: you can't have more precision in your result than you
have in the measurements. (Or, put another way, “garbage in, garbage out.”)
If you don't need a lot of precision, it might be okay to pretend that the earth is flat
instead of curved! This transforms the trigonometry into a much simpler computation
with the Pythagorean theorem, which just uses a few sums, products, and a square root
to determine whether points are within a circle on a plane. 22
But wait, why stop there? Do we even need a circle? Why not just use a square instead?
The corners of a square that's 200 miles on a side are only 141 miles from the center,
which is not so far outside the desired radius of 100 miles. Let's update our query to
look for a square that's 0.0253 radians (100 miles) from the center to the edges:
SELECT * FROM locations
WHERE lat BETWEEN 38.03 - DEGREES(0.0253) AND 38.03 + DEGREES(0.0253)
AND lon BETWEEN −78.48 - DEGREES(0.0253) AND −78.48 + DEGREES(0.0253);
Now the question is how to optimize this expression with indexes. We could certainly
index (lat, lon) or (lon, lat) . But that won't really help the query very much. As you
know, MySQL 5.5 and older versions can't take advantage of any column past the first
one that is accessed with a range condition. Only one of the columns would be used
effectively, because our query has two range conditions ( BETWEEN is equivalent to a
greater-than and a less-than-or-equal-to).
Our trusty IN() workaround comes to the rescue again. We can add two columns to
store the FLOOR() of each coordinate, and then the query can use two IN() lists of integers
to capture all points that fall within the desired square. Here's how to add the new
columns, and an index whose purpose you'll see shortly:
mysql> ALTER TABLE locations
-> ADD lat_floor INT NOT NULL DEFAULT 0,
-> ADD lon_floor INT NOT NULL DEFAULT 0,
-> ADD KEY(lat_floor, lon_floor);
22. To help out even more, you can do the trigonometry in the application, instead of making the database
server do it. Trig functions are pretty CPU-hungry. Storing radians in the table and transforming
everything into radians in the application can help a lot, for example. We're trying to keep our example
simple and free of magic numbers whose origin is unclear, so we don't show this additional optimization.
 
Search WWH ::




Custom Search