Databases Reference
In-Depth Information
Values generated by UUID() have different characteristics from those generated by a
cryptographic hash function such as SHA1() : the UUID values are unevenly distributed
and are somewhat sequential. They're still not as good as a monotonically increasing
integer, though.
Beware of Autogenerated Schemas
We've covered the most important data type considerations (some with serious and
others with more minor performance implications), but we haven't yet told you about
the evils of autogenerated schemas.
Badly written schema migration programs and programs that autogenerate schemas
can cause severe performance problems. Some programs use large VARCHAR fields for
everything , or use different data types for columns that will be compared in joins. Be
sure to double-check a schema if it was created for you automatically.
Object-relational mapping (ORM) systems (and the “frameworks” that use them) are
another frequent performance nightmare. Some of these systems let you store any type
of data in any type of backend data store, which usually means they aren't designed to
use the strengths of any of the data stores. Sometimes they store each property of each
object in a separate row, even using timestamp-based versioning, so there are multiple
versions of each property!
This design may appeal to developers, because it lets them work in an object-oriented
fashion without needing to think about how the data is stored. However, applications
that “hide complexity from developers” usually don't scale well. We suggest you think
carefully before trading performance for developer productivity, and always test on a
realistically large dataset, so you don't discover performance problems too late.
Special Types of Data
Some kinds of data don't correspond directly to the available built-in types. A time-
stamp with subsecond resolution is one example; we showed you some options for
storing such data earlier in the chapter.
Another example is an IPv4 address. People often use VARCHAR(15) columns to store IP
addresses. However, they are really unsigned 32-bit integers, not strings. The dotted-
quad notation is just a way of writing it out so that humans can read it more easily. You
should store IP addresses as unsigned integers. MySQL provides the INET_ATON() and
INET_NTOA() functions to convert between the two representations.
Schema Design Gotchas in MySQL
Although there are universally bad and good design principles, there are also issues that
arise from how MySQL is implemented, and that means you can make MySQL-specific
mistakes, too. This section discusses problems that we've observed in schema designs
 
Search WWH ::




Custom Search