Information Technology Reference
In-Depth Information
systemsthatdrawouttheinformationandmakeitvisible.Nopersonorteamcanmanually
keep tabs on all the parts.
Distributed systems, therefore, require components to generate copious logs that detail
what happened in the system. These logs are then aggregated to a central location for col-
lection, storage, and analysis. Systems may log information that is very high level, such as
wheneverausermakesapurchase,foreachwebquery,orforeveryAPIcall.Systemsmay
loglow-level information aswell, suchastheparameters ofeveryfunctioncall inacritical
piece of code.
Systemsshouldexportmetrics.Theyshouldcountinterestingevents,suchashowmany
times a particular API was called, and make these counters accessible.
In many cases, special URLs can be used to view this internal state. For example, the
Apache HTTP Web Server has a “server-status” page ( http://www.example.com/
server-status/ ).
Inaddition,componentsofdistributedsystemsoftenappraisetheirownhealthandmake
this information visible. For example, a component may have a URL that outputs wheth-
er the system is ready (OK) to receive new requests. Receiving as output anything other
than the byte “O” followed by the byte “K” (including no response at all) indicates that the
system does not want to receive new requests. This information is used by load balancers
to determine if the server is healthy and ready to receive traffic. The server sends negative
replies when the server is starting up and is still initializing, and when it is shutting down
andisnolongeracceptingnewrequestsbutisprocessinganyrequeststhatarestillinflight.
1.2 The Importance of Simplicity
It is important that a design remain as simple as possible while still being able to meet the
needs of the service. Systems grow and become more complex over time. Starting with a
system that is already complex means starting at a disadvantage.
Providing competent operations requires holding a mental model of the system in one's
head.Asweworkweimaginethesystemoperatingandusethismentalmodeltotrackhow
itworksandtodebugitwhenitdoesn't.Themorecomplexthesystem,themoredifficultit
istohaveanaccuratementalmodel.Anoverlycomplexsystemresultsinasituationwhere
no single person understands it all at any one time.
In The Elements of Programming Style , Kernighan and Plauger ( 1978 ) wrote:
Debugging is twice as hard as writing the code in the first place. Therefore, if you
write the code as cleverly as possible, you are, by definition, not smart enough to de-
bug it.
Search WWH ::




Custom Search