Databases Reference
In-Depth Information
Finally, we tune the disks being used to avoid seek contention. There are two
drives devoted to snapshots, while one is serving up the current snapshot, the other
is being used to build the new snapshot. The Hub also uses two other drives for raw
data and processed data, again to allow multiple tasks to run in a multi-threaded
manner without running into disk thrashing.
The end result is an architecture that looks like this (Fig. 6.3 ):
Raw
Data
Files
Parsed
Data,
Indexes
External Data
Sources via
SCMIs
The Hub
Snapshot
2
Snapshot
1
The API
Fig. 6.3 Architecture of Krugle enterprise
6.5.4 Parsing Source Code
During early beta testing, we learned a lot about how developers search in code,
with two in particular being important. First, we needed to support semi-structured
searches, for example where the user wants to limit the search to only find hits in
class definition names.
In order to support this, we had to be able to parse the source code. But “parsing
the source code” is a rather vague description. There are lots of compilers out there
that obviously parse source code, but full compilation means that you need to know
about include paths (or classpaths), compiler-specific switches, the settings for the
macro preprocessor in C/C++, etc. The end result is that you effectively need to be
Search WWH ::




Custom Search