Databases Reference
In-Depth Information
Chapter 6
Krugle Code Search Architecture
Ken Krugler
Abstract Krugle was one of the earliest commercial portals for searching open
source software. This chapter reviews the history of Krugle from initial inception
to present day. It follows the search engine from the initial public version to the
enterprise offering, with a particular focus on lessons learned from design decisions
on topics such as web crawling, indexing, system architecture, and deployment.
6.1 Introduction
Krugle is a search engine for searching in source code and related technical in-
formation. There is a public site at Krugle.org, which has information on the top
3,500 open source projects, including project descriptions, licenses, software con-
figuration management activity, and most importantly the source code—more than
400 million lines and growing.
There is also an enterprise version, which runs inside of company firewalls and
provides the same search functionality against internal code and technical informa-
tion.
In this chapter, I'll be describing the Krugle architecture, how it evolved over
time, and the lessons we learned during that process.
6.2 Background
In 2004 I got actively involved in my first open source project, the ill-fated Chandler
PIM. It slowly dawned on me that there were literally billions of lines of open source
K. Krugler ( )
Scale Unlimited, 14860 Uren St, Nevada City, CA, USA
e-mail: kkrugler@scaleunlimited.com
Search WWH ::




Custom Search