Databases Reference
In-Depth Information
KDD Cups
All the KDD Cups, with their tasks and corresponding datasets, can
be found at http://www.kdd.org/kddcup/index.php . Here's a list:
• KDD Cup 2010: Student performance evaluation
• KDD Cup 2009: Customer relationship prediction
• KDD Cup 2008: Breast cancer
• KDD Cup 2007: Consumer recommendations
• KDD Cup 2006: Pulmonary embolisms detection from image
data
• KDD Cup 2005: Internet user search query categorization
• KDD Cup 2004: Particle physics; plus protein homology
prediction
• KDD Cup 2003: Network mining and usage log analysis
• KDD Cup 2002: BioMed document; plus gene role classification
• KDD Cup 2001: Molecular bioactivity; plus protein locale
prediction
• KDD Cup 2000: Online retailer website clickstream analysis
• KDD Cup 1999: Computer network intrusion detection
• KDD Cup 1998: Direct marketing for profit optimization
• KDD Cup 1997: Direct marketing for lift curve optimization
On the other hand, you have the “real world” kind of data mining
competition, where you're handed raw data (which is often in lots of
different tables and not easily joined), you set up the model yourself,
and come up with task-specific evaluations. This kind of competition
simulates real life more closely, which goes back to Rachel's thought
experiment earlier in this topic about how to simulate the chaotic ex‐
perience of being a data scientist in the classroom. You need practice
dealing with messiness.
Examples of this second kind are KDD cup 2007, 2008, and 2010. If
you're in this kind of competition, your approach would involve un‐
derstanding the domain, analyzing the data, and building the model.
The winner might be the person who best understands how to tailor
the model to the actual question.
 
Search WWH ::




Custom Search