Databases Reference
In-Depth Information
competitors. We think it's convenient for Facebook to have interview‐
ees for data science positions in such a posture of gratitude for the
mere interview. Cathy thinks this distracts data scientists from asking
hard questions about what the data policies are and the underlying
ethics of the company.
Kaggle's Essay Scoring Competition
Part of the final exam for the Columbia class was an essay grading
contest. The students had to build it, train it, and test it, just like any
other Kaggle competition, and group work was encouraged. The de‐
tails of the essay contest are discussed below, and you access the data
at https://inclass.kaggle.com .
You are provided access to hand-scored essays so that you can build,
train, and test an automatic essay scoring engine. Your success de‐
pends upon how closely you can deliver scores to those of human
expert graders.
For this competition, there are five essay sets. Each of the sets of essays
was generated from a single prompt. Selected essays range from an
average length of 150 to 550 words per response. Some of the essays
are dependent upon source information and others are not. All re‐
sponses were written by students ranging in grade levels 7 to 10. All
essays were hand graded and were double-scored. Each of the datasets
has its own unique characteristics. The variability is intended to test
the limits of your scoring engine's capabilities. The data has these
columns:
id
A unique identifier for each individual student essay set
1-5
An id for each set of essays
essay
The ascii text of a student's response
rater1
Rater 1's grade
rater2
Rater 2's grade
grade
Resolved score between the raters
 
Search WWH ::




Custom Search