Fragile Families Challenge powered by CodaLab

> Evaluation

The evaluation states: "We will evaluate submissions based on predictive validity, measured in the held-out test data by mean squared error loss for continuous outcomes and Brier loss for binary outcomes." How will held-out test observations with missing ground-truth data in one of the 6 outcomes be evaluated?

Posted by: kaltenburger @ April 11, 2017, 5:21 p.m.

Great question! We are asking you to make predictions for all observations, and those with missing values on a particular outcome in the held-out test data will be ignored in the ultimate score for that outcome which you receive at the end of the challenge. The reason for this is that we imagine some small chance we might ultimately track down these cases and find out what the real value is, and we'd like to know if that tracked-down value is different from what people predicted.

Posted by: atkindel @ April 12, 2017, 12:52 a.m.

I asked the organisers a question about the need to 'predict' values from the training data in the predictions file, given that the best prediction is the one already provided.
Their reply was that only the non-training portion of the data counted for evaluation purposes. They would prefer, however, that we predict each case rather than just copying across the training value, presumably to look at things like dispersion.

Answer on missing values also confirms my understanding.

Posted by: the_Brit @ April 13, 2017, 11:26 a.m.

Post in this thread

Forums

Fragile Families Challenge Forum

> Evaluation