Overwhelming number of submissions for the inaugural Big Data Derby
by Keith McCalmont
A total of 106 submissions were received for the inaugural Big Data Derby, a competition requiring entrants to provide a machine-learning model to analyze all manner of data regarding horseracing tactics, strategies and path efficiencies.
Sponsored by the New York Racing Association, Inc. (NYRA) and the New York Thoroughbred Horsemen's Association (NYTHA) in partnership with the Kentucky Thoroughbred Association, Equibase, The Jockey Club, Breeders’ Cup and the Thoroughbred Owners and Breeders Association (TOBA), the Big Data Derby launched with a goal of better understanding the vast data set at hand to racing organizations, and to potentially develop new ways of racing and training in a highly traditional industry.
“Our main objective with this competition was to see if qualified data scientists could utilize horse-tracking data to improve the sport's collective knowledge in key areas such as equine welfare and performance,” said NYTHA President Joe Appelbaum.
The Big Data Derby offers a total of $50,000 in prize money with $20,000 awarded to the winner and $10,000 each to the next three placings. The competition is held on Kaggle, a global data science platform with over 500,000 active users where participants compete by using machine learning to solve problems ranging from the trivial to the extremely complex.
A total of 9,349 potential competitors accessed the competition’s four data files providing NYRA racing data from 2019 along with in-race horse tracking information. A wide and varied range of submissions offered models that shed light on injury prevention, jockey decision making metrics, race tactics, track bias and more. An open notebook of user-created content and data can be viewed at: https://www.kaggle.com/competitions/big-data-derby-2022/code.
“The response in both participants and submissions highlights the interest in alternative data sets and bodes well for potential future applications. We are very much looking forward to the results of the competition," said Joe Longo, NYRA’s General Manager of Content Services.
A judging committee will score the submissions based on four categories – Innovation [25 points], Relevance [30 points], Competence [25 points] and Presentation [20 points]. Winners will be announced in early December.
For more information, please visit https://www.kaggle.com/competitions/big-data-derby-2022.