by Pat McKenna
The New York Racing Association, Inc. (NYRA) and the New York Thoroughbred Horsemen’s Association (NYTHA) today announced that a team led by Brendan Kumagai has been selected as the winner of the inaugural Big Data Derby competition. Kumagai, together with Gurashish Bagga, Kimberly Kroetch, Tyrel Stokes, and Liam Welsh, took the $20,000 first prize with the submission, “Bayesian Velocity Models for Horse Race Simulation.”
Kumagai’s team created a dynamic model that focused on horses’ forward versus lateral speed, and examined the results of sustained momentum and velocity within races. The team also studied jockey performance and how that would impact a horse’s running style. Another conclusion in the study posited that with additional biometric data, it would be possible to calculate a horse’s welfare and injury probability.
“We're honored to be named the winners of the inaugural Big Data Derby competition,” said Kumagai. “Our team primarily works in financial analytics and hockey statistics, so working with horse racing data has been a unique challenge which allowed us to apply our skills to a relatively new and unexplored domain."
The Big Data Derby was launched with the goal of analyzing the vast amounts of data available to racing organizations, and to understand how the results of those studies could impact traditional methods of racing and training. The competition was sponsored by the New York Racing Association, Inc. (NYRA) and the New York Thoroughbred Horsemen's Association (NYTHA) in partnership with the Kentucky Thoroughbred Association, Equibase, The Jockey Club, Breeders’ Cup and the Thoroughbred Owners and Breeders Association (TOBA).
A total of 106 submissions were received for the inaugural Big Data Derby. 9,349 potential competitors accessed the competition’s four data files over the course of the competition. A wide and varied range of submissions offered models that shed light on injury prevention, jockey decision making metrics, race tactics, track bias and more.
An open notebook of user-created content and data can be viewed at: https://www.kaggle.com/competitions/big-data-derby....
“Our main objective with this competition was to usher in an age of technological innovation analyzing horse racing data to suggest improvements in vital topics such as equine performance and welfare,” said NYTHA President Joe Appelbaum. “The enthusiastic response from this community suggests that our tradition-bound sport could benefit by applying knowledge gained from machine learning.”
Kumagai, a Data Science intern at Zelus Analytics, was previously part of a team that won the 2022 Big Data Bowl offered by the National Football League.
The Big Data Derby offered a total of $50,000 in prize money with $20,000 awarded to the winner and $10,000 each to the next three placings. The competition was held on Kaggle, a global data science platform with over 500,000 active users where participants compete by using machine learning to solve problems ranging from the trivial to the extremely complex.
The runners-up included Kyle King’s submission “Track Bias,” which explained the calculation of a track bias metric. Timothy Leung and Philip Leung offered “Advanced Horse Race Tactics Using Coordinate Data,” which examined the impact of drafting percentage, path efficiency, race strategy and speed fluctuation. Artem Volgin and Ekaterina Melianova submitted “Winning Strategies: What Works Better,” a study of jockey tactics and race flow.
“Data science can have a profound impact on a number of different aspects of horse racing,” said Joe Longo, GM of NYRA Content Management Services. “It can be applied to enhance our understanding of equine health and safety or day-to-day training methods by providing a new toolkit to even the most knowledgeable racing participants. NYRA is committed to embracing this pursuit moving forward.”
The competition was judged by data analyst Rob Bingel, Rhodes College Economics Professor, avid horseplayer and thoroughbred owner Marshall Gramm, and Craig Milkowski of TimeformUS.
"What came across loud and clear in reviewing all of the presentations was a passion for quantitative analysis and a learning curiosity that was piqued by the intricacies that the data-rich world of horse racing has to offer,” said Bingel, a financial planner and racing data analyst. “This competition served as a valuable introduction of the data scientist community to NYRA/NYTHA and holds the promise for future collaboration. The solutions proposed by the entrants, especially the winners, naturally leads to thoughts about what future important questions they could so capably answer.”
The 2022 fall meet at Aqueduct Racetrack continues through Saturday, December 31. For additional information, and the complete stakes schedule, visit www.NYRA.com.