Atlanta – (Special to NFLNewsbyZennie62.com) – On September 25th, 2025, The National Football League announced the return of The Big Data Bowl, the league’s premier crowd-sourcing competition for the sports analytics community. Powered by Amazon Web Services (AWS), the eighth annual competition will challenge applicants to leverage NFL player tracking, also known as Next Gen Stats (NGS), to generate insights to enhance the game. Applicants signed up to compete for a shared prize of $100,000 and the chance to present their submission to NFL teams at the 2026 NFL Scouting Combine.
2025 Big Data Bowl Has Next Gen Stats Focus
This year’s competition highlights a new pathway for leveraging NGS. For the first time, the participants will predict player movement by using data before the football is thrown to produce insights on where players will move while the football is in the air. Data will be available from the 2023 and 2024 NFL seasons, with predictions evaluated against this year’s outcomes (Weeks 14-18). Also, for the first time, applicants can enter a public leaderboard, which evaluates the accuracy of submissions by comparing predicted to actual player locations, as determined using NGS.
“Now in our eighth year, we are elevating the Big Data Bowl with a first-of-its-kind prediction competition, offering participants a unique journey through an NFL play,” said Mike Lopez, senior director of football data & analytics at the NFL. “With these additions, we hope to gain the attention of amateur and expert analysts alike, bringing out the best in sports analytics to drive innovation in our game.”
In addition to being evaluated on the public leaderboard, applicants can select one of two analytics tracks:
- University track – This is open only to groups or individuals composed entirely of undergraduate or graduate students. Verification may be required to prove eligibility.
- Broadcast visualization track – This track generates an animation, video, or chart that best visualizes the movement of players with the ball in the air.
Participants were allowed to work independently or form teams with other colleagues. Following the submission deadline, all entries into the analytics tracks of the Big Data Bowl will be judged by data analysts from NFL teams.
Big Data Bowl Task For 2026: Predict Player Movement While The Ball Is In The Air
For 2026, the focus was this: “participants were tasked to predict NFL player movement during the video frames after the ball is thrown” according to the “NFL Big Data Bowl 2026 – Prediction” page, which has as its subheading “Predict player movement while the ball is in the air”.
Contestants were given a data set of 49 files. Also provided was a summary of each data set in the 2026 NFL Big Data Bowl, a list of key variables to join on, and a description of each variable. The tracking data was provided by the NFL Next Gen Stats team.
Competition Phases and Data Updates
The competition proceeded in two phases:
- A model training phase using data from historic games.
- A forecasting phase with a test set comprised of all games remaining in the NFL season after the submission deadline. Contestants were told to expect the scored portion of the test set to be roughly the same size as the scored portion of the test set in the first phase, with some variation in the number of plays due to the nature of football games. During the forecasting phase the evaluation API will serve only data from the previously unseen games.
- Applicants registered on Kaggle, which has hosted the Big Data Bowl for the past seven years via the leaderboard prediction competitionor the analytics competition. Applicants were allowed to register for both if they are interested.
The NFL Big Data Bowl 2026 Files
These are the actual file names provided. If you want to see the source page itself, click here.
train/
input_2023_w[01-18].csv
The input data contains tracking data before the pass is thrown
game_id: Game identifier, unique (numeric)play_id: Play identifier, not unique across games (numeric)player_to_predict: whether or not the x/y prediction for this player will be scored (bool)nfl_id: Player identification number, unique across players (numeric)frame_id: Frame identifier for each play/type, starting at 1 for eachgame_id/play_id/file type (input or output) (numeric)play_direction: Direction that the offense is moving (left or right)absolute_yardline_number: Distance from end zone for possession team (numeric)player_name: player name (text)player_height: player height (ft-in)player_weight: player weight (lbs)player_birth_date: birth date (yyyy-mm-dd)player_position: the player’s position (the specific role on the field that they typically play)player_side: team player is on (Offense or Defense)player_role: role player has on play (Defensive Coverage, Targeted Receiver, Passer or Other Route Runner)x: Player position along the long axis of the field, generally within 0 – 120 yards. (numeric)y: Player position along the short axis of the field, generally within 0 – 53.3 yards. (numeric)s: Speed in yards/second (numeric)a: Acceleration in yards/second^2 (numeric)o: orientation of player (deg)dir: angle of player motion (deg)num_frames_output: Number of frames to predict in output data for the givengame_id/play_id/nfl_id. (numeric)ball_land_x: Ball landing position position along the long axis of the field, generally within 0 – 120 yards. (numeric)ball_land_y: Ball landing position along the short axis of the field, generally within 0 – 53.3 yards. (numeric)
output_2023_w[01-18].csv
The output data contains tracking data after the pass is thrown.
game_id: Game identifier, unique (numeric)play_id: Play identifier, not unique across games (numeric)nfl_id: Player identification number, unique across players. (numeric)frame_id: Frame identifier for each play/type, starting at 1 for eachgame_id/play_id/ file type (input or output). The maximum value for a givengame_id,play_idandnfl_idwill be the same as thenum_frames_outputvalue from the corresponding input file. (numeric)x: Player position along the long axis of the field, generally within 0-120 yards. (TARGET TO PREDICT)y: Player position along the short axis of the field, generally within 0 – 53.3 yards. (TARGET TO PREDICT)
test_input.csv
Player tracking data at the same play as prediction. This file is provided only for convenience, the actual test data will be provided by the API.
test.csv
A mock test set representing the structure of the unseen test set. This file is provided only for convenience, the actual test_input data will be provided by the API. Contains the prediction targets as rows with columns (game_id, play_id, nfl_id, frame_id) representing each position that needs to be predicted.
kaggle_evaluation/
Files used by the evaluation API. See the demo submission for an illustration of how to use the API.
If you’re looking to explore or visualize the data prior to modeling, feel free to leverage the supplementary dataset provided through our analytics competition for additional context on plays and games.
Additionally, the future rerun dataset size will be similar to that of the public leaderboard (~60k rows).
Previous NFL Big Data Bowl Submissions Caused Next Gen Stats Like Coverage Responsibility
Previous Big Data Bowl submissions have directly influenced the development of Next Gen Stats, including the introduction of Coverage Responsibility this season. This new stat uses advanced AI and machine learning to measure defensive coverage with unprecedented accuracy.
“The Big Data Bowl and its participants continue to have a direct impact on the ongoing development of Next Gen Stats,” said Ari Entin, head of sports marketing at AWS. “What makes the new coverage responsibility stat particularly exciting is how it evolved, from innovative concepts in the competition to an official Next Gen Stat that fans will see on TV this season. This exemplifies how the Big Data Bowl unites the brightest minds in data science to influence advances in football analytics.”
The Big Data Bowl has continued to be a significant pipeline for members of the football analytics community, as well as other professional sports leagues, since its debut in 2018. In total, over 75 participants have been hired in data and analytics roles in sports, with more than 60 joining the NFL family.
As part of the Big Data Bowl, the NFL incorporates a mentorship program aimed at increasing diversity in sports analytics by connecting experienced NFL analytics experts with interested novices. This program will include both individual meetings as well as monthly group training sessions and conclude with a virtual forum where mentees will have the opportunity to present to analysts from all 32 NFL teams.
NFL Big Data Bowl Leaderboard And Winners For 2026
The NFL has not yet officially put out a list of 2026 NFL Big Data Bowl Finalists as of this writing, but I, Zennie62Media, Inc’s CEO Zennie Abraham, a fan of the NFL Big Data Bowl since its creation, has followed the doings in the competition. The winners for the 2026 Big Data Bowl Competition were listed six days ago on Kaggle, and by Elizabeth Park, Competitions Program Manager for Kaggle. Here’s what she wrote and it seemed to draw some concern on the part of at least one participant:
Recap of Competition – Congratulations to the Winners!
Ms. Park wrote:
Hi Kagglers,
We’re happy to announce the conclusion of the NFL Big Data Bowl 2026 – Prediction competition. Thank you everyone for your participation!
This competition ended with 8,329 registrations and 962 participants on 771 teams. We had 1,247 submissions from 69 countries. For 536 users (including 18 in the top 100!), this was their first competition. Thank you all for your hard work in this competition and congratulations to our winners and to those who gained a new ranking!
We have been working with top potential winning teams via email for the next steps. We look forward to learning more about their winning solutions.
We’ve cleaned the leaderboard and disqualified some teams that have violated the rules. If you think you were removed by mistake, or believe you have evidence that suggests another team cheated, please contact compliance. Please fill in all the fields honestly.
We highly encourage you to post a solution write-up about your approach and solution in the forums (see instructions). You may also refer to Kaggle Solution Write-Up Documentation for guidance. You are also encouraged to publish your models on Kaggle Models!
Thanks for continuing to make Kaggle a great place to learn, practice, and test data science techniques!
Happy Modeling!
Kaggle Team
Cebo
Posted 4 days ago 86th in this Competition
Hi Elizabeth. The certs themselves say “1,899”… not 1,247. To my understanding, that was because the organizers decided to count ALL submitting teams / submissions? (as opposed to just those that submitted at least once after the API submission was enforced?)
Either way… FYI – your numbers seem off!
Addison Howard
Kaggle Staff
Posted 3 days ago
Hi Cebo,
This competition had a rare format where models submitted in the first stage may not have successfully run in the second stage. We manually reset the number of total submissions behind the scenes to accurately calculate medals and rankings on Kaggle, but for general referencing purposes, the numbers in the post are correct (the number of successful submissions at the end of the competition).
Elizabeth Park Did Not Formally List The Winner, But It’s Team Ohkawa3 Lead By A Japanese Engineer From Kawasaki, Kanagawa, Japan
The top prize winners are listed as follows, and from the Kaggle Page:
#Team Members Score Entries Last Solution
- ohkawa3, 0.4634022mo, 2, 2mo, 1st Place Solution Discussion
- 足球小将, 0.4647512mo, 1, 2mo, 2nd Place Solution Discussion
- NoFreeLunch, 0.4657522mo, 2, 2mo, 3rd Place Solution Discussion
The Big Winners And The Big Problem For The NFL Big Data Bowl
The winners are an engineer from Japan who is a Kaggle competitions grandmaster, a machine learning expert from Wuhan, Hubei, China, and another Kaggle competitions grandmaster, and a four-person team of coders from various locations in China. Now, here is the problem from the perspective of relevance to football.
I have drawn a large number of football plays and created play concepts. Some, like The Eagle Offense For The NFL, and The Read-Option Jet Sweep Pass, below, were used by (then) Jon Gruden, Stanford Head Coach David Shaw, and at least six NFL teams.
So when I think of the ball in flight, I first ask what kind of play is being used? That’s important, because I need to know if the quarterback is throwing a short pass to a receiver running an out pattern, or a long pass to one running a post pattern down the middle of the field. That determines the path of the ball in flight, first. I did not see that kind of thinking reflected in The Big Data Bowl data and source material presented. What I did see was a lot of math-tech-talk that focused on how a solution itself was designed. And that brought more chatter about the structure of the model created. But no talk about how this can impact game plan design.
That’s what is missing from the winners of the 2026 NFL Big Data Bowl: a presentation of how these solutions are relevant to the task and how the task itself is relevant in NFL game planning. That said, the level and detail of work put into this is remarkable, and deserves further discussion and review.
