menu

Bureau of Reclamation

 4,622

Snowcast Showdown

Estimate snow water equivalent (SWE) at a high spatiotemporal resolution over the Western U.S. using near real-time data sources.
stage:
Submission Deadline
prize:
$500,000
Overview

Challenge Overview

Seasonal mountain snowpack is a critical water resource throughout the Western U.S. Snowpack acts as a natural reservoir by storing precipitation throughout the winter months and releasing it as snowmelt when temperatures rise during the spring and summer. This meltwater becomes runoff and serves as a primary freshwater source for major streams, rivers and reservoirs. As a result, snowpack accumulation on high-elevation mountains significantly influences streamflow as well as water storage and allocation for millions of people.

Snow water equivalent (SWE) is the most commonly used measurement in water forecasts because it combines information on snow depth and density. SWE refers to the amount of liquid water contained in a snowpack, or the depth of water that would result if a column of snow was completely melted. Water resource managers use measurements and estimates of SWE to support a variety of water management decisions, including managing reservoir storage levels, setting water allocations, and planning for extreme weather events.

Over the past several decades, ground-based instruments including snow course and SNOwpack TELemetry (SNOTEL) stations have been used to monitor snowpacks. While ground measures can provide accurate SWE estimates, ground stations tend to be spatially limited and are not easily installed at high elevations. Recently, high resolution satellite imagery has strengthened snow monitoring systems by providing data in otherwise inaccessible areas at frequent time intervals.

Given the diverse landscape in the Western U.S. and shifting climate, new and improved methods are needed to accurately measure SWE at a high spatiotemporal resolution to inform water management decisions.

 

TASK

The goal of this challenge is to estimate snow water equivalent (SWE) at a high spatiotemporal resolution over the Western U.S. using near real-time data sources. Prizes will be awarded based on the accuracy of model predictions and write-ups explaining the solutions as described below.

Getting better SWE estimates for mountain watersheds and headwater catchments will help to improve runoff and water supply forecasts, which in turn will help reservoir operators manage limited water supplies. Improved SWE information will also help water managers respond to extreme weather events such as floods and droughts.

 

CHALLENGE STRUCTURE

This competition will include two tracks. For more information on each track, see the Problem Description.

TRACK 1: Prediction Competition is the core machine learning competition, where participants train models to estimate SWE at 1km resolution across 11 states in the Western U.S.

  • Stage 1: Model Development (Dec 7 - Feb 15) 
    Historical ground truth is made available along with input data sources for model training. This period is an opportunity to build your data pipelines and test modeling approaches. A public leaderboard will be made available to provide feedback, but prizes will not be awarded during this stage.
  • Stage 2: Model Evaluation (Jan 11 - Jul 1)
  • Stage 2a: Submission Testing (Jan 11 - Feb 15)
    Package everything needed to perform inference on new data each week. This is an opportunity to make any final improvements to your model and ensure it works with approved data sources to generate predictions for real-time evaluation. Submit your code and frozen model weights to be eligible for Stage 2b.
  • Stage 2b: Real-time Evaluation (Feb 15 - Jul 1)
    After the ❄ model freeze ❄, run your model on a weekly basis to generate and submit near real-time predictions throughout the winter and spring season. Predictions will be evaluated against ground truth labels as they become available and prizes will be awarded based on final private leaderboard rankings.

TRACK 2: Model Report Competition (entries due by Mar 15) is a model analysis competition. Everyone who successfully submits a model for real-time evaluation can also submit a report that discusses their solution methodology and explains its performance on historical data.

 

PRIZES

OVERVIEW

Track 1: Prediction Competition$440,000
Track 2: Model Report Competition$60,000
Total$500,000

BREAKDOWN

Overall Predictive Performance

Feb. 15 - Jul. 1, 2022

TRACK 1 - Evaluated on predicted labels across all grid cells in the real-time evaluation period. Final rankings determined by the scoring metric and displayed on the leaderboard at the end of the challenge.             

PlacePrize Amount
1st$150,000
2nd$75,000
3rd$50,000
4th$25,000
5th$20,000

Regional Predictions: Sierras

TRACK 1 - Evaluated on predicted labels from regional grid cells in the real-time evaluation period. Final rankings determined by the scoring metric.

 Prize Amount
1st$30,000
2nd$20,000
3rd$10,000

Regional Predictions: Central Rockies

TRACK 1 - Evaluated on predicted labels from regional grid cells in the real-time evaluation period. Final rankings determined by the scoring metric.

 Prize Amount
1st$30,000
2nd$20,000
3rd$10,000

 

Model Write-up Bonus

Submissions due Mar. 15, 2022

TRACK 2 - Evaluated on write-ups of modeling approaches. The top 15 finalists from the Prediction Contest are eligible to submit write-ups for judging. Final winners will be selected by a judging panel.

 Prize Amount
1st$30,000
2nd$20,000
3rd$10,000

 

HOW TO COMPETE

  1. Click 'Learn More' and be redirected to drivendata.org
  2. Once you are on the DrivenData challenge page, select the “Compete” button in the sidebar to enroll in the competition.
  3. Get familiar with the problem through the overview and problem description. You might also want to reference some of the additional resources from the about page.
  4. Access the data from the data tab.
  5. Create and train your model.
  6. Use your model to generate predictions that match the submission format
  7. On the DrivenData challenge page, click “Submit” in the sidebar, and “Make new submission”. You’re in!

 

This challenge is sponsored by the Bureau of Reclamation with support from NASA and with collaborators from Bonneville Power Administration, USDA Natural Resources Conservation Service, U.S. Army Corps of Engineers, U.S. Geological Survey, and National Center for Atmospheric Research

 

Guidelines

Challenge Guidelines

Problem description

The goal of this challenge is to build algorithms that can estimate SWE at 1km resolution across the entire Western US. The data consist of a variety of pre-approved data sources listed below, including remote sensing data, snow measurements captured by volunteer networks, observations collected for localized campaigns, and climate information.

Timeline

SWE Predictions

Model Report

 

Competition timeline

This competition will include two tracks. The second track is separated into two stages.

Prediction CompetitionStage 1: Model Development (no prizes awarded)Stage 2a: Submission Testing (no prizes awarded)Stage 2b: Real-time EvaluationModeling Report Competition
Dec 7 - Jul 1Dec 7 - Feb 15Jan 11 - Feb 15Feb 15 - Jul 1Dec 7 - Mar 15
Train a model to estimate SWEScores run on data submissions based on a historical ground truth dataset and displayed on the public leaderboardTest data processing and inference pipeline to generate predictions; Submit final code and model weightsScores run on weekly data submissions based on near real-time ground truth dataset; cumulative scores displayed on the public leaderboardWrite a report on solution methodology

 

TRACK 1: PREDICTION COMPETITION (DEC 7 - FEB 15)

In the Prediction Competition, you will train a model to estimate SWE. Stage 1 provides an opportunity to build your machine learning pipeline and evaluate its performance on a historical dataset. During Stage 2, you will package everything needed to perform inference on new data. You will run your model weekly to generate and submit weekly predictions until the end of the competition. You may not update your model weights after February 15. You may implement minor fixes to your data processing pipeline to resolve any unforeseen changes to the input data (e.g., format, type, or API updates).

Stage 1: Model Development (Dec 7 - Feb 15)

During Stage 1 of the competition, build your data processing and integration pipelines, train your model using data collected up to the point of prediction, and evaluate model performance on a set of historical test labels. Use this opportunity to experiment with different data sources, refine your model architecture, and optimize computational performance. Establish a strong foundation for the Model Evaluation Stage (Stage 2) of the competition.

A public leaderboard will be made available to facilitate model development, but prizes will not be awarded during this stage of the competition. It is not mandatory that you participate in Stage 1 before Stage 2 opens (though it is recommended).

Stage 2: Model Evaluation (Jan 11 - Jul 1)

Stage 2a: Submission Testing (Jan 11 - Feb 15): Package everything needed to perform weekly inference on a set of 1km grid cells. Use this stage to make any final model improvements, ensure that your model works with the pre-approved data sources, and verify that it generates predictions in the correct format. A sandbox environment will be made available to test your submission. Before the end of this stage, submit a single archive, including your model code and ❄ frozen model weights ❄, to be eligible to participate in the Real-time Evaluation Stage (Stage 2b).

Stage 2b: Real-time Evaluation (Feb 15 - Jul 1): See how your model performs! After the ❄ model freeze ❄ (Feb 15), run your model each week to generate and submit near real-time predictions throughout the winter and spring season. Predictions will be evaluated against ground truth labels as they become available, and your position on the public leaderboard will be updated each week. It is reasonable that you may need to implement minor fixes to your data processing pipeline to resolve any unforeseen changes to the input data (e.g., format, type, or API updates) during this time period. However, please note that you may not update your model weights. You may be required to re-submit your model code and model weights at the end of the competition to determine prize eligibility.

Prizes will be awarded based on final private leaderboard rankings and code verification. Separate prizes will be awarded for performance in the Cascades, Sierras, and Central Rockies, and for overall performance during the evaluation period (Stage 2b).

TRACK 2: MODEL REPORT COMPETITION (ENTRIES DUE BY MAR 15)

If you are participating in the Real-time Evaluation Stage of the Prediction Competition (Stage 2b), you are eligible to submit a model report that details your solution methodology. In addition to obtaining the best possible predictions of SWE, the Bureau of Reclamation is specifically interested in model interpretability and gaining a deep understanding of which models are the most robust over a broad range of geographies and climates (based on historical data).

Final winners will be selected by a judging panel. For more information, see the Model Report description below.

 

Prediction competition

The goal of the Prediction Competition is to develop algorithmic approaches for most accurately estimating the spatial distribution of SWE at high resolution across the Western US. Participants will draw on near real-time data sources to generate weekly, 1km x 1km gridded estimates that will be compared against ground measures.

See the Development Stage description for information on the ground-based, satellite, and climate data available for Stage 1: Model Development.

 

SUBMISSION

Models will output a .csv with columns for the cell_id and predicted swe for each grid cell in the submission format. For the specific format and an example for Stage 1, see the Development Stage description.

Note that while only grid cells listed in the submission format are required for evalutation during the challenge, winning solutions must be able to produce predictions for every 1km x 1km cell in the Western US (including those outside of the submission format). Finalists will need to deliver a solution that conforms with this requirement in order to be eligible for prizes.

Winners will also need to report the time it takes for their solutions to run. Solutions that can generate predictions for the entire Western US in 24 hours or less (ideally in 6 hours or less) will be of greater operational value to Reclamation.

 

EVALUATION

To measure your model’s performance, we’ll use a metric called Root Mean Square Error (RMSE), which is a measure of accuracy and quantifies differences between estimated and observed values. RMSE is the square root of the mean of the squared differences between the predicted values and the actual values. This is an error metric, so a lower value is better.

For each submission, a secondary metric called the coefficient of determination R2 will also be reported on the leaderboard for added interpretability. R2 indicates the proportion of the variation in the dependent variable that is predictable from the independent variables. This is an accuracy metric, so a higher value is better.

While both RMSE and R2 will be reported, only RMSE will be used to determine your official ranking and prize eligibility.

 

Model report

In addition to obtaining the best possible predictions of SWE, the Bureau of Reclamation is interested in gaining a deeper understanding of modeling approaches in two primary areas:

Interpretability is the degree to which a person can understand the outcome of the solution methodology given its inputs. For example, a solution that relies on physically-based modeling should outline a set of well-defined model governing equations.

Robustness is the extent to which solutions provide high performance over a broad range of conditions, such as different elevations, topographies, land cover, weather, and/or climates. Robustness may be evaluated by comparing solution performance over different regions or time periods (based on historical data).

 

SUBMISSION

Each report will consist of up to 8 pages including visualizations and figures, delivered in PDF format. Winners will also be responsible for submitting any accompanying code (e.g., notebooks used to visualize data).

At a minimum, write-ups should discuss data source selection, feature engineering, algorithm architecture, evaluation metric(s), generalizability, limitations, and machine specs for optimal performance. Reports that help illuminate which signal(s) in the data are most responsible for measuring SWE will be prioritized.

See the Model Report Template on the data download page (once you have entered the competition) for further information.

 

EVALUATION

The final prizes will be awarded to the top 3 reports selected by a panel of judges including domain and data experts from the Bureau of Reclamation. The judging panel may change during the competition.

Meet the judges!

The judging panel will evaluate each report based on the following criteria:

  • Interpretability (40%): To what extent can a person understand the outcome of the solution methodology given its inputs?
  • Robustness (40%): Do solutions provide high performance over a broad range of geographic and climate conditions?
  • Rigor (10%): To what extent is the report built on sound, sophisticated quantitative analysis and a performant statistical model?
  • Clarity (10%): How clearly are underlying patterns exposed, communicated, and visualized?

Note: Judging will be done primarily on a technical basis, rather than on grammar and language (since many participants may not be native English speakers). Final winners will be selected by a judging panel from among the top 15 eligible teams on the private leaderboard.

 

Find additional details on the DrivenData challenge page

Timeline
Updates3

Challenge Updates

Two Week Warning

March 1, 2022, 8 p.m. PST by Lulu

There are only TWO WEEKS left to submit your solution for the Snowcast Showdown. The early bird definitely gets the worm - so don't put it off! Be sure to have at least 75% of your submission complete a full week before the deadline for maximum flexibility. We'd hate to think you worked hard on a submission, just to miss the deadline by a hair (that's right - no late-night, sad email exceptions - the cut-off is real, folks!)

All that aside, thanks so much to all of you for your interest. Crowdsourcing is nothing without the crowd, and well, that's you. Yeah, you.  

We can't wait to see what the winning solution looks like.

Best of luck to all!


Teamwork Makes the Dream Work

Jan. 27, 2022, 9 a.m. PST by Lulu

Have you thought about forming a team to compete in the Snowcast Showdown?

Teams can be formed directly on the challenge page by connecting in the forum.

Some of the advantages of forming a team include:

1. Accountability

Those deadlines are less likely to get away from you when you’re working with a group.

2. Shared workload

You know the old “divide and conquer?”  It works well for challenges! By finding a solid collaborative stride, a group of just three people can churn out a lot more than one person working alone.

3. Camaraderie

It might be hard for other people to relate to your lone wolf journey toward incentive competition conquest, but your team will be right there with you! A source of inspiration, motivation, and perhaps even fire-under-your-butt lighting, teammates can provide a huge emotional advantage. Just think - new internet friends!

4. Specialization

Maybe you’re a real visionary with an incredible concept, but are stuck on how the “nuts and bolts” fit together? Yeah, YOU need a team. Teammates who have the skills and special working knowledge can be a huge resource. And the benefits go both ways!

Rome wasn’t built in a day. Your team will take some time to come together, so be sure to get ahead of it and start recruiting, reaching out, and networking about the challenge now. The forum is a great place to start. Also, feel free to browse the entire HeroX Community by specialization by checking out https://herox.com/crowdsourcing-community. 


All About the Forum

Jan. 13, 2022, 9 a.m. PST by Lulu

Maybe you've had some questions, thoughts, or ideas about the challenge so far -- but you're still wondering where to take them? In fact, there's a quick, easy-to-use way to ask questions and start conversations about the Snowcast Showdown: the challenge forum.  

Interested? Simply go to the forum to see what people are already saying. If you'd like to start a new conversation, click "New topic" (pictured below) and begin crafting your message.

This is a great way to start connecting with other community members around different aspects of the challenge, gain insights, and even collaborate! Keep in mind that HeroX regularly checks in on the forum, so it's also a great way to get in touch with us about any questions (or suggestions) you might have. 

Hope you found this helpful! 

Best of luck, and have a great weekend.

 


Forum1
Teams13
Press