99,454
AlphaPilot – Lockheed Martin AI Drone Racing Innovation Challenge

AlphaPilot – Lockheed Martin AI Drone Racing Innovation Challenge

AlphaPilot is the first large-scale open innovation challenge of its kind focused on advancing artificial intelligence (AI) and autonomy. Read Overview...
Overview

 

 

Calling all coders, gamers, race fans, and drone enthusiasts...

 

Lockheed Martin and The Drone Racing League (DRL) challenge you to participate in AlphaPilot, an open innovation challenge, developing artificial intelligence (AI) for high-speed racing drones.  

 

Enter the AlphaPilot Innovation Challenge today for your chance to master autonomous flight and win more than $2,000,000 in cash prizes.

 

AlphaPilot will challenge teams of up to 10 participants each to design an AI framework, powered by the NVIDIA Jetson platform for AI at the edge, that is capable of flying a drone -- without any human intervention or navigational pre-programming. Autonomous drones will race head-to-head through complex, three-dimensional tracks in DRL’s new Artificial Intelligence Robotic Racing (AIRR) Circuit, starting in 2019.

AlphaPilot aims to unite a diverse community of practicing and emerging AI experts, researchers and students to inspire the next generation of autonomous drone technology. By participating in this challenge, your knowledge and ideas can contribute directly toward the future of autonomous transportation, delivery, disaster relief, and even space exploration!

 

 

Why Drones?

Drone racing is a futuristic sport, and a big draw for millennials and K-12 students with an interest in technology — many of whom will become future STEM professionals, drone pilots and engineers. Lockheed Martin recognizes the important role in helping to develop a workforce with skills to compete in a 21st century high-tech economy. Lockheed Martin and DRL are targeting U.S. undergraduate and graduate students to apply for AlphaPilot; however, the competition is open to drone enthusiasts, coders and technologists of all ages from around the world.

 

Why is Lockheed Martin doing this?

For more than 100 years, Lockheed Martin has been redefining flight — from the fastest speeds, to the edge of space, to unmatched maneuverability and stealth. AI-enabled autonomy promises to fundamentally change the future of flight, and we are actively developing disruptive new AI technologies that will help our customers accomplish their most important missions – from reaching Mars to fighting wildfires.

 

WHAT CAN I DO RIGHT NOW?

  • Click ACCEPT CHALLENGE above to apply for AlphaPilot.
  • Read the Challenge Guidelines to learn about competition.
  • Share this challenge on social media using the icons above. Show your friends, your family, or anyone you know who has a passion for discovery.
  • Start a conversation in our Forum to join the discussion, ask questions or connect with other innovators.

ABOUT LOCKHEED MARTIN

Headquartered in Bethesda, Maryland, Lockheed Martin is a global security and aerospace company that employs approximately 100,000 people worldwide and is principally engaged in the research, design, development, manufacture, integration and sustainment of advanced technology systems, products and services. This year, the company received three Edison awards for groundbreaking innovations in autonomy, satellite technology and directed energy. For more information, please visit www.lockheedmartin.com/alphapilot.

 

 

 

ABOUT DRONE RACING LEAGUE

DRL is the professional drone racing circuit for elite FPV pilots around the world. A technology, sports and media company, DRL combines world-class media and proprietary technology to create thrilling 3D drone racing content with mass appeal. In 2018 DRL hosted a global series of seven races, the Allianz World Championship, that is airing on ESPN, Sky Sports, ProSiebenSat.1 Media SE, Groupe AB, Disney XD, OSN, and FOX Sports Asia. For more information, please visit www.drl.io

Guidelines
Timeline
Updates 25
Forum 83
Community 2.5K
Leaderboard
Resources
FAQ
Sponsors
Test 1
Test 2

AlphaPilot Qualifier

Test #2 – Machine Vision

 

Overview:

Test #2, “Eye Exam”, focuses on effective machine vision, as this will be critical for success in AlphaPilot. Drone racing requires efficient, low-latency visual processing in order to autonomously navigate through gates at high speeds. Therefore, to qualify for AlphaPilot, teams will first need to pass the Eye Exam!

 

What’s Changed?

What’s New? What’s Different?

The AlphaPilot team has identified and addressed a few small bugs in Test 2. As such, we are giving all teams until Monday, March 11th at 5:00PM EST (2:00PM PST) to finalize their Test 1 and 2 submissions. Please use this time to check your algorithm against the Leaderboard a few more times and make sure your submission meets all the requirements. Given the updates, teams should feel free to upload a new version of their Test 1 and 2 submissions as needed.

We have increased the maximum allowable upload to be 1GB for Test 2 submissions. This has now been updated on the Test 2 Submission Form and teams can now submit at any time.

We want to remind teams that their submission archive needs to include everything needed for their algorithms to run in the testing environment as described. During testing, there will not be internet access, and so their submissions will not be able to perform actions that require internet. The only exception to this is that we decided to make OpenCV 4.0 available in the testing environment for teams. So teams will not need to include OpenCV in their submission.

We have made a few small updates to the Test 2 scripts. Some teams were still having issues with getting 0 scores on the Leaderboard due to empty gates and subsequent empty arrays in the JSON. The AlphaPilot team has fixed this bug in the scorer script, updated the scripts on the Leaderboard, and made available the fixed scripts:

A sample submission.zip archive has been requested by teams to make sure their submissions are formatted correctly. The AlphaPilot team has now made this available also, and it can be downloaded here:

As requested by some teams, the AlphaPilot team has decided to increase the frequency of the Leaderboard refreshes until the Test 2 deadline (now Monday, March 11th at 5:00PM EST as noted above). These will now be 3 times a day with the new cut-offs and subsequent refreshes being at:

  • 7:59AM PST
  • 3:59PM PST
  • 11:59PM PST

Teams who received a 0 score on the Leaderboard are encouraged to submit again now that the scorer scripts have been updated.

We updated many of the Test 2 details below with new information that has come out of the Test 2 Questions Forum. A summary of the most frequently asked questions and their responses will be continually added to the new ‘FAQ’ section at the end of this tab.

What’s yet to come?

Additional updates will be posted as needed. Otherwise, teams should submit their algorithm source code archive and technical report by March 11! Good luck!!

Evaluation:

Goal

The Drone Racing League (DRL) has developed unique racing gates for use in AIRR, their new autonomous racing league attached to AlphaPilot. These gates are equipped with visual markings (e.g. colors, patterns, logos) that will provide fiducials to aid in guidance through the course. Teams are tasked with:

(1) Developing a gate detection algorithm

(2) Describing their algorithm in a 2-page Technical Report

This gate detection algorithm needs to be capable of detecting the flyable region of AIRR Race Gates and producing a quadrilateral around its edge (see Figure 1) with a high degree of accuracy and speed.

Figure 1: Shows sample AIRR gate images from the training dataset both without (left) and with (right) the flyable region outlined in red.

Test 2 Scoring

Each team will receive a Test #2 score (max 100 points) that combines an objective score from their gate detection performance with a qualitative review of their written submission:

  • Algorithm Score: 70% of total Test 2 score
  • Technical Report Score: 30% of total Test 2 score

Algorithm Score

Each team’s gate detector will receive an Algorithm Score (max 70 points) that is based on a metric evaluating their algorithm’s ability to find the flyable region of AIRR gates. This is computed using the accuracy of the reported bounding coordinates compared to the labelled coordinates (referred to as “ground-truth”), along with a measure of average execution time of the algorithm in seconds (avg_time). The accuracy of all the labels is evaluated according to the Mean Average Precision (MAP) metric, and the measure of execution time is assessed according to the wall clock time for the gate detector to read an image, extract the corners of the flyable region of the gate, and output the label.

The total Algorithm Score is then calculated by subtracting the average wall clock time from the weighted MAP score and multiplying by 35 to get a maximum of 70 points:

 

Note: If a team’s execution time is longer than 2 seconds, it is possible for a team’s Algorithm Score to be negative. If this occurs, a team will be given 0 points for their Algorithm Score.

For more information on the implementation of this metric used for AlphaPilot, read more here: https://arxiv.org/abs/1405.0312

Technical Report Score

Each team’s Technical Report Score (max 30 points) is based on a rubric evaluated by judges from academia, industry, and government. The judges will review each report for technical merit, applicability to AlphaPilot, and presentation and clarity of the approach.

 

Resources:

Data

The data used for EyeExam consists of images of AIRR Racing Gates from various distances, angles, and lighting profiles in JPG format. The AIRR gates are square with parallel edges. The internal clearance is 8ft x 8ft, the external dimensions are 11ft x 11ft, and the depth is 1ft. However, it is very likely these dimensions could change a bit for the final AlphaPilot Challenge as the drone architecture and race courses get finalized. This data has been divided into 3 sets:

  1. Training Dataset
  2. Leaderboard Testing Dataset
  3. Final Sequestered Testing Dataset

The Training Dataset (can be found in DataTraining.zip) contains roughly 9,300 images totaling 2.8GB. This will be the primary resource available to teams for development. A JSON file containing the ground truth labels for this training dataset is now available as training_GT_labels_v2.json.

The Leaderboard Testing Dataset (can be found in Data_LeaderboardTesting.zip) is now available as a practice exam!! This dataset contains roughly 1,000 images totalling 360MB. This test dataset should be used to see how well a team’s algorithm performs on unseen images. For the Leaderboard Testing Dataset, the ground truth is not provided; it is the team’s job to predict these labels. With this dataset, teams may submit a JSON file of their labels for scoring and placement on a leaderboard.

The Final Sequestered Testing Dataset and the corresponding ground truth will not be released to teams. This test set will be used to evaluate each team’s algorithm and determine their final EyeExam score.

See the ‘Testing’ section below for more details on Leaderboard and Final Testing.

 

Sample Submission

At this time, the final version of the sample submission and scoring code is available for teams to use. These scripts provide a framework on which to build and develop algorithms so that they meet all the submission requirements. The starter scripts folder (starter_scripts_v2.zip) contains code to help teams create submissions:

  • generate_results.py – Sample submission with example class by which to define a solution. Also reads and plots images and labels.
  • generate_submission.py – Test script that reads all test images, imports and calls a team’s algorithm, and outputs labels for all images in a JSON file
  • random_submission.json – Sample JSON file of labels; this JSON is also the output of generate_submission.py and meets the submission requirements

The scorer scripts folder (scorer_scripts_v3.zip) contains code to calculate a team’s Algorithm Score:

  • score_detections.py – Test script which calls a team’s JSON file, evaluates it against the ground-truth JSON file, and outputs a MAP score

To run the sample submission, configure the environment to match the environment described in the ‘Testing’ section below. Teams will also need to install the following libraries which can be installed using pip:

  • Shapely
  • NumPy

Please note that these libraries should be compatible with Python 3.5.2.

 

Labels

For the training dataset, the Training Dataset Ground-Truth Labels file (training_GT_labels_v2.json) contains a label for each image provided. This JSON file contains the coordinates of quadrilaterals that best fit around the flyable region of the AIRR gates (see Figure 1). For each image tested, the JSON file contains an object corresponding to the label for that image. The labels are provided as a dictionary where keys are image_ID and values are numbers that correspond to the four (x,y) coordinates of the corners of the quadrilateral (e.g. "IMG_####.JPG": [[x1, y1, x2, y2, x3, y3, x4, y4]]). The four coordinates of the quadrilateral need to be in clockwise ordering; preferably starting with the upper-left most corner (see Figure 1).

JSON files are structured such that there is an array of values for each gate in an image. An image with 2 gates should have 2 arrays of coordinates. If there is no gate in the image, the values should be empty for that image_ID. Teams will not be tested on images with multiple gates in Test 2 though.

Here are a few examples:

Single Gate: "IMG_3668.JPG": [[370.3913304338685, 377.2051599830667, 7.742989976402814, 13.058747191455566, 246.50017198915896, 321.3924492511484, 342.11494534552276, 35.65517009139904]]

Two Gates: "IMG_3668.JPG": [[370.3913304338685, 377.2051599830667, 7.742989976402814, 13.058747191455566, 246.50017198915896, 321.3924492511484, 342.11494534552276, 35.65517009139904], [370.3913304338685, 377.2051599830667, 7.742989976402814, 13.058747191455566, 246.50017198915896, 321.3924492511484, 342.11494534552276, 35.65517009139904]]

No Gate: "IMG_3668.JPG": [[]]

The Training Dataset Ground-Truth Labels are crowd-sourced, and as often happens, are not perfect data. Above all, follow the guidance given for accurately identifying correct gate labels in images. Some of the training data ground-truth labels do not follow these rules accurately, and that should be considered when developing machine vision algorithms. Please address any data curation done in the Technical Report as AlphaPilot judges would like to know this when considering a team’s technical approach.

Please be assured that the Leaderboard and Final Testing ground-truth labels are very accurate, and that is what teams will be tested on.

 

Camera Calibration Images

A set of images of a checkerboard pattern taken with the same camera and lens as the training and testing data is available to teams in Cam_Calibration.zip. All images are taken with Canon DSLR with 18mm lens.

 

File Descriptions

Data_Training.zip – Training dataset images

Data_LeaderboardTesting.zip – Leaderboard Testing images

training_GT_labels_v2.json – Training dataset labels

Cam_Calibration.zip – Calibration dataset images. The square sizes are 19 x 19 mm.

starter_scripts_v2.zip – Final version of starter scripts for submission creation

scorer_scripts_v3.zip – Final version of scorer scripts for submission evaluation

submission.zip - Sample submission.zip archive

 

Submission Requirements:

For Test 2 submissions, each Team Captain should upload 2 items as attachments via the HeroX Test 2 submission form:

  1. 1 PDF file describing their algorithm in a 2-page Technical Report
  2. 1 zipped archive named ‘submission’ with maximum upload size of 1GB

In the folder named ‘submission’, at the highest level of the directory, please include the following:

  1. generate_results.py - Within this script, teams need to define their algorithm within the GenerateFinalDetections() class in the predict(self,img) function. This function is called within the generate_submission.py test script that reads all test images, imports and calls a team’s algorithm, and outputs labels with confidence score for all images in a JSON file.
  2. requirements.txt - A file which lists libraries that need to be installed for an algorithm’s source code to run.
  3. Other code - Teams can include additional code in the directory as needed for your algorithms to run. This can be compiled libraries, source code, etc. Teams SHOULD NOT include any of the following in their submitted archive:
    1. Any code not needed for their algorithms to run
    2. We decided to make OpenCV 4.0 available in the testing environment for teams. So teams will not need to include OpenCV in their submission.
    3. generate_submission.py as AlphaPilot will run our own version of this code
    4. Scoring scripts as AlphaPilot will run our own version of this code

A sample submission with example class and predict function is included in the starter scripts, and these help teams to format and define a solution. A sample submission.zip archive has also been made available to help teams make sure their submissions are formatted correctly.

 

Test 2 Source Code and IP Concerns:

As a reminder, the judges will have access to teams’ submitted source code to help them understand the technical approach taken for Test 2. The AlphaPilot team suggests that if there are any concerns with IP, then that IP should be compiled into an executable that is called from your source code and that executable should be included in your archive. However, by abstracting functionality away in executables, it makes it more difficult for the judges to verify that teams have the technical skills and capabilities to succeed during the AlphaPilot Competition. As a result, please balance your team’s individual needs to obscure IP with building transparency of your approach.

 

Algorithm Requirements:

Test 2 Algorithms need to detect a single polygon per gate in the image. While there will not be multiple gates in any test images in the Eye Exam, there may be some test images with no gates. Multiple gates in view is something teams will have to deal with during Test 3 and the AlphaPilot Challenge, and the judges will be interested in hearing how teams will extend their Test 2 approach. Please address in Technical Report.

Each gate should be labeled with a single polygon with 4 corners around the flyable region (even if it’s partially obstructed). When all 4 corners of the flyable region of the gate are visible, teams are expected to find them (even if part of the flyable region is not visible). If that is not the case (where one or more corners is not visible in the image), teams will not be tested on these images, and they have been removed from the testing data. This is largely due to the fact that it is very difficult to accurately label the ground-truth for those cases.

Within the generate_results.py script provided, teams should define their algorithm within the GenerateFinalDetections() class in the predict(self,img) function. This function is called within the generate_submission.py test script that reads all test images, imports and calls a team’s algorithm, and outputs labels for all images in a JSON file. For an example, teams can view the random_submission.json included in, and generated by, the starter scripts.

The predict function should input an OpenCV image object and output an array of values for each gate in an image followed by a confidence score (e.g. "IMG_####.JPG": [[x1, y1, x2, y2, x3, y3, x4, y4, CS]]). There exist two distinct measurements needed in object detection: (1) Whether the image is correctly classified (i.e. if an object exists in the image) and (2) how well the object has been localized. Simple metrics introduce biases and so it is important to assess the risk of misclassifications. Thus, the “confidence score” is used. An image with 2 gates should have 2 arrays of coordinates. If there is no gate in the image, the values should be empty.

Here are a few examples:

Single Gate: "IMG_3668.JPG": [[370.3913304338685, 377.2051599830667, 7.742989976402814, 13.058747191455566, 246.50017198915896, 321.3924492511484, 342.11494534552276, 35.65517009139904, 0.5]]

Two Gates: "IMG_3668.JPG": [[370.3913304338685, 377.2051599830667, 7.742989976402814, 13.058747191455566, 246.50017198915896, 321.3924492511484, 342.11494534552276, 35.65517009139904, 0.5], [370.3913304338685, 377.2051599830667, 7.742989976402814, 13.058747191455566, 246.50017198915896, 321.3924492511484, 342.11494534552276, 35.65517009139904, 0.5]]

No Gate: "IMG_3668.JPG": [[]]

Algorithms must be compatible with Python 3.5.2. Teams may leverage open-source algorithms and data libraries to assist in their design. Use of TensorFlow is recommended but not required. No other specific software and hardware is required, but teams should review the testing information below for further considerations. Further, teams can choose any approach for their algorithms (i.e. teams don’t have to use machine learning).

Teams will need to list dependencies in a requirements.txt file included in their submitted archive which can be automatically generated. The requirements.txt file only needs to list libraries that need to be installed for an algorithm’s source code to run. This will be installed using pip with the following command:

>> pip install -r requirements.txt

Given this, all libraries listed in requirements.txt must be installed via pip. If teams want to use libraries that do not fit within this constraint, they can add their compiled versions of their libraries into the submitted archive. However, proceed with the latter option at risk, because the AlphaPilot team can only guarantee successful install of libraries using pip.

Note: GPU and CUDA drivers will already be installed in the testing environment. The following instance will be used at the basis for the testing environment: https://aws.amazon.com/marketplace/pp/B077GCZ4GR

Technical Report Requirements:

In the Technical Report, teams must document how their algorithm works and how they conducted analysis, including any training and data curation. Teams should detail any libraries or well-known approaches used. In addition to describing their approach, teams must also address:

  • How the team plans to build onto or modify their gate detection algorithm for use in the AlphaPilot competition should they qualify.
  • Any technical issues the team ran into and how they overcame them.

Reports need to be in PDF format and 2-pages maximum, single-spaced. Font size should be minimum 11pts. Insertion of text, equations, images, figures, plots, code, and pseudo-code is accepted but will be included in the 2-page limitation. The only exception to the 2-page limitation will be a list of references.

 

Testing:

The Leaderboard and Final Exam are now out!

Leaderboard Testing

The Test 2 Leaderboard is now open for participants to test their algorithms on a practice exam. At this time, the Leaderboard Testing Dataset is now available for teams to download, test their algorithms against, and evaluate their algorithm’s performance. This will give teams a rough idea of how well they will perform on the final testing though actual results might differ slightly.

To participate in the Leaderboard Test, a team can upload to HeroX AlphaPilot their JSON file containing the labels and confidence score for each image in the set. Each day, a cut-off is implemented for uploading and testing. If a team uploaded a JSON since the last cut-off, their most recently uploaded JSON file will be used to compare against the ground-truth labels for the practice images and calculate a team’s MAP value. The leaderboard shows the team’s most recent score and is updated with the top teams according to their most recent MAP value.

The AlphaPilot team has decided to increase the frequency of the Leaderboard refreshes until the Test 2 deadline. These will now be 3 times a day with the new cut-offs and subsequent refreshes being at:

  • 7:59AM PST
  • 3:59PM PST
  • 11:59PM PST

 

Final Testing

By the final deadline, teams must submit source code for testing and evaluation as previously described. During testing, there will not be internet access, and so their submissions will not be able to perform actions that require internet.

The testing environment is an instance of Ubuntu 16.04.5 LTS running in AWS type p3.2xlarge. The following instance will be used at the basis for the testing environment: https://aws.amazon.com/marketplace/pp/B077GCZ4GR. 

OpenCV 4.0, Python 3.5.2, numpy, and shapely have also been already installed in the environment for teams.

The final algorithm testing will be conducted as such:

          1. A new Linux user home directory is created. The team’s archive will be unzipped and ‘submission’ folder will be unpacked into the user’s home directory. A virtual testing environment is setup in the user’s home directory.

          2. Team dependencies will be installed according to the libraries listed in a requirements.txt file. See ‘Algorithm Requirements’ section for more details on this file. This will be installed using pip with the following command:

                    >> pip install -r requirements.txt

          3. An internal version of the testing script, generate_submission.py, will run through all available image data, implement the predict(self,img) function call to get the labels and execution time of team algorithms for a given image, and store the results for each image in a JSON file named ‘random_submission.json’. The internal version of generate_submission.py is functionally comparable to the version gives to teams and only differs in that it contains additional checks performed on algorithms.

          Note: there is a cut-off for execution time for a team’s algorithm, and it is 10 seconds. If a team’s algorithm takes longer than this to output a label, the script will be stopped, and the team will be given an Algorithm Score of 0 points.

          4. Once all results are stored in a JSON, the scoring script, score_detections.py, is run to compare the outputted JSON file to a master JSON containing the ground-truth labels. The total algorithm score is then calculated using the output from the scoring script as well as the execution time outputted from running generate_submission.py.

          Note: there is a cut-off for the total submission evaluation time (Final Testing steps 3 and 4) and that is 2 hours. If a team’s algorithm takes longer than this to be evaluated, the testing will be stopped, and the team will be given an Algorithm Score of 0 points.

Separately, judges will read and score each team’s technical report which will be subsequently used to calculate their total final score for Test 2.

 

Test 2 Algorithm Submission Checking

The final algorithm testing is completed automatically, and therefore, it is essential that teams follow the above requirements exactly. To reduce the possibility of small, technical errors causing issues for the teams, the AlphaPilot team has provided all the tools needed to check and validate Test 2 algorithm submissions. To check that your submitted source code will function correctly, we suggest teams follow the above testing process using the starter and scorer scripts provided. 

 

Frequently Asked Questions

How do we deal with edges cases in the images (i.e. where gates might be partially or fully obstructed or not visible for whatever reason)?

Each gate should be labeled with a single polygon with 4 corners around the flyable region (even if it’s partially obstructed). When all 4 corners of the flyable region of the gate are visible, teams are expected to find them (even if part of the flyable region is not visible). If that is not the case (where one or more corners is not visible in the image), teams will not be tested on these images, and they have been removed from the testing data. This is largely due to the fact that it is very difficult to accurately label the ground-truth for those cases.

 

How do you recommend we use the ground truth labels in the training dataset?

Above all, follow the guidance given in the Test 2 description for accurately identifying correct gate labels in images. Some of the training data ground-truth labels do not follow these rules accurately, and that should be considered when developing machine vision algorithms. Handling real-world challenges effectively will be critical for success in the AlphaPilot Competition, and similarly, teams need to deal with some flawed ground-truth in Test 2 and sensor noise, control drift, and modeling errors in Test 3. Address how your team plans to deal with these in the Technical Reports, because AlphaPilot judges would like to know this when considering a team’s approach. 

 

What are some safe assumptions that can be made about the AIRR gates?

The gate is square with parallel edges. The internal clearance is 8ft x 8ft, the external dimensions are 11ft x 11ft, and the depth is 1ft. However, it is very likely these dimensions could change for the final AlphaPilot Challenge (as the drone architecture and race courses get finalized). If your team qualifies, we would like to know how your team plans to deal with these real-life variations (please address in Technical Report).

 

How is execution time of algorithms calculated?

The measure of execution time is assessed according to the wall clock time for the gate detector to read an image, extract the corners of the flyable region of the gate, and output the label. So, 2 seconds per image. Please also note the additional time restrictions during testing (see ‘Testing Section’ of Eye Exam).

 

What are some safe assumptions that can be made about the camera that took all the training and testing images?

The camera used for training and testing images was a canon DSLR with an 18mm lens. A set of images that can be used for calibration are available here (Cam_Calibration.zip). The squares are 19 x 19 mm.

 

How are teams supposed to deal with multiple gates in view?

Test 2 Algorithms need to detect a single polygon per gate in the image. While there will not be multiple gates in any test images in the Eye Exam, there may be some test images with no gates. Multiple gates in view is something teams will have to deal with during Test 3 and the AlphaPilot Challenge, and the judges will be interested in hearing how teams will extend their Test 2 approach. Please address in Technical Report.

 

What are the requirements for the leaderboard submission json?

JSON files are structured such that there is an array of data for each gate in an image. So an image with 2 arrays of coordinates would indicate an image with 2 gates. An image with an empty array would indicate an image with no gate.

For further info, review the ‘Submission Requirements: Algorithm Requirements’ section above. The random_submission.json included in, and generated by, the starter scripts give an example of the format, and this will also have the same format as the ground-truth JSON file.

 

What is the required version of Python?

AlphaPilot has modified the Python requirements. Please note that all Test #2 algorithms must be compatible with Python 3.5.2.

 

Is there a test instance on your platform we can use to see that our code will work when you score it?

The starter scripts and scorer scripts provided for Test 2 represent how each team’s source code will be tested.

We will not be providing the exact test instance for teams. However if a team defines a GenerateFinalDetections() class with a predict(self,img) function that runs smoothly in generate_submission.py (as shown in the starter scripts), this provides the sanity checks needed on source code.

 

How do the ground truth labels work in the testing dataset? Is there a “fudge factor” that will score correct if we’re a few pixels off?

The ground-truth (GT) labels represent the 4 coordinates which define a polygon that maximizes the flyable region of the AIRR gates. The primary measure used in the MAP score is the Intersection over Union (IoU). The IoU is computed by dividing the area of overlap between the GT and predicted bounding boxes by the area of union. For more information on the implementation of this metric used for AlphaPilot, read more here: https://arxiv.org/abs/1405.0312

 

Is calling a C/C++ library from Python allowed for Test 2? I'm working under assumption that it is allowed, since almost all Python libraries, where performance is an issue are written in C/C++ or other higher performance language.

This is allowed. However, please make sure your code is still compatible with the specified testing environment and requirements.

 

Can a 3D model of the gate used in Test 2 be made available?

Unfortunately, we won't be able to provide any mechanical drawings of the gates.

 

Is there any difference in the distribution of the public and private leaderboard data for Test 2?

No

 

Do you account for inference time?

The measure of execution time is assessed according to the wall clock time for the generate_submission.py function to run finalDetector.predict(img) as defined by your class. That is, the time for the gate detector to read an image, extract the corners of the flyable region of the gate, and output the label.

The total Algorithm Score is then calculated by subtracting the average wall clock time from the weighted MAP score and multiplying by 35 to get a maximum of 70 points:

 

Do we need to detect gates obscured by pillars?

Each gate should be labeled with a single polygon with 4 corners around the flyable region (even if it’s partially obstructed). When all 4 corners of the flyable region of the gate are visible, teams are expected to find them (even if part of the flyable region is not visible). If that is not the case (where one or more corners is not visible in the image), teams will not be tested on these images, and they have been removed from the testing data. This is largely due to the fact that it is very difficult to accurately label the ground-truth for those cases.

 

Could you elaborate on the MAP score for test 2?

Please read the paper included in the Test 2 overview. We also suggest teams do a bit of their own research into the score. This is a very common metric for data science competitions, and we utilize a common implementation in AlphaPilot.

 

Execution time for test #2 will be benchmarked on multicore, hyperthreaded CPU where my task execution, due to primarily cache thrashing, may be randomly affected by another task running on that CPU at the same time. Are you aware of that?

Each team’s algorithm will be tested on the server sequentially (one at a time), and no other tasks will be running on the server during that process.

 

Can you expand on the confidence score?

  1. There exist two distinct measurements needed in object detection:
    1. Whether the image is correctly classified (i.e. if an object exists in the image)
    2. How well the object has been localized
  2. Simple metrics introduce biases and so it is important to assess the risk of misclassifications. Thus, the “confidence score” is used. For more information on how this is used in AlphaPilot, please doing some reading into the “mean Average Precision” on the MAP metric and how the confidence score is considered mathematically.

 

 

If you have any specific questions on Test #2, please comment in this forum thread.

Test 3