Well, for ImageNet, the golden standard for image classification, this photo of severe flooding in the midwestern US is a toilet.
When Hurricane Maria struck Puerto Rico, Andrew Weinert and his colleagues, researchers from MIT’s Lincoln Laboratory, attempted to assess the damage to help the Federal Emergency Management Agency (FEMA). In their hands were a large amount of aerial shots (80,000) of the region taken by the Civil Air Patrol right after the disaster. There’s a problem, however.
...there were too many images to sort through manually, and commercial image recognition systems were failing to identify anything meaningful. In one particularly egregious example, ImageNet, the golden standard for image classification, recommended labeling an image of a major flooding zone as a toilet.
“There was this amazing information content, but it wasn’t accessible,” says Weinert.
They soon realized this problem isn’t unique. In any large-scale disaster scenario, teams of emergency responders like FEMA could save significant time and resources by reviewing details of on-the-ground conditions before their arrival. But most computer vision systems are trained on regular day-to-day imagery, so they can’t reliably pick out relevant details in disaster zones.
The realization compelled the team to compile and annotate a new set of photos and footage specific to emergency response scenarios. They released the data set along with a paper this week in the hopes that it will be used to train computer vision systems in the future.
The data set includes over 620,000 images and 96.5 hours of video that encompass imagery from all 50 states of the US. Most of the media were sourced from government databases or Creative Commons videos on YouTube; a small fraction were also filmed by the Lincoln Lab staff themselves.
(Image Credit: MIT Lincoln Laboratory)