Automating mass-digitisation with Inselect | Digital Collections Programme

Natural history collections provide an enormous evidence base for scientific research on the natural world. We are working to digitise our collection and provide global, open access to this data via our Data Portal.

A full drawer image of Mayflies with the boundary boxes around each specimen

Tray of mayflies (Ephemeroptera) with bounding boxes from the Inselect programme

To digitise the collection we are developing digital capture flows that cater for a wide range of collection types. One of the applications we have developed is Inselect – a cross-platform, open source desktop PC application that automates the cropping of individual images of specimens from whole-drawer scans.

At the Museum, there are an estimated 33 million insect specimens, housed in 130 thousand drawers. It is much easier and quicker to image 130 thousand drawers rather than 33 million individual insects.

However, by themselves drawer-level images are not very useful. Manually cropping each image takes too much time and without unique identifiers the individual images are of questionable value.

How Inselect helps

The Inselect application was developed by Alice Heaton, Pieter Holtzhausen, Stéfan van der Walt and Lawrence Hudson. It identifies individual specimens with their associated labels, and places a box around each one, using simple computer vision techniques.

This allows the user to efficiently crop and save an image of each specimen in the drawer, without having to ‘draw round’ each one manually. Inselect also captures information from relevant labels, associating this with the saved cropped image.

What specimens can Inselect be used with?

Inselect has been applied to many different specimen types. We have used Inselect at the Museum on images from small groups of pinned insects to around 80,000 microscope slides.

We have presented Inselect at conferences such as the annual meetings of the Society for the Preservation of Natural History Collections and of the Entomological Collections Network. In addition to the Museum, Inselect is now being either trialled or used by a number of other natural history organisations around the world.

For example, the Australian Museum in Sydney are using Inselect to identify individual specimens found in bulk samples of insects stored in ethanol, in their Insect Soup project, and our colleagues at the Yale Peabody Museum of Natural History, are using Inselect to image fossils.

“Thanks for Inselect – it’s workflow-changing! We have been whole drawer imaging fossils, but we have also been using Inselect to image concretions (basically a chunk of rock) that hold hundreds of fossils – which is basically only possible to do now that we have Inselect. Thanks again to you and your team – Inselect rules.”

Susan H. Butts, Peabody Museum of Natural History

The Smithsonian, Agriculture and Agri-Food Canada, the Carnegie Museum of Natural History and the Field Museum have also been using the software.

“I appreciate you guys creating and maintaining this software. It will be immensely useful to the digitization community.”

Jim Fetzner, Carnegie Museum of Natural History

Inselect is just one of the methods we are using to develop our digitisation workflows for different types of specimens, and it’s fantastic that we can spread the word and share our findings with other museums who face similar challenges.

If you enjoyed reading this you can visit the Inselect website and follow us on Twitter for regular Digital Collections Programme updates. We would love to know what you want to hear from us so please tag #Inselect and @NHM_Digitise with your photos or comments.