Tag Archives: Data scientists

A Flutter of Data | Digital Collections Programme

iCollections canvas

Examples of some of the Lepidoptera specimens available on the Data Portal.

The final batch of data from the iCollections project has now been released through the Museum’s Data Portal – a total of 260,000 Lepidoptera specimen records, bringing the total number of Museum specimen records accessible on the Portal to just over 3.8 million.

What was iCollections?

In 2013 the Museum started to look at the best way to digitise Butterflies and Moths from the UK and Ireland, a collection estimated at half a million specimens. This was a pilot project to develop quick and efficient ways to digitise large Museum collections.

Digitisation Workflow

During the pilot project we trialled and adapted methods of image capture to suit the specimens, giving us an efficient workflow which can be used to digitise wider pinned insect collections. We place each specimen in a specially designed unit tray, with raised sides where we position the specimen’s labels and add a barcode encoded with the unique specimen number. We place each tray in a light box under a DSLR camera to capture an image containing the majority of specimen data. These images are ingested into a bespoke database, which allows species name and location (within the collection) to be added to the file. The database transcription interface lets us add additional data from labels.

collage

We photograph each specimen and its labels, and data is then added to the record via a transcription interface.

During the iCollections project, we became much more efficient with the time taken to photograph a single specimen, whilst ensuring that the damage to these precious specimens from handling is kept to a minimum. We digitised the entire butterfly collection of over 180,000 specimens and made a significant start on the moths by digitising over 260,000 specimens.

In 2016 we secured further funding to carry on the digitisation of the British and Irish moths with our refined workflow. Once this has been completed, further data will be released on the Data Portal. When complete we will have just over half a million Lepidoptera specimens accessible to anyone in the world with an internet connection. This enhances access to our collection, which traditionally will have been via visits or specimen loans. In some cases the researcher may only require a digital specimen, or the digital records could help a researcher narrow down the scope of what they may want to study on a visit to the museum.

iCollections enabled us to come up with an efficient and bespoke workflow for pinned insects which we have been able to re-use. We have published a paper on the iCollections method, to share this with the natural history community. We have also used the learning from iCollections to start new projects, such as our current project to digitise Madagascan Lepidoptera type specimens.

Why Butterflies and Moths? 

The British Lepidoptera collection contains over half a million pinned specimens collected in the UK and Ireland spanning over 200 years. It includes donations from important collectors of the twentieth and twenty-first centuries. As we digitise the Lepidoptera collections we are georeferencing each record, mapping the distribution of species and revealing collecting trends since the mid-nineteenth century.

By providing access to this unrivalled historical, taxonomic and geographical data we can equip more scientists to conduct new research in new ways. For example, Museum scientists, Steve Brooks et al. have been able to compare butterfly data to historical temperature records and found that 92% of the 51 species emerged earlier in years with higher spring temperatures.

‘The warming climate is already causing butterflies to emerge earlier – and unless their food plants adapt at the same rate, the insects could emerge too early to survive.’ (S.Brooks et al., 2016)

When it comes to digitising Lepidoptera, our digitisers can now process up to 300 a day. They get to see and interact with the specimens up close and become extremely fast with a pair of forceps! Our digitiser Peter Wing told us “My favourite image to digitise was a Monarch Butterfly that was pinned with a sewing needle.” While digitising, we uncover some fascinating stories behind the collection. We have been sharing some of these enlightening moments by using #MothMonday on twitter.

peter

Our digitiser Peter with his favourite specimen

Who’s using our data?

We are on a mission to digitise the Museum collection of 80 million specimens. We want to make available our unrivalled historical, geographic and taxonomic specimen data gathered in the last 250 years available to the global scientific community. These data, along with associated specimen images are released through the Museum’s Data Portal.

Through the Data Portal and those of our partners like the Global Biodiversity Information Facility (GBIF), more than 5.9 billion records have been accessed in over 115,500 downloads since April 2015. Through GBIF we are also able to see which scientists are using our data as part of their papers and through Altmetric how many people are talking about our data online. So far we have been cited in 44 papers and referenced over 100 times online.

The Data Portal currently has around 200 non-museum users each day and contains more than 700,000 species-level (index lot) records and over 90 research datasets uploaded by NHM staff and other institutions. This includes 3D scans, images and audio recordings as well as more traditional data.

Critical information is currently locked away within hundreds of millions of specimens, labels and archives in collections across the globe. Our ultimate goal is to unlock this treasure trove of information so that scientists, researchers and data analysts from around the world can use this information to tackle some of the big questions of our time.

To make use of the Museum’s iCollections data please visit the Data Portal To hear more stories behind the Lepidoptera collection you can follow our #MothMonday content on twitter or keep up to date with the Museum’s digitisation projects on the website.

Endorsing the Science International Open Data Accord | Digital Collections Programme

A growing number of museums are joining open data initiatives to publish their collection databases and digital reproductions online. The Museum has operated a policy of open by-default on our digital scientific collections.

Photograph of Vince Smith, Head of Informatics reading the Science International Data Accord

Vince Smith, Head of Informatics reading the Science International Data Accord

By signing the International Open Data Accord, the Museum recognises the opportunities and challenges of the data revolution and adopts a set of internationally recognised principles as our response to these.

Continue reading

Digital Collections: the Cisco Pitstop | Digital Museum

We have a massive digital challenge. How do we transform museum collections of millions of diverse specimens, each with complex information in many forms, into digital resources – images and data – to be used by modern science and shared across the world?

The collections have been at the centre of scientific knowledge for 300 years – how do we take them into science’s future? In the words of Rod Page from Glasgow University: how do we transform a 19th Century technology into a 21st Century technology? This is the question we have been looking at in a Cisco Pitstop at the London Digital Catapult Centre over two days in February 2016.

Continue reading