Skip to main content
Dryad

A high-throughput multispectral imaging system for museum specimens

Cite this dataset

Chan, Wei-Ping et al. (2022). A high-throughput multispectral imaging system for museum specimens [Dataset]. Dryad. https://doi.org/10.5061/dryad.37pvmcvp5

Abstract

We present an economical imaging system with integrated hardware and software to capture multispectral images of Lepidoptera with high efficiency. This method facilitates the comparison of colors and shapes among species at fine and broad taxonomic scales and may be adapted for other insect orders with greater three-dimensionality. Our system can image both the dorsal and ventral sides of pinned specimens. Together with our processing pipeline, the descriptive data can be used to systematically investigate multispectral colors and shapes based on full-wing reconstruction and a universally applicable ground plan that objectively quantifies wing patterns for species with different wing shapes (including tails) and venation systems. Basic morphological measurements, such as body length, thorax width, and antenna size are automatically generated. This system can increase exponentially the amount and quality of trait data extracted from museum specimens.

Methods

Processed data

These data include but are not limited to all parameters generated during image processing, gridded multispectral reflectance, wing shapes, and the measurements of body size and antennae. The detailed data structure can be found on the GitHub repository.

 Map of archived materials, protocols, and tutorials

To prevent potential conflicts, scripts for different purposes on the cluster and on the local machine are provided in different protocols on Protocols.io and repositories on GitHub. Here, the summary of online protocols and source codes are organized as follows. Inclusion in [Protocol] indicates the corresponding step-by-step instruction on Protocols.io; inclusion in [Cluster] indicates the script will run better on the cluster; inclusion in [Local] indicates the script is designed for local machines with relatively low CPU and memory demands.

     Raw data: files described in the following format [[Folder/File name]]: descriptions

       [[Methodology_imaging_records.csv]]: A file recording image names and the barcode of imaged specimens

       [[Drawer_img_nef]]: Drawer images in RAW (*.NEF) format (total 35 images)

o   Five set of images: Method_1-1_dorsal, Method_1-1_ventral, Method_1-2_dorsal, Method_1-2_ventral, Method_1-r_ventral (with a scale bar placed upside-down)

       [[Drawer_img_tiff]]: Drawer images in linearized 16-bit (*.tiff) format (total 35 images)

       [[manual_bounding_box_par]]: Manually corrected bounding boxes

       [[spp_img_inspection]]: Specimen images for visualization (*.jpg)

o   [[Problematic]]: Those problematic ones that need to be manually corrected

       [[spp_img_reMask_tiff_done]]: Specimen images (*.tiff) after the mask correction

       [[spp_first_level_product]]: The initial descriptive data or ‘first-level products’ (*AllBandsMask.mat). Find Methods for the detailed data structure

       [[spp_RGB_Imgs]]: Images used for manual fore-and hindwing segmentation

o   [[Seg_done]]: Done images (*.jpg)

o   [[Segmented]]: The fore-and hindwing segmentation parameters (*.json)

       [[spp_segmentation_analysis]]: Segmented images after inspection and manual correction

o   [[wing_segmentation_img]]: The visualizations of image segmentation (*.jpg)

o   [[wing_shape_morph-seg]]: The results of image segmentation (*morph-seg.mat)

o   [[morphology_analysis_spp_preference_table_template.csv]]: A table generated according to the images in the ‘wing_segmentation_img’ folder, which is later used for inspection

o   [[morphology_analysis_spp_preference_table.csv]]: The result after manual inspection, which records the condition of different body parts of a specimen

o   [[reflectance_table]]: The reflectance data for all body parts of all specimens

       [[spp_wing_grids_generation]]: Generate wing grids and processed data

o   [[inspect_imgs]]: The visualization (*.jpg) of wing grids (no correction was needed in these results)

o   [[spp_wing_parameters]]: Processed wing data. The original folder name is kept here.

o   [[wing_matrix_visualization]]: The summarized multispectral reflectance (NIR [740], fNIR [940], F, FinRGB, PolDiff, UV, UVF, white, whitePol1, whitePol2) according to wing grids.

       [[spp_second-level_product]]: The processed “second-level products”. (*_d-v_gridsPars*.mat). Find Methods for the detailed data structure

       [[group_summary]]: The summary statistics for specified groups

o   [[specimen_groups.csv]]: A table specifying groups

o   [[specimen_groups_group_barcode_list.json]]: The group table in JSON format

o   [[summary_matrices]]: The summary results according to the group table (*._summary.mat)

o   [[summary_visualization]]: The summary visualization for each group (*.png)

o   [[shp_tail_adv_vis]]: Replot wing shape and tails by scripts for advanced visualization (*.png)

o   [[tail_summary_visualization]]: Replot tails by scripts for advanced visualization (*.png)

 

    Blueprints and materials (Fig. 6)

[Protocol] https://www.protocols.io/private/2E2FB268F7AF11EBB05F0A58A9FEAC02

 

    Bash scripts and shell scripts running on the cluster

[Cluster] https://github.com/weipingchan/Bash_scripts_methodology_paper

 

    Image preprocessing to derive initial descriptive data for museum archiving

[Protocol] https://www.protocols.io/private/DEF29A74E44E11EB96DA0A58A9FEAC02

[Cluster] https://github.com/weipingchan/single_img_processing

       Inspection and manual correction of specimen bounding box (Fig. 3d)

[Local] https://github.com/weipingchan/Drawer_img_manual_define_bounding_boxes

       Inspection and manual correction of mask for background removal (Fig. 9a)

[Local] commercial painting software, such as Adobe Photoshop

 

    Data preparation and processing for color and shape quantification

[Protocol] https://www.protocols.io/private/DEF29A74E44E11EB96DA0A58A9FEAC02

       Body-part segmentation (Fig. 9c panels at right)

manually defined fore-hindwing segmentation data

[Local] https://github.com/weipingchan/body-seg_distribute

Segmentation

[Cluster] https://github.com/weipingchan/basic_segmentation

Inspection and manual correction of primary landmarks (Fig. 9b)

[Local] https://github.com/weipingchan/manual_landmark_correction

       Multispectral reflectance at wing-size level (as table format; Fig. 9d)

[Protocol] https://www.protocols.io/private/F3292DF1FE0211EB878B0A58A9FEAC02

[Cluster] https://github.com/weipingchan/multispectral_reflectance_wing-size_level

       Dorsal-ventral side analyses (Fig. 4)

[Protocol] https://www.protocols.io/private/F3292DF1FE0211EB878B0A58A9FEAC02

[Cluster] https://github.com/weipingchan/dorsal_ventral_analysis

       Inspection and manual correction of secondary landmarks (Fig. 1d)

[Local] https://github.com/weipingchan/manual_wing_grid_correction

 

    Visualization (Fig. 1g-h & Fig.5)

[Protocol] https://www.protocols.io/private/F3292DF1FE0211EB878B0A58A9FEAC02

       Multispectral reflectance at wing-pattern level with wing shape summary

[Local] https://github.com/weipingchan/dorsal_ventral_summary

       Advanced visualization for wing shapes and tails (Methods)

[Local] https://github.com/weipingchan/replot_tail_and_avg_shapes

Usage notes

The pipeline was mainly developed under Matlab and R, but the data formats (e.g. *.mat, *.json) can still be operated in Python or other interface.

Funding

National Science Foundation, Award: PHY-1411123

National Science Foundation, Award: DEB-0447242

National Science Foundation, Award: PHY-1411445

United States Air Force Office of Scientific Research, Award: FA9550-14-1-0389

United States Air Force Office of Scientific Research, Award: FA9550-16-1-0322