A high-throughput multispectral imaging system for museum specimens
Data files
Dec 07, 2022 version files 14.54 GB
-
README.txt
5.78 KB
-
Supplementary_data_on_DryAd.rar
14.54 GB
Abstract
We present an economical imaging system with integrated hardware and software to capture multispectral images of Lepidoptera with high efficiency. This method facilitates the comparison of colors and shapes among species at fine and broad taxonomic scales and may be adapted for other insect orders with greater three-dimensionality. Our system can image both the dorsal and ventral sides of pinned specimens. Together with our processing pipeline, the descriptive data can be used to systematically investigate multispectral colors and shapes based on full-wing reconstruction and a universally applicable ground plan that objectively quantifies wing patterns for species with different wing shapes (including tails) and venation systems. Basic morphological measurements, such as body length, thorax width, and antenna size are automatically generated. This system can increase exponentially the amount and quality of trait data extracted from museum specimens.
Methods
Processed data
These data include but are not limited to all parameters generated during image processing, gridded multispectral reflectance, wing shapes, and the measurements of body size and antennae. The detailed data structure can be found on the GitHub repository.
Map of archived materials, protocols, and tutorials
To prevent potential conflicts, scripts for different purposes on the cluster and on the local machine are provided in different protocols on Protocols.io and repositories on GitHub. Here, the summary of online protocols and source codes are organized as follows. Inclusion in [Protocol] indicates the corresponding step-by-step instruction on Protocols.io; inclusion in [Cluster] indicates the script will run better on the cluster; inclusion in [Local] indicates the script is designed for local machines with relatively low CPU and memory demands.
● Raw data: files described in the following format [[Folder/File name]]: descriptions
● [[Methodology_imaging_records.csv]]: A file recording image names and the barcode of imaged specimens
● [[Drawer_img_nef]]: Drawer images in RAW (*.NEF) format (total 35 images)
o Five set of images: Method_1-1_dorsal, Method_1-1_ventral, Method_1-2_dorsal, Method_1-2_ventral, Method_1-r_ventral (with a scale bar placed upside-down)
● [[Drawer_img_tiff]]: Drawer images in linearized 16-bit (*.tiff) format (total 35 images)
● [[manual_bounding_box_par]]: Manually corrected bounding boxes
● [[spp_img_inspection]]: Specimen images for visualization (*.jpg)
o [[Problematic]]: Those problematic ones that need to be manually corrected
● [[spp_img_reMask_tiff_done]]: Specimen images (*.tiff) after the mask correction
● [[spp_first_level_product]]: The initial descriptive data or ‘first-level products’ (*AllBandsMask.mat). Find Methods for the detailed data structure
● [[spp_RGB_Imgs]]: Images used for manual fore-and hindwing segmentation
o [[Seg_done]]: Done images (*.jpg)
o [[Segmented]]: The fore-and hindwing segmentation parameters (*.json)
● [[spp_segmentation_analysis]]: Segmented images after inspection and manual correction
o [[wing_segmentation_img]]: The visualizations of image segmentation (*.jpg)
o [[wing_shape_morph-seg]]: The results of image segmentation (*morph-seg.mat)
o [[morphology_analysis_spp_preference_table_template.csv]]: A table generated according to the images in the ‘wing_segmentation_img’ folder, which is later used for inspection
o [[morphology_analysis_spp_preference_table.csv]]: The result after manual inspection, which records the condition of different body parts of a specimen
o [[reflectance_table]]: The reflectance data for all body parts of all specimens
● [[spp_wing_grids_generation]]: Generate wing grids and processed data
o [[inspect_imgs]]: The visualization (*.jpg) of wing grids (no correction was needed in these results)
o [[spp_wing_parameters]]: Processed wing data. The original folder name is kept here.
o [[wing_matrix_visualization]]: The summarized multispectral reflectance (NIR [740], fNIR [940], F, FinRGB, PolDiff, UV, UVF, white, whitePol1, whitePol2) according to wing grids.
● [[spp_second-level_product]]: The processed “second-level products”. (*_d-v_gridsPars*.mat). Find Methods for the detailed data structure
● [[group_summary]]: The summary statistics for specified groups
o [[specimen_groups.csv]]: A table specifying groups
o [[specimen_groups_group_barcode_list.json]]: The group table in JSON format
o [[summary_matrices]]: The summary results according to the group table (*._summary.mat)
o [[summary_visualization]]: The summary visualization for each group (*.png)
o [[shp_tail_adv_vis]]: Replot wing shape and tails by scripts for advanced visualization (*.png)
o [[tail_summary_visualization]]: Replot tails by scripts for advanced visualization (*.png)
● Blueprints and materials (Fig. 6)
[Protocol] https://www.protocols.io/private/2E2FB268F7AF11EBB05F0A58A9FEAC02
● Bash scripts and shell scripts running on the cluster
[Cluster] https://github.com/weipingchan/Bash_scripts_methodology_paper
● Image preprocessing to derive initial descriptive data for museum archiving
[Protocol] https://www.protocols.io/private/DEF29A74E44E11EB96DA0A58A9FEAC02
[Cluster] https://github.com/weipingchan/single_img_processing
● Inspection and manual correction of specimen bounding box (Fig. 3d)
[Local] https://github.com/weipingchan/Drawer_img_manual_define_bounding_boxes
● Inspection and manual correction of mask for background removal (Fig. 9a)
[Local] commercial painting software, such as Adobe Photoshop
● Data preparation and processing for color and shape quantification
[Protocol] https://www.protocols.io/private/DEF29A74E44E11EB96DA0A58A9FEAC02
● Body-part segmentation (Fig. 9c panels at right)
manually defined fore-hindwing segmentation data
[Local] https://github.com/weipingchan/body-seg_distribute
Segmentation
[Cluster] https://github.com/weipingchan/basic_segmentation
Inspection and manual correction of primary landmarks (Fig. 9b)
[Local] https://github.com/weipingchan/manual_landmark_correction
● Multispectral reflectance at wing-size level (as table format; Fig. 9d)
[Protocol] https://www.protocols.io/private/F3292DF1FE0211EB878B0A58A9FEAC02
[Cluster] https://github.com/weipingchan/multispectral_reflectance_wing-size_level
● Dorsal-ventral side analyses (Fig. 4)
[Protocol] https://www.protocols.io/private/F3292DF1FE0211EB878B0A58A9FEAC02
[Cluster] https://github.com/weipingchan/dorsal_ventral_analysis
● Inspection and manual correction of secondary landmarks (Fig. 1d)
[Local] https://github.com/weipingchan/manual_wing_grid_correction
● Visualization (Fig. 1g-h & Fig.5)
[Protocol] https://www.protocols.io/private/F3292DF1FE0211EB878B0A58A9FEAC02
● Multispectral reflectance at wing-pattern level with wing shape summary
[Local] https://github.com/weipingchan/dorsal_ventral_summary
● Advanced visualization for wing shapes and tails (Methods)
[Local] https://github.com/weipingchan/replot_tail_and_avg_shapes
Usage notes
The pipeline was mainly developed under Matlab and R, but the data formats (e.g. *.mat, *.json) can still be operated in Python or other interface.