Rapid species discrimination of similar insects using hyperspectral imaging and lightweight edge artificial intelligence
Data files
Dec 19, 2025 version files 812.08 KB
-
README.md
4.02 KB
-
training_set.csv
538.01 KB
-
validation_set.csv
270.05 KB
Abstract
Species discrimination of insects is an important aspect of ecology and biodiversity research. The traditional methods based on human visual experience and biochemical analysis cannot strike a balance between accuracy and timeliness. Morphological identification using computer vision and machine learning is expected to solve this problem, but image features have poor accuracy for very similar species and usually require complicated networks which are unfriendly to portable edge devices. In this work, we propose a fast and accurate species discrimination method of similar insects using hyperspectral features and lightweight machine learning algorithm. Feature regions selection, feature spectra selection and model quantification are used for the optimization of discriminating network. The experimental results of six similar butterfly species in the genus of Graphium show that, compared to morphological recognition with machine vision, our work achieves a higher accuracy of 100% and a shorter inference time of 0.35 ms, with the tiny-size convolutional neural network deployed on a neural-network chip. This study provides a rapid and high accuracy species discrimination method for insects with high appearance similarity, and paves the way for field discriminations using intelligent micro-spectrometer based on on-chip microstructure and AI chip.
https://doi.org/10.5061/dryad.cfxpnvxdc
According to the ratio of 2:1, all samples of the six species were divided into training set and test set. We use the number from 0 to 5 to show the species of G.sarpedon, G.milon, G.doson, G.chironides, G.eurypylus and G.meyeri.
Description of the data and file structure
Butterflies (Lepidoptera: Rhopalocera) are the second largest species of insects, with approximately 17,000 recorded species on earth. The species classification and identification of butterfly have been a significant area of research in the field of entomological taxonomy for a long time. In this work, six butterfly species in the genus of Graphium are selected for experiments, including G.sarpedon, G.milon, G.doson, G.chironides, G.eurypylus and G.meyeri. The total number of samples is 36, and for each species the number is 6. All the specimens were made using standard method including softening, pinning, stretching wings, and drying. The boundary shape, spot distribution, and color of the Graphium samples all have high similarity. Therefore, fast discrimination between some species of the samples is especially difficult even for professional scientists.
Considering the possibility of field-use, we choose a hand-held hyperspectral camera of Specim IQ (Specim Spectral Imaging Ltd.) with the wavelength range of 400 to 1000 nm and a spectral resolution of 7 nm. It’s a line scan camera based on push-broom technology with 204 spectral bands. To minimize light scattering errors, the specimens of butterfly and standard white reference (WR) were placed flat on a black background with low-reflectance. The camera lens was oriented vertically downward, with a distance of 26 cm to butterfly specimen. Clear hyperspectral images of the samples were obtained by adjusting suitable focal length and exposure time, for this work the exposure time is 20 ms. The resulting hyperspectral images, aside from the actual raw image data of the target area, were further processed to yield reflectance spectral data of every pixel. Considering that both spectral ends of the camera have a lower signal-to-noise ratio (SNR), only the spectral range of 420-990 nm was used in this work, including 191 spectral bands.
training_set
wb: wb is the abbreviation of waveband, and its unit is 'nm'. The first row of the datasheet lists the waveband values, corresponding to the spectral wavelengths of each band.
Class label (first column): The first column indicates the class label of each sample. A total of six classes are included in the dataset, encoded as integers from 0 to 5.
Spectral data: Each row (excluding the header) represents one sample. Each column (excluding the first column) represents the reflectance value at a specific waveband. Therefore, each row corresponds to a complete spectral curve of a single sample.
Measurement unit: The spectral values represent reflectance, which is a dimensionless quantity (or normalized reflectance).
validation_set
wb: wb is the abbreviation of waveband, and its unit is 'nm'. The first row of the datasheet lists the waveband values, corresponding to the spectral wavelengths of each band.
Class label (first column): The first column indicates the class label of each sample. A total of six classes are included in the dataset, encoded as integers from 0 to 5.
Spectral data: Each row (excluding the header) represents one sample. Each column (excluding the first column) represents the reflectance value at a specific waveband. Therefore, each row corresponds to a complete spectral curve of a single sample.
Measurement unit: The spectral values represent reflectance, which is a dimensionless quantity (or normalized reflectance).
Key Information Sources
The spectral data were acquired using the Specim IQ hyperspectral camera.
The dataset comprises hyperspectral measurements collected from six butterfly species belonging to the genus Graphium, including G. sarpedon, G. milon, G. doson, G. chironides, G. eurypylus, and G. meyeri. A total of 36 samples were collected, with six samples per species.
All spectral data were acquired using a handheld Specim IQ hyperspectral camera (Specim Spectral Imaging Ltd.), covering a wavelength range from 400 to 1000 nm with a spectral resolution of 7 nm.
As a full-spectrum dataset of all samples, a total of 360 spectra were obtained. Each species contributed 60 spectra, and each individual sample contributed 10 spectra.
- Wang, Xuquan; Ma, Zhiyuan; Xing, Yujie et al. (2025). Rapid species discrimination of similar insects using hyperspectral imaging and lightweight edge artificial intelligence. Zenodo. https://doi.org/10.5281/zenodo.10518075
- Wang, Xuquan; Ma, Zhiyuan; Xing, Yujie et al. (2025). Rapid species discrimination of similar insects using hyperspectral imaging and lightweight edge artificial intelligence. Zenodo. https://doi.org/10.5281/zenodo.10518076
- Wang, Xuquan; Ma, Zhiyuan; Xing, Yujie et al. (2024). Rapid species discrimination of similar insects using hyperspectral imaging and lightweight edge artificial intelligence. Royal Society Open Science. https://doi.org/10.1098/rsos.240485
