# Data from: Assessment of conservation status of Ferula huber-morathii: Association with population genetic structure and regional climate

## Data files

## Sep 06, 2024 version files 350.22 KB

## Abstract

Ferula huber-morathii is an endemic and medicinally important plant. This species is distributed in eight Turkish localities, including three newly identified ones. Its extent of occurrence and area of occupancy is determined to be 3963 km2 and 32 km2 respectively. All localities are characterized by East Mediterranean and sub-Mediterranean precipitation regimes; however, temperatures increase excessively and precipitation decreases during the flowering period of the species. The population sizes are quite small, and the number of reproducing individuals in some populations is below ten. Analyses of ISSR markers showed the percentage of polymorphic loci to be 94% at the species level and 56% at the population level. The level of genetic differentiation (measured by GST) was 0.37 and the estimated level of gene flow among populations (NM) was 0.84. The percentage of variance occurring within and among populations, determined by AMOVA, was 75% and 25%, respectively. STRUCTURE analysis revealed two genetic clusters of individuals with a geographic structure similar to that found in UPGMA and an ordination analysis. Some populations turned out to have both low numbers of individuals and low genetic diversity. Since many of the populations are subject to anthropogenic disturbance, the species should remain in the EN category. At the same time, it is suggested that a new in-situ conservation area should be created around nearby dams, situated in the same climate area as the currently known populations.

## README: This dataset includes Figure1, FR GEN-DIVERSITY data, and Figure 4, and their explanations.

Figure 1 dataset provides resources for the species' conservation strategy based on habitat and distribution areas.In this data, coordinates obtained from herbarium records and field studies were used to determine AOO and EOO, which constitute an important part of the Ferula huber-morathii threat category criteria. For this purpose, the distribution map of the species was created by adding population coordinates to the GeoCat program (Bachman et al. 2011),

A total of eight localities were recorded.

- AOO: Area of Occupancy

32 km2

AOO is minimum 4 (2X2) km2 for each population

- EOO: Extent of Occurence

3.963,064 km2

- FR GEN-DIVERSITY data set

Table 4 contains genetic diversity data of the *F. huber-morathii* and its populations.

* N = Sample sizes

* Na = Observed number of alleles

* Ne = Effective number of alleles [Kimura and Crow (1964)]

* H = Nei's (1973) gene diversity

* I = Shannon's Information index [Lewontin (1972)]

Formulas from which data is obtained

N = Sample sizes

Na = Number of Different Alleles

Ne = Number of Effective Alleles = 1 / (p^2 + q^2)

H = Σ(1- pi2– qi2)/n)

I = Shannon's Information Index = -1* (p * Ln (p) + q * Ln(q))

He = Expected Heterozygosity = 2 * p * q

Where for Diploid Binary data and assuming Hardy-Weinberg Equilibrium, q = (1 - Band Freq.)^0.5 and p = 1 - q.

Figure 4 was obtained by loading the data generated as a result of structure analysis into the CLUMPACK program. In this data set account for potential recombination in clustering of population individuals, the Mixture model was used to calculate the probabilities of the data from K=1 to K=10 clusters for each cluster K ((Pr(X|K)) or L(K). After performing 100,000 or 300,000 McMC iterations as burn-in time for each K, 10 runs with ten iterations were performed. The amount of variance in the probability of each K was estimated (Evanno et al. 2005).

MCMC: integrates over possible chunk sizes and breakpoints to find different values of K.

The optimal number of groups was determined using K in CLUMPAK (Kopelman et al. 2015).

Retrieving data from files for each K

K: number of populations

K=1 mean: -754.21 K=1 standard deviation: 0.128668393770775 K=1 median: -754.2

K=2 mean: -630.77 K=2 standard deviation: 0.494525586350747 K=2 median: -630.7

K=3 mean: -640.63 K=3 standard deviation: 6.97982011866271 K=3 median: -638.1

K=4 mean: -718.74 K=4 standard deviation: 21.814072929597 K=4 median: -722

K=5 mean: -743.32 K=5 standard deviation: 15.4657902050515 K=5 median: -748

K=6 mean: -743.11 K=6 standard deviation: 17.6488872800022 K=6 median: -739.1

K=7 mean: -728.84 K=7 standard deviation: 18.4378957584644 K=7 median: -732.6

K=8 mean: -882.44 K=8 standard deviation: 502.315585176402 K=8 median: -723.3

K=9 mean: -704.89 K=9 standard deviation: 10.4891954992851 K=9 median: -705.25

K=10 mean: -707.07 K=10 standard deviation: 12.1466090375508 K=10 median: -709.7

Calculating Best K by Evanno

Delta(K=2) = 269.55127030668 Delta(K=3) = 9.77818895611831 Delta(K=4) = 2.45392046559867

Delta(K=5) = 1.60289255649564 Delta(K=6) = 0.796650790326652 Delta(K=7) = 9.1046181299151

Delta(K=8) = 0.659246915231004 Delta(K=9) = 17.1347745413125 Max Delta K: 269.55127030668

Optimal K by Evanno is: 2

## Methods

The area of occupancy and other distributional data were determined by the Geospatial Conservation Assessment Tool (GeoCat) program (http://geocat.kew.org; Bachman et al. 2011), based on the 2x2 km cell sizes recommended by IUCN and interpopulation distances obtained with the Google Earth program. In accordance with the ICUN criteria (2012; 2019) and the updated population data collected from the field study, these data resulted in a reassessment of the Red List category for the species.

The data obtained using the ISSR markers were scored in a binary matrix as present (1) or absent (0). Allele number observed (Na), effective allele number (Ne), polymorphic locus number (P_{LS}), polymorphic locus percentage (P_{LY}), Nei’s genetic diversity (H) (1973, 1987) as well as Shannon’s information index (Lewontin, 1972), served as general estimators of genetic diversity, calculated using POPGENE ver. 1.32 (Yeh et al. 1997) and GenAlEx ver. 6.5 (Peakall and Smouse, 2012).

Finally, individuals were classified into genetic clusters using STRUCTURE version 2.3.4 (Pritchard et al. 2000). In order to account for potential recombination across inferred clusters, the Admixture model was used to calculate the probabilities of the data for each K ((Pr(X|K)) or L(K) cluster for K=1 through K=10 clusters. After 100,000 or 300,000 McMC replications had been performed as a burn-in period for each K, 10 runs with ten iterations were performed. The amount of variance in each K.'s likelihood was estimated (Evanno et al. 2005). The bestfit number of groups in CLUMPAK (Kopelman et al. 2015) was determined using K.