# ForGEM Functionality

 FORGEM

### Genetic variety

The genetic variety can be measured as the number of different alleles or different genotypes in population (Gregorius 1977, Gregorius et al. 1985)

### Genetic diversity

The genetic diversity characterizes the heterogeneity of the distribution of genetic variants in a population of a sample therefrom (Hattemer 1991). It can thus be measure the allelic diversity of the k-th locus or genotype diversity of a deme.

${\displaystyle v_{k}={\frac {1}{\sum \limits _{i=1}^{n_{k}}(p_{i}^{k})^{2}}}\qquad \qquad 1{{EqNo 27}}

with: n: number different genetic types (alleles, genotypes), p frequency of i-th genetic type. v equals unity if there is only 1 genetic type, and equals n if every all genetic types are equally frequent (Gregorius 1978).

### Mean effective number of alleles

In case of allele diversity vk can be considered the effective number of alleles for locus k if nk alleles occur with frequencies pik (i=1, ... nk) (Hattemer 1991). Thus, the mean effective number of alleles is the harmonic mean of vk at m loci.

${\displaystyle {\overline {v}}=m\cdot {\frac {1}{\sum \limits _{k=1}^{m}{\frac {1}{v_{k}}}}}\qquad \qquad 1\leq v\leq {\frac {1}{n}}\sum \limits _{k=1}^{n}{n_{k}}}${{EqNo 28}}

### Hypothetical gametic multi-locus diversity

The diversity of the gametic output of populations is a special case of diversity and characterizes the adaptive potential of sexually reproducing populations (Gregorius 1978). It is hypothetical in the sense that the absence of fertility selection is assumed as well as the independence of the distributions of alleles at different loci (i.e. no linking) (Hattemer 1991).

${\displaystyle v_{gam}=\prod \limits _{k=1}^{m}{v_{k}}\qquad \qquad 1\leq v\leq \prod \limits _{k=1}^{m}{n_{k}}}${{EqNo 29}}

with: m the number of unlinked loci, and vk the allelic diversity based on Eqn. (27) for the k-th locus. vgam is thus a measure for effective number of the multiloci gametes that can be produced in a population (Gregorius 1978).

### Genetic distance between demes

The differentiation between two demes is characterized by counting the number of genetic variants which the demes do not share. Thus, the allelic differentiation between demes X and Y represents the genetic distance between the demes (Gregorius 1974, Gregorius & Roberts 1986).

${\displaystyle d_{xy}={\frac {1}{2}}\cdot |x_{i}-y_{j}|\quad \quad \quad \quad 0\leq d_{xy}\leq 1}$(30)

with: xi and yj genetic frequencies (of alleles at a given locus or of a genotype) of deme X and Y. If the genetic distance equals zero then both populations have the same alleles or genotypes with the same frequency. dxy equals unity if both populations have no alleles or genotypes in common (Gregorius 1974, 1978, 1984). Note that the genetic distance is a symetrical statistic (dxy = dyx) and that the distance between population X and Y cannot exceed the sum of their distances to a third population Z (dxydxz + dyz) (Hattemer 1991).

### Genetic differentiation among demes

This statistic represents the genetic distance between a deme and its complement, i.e. the union of all other demes (Gregorius 1985)

${\displaystyle D_{j}={\frac {1}{2}}\cdot \sum \limits _{i=1}^{n}{|p_{i}^{j}-p^{-j}|}\quad \quad \quad \quad 0\leq D_{j}\leq 1}$(31)

with: pij frequency of allele or genotype i in deme j, and average allele or genotype freqency in the complement of deme j. The substructure of the complement has no influence of D, as different complement can yield the same p- j. Thus, identical D’s do not neccessarily indicate the demes with an identical genetic structure. However, vice versa demes with an identical genetic structure do possess an identical genetic structure (Hattemer 1991).

### Average genetic differentiation

The average genetic differentiation amoung m demes is the weighted mean of Dj

${\displaystyle \delta =\sum \limits _{j=1}^{m}D_{j}\cdot c_{j}\quad \quad \quad \quad 0\leq \delta \leq 1}$(32)

with: m number of populations, cj relative size of deme j. (Gregorius 1984, 1988). δ attaines zero if all demes have the same genetic structure, and reaches unity if all demes consideres in pairs have no gene in common (Hattemer 1991).

### Differentiation within a population

The concept of differentiation can also be applied winthin a population by considering each individual in that population a deme. The number of identical individuals can be counted and expressed relative to the number of other genetic types (0.33) with: N the sample size, and pi frequency of genetic type (allele or genotype). T indicates the total genetic difference between all individuals of a population. T equals zero if all individuals of the population are of the same genotype, and T equals unity if all individuals are different. (Gregorius 1987, 1988). T represents the probability that two individuals samples from the sample population without replacement represent the same variant (Hattemer 1991).

Note that all differentiation measures range between zero and unity, whereas the genetic divirsity measures range between unity and the number of genetic types, n (Gregorius 1987)

### Actual heterozygosity

(0.34) with: Pij the frequency of genotype with alleles i and j, with i  j. Ha indicates the fraction of observed heterozygotes in the population.

### Fixation index

The fixation index indicates for the locus considered the surplus or deficit of heterozygotes compared to Hardy-Weinberg-equilibrium. (0.35) with: He the expected heterozygosity based on Hardy-Weinberg-equilibrium.

### F-statistics

F-statistics measure the degree of deviation of genotypic frequencies from those expected under random mating in structured populations (Falconer 1996), (Weir & Cockerham, 1984. ref in (Larsen 1996)). FIS Inbreeding coefficient of an individual relative to its on subpopulation. Measures inbreeding due to non-random mating in a sub-population. Within population fixation index FST Average inbreeding of the subpopulation relative to the whole population, or correlation between two randomly choosen alleles in a sub-population relative to the alleles in the whole population. Measures inbreeding due to correlation amoung alleles cause by their occurrence in the same sub-population. Between populations fixation index FIT Inbreeding coefficient of an individual relative to the whole population, or correlation between gametes for the total population. Measures the extend of inbreeding in the entire population (for neutral alleles). Total fixation index

In random mating population: FIS = 0 and FIT = FST. If all populations are genetically identical: FST = 0 and FIS = FIT

### Spatial analysis / geostatistics

A basic tool in geostatistics is the variogram, which is defined as:

where x and x+h are two locations, separated by distance h. E(.) is the mathematical expectation and vj is some diversity function linked to location x. An empirical variogram quantified spatial variability of a variable as a function of the distance h as it describes its spatial dependence. The variogram is determined by calculating the mean squares of differences between observations that are separated by a distance approximatly to h for several values of h. Next, a so-called transitive model model is fitted through these empirical variogram values. Common variogram transitive models are: the exponential model:

the Gaussian model:

the spherical model:

All transitive models depend upon the nugget (C0), sill (C1) and range (b) parameter. The nugget, the positive intercept of the variogram with the ordinate, represents unexplained spatially dependent variation or purely random variance. The sill, the asymptote of the variogram, is the value at which the variogram levels out, and the distance at which the levelling occurs is known as the range of the spatial dependence. The exponential model described attributes characterised by abrupt changes at all distances, the Gaussian model describes continuous, gradually varying attributes and the spherical model descibes attributes with abrupt changes at discrete and regular spacings (range) but where the distance between the abrupt changes is not clearly defined (Burrough 1998). For a measure of how well these models fit, the ration of the Sum of Square Deviations to the Total Sum of the Square (SSD/SST) can be used. The close his ratio is to 0 the better the fit.