Genetic Diversity and Ancestral History of the German Angler and the Red-and-White Dual-Purpose Cattle Breeds Assessed through Pedigree Analysis ()
1. Introduction
Angler (RVA) and Red-and-White DP (RDN) cattle are local breeds of German origin. Both breeds have small populations predominantly found in the Northern part of the country. Planned breeding of RVA dates back to 1838, however, the organisation of a central herdbook and official milk recording began in 1879 and 1902, respectively [1] . The RVA breed has been used to improve many local red breeds of central and eastern European countries as well as in the Baltic countries [2] , [3] . Lactation yield of RVA cows is about 7500 kg with approximately 3.5% protein and 5% fat. Systemic breeding of RDN cattle on the other hand started in 1885 and from 1992, a pedigree of the breed typically has a maximum of 25% Red Holstein genes [4] . To maintain large populations for bull testing, there were exchanges of bulls between the RDN population in Germany and the Meuse-Rhine-Yssel (MRY) population in The Netherlands [5] . German RDN bulls were also used in Belgium to improve the native Red-and-White breed after the merging of provincial herd books into a single national herdbook in 1970 [6] . RDN cows produce on average 7000 kg milk per lactation with approximately 3.5% protein and 4% fat.
Available data from the German Federal Office for Agriculture and Food indicate a gradual decline in herdbook number of RVA and RDN bulls and cows over the past two decades [7] . During the same period, however, the number of herdbook cows increased markedly for the high performing German Holstein breed. Based on effective population size (Ne), the German National Committee on Animal Genetic Resources (AnGR) classified RVA as a monitoring population (200 < Ne ≤ 1000) and RDN as a conservation population (Ne ≤ 200) [7] . The Ne values for the categorisation were calculated based on herdbook number of male and female animals. This estimation procedure is generally useful in the absence of pedigree data. With the availability of pedigree information, pedigree analysis can offer a better understanding of the population structure and trends in inbreeding of these breeds. In dairy cattle, pedigree analysis has often been used in genetic diversity studies [8] , [9] , and in assessing the effect of inbreeding on differing phenotypic traits [10] [11] [12] [13] . Apart from a few studies [10] [13] , most pedigree analyses involving dairy cattle were focused on classical inbreeding without considering the age of inbreeding. Meanwhile, ancestral inbreeding concepts [14] , [15] are well established. In a recent publication, Ballou’s and Kalinowski’s ancestral inbreeding coefficients were redefined [16] , and the authors introduced the ancestral history coefficient (AHC) defined as the number that tells how many times during pedigree segregation (gene dropping) a randomly taken allele has been in IBD status.
In the current study, we performed both within and across breed diversity assessment by calculating classical inbreeding coefficients, ancestral inbreeding coefficients according to Ballou [15] and as suggested by Kalinowski [14] , and AHC for the RVA and RDN cattle breeds. Furthermore, we calculated the effective population size, effective number of founders and of ancestors, and investigated the contribution of key ancestors to inbreeding.
2. Materials and Methods
2.1. Pedigree Information
Pedigree data for RVA and RDN span the period between 1906 and 2016 and were obtained from the official computation centre responsible for breeding value estimation (Vereinigte Informationssysteme Tierhaltung w. V., Verden, Germany). The RVA dataset consisted of 93,078 animals, including 10,481 bulls and 82,597 cows. A total of 184,358 animals including 16,068 bulls and 168,290 cows formed the RDN pedigree dataset. For both breeds, there has been some form of introgression of genetic material from other conventional breeds including the German Black-and-White Holstein (SBT), German Red-and-White Holstein (RBT), Holstein from North America (HOL), Jersey (JER), Braunvieh (BV), Fleckvieh (FV) and Scandinavian Red cattle. Breeds other than RVA and RDN were therefore present in both datasets, and consequently, 12,709 animals were common to both pedigrees. The completeness of pedigree [17] was computed for all animals to ascertain the proportion of known ancestors per generation. Additionally, the number of equivalent complete generations known in the pedigree was computed as the sum over all known ancestors of the term (1/2)n, where n is the number of generations separating the individual from each known ancestor [18] .
2.2. Data Processing, Analysis and Softwares
Using SAS software (SAS 9.4, SAS Institute Cary, NC, USA), we recoded the animal identification numbers (Ids) in the raw pedigree file from 15 digits to 14 digits, the maximum number required by PEDIG software [19] . PEDIG software was used for the extraction, verification, sequential recoding of the pedigrees, and for the calculation of classical inbreeding coefficients for all animals. The raw pedigree data were also recoded sequentially using the R software package QTLRel [20] , and in the process mismatches regarding the sex of animals were corrected. The GRAIN software package [16] was applied for the computation of ancestral inbreeding coefficients. To describe possible disequilibrium of founder contribution to the reference population (RP, i.e. animals with both parents known), ENDOG v4.8 [21] was used to compute parameters derived from the probability of gene origin. Pedigree analysis was carried out separately for each breed, and for the combined data (RVA_RDN), the latter involving 264,727 different individuals. In all three cases, animals with no known parents in the pedigree data were considered as founders and assumed unrelated.
2.3. Classical and Ancestral Inbreeding
The classical individual inbreeding coefficient (F), defined as the probability of an individual having two identical alleles by descent, was calculated following [22] and averaged over all as well as inbred animals. We also calculated classical inbreeding coefficients together with ancestral inbreeding coefficient according to Ballou [15] and as suggested by Kalinowski [14] , and AHC [16] using a modified version of gene dropping [23] , [24] with 106 replications. Originally, Ballou’s ancestral inbreeding coefficient (Fa_Bal) refers to the cumulative proportion of an individual’s genome that has been previously exposed to inbreeding in its ancestors. Without changing the original meaning of the parameter, Fa_Bal was recently defined as the probability that any allele in an individual has been autozygous (IBD) in previous generations at least once [16] . Kalinowski’s approach to ancestral inbreeding gives a narrower meaning of the parameter, and is defined as the probability that any allele in an individual is currently autozygous (IBD) and has been autozygous in previous generations at least once. The ancestral history coefficient is quite novel and by definition, tells how many times during pedigree segregation a randomly taken allele has been in IBD status [16] .
2.4. Effective Population Size
Defined as the number of individuals in an ideal population that would give rise to the same rate of inbreeding as observed in the actual breeding population [25] ,
was computed by first, calculating the rate of inbreeding as the regression coefficient (b) of the classical inbreeding coefficient (F) on the equivalent complete generation [21] . Secondly,
values were obtained using Equation (1) below.
(1)
Additionally, we applied the same estimation procedure in the calculation of what we termed as ancestral effective population size, which can be defined as the size of a population as reflected by its rate of ancestral inbreeding. In this regard, three values being Ne_Bal, Ne_Kal and Ne_AHC were distinguishable.
2.5. Probability of Gene Origin Based Parameters
The effective number of founders (
) and effective number of ancestors (
) best describe the unbalanced representation of founder contributions in a reference population. Parameter
defines the number of equally contributing founders that would be expected to produce the same genetic diversity as in the population under study [26] . Except in situations where each founder contributes the same to a reference population,
is always smaller than the actual number of founders. Calculation of
follows Equation (2) below,
(2)
where
is the probability that a gene randomly sampled in the population originates from founder k, and
is the total number of founders [27] . Parameter
on the other hand refers to the minimum number of ancestors, not necessarily founders, explaining the complete genetic diversity of a population [27] and can be computed using Equation (3).
(3)
In Equation (3),
represents the marginal genetic contribution of ancestor j, i.e. the genetic contribution made by an ancestor that is not explained by previously chosen ancestors, and a is the total number of ancestors considered. To calculate marginal genetic contributions, the first major ancestor was found based on its raw genetic contribution (i.e. qk = qj) following an iterative procedure. Next, the genetic contribution of the nth major ancestor was calculated conditional on the genetic contribution of the n − 1 already chosen ancestors. Reference [27] presents a detailed algorithm to compute marginal genetic contribution. Therefore, parameter
accounts for the losses of genetic variability which result from the unbalanced use of reproductive individuals producing a bottleneck. Furthermore, the ratio of the two parameters (i.e.
) actually reflects the role of bottleneck in the development of the population. Values close to one indicate the absence of a bottleneck.
3. Results and Discussion
3.1. Completeness of Pedigree
Figure 1 shows the completeness of pedigree across parental generations for the RVA, RDN and the RVA_RDN datasets. Completeness of the RVA pedigree was about 90% and higher than that of the RDN (64%) in the first parental generation. For the same parental generation, pedigree completeness was intermediate when the two pedigrees were combined (RVA_RDN). The proportion of known ancestors decreased quite steadily with increasing pedigree depth such that, completeness was below 50%, 20% and 30% at the seventh parental generation for RVA, RDN and RVA_RDN, respectively. Beyond the thirteenth parental generation, completeness was close to zero in all three cases. Published estimates of pedigree completeness level for cattle vary a lot. It ranges from 99% to below 10% in recent and founder generations, respectively [28] [29] [30] . Mean values for the number of known equivalent complete generations were 5.59, 2.7 and
Figure 1. Pedigree completeness indicating the percentage of known ancestors per parental generation, computed for the RVA, RDN and RVA_RDN pedigrees (Parental generation 1 represents parents, 2 represents grandparents, etc.).
3.73 for RVA, RDN and RVA_RDN, respectively, and consistent with the ranking of the three scenarios based on the trends in pedigree completeness across parental generations. The equivalent complete generation is an appropriate criterion to characterise pedigrees [18] . The estimated mean equivalent complete generation for RVA in this study was higher than the estimate for French Holstein (4.75) [18] but lower than that reported for German Holstein cows (6.15) [9] .
Similar to the results of previous studies, there is a general trend of decreasing pedigree completeness with increasing pedigree depth. Pedigree recording in the study populations started over a century ago and at a time when little was known about planned breeding. Recognition of breed importance and improvements achieved in breeding over the years are contributing factors to the observed increase in data recording from founder to recent generations. Incompleteness of pedigrees in this study implies a caution about the overreliance on our data for inbreeding estimation. It was demonstrated that with only 10% of unknown pedigree, inbreeding is strongly underestimated [27] .
3.2. Different Inbreeding Coefficients
The numbers of inbred individuals were 59,000, 39,477 and 95,343 representing 64%, 21% and 36% for RVA, RDN and RVA_RDN, respectively. The percentage of inbred individuals was low for the RDN pedigree and this is due to the inability of the pedigree data to fully capture the relationships between all animal as discussed previously. Table 1 summarises the mean values of classical inbreeding coefficients for all and inbred individuals, Ballou’s and Kalinowski’s ancestral inbreeding coefficients, and ancestral history coefficients calculated for the three different pedigree datasets. Generally, the inbreeding estimates were higher for the RVA breed and intermediate for the combined breed analysis. Average classical inbreeding coefficient of inbred individuals in this study ranged from 1.94 to 2.19 and were lower than the estimate for German Holstein dairy cattle
Table 1. Average classical inbreeding coefficients and estimates of average ancestral in- breeding coefficients computed for the RVA, RDN and RVA_RDN pedigrees.
(3.25%) [13] . More interesting are the changes in inbreeding over time for all individuals in each pedigree dataset. As shown in Figure 2, the level of inbreeding started rising steadily only after the 1940s. Before this period, only a few animals existed in all pedigrees as most animals were born after the 1960s. From the 1960s onwards, inbreeding levels increased markedly and continuously in all three cases but dropped for the RDN breed after the 1990s. Average inbreeding coefficient usually increases over time especially in small and closed populations where the mating of related individuals is unavoidable. Comparatively, the RVA breed has a smaller population size, which may have accounted for the high and continuous increase of inbreeding rate. There may have been an intervention to curb the increase of inbreeding for the RDN breed population from the 1990s, but we do not have adequate information to substantiate this.
Knowing the population level inbreeding rate alone is not enough, rather, the effect of inbreeding as manifested in the reduction in individual’s performance per unit increase in inbreeding coefficient (i.e. inbreeding depression). Ballou’s concept of ancestral inbreeding proposes a measure that tells which individuals or population harbour fewer detrimental genes. Thus, higher values of the parameter indicate the likelihood of an individual having fewer detrimental genes. Following this concept, it can merely be said that the RVA breed population has endured high inbreeding at the ancestral level (F_Bal = 3.69%) and is probably prone to fewer incidents of inbreeding depression. The mean estimates for F_Kal were much lower, i.e. 0.16%, 0.05% and 0.09% for RVA, RDN and RVA_RDN, respectively. By definition, F_Kal deals with alleles which are homozygous because they have met in the past, and only includes the ancestral inbreeding of relationship. This means that unlike F_Bal, F_Kal for an individual remains zero when its classical inbreeding coefficient is zero. Note, that our analysis did not include the second component of the parameter that deals with new inbreeding. To our knowledge, the results on AHC in this study represent one of the first tests of this coefficient using real data. Estimates of AHC were high i.e. 3.94%, 1.49% and 2.37% for RVA, RDN and RVA_RDN, respectively, and very similar to the estimates of F_Bal. The advantage of AHC is that it offers an appropriate measure of inbreeding when selection against deleterious recessive alleles is less than fully efficient. The correlation between the different inbreeding coefficients are
Figure 2. Changes in the per-decade average inbreeding coefficients for the RVA, RDN and RVA_RDN complete pedigrees.
presented in Table 2. Similar estimates of classical inbreeding coefficient were obtained for the computation following [22] (F_ Meuw) and by the use of gene dropping method as implemented using GRAIN (F_Gendrop). Correlation between F_Meuw and F_Gendrop was near perfect (r = 0.99, p < 0.001) for both; the analysis involving the complete pedigree (above diagonal) and for the calculation using only inbred individuals (below diagonal). This served as a check of accuracy for our gene dropping procedure used in computing all ancestral inbreeding coefficients. For the complete data, AHC and F_Bal were almost identical (r = 0.99, p < 0.001), however, the correlation between either of the two coefficients and the classical inbreeding coefficient was intermediate (r = 0.50, p < 0.001). The situation was the same for inbred individuals but the estimates were slightly lower. Correlations between classical inbreeding coefficient and Fa_Bal calculated for inbred animals in the current study were slightly lower than those found in previous studies; (0.61) [13] and (0.36 - 0.40) [10] . The correlations between classical inbreeding coefficient and Fa_Kal were also slightly lower in our study than in the afore-mentioned studies. Between Fa_Bal and Fa_Kal, the correlation estimates in the current study were lower than those by [13] (0.89) but higher than those found by [10] (0.28 to 0.38). Based on their obtained weak correlation between Fa_Bal and Fa_Kal, the authors [10] argued that the two coefficients measure different population statistics. However, our results and those of the other authors [13] suggest some kind of relationship between the two coefficients. Differences in correlation estimates between the different inbreeding coefficients as reported by different authors are expected since populations differ in pedigree structure.
3.3. Effective Population Size
Estimates of effective population size are given in Table 3. The effective popula- tion size computed based on the increase of classical inbreeding coefficient was higher for RDN (170) than for RVA (156). Note, that the RDN pedigree recorded
Table 2. Correlations between different inbreeding coefficients computed for all animals in the combined pedigree (n = 264,727, above diagonal elements.) and for inbred animals (n = 95,343, below diagonal elements.) with p value < 0.001 in all cases.
Table 3. Estimates of effective population size computed based on classical and ancestral inbreeding concepts for the RVA, RDN and RVA_RDN datasets.
the lowest average inbreeding coefficient but also the poorest pedigree quality. Here, effective population size values were estimated by regressing the individual inbreeding coefficients on the equivalent complete generations traced and considering the regression coefficient as the rate of inbreeding. The same procedure was applied to the individual estimates of Fa_Bal, Fa_Kal and AHC to calculate for the first time ancestral effective population size which we defined as the size of a population as reflected by its rate of ancestral inbreeding. The ancestral effective population size estimates based on Fa_Bal (Ne_Bal) and that based on AHC (Ne_AHC) were similar and ranged from 50 to 59 animals for all data considerations. Estimates of ancestral effective population size based on Fa_Kal (Ne_Kal) on the other hand, were unrealistically high and above 1000 animals. Nevertheless, these high estimates are not surprising since Fa_Kal considers only “old” inbreeding. Applying different computation methods [30] , pedigree based
values ranging from 47 to 167 animals were reported for the Rotes Hoehenvieh cattle breed. The estimated
values in the current study are higher than the threshold number of 50 animals [31] , and between 50 and 100 animals [32] , below which the fitness of a population is expected to decrease.
3.4. Founder and Ancestor Contributions
The parameters derived from the probability of gene origin account for the unbalanced use of founders in a pedigree and unlike
, are less affected by pedigree errors [27] . In Table 4, the results of the parameters derived from the probability of gene origin are given. Additionally, statistics on the number of animals that formed the reference and base populations are given. The RDN
Table 4. Number of animals in the reference and base populations, and the effective number of founders and ancestors computed from the probability of gene origin for the RVA, RDN and RVA_RDN pedigrees.
pedigree had a slightly lower RP number (73,749) although it has the highest total number of animals. A total of 10,059, 24,101 and 30,911 ancestors, some of which were not founders contributed to the RVA, RDN and RVA_RDN reference populations, respectively.
The
values obtained were 310 (RVA), 519 (RDN) and 532 (RVA_RDN). Published
values for other cattle breeds range from 40 to 649 animals [28] , [33] , [34] . These values depended on the actual number of founders making it interesting to interpret
in relation to the actual number of founder rather than a simple comparison of absolute values across studies. Compared to the actual number of founders, the estimated
values in the current study suggest an unbalanced genetic contribution in the founder population of all three cases. A simple offspring analysis of our data revealed an excessive use of some individuals, especially males as parents. An individual in the Angler pedigree for instance, sired 1485 offspring (results not shown). Meanwhile, the obtained
values show that only 90, 189 and 159 animals explained the complete genetic diversity in the RVA, RDN and RVA_RDN reference gene pools, respectively. In all three cases, the
ratio indicates the occurrence of a genetic bottleneck since the foundation of the population, and the RVA population is the most impacted. The use of artificial insemination in these breeds is a contributing factor to the observed genetic bottleneck.
The marginal genetic contribution of the top 10 ancestors to the RVA, RDN and RVA_RDN reference populations are given in Table 5. Total genetic contribution of the top 10 ancestors were 26.3%, 18% and 19% for RVA, RDN and RVA_RDN, respectively. These ancestors either had many offspring or did contribute enormously through their famous offspring, as was the case for the only female ancestor (ID = 840000000005304) whose two sons sired 226 and 15 immediate progenies. Interestingly, some of the top ancestors (see superscript a, b and c) were common to both the RVA and RDN pedigrees. These ancestors and some others (superscript d, e, and f under RVA, and g under RDN) also appear in the combined breed analysis as major ancestors. The current popula-
Table 5. Marginal genetic contribution of the top 10 ancestors to the RVA, RDN and RVA_RDN reference populations given by the sex of individual, birth year, breed type and number of offspring.
a-gAncestor IDs with the same superscript indicate the same animal appearing in the different pedigree datasets (Ancestors were selected based on marginal contribution calculated following [27] ).
tion of Angler and Red-and-White DP cattle breeds are genetically not distinct. In fact, they share common ancestors some of which can be traced back to as early as 1965. Most striking is the high genetic contribution of the Red-and-White Holstein breed to the breed populations under study. For the Red-and-White dual purpose breed, it has been established that a pedigree of the breed has a maximum of 25% Red Holstein genes [4] . These revelations highlight crossbreeding schemes established to improve the performance of local cattle breeds some decades ago.
4. Conclusion
Analysing the Angler and Red-and-White dual-purpose local cattle pedigrees has shed some light on the population structure of these breeds in Germany. The current study demonstrates that Ballou’s approach to estimate ancestral inbreeding and the novel ancestral history coefficients are similar approaches that produce comparable results. Besides, these coefficients provide avenue to calculate effective population size at the ancestral level. The effective population size of the breeds did not raise concern, however, due to incompleteness of the pedigree data used, consideration of the parameters derived from the probability of gene origin was extremely necessary in characterising the genetic diversity within the populations. For both breeds, the reference populations were raised from founder or ancestor groups, within which genetic contributions were typically unbalanced, male animals being favoured. Consequently, only a few animals explained the complete genetic diversity in the population under study. The Red Holstein breed is a key progenitor of the current Angler and Red-and-White dual-purpose cattle populations in Germany. Based on the high genetic contribution of key ancestors belonging to other breeds, we recommend an extensive investigation of foreign blood percentage in both breeds.
Acknowledgements
This work was conducted as part of the research activities of the operational group, “Animal Genetic Resources” that operates under the auspices of the Agricultural European Innovation Partnership (EIP-AGRI) project. The authors are thankful to the European Commission for providing funds for the project. Personnel at the “Vereinigte Informationssysteme Tierhaltung” in Lower Saxony are also acknowledged for the provision of data and their relentless efforts in answering questions about the datasets used.