This study examined the feasibility and accuracy of using Illumina BovineSNP50 genotypes to estimate individual cattle breed composition and heterosis relative to estimate from pedigree. First, pedigree was used to compute breed fractions for 1124 crossbred cattle. Given the breed composition of sires and dams, retained heterosis and retained heterozygosity were computed for all individuals. Second, all animals’ genotypes were used to compute individual’s genomic breed fractions by applying a cross-validation method. Average genome-wide heterozygosity and retained heterozygosity based on genomic breed fraction were computed. Lastly, accuracies of breed composition, retained heterozygosity and retained heterosis were assessed as Pearson’s correlation between pedigree- and genome-based predictions. The average breed compositions observed were 0.52 Angus, 0.23 Charolais, and 0.25 Hereford for pedigree-based prediction and 0.46, 0.26, and 0.28 for genome-based prediction, respectively. Correlations of predicted breed composition ranged from 0.94 to 0.96. Genome-based retained heterozygosity and retained heterosis from pedigree were also highly correlated (0.96). A positive association of nonadditive genetic effects was observed for growth traits reflecting the importance of heterosis for these traits. Genomic prediction can aid analyses that depend on knowledge of breed composition and serve as a reliable method to predict heterosis to improve the efficiency of commercial crossbreeding schemes.
Breed composition is useful in multiple aspects of beef cattle production, including assessment of admixture, validation of breed status of an animal for registration purposes, correction for population stratification in across-breed genetic evaluation, genome-wide association studies, and planning crossbreeding programs that will exploit heterosis and breed complementarity. Absent genotypes, accurate assessment of breed composition requires reliable pedigree information which is not always available. Availability of affordable genotyping for various livestock species provides an opportunity for more accurate prediction of breed composition using single nucleotide polymorphism (SNP) markers (Gorbach et al. 2010). Efforts to predict genomic breed composition previously utilised information from the purebred population to make predictions which resulted in more accurate estimates (Gorbach et al. 2010; Kuehn et al. 2011). Here, predictions will be based largely on crossbred data using established procedures (Alexander et al. 2009).
Breed composition can be used to predict heterosis (Dickerson 1973). Heterosis refers to the performance advantage of crossbred animals over the average of their purebred parents. The amount of heterosis retained in crossbreds after several generation of crossbreeding is proportional to the retained heterozygosity (Dickerson 1973). Therefore, the objectives of this study were to predict genomic breed composition in crosses of Angus, Hereford, and Charolais without using purebred information, to compare the result of this approach to that of pedigree-based prediction, and further to compare estimates of heterosis based on expectations of heterozygosity from pedigree versus heterozygosity observed in the genotype assay.
Materials and Methods
All management and procedures involving live animals conformed to the guidelines outlined in the Canadian Council on Animal Care (1993).
Animals, phenotypes, and genotypes
Data were collated from 1124 crossbred beef cattle born in the spring between 2002 and 2012 at the Lacombe Research and Development Centre, Lacombe, AB, Canada. The foundation breeds were largely made up of Aberdeen Angus, Red Angus, Charolais, and Hereford. Phenotypic records for growth and carcass traits were available for all of the 1124 animals. The traits studied were birth weight, weaning weight, pre-weaning daily gain, average daily gain, yearling weight, hot carcass weight, back fat thickness, rib eye area, marbling score, lean meat yield, and yield grade. Data were edited to remove records >3 SD or <3 SD from the mean after correcting for systematic effects of sex, age of dam, herd, and year of birth. Contemporary groups were formed based on herd, year, sex, and management groups. Pedigree extending to purebred ancestors was known and available for all animals studied.
A blood sample was collected from each animal by jugular venipuncture. Samples were collected into evacuated tubes containing EDTA (Vacutainer, Becton Dickinson and Co., Franklin Lakes, NJ, USA) and refrigerated at 4 °C until DNA extraction using the QiagenDNeasy 96 blood and tissue kit (Qiagen Sciences, Germantown, MD, USA). Scoring of marker genotypes was performed using BovineSNP50 BeadChip (50K; Illumina, San Diego, CA, USA) and was completed at Delta Genomics, Edmonton, AB, Canada. Quality control was performed to remove SNPs with minor allele frequency <0.01, call rate <0.90, and heterozygosity excess >0.15 (Lu et al. 2016). Missing genotypes were imputed using FImpute v2.0 (Sargolzaei et al. 2014). Only autosomal SNPs with known genome position according to the UMD_3.1 bovine assembly map (Zimin et al. 2009) were used. After editing, 42 610 SNPs were available for subsequent analyses.
Statistical model and analysis
Pedigree-derived breed composition was assigned to each individual. The 1124 crossbred beef cattle were from 64 sires and 495 dams over four generations. Parentage testing was previously performed using 50K genotypes to update any missing information in the pedigree and to improve the reliability of the pedigree information (Akanno et al. 2014a ). Genomic breed compositions were predicted for all individuals using the ADMIXTURE software (Alexander et al. 2009). A 10-fold cross-validation procedure available in ADMIXTURE program was performed to find the best possible K value with the lowest cross-validation error (Alexander et al. 2009). The resulting breed fractions at K = 4 was selected from the ADMIXTURE analysis and aligned with the breed information from pedigree to identify the various breed ancestries existing in the dataset. Aberdeen Angus and Red Angus were considered as a single breed in developing the breed fractions above (Kuehn et al. 2011). Expected retained heterozygosity from pedigree (RHETp) and genomics (RHETg) was calculated for each individual as , where Pi is the fraction of each of the n contributing breeds. Average genome-wide heterozygosity (H) was calculated for each individual as the total number of heterozygous loci divided by the total number of SNPs. Expected retained heterosis (RH) was also calculated for each individual as 1 minus the product of the fractions of the same breed from the sire (PSi) and dam (PDi), that is .
Proportions of total phenotypic variance explained by direct additive genetic effects were assessed using two models defined as pedigree-based or genome-based models according to the type of relationship matrix (pedigree or genomics) used. The genomic relationship matrix was constructed following the formulas provided by VanRaden et al. (2009). Depending on the trait analysed, fixed and random factors included in each model are defined in Table 3. Residuals for each model were assumed to be independent. All traits were analyzed using ASReml version 4.1 (Gilmour et al. 2015). Thereafter, heterosis effects were determined for each trait by fitting separately the fixed covariates of either RHETp or RH in the pedigree-based model alongside with the factors defined in Table 3 while fixed covariates of either RHETg or H were fitted separately in the genome-based model in a similar approach. A total of four scenarios were tested for the heterosis effects. Accuracy of predictions of genomic breed composition and heterosis were determined by linear regression of genome-based prediction on pedigree-based prediction and Pearson’s correlations of genome-based predictions with pedigree-based calculations.
Results and Discussion
Prediction of genomic breed composition
This study applied a genomic approach based on a 50K SNP panel for bovine to predict the breed fractions of founder breeds in crossbred beef cattle. Means, standard deviations, and Pearson’s correlation between pedigree- and genome-based estimates of breed fractions are summarized in Table 1. Predicted proportions of Angus, Charolais, and Hereford were similar when predictions were made using pedigree and genomic information (Fig. 1).
Mean and standard deviation (SD) of breed fractions, Pearson’s correlation (r) between estimated breed fractions from pedigree and genomics, and the regression coefficient (β) for regressing estimated breed fractions from pedigree on genomic estimates in beef cattle (n = 1124).
Genomic predictions traced the ancestry of these crossbred cattle to three founding genetic groups which corresponded closely with Angus, Charolais, and Hereford ancestral proportions based on pedigree; r = 0.94 in Angus to 0.96 in Charolais (Table 1). Linear regression coefficients of genomic predictions on pedigree were approximately 1. Tshipuliso et al. (2008) demonstrated the fidelity of predicted breed proportions in tracing ancestry to parental stocks under backcrossing. Further, Kuehn et al. (2011) and Frkonja et al. (2012) reported prediction accuracies of similar magnitude in cattle breeds. However, the aforementioned studies assumed knowledge of breed ancestry in the crossbreds and used this information in their predictions. In a complicated crossbreeding system, it may be infeasible to track ancestries of crossbred animals or the founder breeds may be wrongly assigned. In such a situation, genotypes of the crossbred individuals may be used to estimate the ancestries directly. Using a cross-validation procedure and the ADMIXTURE software (Alexander et al. 2009), as in this study, allows identification of K possible ancestral populations with the least cross-validation error (0.537). Here, founding genetic groups were predicted accurately using data from the presumed ancestral breeds. Departures from 100% accurate prediction may be ascribed to laboratory (including missing genotypes), pedigree, sample identification, and independent errors.
Breed fractions predicted for Aberdeen Angus using genomics and pedigree information were found to be moderately correlated (r = 0.42–0.65) with Red Angus fractions. Consequently, Aberdeen Angus and Red Angus fractions were pooled into one breed called Angus, while Charolais and Hereford were maintained as separate breeds throughout the study. This pooling improved the accuracy of genomic prediction for Angus breed fractions from 0.79 to 0.94. This similar genetic architecture of Aberdeen Angus and Red Angus has been previously observed and reported by Kuehn et al. (2011).
Rapid prediction of breed composition using genomics in beef cattle populations may be beneficial for checking the integrity of pedigree recording for seed stock. Even without intention to falsify pedigree, errors were found in the pedigree-based breed fractions assigned to a few individuals in this study. Incorrect integrity of breed assignment has the potential to reduce heterosis attained in by commercial producers when implementing a crossbreeding system.
Estimates of genetic parameters
The numbers of observations, phenotypic means, and standard deviations for the studied traits are summarized in Table 2. Consistent with previous studies, there was substantial phenotypic variation in this sample of crossbred beef cattle (Akanno et al. 2014a , 2014b ). The proportions of total phenotypic variance explained by all genetic effects according to the model defined for each trait in Table 3 are given in Table 4. Two types of models that utilized pedigree and genomic relationships were applied. Maternal effects were included for pre-weaning traits in both models; however, maternal heritability were found to be close to zero for birth weight or zero for pre-weaning daily gain and weaning weight for this dataset. Heritability estimated from a genome-based model was slightly less than the pedigree-based estimates for most traits using the same data (Table 4). Reduced estimates of heritability from a genome-based model as opposed to a pedigree-based model have been previously observed (Lopes et al. 2015). This phenomenon has been referred to as the “missing heritability problem” (Lee et al. 2011). This is because the heritability from the genome-based model includes only the contribution of causal variants in linkage disequilibrium with the SNP markers and not the contribution of all causal variants as in the pedigree-based model. The additive heritability observed for growth traits in this study using both pedigree- and genome-based models were higher than the values of 0.22–0.55 reported in Canadian (Schenkel et al. 2004) and US (Schiermiester et al. 2015) beef cattle populations. These differences in heritability estimates may be attributed to the small number of animals used in the estimation. However, the moderate to high heritability observed for carcass traits were within the range of values (0.15–0.97) reported in a review by Utrera and Van Vleck (2004).
The number of animals with record (N), mean, and standard deviation (SD) for growth and carcass traits of animals with raw genotype.
Model definitions for growth and carcass traits of beef cattle.
Proportions of total phenotypic variance explained by additive, maternal, and maternal permanent effect (MPE) based on pedigree-based and genome-based models for each trait in crossbred beef cattle.
Genomic prediction of heterotic effects
Effects of heterosis estimated from pedigree-based and genome-based models are presented in Table 5. Measures of heterosis that were based on pedigree information had greater effect sizes than those measures that used genomic information. For growth traits of beef cattle, the heterosis effects were significant (P < 0.05) for average daily gain and yearling weight based on the average heterozygosity from genomics. Other measures of heterosis were not significant for growth traits. The magnitude of heterosis effects from all four measures of heterosis were slightly large and positive for weaning weight and yearling weight which suggests the importance of heterosis for these traits. Previous studies have observed high heterosis effects for weaning weight and post-weaning gain in crosses of British and Continental cattle breeds (Williams et al. 2010; Schiermiester et al. 2015). Negative heterotic effects were found for most carcass traits except for back fat thickness and yield grade which gave positive estimates for all four measures. A negative heterotic effect means that the interactions between paternal and maternal alleles of different breeds yielded a value that was lower than the interactions between paternal and maternal alleles of the same breed.
Effects of retained heterosis (RH) from pedigree, retained heterozygosity from pedigree (RHETp) and genomics (RHETg), and genome-wide heterozygosity (H) for each traits in crossbred beef cattle.
Table 6 shows the Pearson’s correlation and regression coefficient between measures of heterosis estimated from pedigree and genomic information. The accuracy of genomic predictions, assessed as correlations between pedigree and genomic predicted values, ranged from 0.67 to 0.96. A notable finding was the Pearson’s correlation coefficient of 0.96 observed between retained heterosis from pedigree and retained heterozygosity from genomics. This prediction had a regression coefficient of 1.73 suggesting a slightly upward bias. If we assumed the retained heterosis from pedigree as proportional to the amount of F1 heterosis retained in future crosses (Dickerson 1973) then the genome-based heterozygosity is a good predictor of heterosis. More so, the heterotic effects reported in Table 5 are dependent on the assumption that heterosis is proportional to expected breed heterozygosity irrespective of the source of information (pedigree or genomics). The genomic prediction approach can aid analyses that depend on knowledge of breed composition and serve as a reliable method to predict heterosis to improve the efficiency of commercial crossbreeding schemes.
Results of Pearson’s correlation between various heterosis parameters predicted from pedigree and genomics (above diagonal) and the regression coefficient for regressing pedigree-based prediction on genomic-based prediction (below diagonal).
Genomic prediction of breed composition from crossbred genotypes will be worthwhile when pedigree information is incomplete and will be beneficial in the identification of optimal crossbreeding program for increased heterozygosity and heterosis in crossbred cattle. This study utilized genomic information to gain insights on admixture level and the contribution of nonadditive genetic effects to phenotypic variation in British and Continental cattle breeds, mainly, Angus, Hereford, and Charolais. A positive association of nonadditive genetic effects was observed for growth traits which reflect the importance of heterosis for these traits in crossbred cattle. It is reassuring that a small sample of crossbred beef cattle will suffice to predict breed composition and assess heterotic effects. However, more samples should be added to this resource to facilitate accurate estimation of breed composition and to develop an efficient genomic approach for predicting heterosis.
We gratefully acknowledge funding support from Alberta Livestock and Meat Agency Ltd., Alberta Innovates Bio Solutions, and Alberta Agriculture and Forestry (AF) and in-kind contribution in animals, facilities, and people received from Agriculture and Agri-Food Canada, Lacombe Research and Development Centre, with special thanks to Cletus Sehn, Ken Grimson, and their staff for animal care and management. Special thanks are also extended to Cathy Bryant and Sheldon Johnston of AF and Lisa McKeown (Livestock Gentec) for project coordination, data collection, and database management.