Research Article

A rapid screening of ancestry for genetic association studies in an admixed population from Pernambuco, Brazil

Abstract

Genetic association studies determine how genes influence traits. However, non-detected population substructure may bias the analysis, resulting in spurious results. One method to detect substructure is to genotype ancestry informative markers (AIMs) besides the candidate variants, quantifying how much ancestral populations contribute to the samples’ genetic background. The present study aimed to use a minimum quantity of markers, while retaining full potential to estimate ancestries. We tested the feasibility of a subset of the 12 most informative markers from a previously established study to estimate influence from three ancestral populations: European, African and Amerindian. The results showed that in a sample with a diverse ethnicity (N = 822) derived from 1000 Genomes database, the 12 AIMs had the same capacity to estimate ancestries when compared to the original set of 128 AIMs, since estimates from the two panels were closely correlated. Thus, these 12 SNPs were used to estimate ancestry in a new sample (N = 192) from an admixed population in Recife, Northeast Brazil. The ancestry estimates from Recife subjects were in accordance with previous studies, showing that Northeastern Brazilian populations show great influence from European ancestry (59.7%), followed by African (23.0%) and Amerindian (17.3%) ancestries. Ethnicity self-classification according to skin-color was confirmed to be a poor indicator of population substructure in Brazilians, since ancestry estimates overlapped between classifications. Thus, our streamlined panel of 12 markers may substitute panels with more markers, while retaining the capacity to control for population substructure and admixture, thereby reducing sample processing time.

Genetic association studies determine how genes influence traits. However, non-detected population substructure may bias the analysis, resulting in spurious results. One method to detect substructure is to genotype ancestry informative markers (AIMs) besides the candidate variants, quantifying how much ancestral populations contribute to the samples’ genetic background. The present study aimed to use a minimum quantity of markers, while retaining full potential to estimate ancestries. We tested the feasibility of a subset of the 12 most informative markers from a previously established study to estimate influence from three ancestral populations: European, African and Amerindian. The results showed that in a sample with a diverse ethnicity (N = 822) derived from 1000 Genomes database, the 12 AIMs had the same capacity to estimate ancestries when compared to the original set of 128 AIMs, since estimates from the two panels were closely correlated. Thus, these 12 SNPs were used to estimate ancestry in a new sample (N = 192) from an admixed population in Recife, Northeast Brazil. The ancestry estimates from Recife subjects were in accordance with previous studies, showing that Northeastern Brazilian populations show great influence from European ancestry (59.7%), followed by African (23.0%) and Amerindian (17.3%) ancestries. Ethnicity self-classification according to skin-color was confirmed to be a poor indicator of population substructure in Brazilians, since ancestry estimates overlapped between classifications. Thus, our streamlined panel of 12 markers may substitute panels with more markers, while retaining the capacity to control for population substructure and admixture, thereby reducing sample processing time.