Skip to content

Genetic ancestry, ethnicity & race

Definition

In GWAS and cohort genomics, these labels answer different questions: genetic ancestry is inferred from DNA (proportions or axes vs reference panels); ethnicity and race are usually self-report or administrative social identities. They correlate in some populations because of history and sampling, but they are not interchangeable - using one as a proxy for the other misstates what was measured.

How they differ

Ancestry Ethnicity Race
What it is Genetic: how much of the genome is attributed to specified source populations or positions along axes of allele-frequency variation (e.g. PCA, supervised assignment, local ancestry). Social / cultural identity—language, nationality, migration history, community affiliation—often self-described. Socially defined category, historically tied to racism and power; often collected as self-report or observer assignment in biomedical settings.
Typical source Genotypes + a reference panel or model. Questionnaires, registries, EHR fields. Questionnaires, registries, EHR fields; reporting standards (e.g. NIH) may require race/ethnicity reporting separately from genetics.
GWAS use Control population stratification, choose analysis models, interpret portability of scores across groups; not a moral or social label by itself. Recruitment equity, harmonization across sites, reporting—not a substitute for genotype-based ancestry adjustment. Same cautions as ethnicity; must not be treated as a biological proxy for allele frequencies or causation.

Rule of thumb: If the method section says PCA, admixture, or projection onto 1000 Genomes, the authors mean genetic ancestry. If the table lists “White,” “Black,” “Asian,” “Hispanic,” check whether that field is self-report (race/ethnicity) vs genetic inference—and never assume they match individual-level ancestry.

Other meanings (optional)

  • Cohort labels: Studies sometimes use coarse population or super-population codes (e.g. reference-panel abbreviations) that are not the same as race or ethnicity; read the data dictionary. See 1000 Genomes Project and cohort-specific documentation.
  • Ancestry in plain English: Outside genetics, “ancestry” can mean genealogy or family history without genotyping; in this dictionary the ancestry entry is genetic ancestry unless context says otherwise.

References