History & Migration

Ancient DNA Studies in India: Key Discoveries So Far

Ancient DNA (aDNA) research has revolutionized our understanding of human migration, population mixing, and the deep genetic history of civilizations across the globe. For South Asia - a region home to over 1.4 billion people and extraordinary genetic diversity - archaeogenetics has begun to answer questions that archaeology, linguistics, and historical texts alone could never resolve.

Yet India remains one of the most challenging regions on Earth for ancient DNA work. The subcontinent's tropical climate, diverse burial traditions, and the relatively recent arrival of advanced genomic techniques to the region mean that the number of ancient genomes recovered from South Asia is still a fraction of what has been sequenced from Europe or Central Asia. Despite these obstacles, the studies that have succeeded have produced some of the most consequential findings in the entire field of archaeogenetics.

This article provides a comprehensive overview of the major ancient DNA studies relevant to India - the landmark papers, the key sites, the breakthrough techniques, and the discoveries that have reshaped our understanding of who the people of the Indian subcontinent are and where they came from.

Why Ancient DNA Matters for India: India is home to over 4,600 distinct population groups with extraordinary genetic diversity. Ancient DNA is the only tool that can directly reveal the genetic makeup of past populations - showing us who lived here thousands of years ago, how they mixed, and how they gave rise to the complex mosaic of modern Indian ancestry. Without aDNA, we are left guessing based on indirect evidence from modern genomes alone.

The Landscape of Ancient DNA Research in South Asia

Before diving into individual studies, it is important to understand the broader context. Ancient DNA research in South Asia has followed a distinctive trajectory compared to other regions. In Europe, hundreds of ancient genomes were being published by 2015, enabling detailed reconstructions of population movements during the Neolithic, Bronze Age, and Iron Age. In South Asia, the first truly impactful ancient DNA results did not emerge until 2018-2019, and the total number of ancient individuals sequenced from the Indian subcontinent remains relatively small.

The reasons for this lag are both technical and institutional. DNA degrades rapidly in warm, humid environments, and much of South Asia experiences exactly these conditions. Many ancient Indian communities practiced cremation, leaving no skeletal remains. Burial sites that do exist - such as those at Indus Valley Civilization sites - have yielded bones in poor preservation states. Additionally, until recently, there were very few ancient DNA laboratories in India itself, and international collaborations sometimes faced bureaucratic and political challenges.

Despite these hurdles, the studies that have been completed represent genuine breakthroughs. Let us examine them one by one.

Rakhigarhi: The IVC Breakthrough (Shinde et al. 2019)

The single most important ancient DNA study for understanding prehistoric India is the analysis of a skeleton from Rakhigarhi, a massive Indus Valley Civilization site in Haryana. Published in 2019 by Vasant Shinde and colleagues in the journal Cell, this study finally gave us a direct genetic window into the people who built one of the world's earliest urban civilizations.

Background and Context

Rakhigarhi is the largest known IVC site within modern India's borders, covering approximately 350 hectares. Excavations at the site uncovered a cemetery with multiple burials, providing the rare opportunity to attempt DNA extraction from IVC-period remains. The research team, led by Vasant Shinde of Deccan College, Pune, worked in collaboration with geneticists from institutions in South Korea, the United States, and other countries.

The DNA extraction was extraordinarily challenging. Multiple attempts over several years failed due to the degraded state of the ancient bones. Success finally came from the petrous bone - the dense bone behind the ear that is now known to be the best source of ancient DNA in poorly preserved remains. Even then, the amount of recoverable DNA was extremely low, and the team had to use sophisticated computational methods to reconstruct the individual's genetic profile.

Key Findings

Significance: The Rakhigarhi study settled one of the longest-running debates in Indian archaeology - the IVC was built by a population with deep roots in South Asia, not by migrants from Central Asia. The steppe ancestry found in many modern Indians arrived later, after 2000 BCE.

Roopkund Lake: A Mystery Spanning Centuries (Harney et al. 2019)

Roopkund, a glacial lake situated at approximately 5,029 meters altitude in the Uttarakhand Himalayas, has long been known as "Skeleton Lake" due to the hundreds of human skeletal remains scattered around its shores. The bones were first reported to the wider world in 1942 when a British forest ranger stumbled upon them. For decades, theories about who these people were ranged from a lost army to victims of a catastrophic hailstorm.

The Ancient DNA Answer

In 2019, a study led by Eadaoin Harney and colleagues, published in Nature Communications, used ancient DNA and radiocarbon dating to finally solve the mystery - and the answer was far more complex than anyone had anticipated.

The study revealed that the skeletons at Roopkund did not represent a single catastrophic event but rather multiple episodes of death spanning over a thousand years. The individuals fell into three genetically distinct groups:

What Roopkund Tells Us

The Roopkund study demonstrated the power of ancient DNA to overturn long-held assumptions. What appeared to be a single event was actually multiple unrelated episodes. The presence of Mediterranean individuals at a remote Himalayan site speaks to the unexpected mobility of people across vast distances, even in pre-modern times. For South Asian genetics specifically, the diverse ancestry profiles of Group 1 individuals provided a snapshot of the genetic variation that existed across India over a thousand years ago.

Swat Valley, Pakistan: Tracking the Arrival of Steppe Ancestry

The Swat Valley in northwestern Pakistan has been an archaeological treasure trove, with well-preserved burial sites spanning from the Bronze Age to the Iron Age. Ancient DNA from these sites has been instrumental in tracking when and how steppe pastoralist ancestry first appeared in South Asia.

The Swat Proto-Historic Graves

Ancient DNA from the Swat Valley was analyzed as part of the landmark Narasimhan et al. 2019 study. The burials span from approximately 1200 BCE to 1 CE, providing a time transect that allowed researchers to observe genetic changes over centuries. The key findings were as follows:

The Narasimhan et al. 2019 Mega-Study

Published in Science in September 2019, the study by Vagheesh Narasimhan and over 100 co-authors remains the single most comprehensive ancient DNA study relevant to South Asian population history. It analyzed ancient genomes from 523 individuals spanning Central Asia, Iran, and South Asia, covering a time range from approximately 6000 BCE to the medieval period.

Scale and Scope

The 523 individuals came from sites across a vast geographic area: the Iranian plateau, the Central Asian steppe, the BMAC region (modern Turkmenistan and Uzbekistan), the Swat Valley, and the "Indus Periphery" sites. While none of the individuals were directly from IVC sites within India, the study included 11 individuals from sites in Turkmenistan and Iran whose genetic profiles closely matched what was later confirmed by the Rakhigarhi study - the IVC genetic signature of Iranian-related farmer ancestry plus AASI.

Key Conclusions

  1. The "Indus Periphery" Population: The study identified a distinct genetic cluster that the authors termed "Indus Periphery" - individuals with a mix of Iranian-related ancestry and AASI ancestry but no steppe ancestry. These individuals, found at sites on the borders of the IVC, were genetically consistent with being members of the Harappan civilization. This finding was later confirmed by the Rakhigarhi genome.
  2. Two-Source Model for Modern Indians: Modern South Asians could be modeled as mixtures of two major ancestral populations: Ancestral North Indians (ANI), who carried both IVC-like and steppe ancestry, and Ancestral South Indians (ASI), who carried IVC-like and additional AASI ancestry. The proportions of these two components vary predictably across the subcontinent.
  3. The BMAC Was a Waystation, Not a Source: The Bactria-Margiana Archaeological Complex, once hypothesized as a potential homeland for Indo-Iranian speakers, showed no steppe ancestry in its earliest phases. Steppe ancestry appeared at BMAC sites only after approximately 2100 BCE, suggesting that steppe pastoralists moved through BMAC territory on their way south rather than originating there.
  4. Steppe Ancestry Correlates with Caste: Within any given region of India, groups traditionally classified as upper caste tend to have more steppe pastoralist ancestry, while groups classified as lower caste or tribal have less. This correlation holds across both North and South India, though absolute proportions differ by region.
  5. Gradual, Not Sudden: The genetic evidence supports a model of gradual admixture over many centuries rather than a sudden invasion or conquest. The mixing of steppe-descended and IVC-descended populations appears to have been a prolonged process spanning from approximately 2000 BCE to well into the first millennium CE.

Key Ancient DNA Studies Relevant to India

The following table summarizes the most important ancient DNA studies that have contributed to our understanding of Indian genetic history:

Study Year Site(s) Sample Date Key Genetic Finding
Shinde et al. 2019 Rakhigarhi, Haryana, India ~2500 BCE IVC individual had Iranian-related + AASI ancestry; no steppe ancestry detected
Narasimhan et al. 2019 523 individuals across Central/South Asia ~6000 BCE - Medieval Identified "Indus Periphery" genetic cluster; two-source model for modern Indians (ANI/ASI)
Harney et al. 2019 Roopkund Lake, Uttarakhand, India ~800 CE and ~1800 CE Three genetically distinct groups: South Asian, Mediterranean, and Southeast Asian from different eras
de Barros Damgaard et al. 2018 Central Asian steppe sites ~3000-1000 BCE Confirmed Sintashta/Andronovo populations had steppe ancestry that later spread to South Asia
Lazaridis et al. 2016 Ganj Dareh, Iran (Iranian Neolithic) ~10,000 BCE Characterized Iranian Neolithic farmers; showed Iranian-related ancestry split from this group before farming began
Broushaki et al. 2016 Multiple Iranian Neolithic sites ~8000-6000 BCE Iranian farmers were genetically distinct from Anatolian farmers; farming arose independently in multiple Near Eastern regions
Damgaard et al. 2018 BMAC sites, Turkmenistan/Uzbekistan ~2300-1500 BCE BMAC populations initially had no steppe ancestry; steppe genes appeared only after ~2100 BCE
Pathak et al. 2018 Burzahom, Kashmir, India ~2500 BCE One of the first ancient DNA extractions from within India; limited data but showed South Asian affinity

Central Asian Bronze Age Sites: BMAC, Sintashta, and Andronovo

Understanding ancient DNA from Central Asia is essential for interpreting Indian genetic history, because Central Asia served as the corridor through which steppe ancestry eventually reached South Asia. Three archaeological cultures are particularly relevant:

The Bactria-Margiana Archaeological Complex (BMAC)

The BMAC (approximately 2300-1700 BCE) was a sophisticated Bronze Age civilization centered in modern Turkmenistan and Uzbekistan. Ancient DNA from BMAC sites has shown that the population was initially composed of Iranian-related farmer ancestry with some additional western Siberian hunter-gatherer admixture. Crucially, early BMAC individuals had no steppe pastoralist ancestry.

This finding has important implications for India. The BMAC was once considered a potential staging ground for Indo-Iranian migrations into South Asia. The absence of steppe ancestry in early BMAC individuals means that the BMAC population itself was not the source of Indo-Aryan migrations. Instead, the data suggest that steppe pastoralists (from Sintashta and related cultures) moved through or alongside BMAC territory, picking up some BMAC-related ancestry along the way, before continuing south into the Indian subcontinent.

Sintashta and Andronovo Cultures

The Sintashta culture (approximately 2100-1800 BCE) and the related Andronovo horizon (approximately 2000-900 BCE) of the Central Asian steppe are the archaeological cultures most closely associated with the early Indo-Iranian-speaking peoples. Ancient DNA has confirmed that Sintashta and Andronovo individuals carried high proportions of steppe pastoralist ancestry, ultimately derived from the earlier Yamnaya culture of the Pontic-Caspian steppe.

The genetic profile of Sintashta/Andronovo populations is the source of the "steppe ancestry" component found in modern Indians. This ancestry is most concentrated in northwestern South Asian populations and in groups traditionally classified as upper caste, consistent with the hypothesis that it entered the subcontinent from the northwest during the second millennium BCE.

Iranian Neolithic Samples: The Zagros Farmers

Ancient DNA from Iranian Neolithic sites, particularly Ganj Dareh (dated to approximately 10,000 BCE), has been instrumental in understanding one of the two major ancestral components of the IVC population. The key findings from Iranian Neolithic aDNA are as follows:

Key Takeaway: The Iranian-related ancestry in the Harappans does not mean they migrated from Iran. Rather, both the Zagros farmers and the ancestors of the IVC people belonged to a widespread ancestral population that existed across western and southern Asia before the Neolithic period. Their paths diverged thousands of years before the IVC was founded.

AASI: Ancient Ancestral South Indians

The other major component of ancient Indian genetics - and arguably the most important for understanding the deep roots of South Asian populations - is the Ancient Ancestral South Indian (AASI) lineage. Unlike the other ancestral components discussed in this article, no ancient DNA from a "pure" AASI individual has ever been recovered. Instead, AASI is reconstructed computationally from its genetic signature in ancient and modern genomes.

What We Know About AASI

The absence of direct AASI ancient DNA is one of the most significant gaps in South Asian archaeogenetics. Recovering pre-Neolithic ancient DNA from the Indian subcontinent - DNA from individuals who lived before the mixing of Iranian-related and AASI populations - would be a transformative discovery.

Discover Your Deep Ancestry

Helixline's advanced DNA analysis reveals the ancient ancestral components in your genome, connecting you to the deep migrations and mixing events that shaped South Asian populations over tens of thousands of years.

Get Your DNA Kit

Challenges Facing Ancient DNA Research in India

Despite the breakthroughs described above, ancient DNA research in India faces substantial challenges that continue to limit the scope and speed of discoveries. Understanding these challenges helps explain why our knowledge of ancient South Asian genetics remains incomplete.

Tropical Climate and DNA Degradation

This is the single biggest obstacle. DNA is a fragile molecule that degrades through hydrolysis and oxidation. In hot, humid environments, these processes are dramatically accelerated. A skeleton buried in the cold, dry soils of Scandinavia may yield abundant DNA after 5,000 years, while a skeleton of the same age buried in the alluvial plains of the Ganges or the Indus may retain almost none.

The problem is compounded by seasonal monsoons, which saturate burial sites with water and accelerate chemical degradation. Even at well-preserved archaeological sites, the proportion of endogenous (original human) DNA in ancient bones from South Asia is typically less than 1%, compared to 10-50% at many European sites. This means that sequencing ancient South Asian genomes requires vastly more effort and expense per individual.

Limited Burial Practices

Many ancient Indian communities practiced cremation, which destroys all DNA. Even during the IVC period, burial practices varied - some sites show evidence of secondary burial (where bones were deposited after the flesh had decomposed elsewhere), which further reduces DNA preservation. The relative scarcity of well-preserved burial sites in India, compared to Europe or Central Asia, means there are simply fewer opportunities to attempt DNA extraction.

Few Specialized Laboratories

Ancient DNA work requires highly specialized cleanroom facilities to avoid contamination with modern DNA. As of the mid-2020s, only a handful of institutions in India have established dedicated ancient DNA laboratories. The Birbal Sahni Institute of Palaeosciences (BSIP) in Lucknow and the CSIR-Centre for Cellular and Molecular Biology (CCMB) in Hyderabad are among the leading Indian institutions in this field. However, the capacity remains limited compared to the dozens of ancient DNA facilities in Europe and North America.

Political and Cultural Sensitivities

Ancient DNA findings in South Asia inevitably intersect with politically charged debates about identity, migration, and the origins of Indian civilization. The Aryan migration question, in particular, has been a flash point. Some researchers have faced pressure to frame their findings in ways that support particular political narratives. International collaborations have sometimes been complicated by concerns about genetic sovereignty and the export of ancient samples. Navigating these sensitivities while maintaining scientific rigor is an ongoing challenge for researchers in the field.

The Future of Ancient DNA Research in India

Despite the challenges, there are strong reasons for optimism about the future of archaeogenetics in South Asia. Several developments are likely to dramatically expand our knowledge in the coming years:

Petrous Bone Techniques

The petrous bone - the densest bone in the human body, located in the skull behind the ear - has been shown to preserve DNA far better than other skeletal elements. The Rakhigarhi breakthrough was possible precisely because researchers targeted the petrous bone. As this technique becomes standard practice, success rates for DNA extraction from South Asian sites are expected to improve significantly.

More IVC Sites

Major IVC sites such as Dholavira in Gujarat, Lothal in Gujarat, Kalibangan in Rajasthan, and Banawali in Haryana have all yielded skeletal remains. Systematic attempts to extract DNA from these sites could reveal whether there was genetic diversity within the IVC - did people at different Harappan cities have different proportions of Iranian-related and AASI ancestry? Were there trading communities with foreign genetic signatures? These questions can only be answered with more data.

Pre-Neolithic Samples

The holy grail of South Asian archaeogenetics is the recovery of ancient DNA from pre-Neolithic (Mesolithic or Upper Paleolithic) individuals in the subcontinent. Such samples would represent populations before the mixing of Iranian-related and AASI lineages, providing direct evidence for the genetic composition of the earliest inhabitants of South Asia. Sites such as Bhimbetka in Madhya Pradesh and Hunsgi in Karnataka, which show evidence of habitation stretching back tens of thousands of years, are potential targets.

Ancient Pathogen DNA

Beyond human ancestry, ancient DNA can reveal the pathogens that ancient people carried. Studies of ancient plague DNA in Europe have revolutionized our understanding of disease history. Similar studies in South Asia could shed light on the disease pressures that shaped Indian populations - including questions about the role of epidemics in the decline of the IVC.

Expanding Indian Laboratory Capacity

Several new ancient DNA facilities are being established or expanded in India. The government's increased funding for genomic research, combined with training programs that send Indian researchers to leading international labs, is building the human and institutional capacity needed for large-scale archaeogenetic studies. Within the next decade, India is expected to become a major producer of ancient DNA data rather than primarily a consumer of results generated abroad.

Frequently Asked Questions

Has ancient DNA been successfully recovered from India?

Yes, but only in very limited quantities. The most notable success is the extraction of ancient DNA from a woman buried at Rakhigarhi in Haryana, dated to approximately 2500 BCE, as part of the Indus Valley Civilization. Ancient DNA has also been recovered from skeletons at Roopkund lake in Uttarakhand and from several sites in the Swat Valley of Pakistan. India's tropical climate makes DNA preservation extremely difficult, so the total number of successfully sequenced ancient genomes from the subcontinent remains far smaller than what has been achieved in Europe or Central Asia.

What did the Rakhigarhi ancient DNA study reveal?

The Rakhigarhi study (Shinde et al. 2019) revealed that a woman buried at this major Indus Valley Civilization site around 2500 BCE had a genetic profile consisting of Iranian-related farmer ancestry mixed with Ancient Ancestral South Indian (AASI) ancestry. Her genome contained no detectable steppe pastoralist ancestry. This demonstrated that the people who built the IVC were not descended from Central Asian steppe populations and that the steppe-related ancestry found in many modern Indians must have arrived after the IVC period, likely during the second millennium BCE.

Why is ancient DNA so rare in India compared to Europe?

Three main factors contribute to the scarcity of ancient DNA from India. First, the tropical climate - with its heat, humidity, and monsoon rains - rapidly degrades DNA in buried remains. A 4,000-year-old skeleton in India may contain less than 1% endogenous DNA, compared to 10-50% at comparable European sites. Second, many ancient Indian cultures practiced cremation rather than burial, leaving no skeletal remains for DNA extraction. Third, the field of ancient DNA research has historically been concentrated in European and North American institutions, and it was only in the late 2010s that significant ancient DNA work began on South Asian material.

What will future ancient DNA studies reveal about India?

Future studies are expected to provide ancient DNA from additional IVC sites, potentially revealing genetic diversity within the Harappan civilization. Advances in petrous bone extraction techniques will improve success rates in tropical climates. Researchers are particularly eager to obtain pre-Neolithic samples from India that could illuminate the earliest human settlements in the subcontinent. Studies of post-IVC transitional sites may clarify exactly when and how steppe ancestry entered different parts of South Asia. The expansion of ancient DNA laboratories within India itself will also accelerate the pace of discovery in the coming decade.

Conclusion: What Ancient DNA Has Taught Us About India

The ancient DNA revolution has, in just a few years, transformed our understanding of Indian population history. Where once we had only inference and speculation, we now have direct genetic evidence from the people who lived in and around the Indian subcontinent thousands of years ago.

The key lessons from ancient DNA research in India so far can be summarized as follows. The Indus Valley Civilization was built by a population with deep South Asian roots - a mixture of Iranian-related ancestry and indigenous AASI ancestry, with no contribution from Central Asian steppe populations. Steppe ancestry entered the subcontinent during the second millennium BCE, after the IVC had begun its decline, consistent with the Indo-Aryan migration model supported by linguistic and archaeological evidence. The mixing of steppe-descended and IVC-descended populations was a gradual process that produced the complex mosaic of modern Indian genetics. And the deepest layer of South Asian ancestry - the AASI component - has been present in the subcontinent for at least 50,000 years, making it one of the oldest continuous population lineages anywhere on Earth.

We are still in the early chapters of South Asian archaeogenetics. Every new ancient genome recovered from the subcontinent has the potential to challenge existing models and reveal new complexities. The coming years promise exciting discoveries as techniques improve, laboratories expand, and more sites are excavated with ancient DNA recovery in mind.

Curious about the ancient ancestral components in your own DNA? Order your Helixline DNA kit and discover how your genome connects to the deep history of the Indian subcontinent.