Regional Genetics

Gujarati DNA & Ancestry: Genetics of Patel, Anavil, Bania & Jain Lineages

By Dr. Kavitha Krishnamurthy · June 2026 · 14 min read

Gujarat occupies a singular place in South Asian genetic history. The state sits at what was once the heartland of the Indus Valley Civilization — one of humanity's earliest urban cultures — and Gujarati populations carry remarkably high proportions of this ancient ancestry. Yet modern Gujarat is also one of India's most diaspora-connected states: roughly 1.5 million Gujarati-origin people live in the United Kingdom alone, making Gujarati the second-most-spoken South Asian language in Britain after Punjabi.

Understanding Gujarati genetics means navigating two distinct questions. The first is scientific: what is the deep ancestry of Gujarat's many endogamous communities, from Leva Patels to Anavil Brahmins to Shvetambar Jains? The second is practical: why do Gujaratis so often receive vague, inaccurate results from mainstream DNA tests like 23andMe, and what does a purpose-built Indian ancestry analysis reveal instead?

The Deep Ancestry of Gujarat: Indus Valley at the Core

Gujarat's genetic profile is shaped by three ancient source populations that mixed over thousands of years:

The relative proportions of these three components vary across Gujarati communities in ways that reflect occupational history, endogamy, and geographic origin — making community-level ancestry analysis far more informative than a simple "South Asian" label.

Community-by-Community: How Gujarati Groups Differ Genetically

Leva and Kadva Patel

The Patel surname — particularly common among Leva and Kadva Patels from central and north Gujarat — represents one of the most studied South Asian populations in human genetics research. Gujarati Indians from Houston (GIH) were included in the 1000 Genomes Project, providing a baseline for Gujarati genetics globally.

Leva Patels typically show 50–60% ANI ancestry (combining Iranian farmer and steppe components) with correspondingly higher AASI than most North Indian groups. The steppe component is generally 8–15%, reflecting lower Indo-European influence than Punjabi Jats (15–25% steppe) or North Indian Brahmins (up to 30%).

Kadva Patels show a similar but slightly distinct profile, reflecting separate geographic origins in north Gujarat compared to the central Gujarat Leva Patel homeland. Within-community variation exists: Patels from Saurashtra versus those from the Charotar region of Anand district carry subtly different ancestry signatures.

Anavil Brahmin

The Anavil Brahmins of south Gujarat (Surat, Valsad, Navsari) show a higher ANI ratio than Patels — typically 60–72% ANI — with elevated steppe ancestry (15–22%) reflecting their Brahmin status and the general pattern in which Brahmin communities across India carry more steppe ancestry than non-Brahmin groups.

Notably, Anavil Brahmins have been highly endogamous for centuries and show a distinctive genetic signature that separates them from both Nagar Brahmins (who have somewhat different origins) and from Brahmin communities in Maharashtra or Rajasthan.

Bania and Jain Communities

Gujarat's Bania and Jain communities — including the Shvetambar and Digambar Jains, Kapol Banias, and Oswal Banias — show one of the most striking genetic profiles in South Asia: very high Iranian farmer / IVC ancestry, relatively low steppe, and moderate AASI. This pattern suggests these communities descend from populations that were at the heart of the Indus Valley Civilization and experienced limited admixture from the later steppe migration.

Oswal and Maheshwari communities from Rajasthan (historically connected to Gujarat through trade) show similar patterns. The genetic distinctiveness of Jain communities reflects centuries of endogamy within trading communities.

Parsi / Zoroastrian

The Parsi community of Gujarat is perhaps South Asia's most genetically distinctive group. Descended from Zoroastrian refugees who migrated from Iran to Gujarat roughly 1,200–1,400 years ago, Parsis show approximately 50% Iranian ancestry — making them genetically intermediate between the Iranian plateau and South Asian populations. Their endogamy has been exceptionally strict, and the Parsi genetic signature is clearly identifiable in any comprehensive South Asian ancestry test.

Koli and Other Tribal/Agropastoral Communities

Gujarat's Koli communities — one of the largest caste groups in the state — generally show higher AASI ancestry than the upper-caste Patel and Brahmin groups, with lower steppe. This reflects the broader South Asian pattern where historically tribal and lower-caste communities retain more of the pre-farming, pre-steppe population ancestry.

Haplogroups in Gujarati Populations

Y-DNA (paternal lineage) and mtDNA (maternal lineage) haplogroups provide an independent layer of ancestry information beyond the admixture proportions.

HaplogroupFrequencySignificance
L1a / L1b (Y-DNA)High in Patels, BaniasAncient South Asian / Indus Valley farmer lineage; elevated in Gujarat
R1a (Y-DNA)Moderate (8–18% in Brahmins)Indo-European steppe ancestry; lower in Gujarat than in North India
J2a (Y-DNA)Present in Brahmin, BaniaWest Asian / ancient Iranian ancestry
H1a (Y-DNA)Common in Koli, artisan groupsAncient South Asian lineage
R2 (Y-DNA)Present in various groupsSouth Asian–specific; linked to IVC-era populations
M* / M2 (mtDNA)High across Gujarati groupsAncient South Asian maternal lineage
U2a / U7 (mtDNA)ModerateWest Eurasian maternal lineage with ancient South Asian presence
I (mtDNA)Notable in ParsiIranian / West Asian maternal lineage

The L1 haplogroup frequency in Gujarat is among the highest in South Asia, supporting the idea that Gujarati populations are particularly direct descendants of the Indus Valley Civilization's farming population. R1a — the dominant marker of steppe migration — is present but at lower frequencies than in the Hindi belt or Punjab.

The ANI/ASI Ratio: Where Gujaratis Fall

The ANI (Ancestral North Indian) and ASI (Ancestral South Indian) framework, developed by Reich et al. in 2009, remains a useful lens for understanding South Asian genetic diversity even as more granular models have been developed.

CommunityANI %ASI %Notes
Anavil Brahmin (south Gujarat)65–72%28–35%Highest ANI among major Gujarati groups
Nagar Brahmin62–70%30–38%Distinct from Anavil
Leva / Kadva Patel52–62%38–48%Moderate ANI; lower steppe than Brahmins
Bania / Jain (Oswal, Kapol)55–65%35–45%Very high IVC ancestry within ANI
Koli (various)42–55%45–58%Higher ASI; lower steppe
Parsi70–80%*20–30%*Includes ~50% Iranian ancestry
What This Means For You

These are population averages. Your personal result will differ — even within the same community, individual variation exists. A DNA test reveals your specific admixture, not just a community average. Gujarati customers at Helixline have discovered everything from unexpected Parsi ancestry signals to higher-than-typical AASI reflecting pre-Brahmin lineages on one parental side.

Why 23andMe and AncestryDNA Give Poor Gujarati Results

If you are Gujarati and have tested with 23andMe or AncestryDNA, you have almost certainly encountered the frustration of seeing "Broadly South Asian" or "Northern Indian & Pakistani" with no community-level detail. There are two technical reasons for this.

Reference panel mismatch. 23andMe's South Asian reference panel relies heavily on the 1000 Genomes GIH (Gujarati Indians in Houston) sample — but this captures mainly Patel ancestry and cannot distinguish between the many Gujarati communities. AncestryDNA has similar limitations for intra-Indian diversity.

Algorithm resolution. Even where Gujarati-specific samples exist in these databases, the assignment algorithms are not optimised to make fine-grained distinctions between, say, a Leva Patel and an Anavil Brahmin or an Oswal Bania. These communities look similar at the level of generic global ancestry but are genuinely distinct at the level of community-specific reference panels.

Helixline's reference database includes dedicated samples from Gujarati communities across the spectrum — Patel, Anavil, Nagar, Bania, Jain, Koli — enabling the kind of community-level ancestry assignment that generic global tests cannot provide.

Already have 23andMe or AncestryDNA data?
If you are Gujarati and have already tested with another provider, there is no need to retest. Upload your existing raw data file to Helixline and get community-level South Asian ancestry results — including Gujarati-specific breakdowns, ANI/ASI ratios, and haplogroups.
Upload Your Data — from $25 / ₹2,500
Supports 23andMe, AncestryDNA, MyHeritage, LivingDNA, and FamilyTreeDNA raw data files.

Gujarati Diaspora and DNA Testing

The Gujarati diaspora is one of the most globally distributed and economically successful immigrant communities in the world. In the United Kingdom, people of Gujarati origin — concentrated in Leicester, London, and the East Midlands — make up a large proportion of the British Indian population. The Patel surname alone is estimated to be the most common Indian surname in the UK.

For diaspora Gujaratis — whether in the UK, USA, Canada, East Africa, or Australia — DNA ancestry testing offers a direct connection to ancestral roots that family history alone cannot provide. Questions like "Are we descended from the communities who came from Gujarat before going to Uganda or Kenya?" or "Our family says we are Leva Patel — does my DNA confirm this?" are exactly the kinds of questions that population-specific analysis can answer.

Most diaspora Gujaratis who have already tested with 23andMe or AncestryDNA have been disappointed by vague results. The Helixline upload product specifically addresses this gap: you upload the raw data you already have, and receive a Gujarat-specific analysis with community-level breakdowns that mainstream tests cannot provide.

The Gujarati Connection to the Indus Valley Civilization

One of the most exciting developments in ancient DNA research over the past five years has been the sequencing of ancient genomes from Indus Valley Civilization sites. Studies published in Cell (2019) and Nature (2022) have shown that IVC populations were distinct from both present-day South Asians and from steppe populations — they were primarily descended from early Iranian farmers mixed with AASI ancestry, with virtually no steppe ancestry.

Modern Gujarati populations — particularly Bania, Jain, and Patel communities — show among the highest proportions of IVC-related ancestry of any present-day South Asian group. This is not surprising given Gujarat's geographic location: the ancient cities of Dholavira (in Kutch) and Lothal (near Ahmedabad) were among the largest Indus Valley Civilization sites.

What this means in practice: if you are Gujarati and receive a Helixline report, your Iranian farmer component (which proxies IVC ancestry in our analysis) is likely to be one of your largest ancestry components — sometimes exceeding your steppe component by three to four times, in contrast to North Indian Brahmin customers where steppe and IVC ancestry are often more balanced.

What a Helixline DNA Test Reveals for Gujaratis

A Helixline ancestry analysis for a Gujarati customer typically includes:

For Gujaratis with Decode or Infinite kits, the analysis also includes health traits, pharmacogenomics, and carrier screening — all optimised for Indian population genetics.

Discover Your Gujarati Ancestry Story
Order a Helixline kit and find out exactly which Gujarati communities your DNA most closely matches — with community-specific breakdowns that 23andMe and AncestryDNA cannot provide.
Order Kit — from ₹6,999
Free shipping · Results in 6–8 weeks · Full privacy controls

Frequently Asked Questions

What is the typical ANI/ASI ratio in Gujaratis?
Gujarati populations show a range of ANI (Ancestral North Indian) ancestry depending on community. Leva and Kadva Patels typically show 50–60% ANI, Anavil Brahmins 60–70% ANI, and Bania/Jain communities 55–65% ANI. The ASI (Ancestral South Indian) component is lower in Gujarat than in South Indian populations but higher than in most North Indian Brahmin groups. Ancient Iranian farmer ancestry (from Indus Valley Civilization populations) is notably high in all Gujarati groups.
What haplogroups are common in Gujarati men?
Y-DNA haplogroup L1 is particularly common in Gujarati communities and reflects the Indus Valley Civilization lineage. R1a (steppe ancestry) is present but lower than in North Indian Brahmins. H1a is common in Koli and artisan communities. J2a is found in some Brahmin and Bania communities reflecting ancient West Asian ancestry. R2 is present in certain Patel sub-groups. Overall, Gujarati Y-DNA is more weighted toward IVC-era lineages than the Indo-European lineages (R1a/R1b) that dominate in the northwest.
Why do Gujaratis get poor results on 23andMe?
23andMe's South Asian reference panel cannot distinguish between Gujarati communities — Patel, Anavil, Bania, and Koli profiles all get collapsed into "Northern Indian & Pakistani" or "Broadly South Asian." Helixline's reference panel includes dedicated Gujarati community samples, providing specific community-level breakdowns rather than a generic South Asian label.
Are Gujarati genetics similar to Punjabi or different?
Gujaratis and Punjabis are genetically distinct. Punjabis typically show 15–25% steppe ancestry while Gujarati communities generally show 8–18%. Gujaratis have higher Indus Valley Civilization / Iranian farmer ancestry compared to Punjabis, reflecting Gujarat's position as an IVC heartland where steppe migration was less intense.
Can I upload my AncestryDNA or 23andMe data to get Gujarati-specific results?
Yes. If you have already tested with 23andMe, AncestryDNA, MyHeritage, LivingDNA, or FamilyTreeDNA, you can download your raw data file and upload it to Helixline. The analysis provides Gujarat-specific ancestry breakdowns, community signals, haplogroups, and ancient DNA components starting from $25 / ₹2,500 — no new saliva sample needed.
Discover your Gujarati ancestry — which community does your DNA match? Kit from ₹6,999 Order Your Kit