To evaluate language models on Indic languages, we need a robust human-annotated NLU benchmark consisting of 9 tasks across 18 Indic languages.


Dataset Download Link
IndicSentiment link
IndicXNLI link
IndicCOPA link
IndicXParaphrase link
MASSIVE Intent Classifcation link
Naamapadam link
MASSIVE Slot Filling link
IndicQA link

Coming Soon

Dataset Descriptions and Examples


  • Sumanth Doddapaneni (AI4Bharat, IITM)
  • Rahul Aralikatte (McGill, MILA)
  • Gowtham Ramesh, (AI4Bharat, IITM)
  • Shreya Goyal, (AI4Bharat)
  • Mitesh Khapra,  (AI4Bharat, IITM)
  • Anoop Kunchukuttan, (Microsoft, AI4Bharat, IITM)
  • Pratyush Kumar,  (Microsoft, AI4Bharat, IITM)

Corresponding authors: Sumanth Doddapaneni


If you are using any of the resources, please cite the following article:

  doi = {10.48550/ARXIV.2212.05409},
  url = {},
  author = {Doddapaneni, Sumanth and Aralikatte, Rahul and Ramesh, Gowtham and Goyal, Shreya and Khapra, Mitesh M. and Kunchukuttan, Anoop and Kumar, Pratyush},
  keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {IndicXTREME: A Multi-Task Benchmark For Evaluating Indic Languages},
  publisher = {arXiv},
  year = {2022}, 
  copyright = { perpetual, non-exclusive license}


IndicXTREME is released under this licensing scheme:

  • We do not own any of the text from which this data has been extracted.
  • We license the actual packaging of this data under the Creative Commons CC0 license (“no rights reserved”).
  • To the extent possible under law, AI4Bharat has waived all copyright and related or neighboring rights to IndicXTREME.
  • This work is published from: India.