Shoonya is an open-source platform to improve the efficiency of language work in Indian languages with AI tools and custom-built UI interfaces and features. This is a key requirement to create larger datasets for training AI models such as neural machine translation for a large number of Indian languages.  

Shoonya has been envisaged as supporting various types of language work including translation, text validation, speech transcription, optical character recognition and so on. The current focus of Shoonya is on translation.

Features supported

Workplace Management

Shoonya provides hierarchical way to manage language work into different organizations, workspaces, and projects.

NMT support

Shoonya enables populating automatic translations from IndicTrans currently supporting 12 Indic languages.

Transliteration Support

Shoonya enables simplified input entry in Roman character with transliteration from IndicXlit models supporting 20+ languages.

Maker-Checker-Superchecker Flow

Shoonya provides multiple ways to evaluate the quality of translated data with automated maker-checker flows.

Context View

Shoonya allows translators to see paragraph level context when translating an individual sentence.

Cross-lingual Support

For low-resource language, Shoonya supports showing annotators translations in other languages.

Shoonya v2 has been Released Publicly