PlainSpell data
The corpus in numbers
According to the PlainSpell database, the corpus holds 6,918,744 words and 3,375,992 confusable pairs across 5 languages, sourced from Wiktionary (CC BY-SA, May 2026). Every figure below is counted directly from the live database and free to cite.
At a glance
PlainSpell indexes 6,918,744 words across 5 languages, with 3,375,992 confusable pairs, 27,821 homophone groups and 1,974,632 generated misspelling variants.
- 6.92M
- words indexed
- 3.38M
- confusable pairs
- 27,821
- homophone groups
- 1.97M
- misspelling variants
Source: Wiktionary (kaikki.org, CC BY-SA) + open word-frequency list. Data vintage May 2026.
Words indexed by language
| Language | Words | Confusables | Homophones |
|---|---|---|---|
| 🇫🇷 French | | 440,172 | 21,890 |
| 🇩🇪 German | | 2,006,359 | 2,859 |
| 🇪🇸 Spanish | | 323,831 | 812 |
| 🇺🇸 English | | 529,999 | 2,182 |
| 🇧🇷 Portuguese | | 75,631 | 78 |
| All languages | 6,918,744 | 3,375,992 | 27,821 |
Counts are read live from the PlainSpell database (data vintage May 2026). Misspelling variants are generated by edit-distance from each headword, not observed corpus frequencies.
Cite these statistics
These figures are free to reuse with attribution (CC BY-SA). Copy the citation:
PlainSpell, “PlainSpell Corpus Statistics” (May 2026). Derived from Wiktionary (kaikki.org, CC BY-SA) and an open word-frequency list. https://plainspell.com/statistics
Go deeper
From the totals to the individual records.
- Explore any language in full — definitions, IPA, etymology and misspellings. English A–Z
- See the cross-language rankings: hardest to spell, most confusable, largest homophone groups. Rankings
- Read how the corpus is built and refreshed. Methodology