This page highlights history, timeline, and the latest updates with my work on the quantitative linguistics metric I invented known as Rhetorical Density, and its operationalisation in the Arabic language, which I have named the BALAGHA Score because the Arabic word for rhetoric is al-balāgha.
2025 – 2026 – Building my research laboratory and infrastructure
13 January 2026 – The Arabic Rhetorical Device Identifier is released on Github and Zenodo. This is a custom GPT and AI system prompt package which harnesses the power of LLMs to assist in identifying Arabic rhetorical devices in classical and contemporary Arabic texts of any dialect.
15 December 2025 – Launch of BalaghaID – a web-based interactive application for identifying Arabic rhetorical devices through guided decision-making. Also available on Github, Zenodo, and as an offline version.
10 December 2025 – My set of rules for tokenising Arabic words for rhetorical density calculations – the Arabic Word Tokenisation Scheme (v0.1.0) – is released on GitHub and Zenodo.
Compound Arabic words need to be broken down into their constituent parts before performing a word count during a rhetorical density calculation. The Arabic Word Tokenisation Scheme offers a framework for breaking compound words in a consistent way when analysing different texts. Having a transparent and defined tokenisation scheme like this is essential to rhetorical density calculation.
17 November 2025 – My taxonomy of Arabic rhetorical devices – The Arabic Rhetorical Device Taxonomy (v0.1.0) – is released on GitHub and Zenodo.
Consisting of 95 Arabic rhetorical devices, their definitions, taxonomy, classification, relationships, and usage examples, it is a permanent, shareable, and citable resource which can be used by anyone working in the field of Arabic rhetoric.- Update: See My LinkedIn post on this milestone in my research project, and why it’s important for modern-day research in Arabic rhetoric.
- Update: Version 0.1.1 adds machine-readable files to the taxonomy.
16 October 2025 – Launch of BalaghaBase.org, an open knowledge base for Arabic rhetoric, which will be used to semantically map concepts, rhetorical devices, research, and interlinked data.
28 August 2025 – Creation of Rhetorical-Density.com, a dedicated site for all things rhetorical density.
17 July 2025 – The Balagha Corpus becomes a Crossref member and starts to mint DOIs for annotated texts in the Corpus. One of the first DOIs (10.64393/balagha-corpus.7451964) was for the Balagha Corpus itself!- Update: As of November 2025, over 100 DOIs have been minted, including one for each rhetorical device listed in the Encyclopedia of Arabic Rhetoric.
26 June 2025 – I presented data about rhetorical density and the BALAGHA Score at the the 13th International Quantitative Linguistics Conference (QUALICO 2025) in Brno, Czech Republic.- The title of my presentation was "The BALAGHA Score: A Quantitative Linguistics Approach to Arabic Rhetorical Analysis".
3 April 2025 – Launch of the Balagha Corpus, which will host texts annotated for Arabic rhetorical devices. This solves the problem of how to display annotated Arabic texts, as the Corpus provides a very elegant and engaging way to publicly display Arabic texts annotated with rhetorical devices.- As mentioned above, the Corpus can mint its own DOIs, and it also has an ISSN (yes, like a journal…), both of which enhance discovery of the Corpus and its datasets.
7 March 2025 – I presented my research on rhetorical density and the BALAGHA Score at “Pathways to the Future: A Digital Humanities Conference”, a conference jointly organised by SOAS University of London and Keio University, Tokyo, Japan.
2024 – Laying the academic foundations of my research
23 September 2024 – The start of my PhD in Arabic at SOAS University of London. The working title of my research project is “The BALAGHA Score: A Digital Humanities Approach to Assessing Arabic Rhetoric”. My supervisors are Dr Marlé Hammond and Dr Chris Lucas.
2022 – Invention of Rhetorical Density and the BALAGHA Score
8 September 2022 – I launched the Encyclopedia of Arabic Rhetoric to collate and present information about Arabic rhetorical devices.
6 September 2022 – My MA dissertation, supervised by Dr Mustafa Baig and submitted to the University of Exeter, was a feasibility study investigating the use of a numerical scoring system – a precursor to the BALAGHA Score – to objectively measure and compare the density of rhetoric in Arabic texts.- This work was the first description of rhetorical density as a new quantitative linguistics metric.
30 August 2022 – I registered the domain name BalaghaScore.com as part of the original research for my MA in Advanced Arabic at the University of Exeter.- This website hosted the “Arabic Rhetoric Literary Device Density Measurement System” – a world-first, working prototype implementation of rhetorical density measurement (in any language), which subsequently become the BALAGHA Score for measuring rhetorical density in Arabic specifically.
2021 – Birth of an idea, about how to assess authorship of the Quran
4 November 2021 – Scribbled in my room in a hall of residence in Exeter, these are my initial brainstorming notes about my MA research project proposal.- About the doctrine of the Quran’s inimitability, I wrote, “One aspect [of the Quran’s inimitability] is that [the inimitability] is [in] the domain of balāgha. It is said that [the Quran] has [a] rich use of literary devices which is more than others e.g. [Surah] Hud [Verse 44] [has] 25 [literary] devices [in that one verse alone].”
- My research question was “Is it possible to objectively measure the use of balāgha in the Quran and compare it with non-Quranic texts?”
- I noted, “As far as I am aware, there is no way to objectively compare the balāgha in one text with that of another.”
- Therefore, “The objective is to create a way of objectively measuring balāgha in any text.”
This moment was the birth of the idea of Rhetorical Density as a quantitative linguistics metric, the BALAGHA Score as its implementation in Arabic, and their use in comparing the amount of rhetoric in the Quran with non-Quranic texts, to see whether the Quran is exceptional (or miraculous) in this regard.
Please also read the background to my research (and here too) and visit the main BALAGHA Score Project page.