AI, Illiteracy, and Written Language in Mali
Nouhoum Coulibaly, Alou Dembele, Michael Leventhal | RobotsMali AI4D Lab | Bamako, Mali | research@robotsmali.org
Abstract
This paper examines the potential contribution of AI to reducing extreme rates of illiteracy in Mali, taking as its point of departure research that has demonstrated that education exclusively in French, a language not spoken at home by Malians, is a primary contributing factor to the current 70% illiteracy. The problem of addressing illiteracy when the target language is predominantly oral is considered, with a set of objective guidelines proposed to respond to its unique challenges. In this paper, we assess the capabilities of AI to accelerate the implementation of a proposed solution for Malian languages and examine the results of a pilot project.
The Malian Context
Mali is a West African country with a long history of empires enriched by trans-Saharan trade, gold, and salt and renowned as an ancient center of learning and scholarship. France ruled the country from the late 19th century until independence in 1960, introducing the use of French language in government and in education. The country currently has one of the lowest HDI (human development index) scores in the world, a very low level of industrialisation, and persistent security threats from jihadists implanted in remote border regions. Our work focuses on uses of AI which may help to address some of the challenges Mali faces more efficiently than traditional interventions, with education as a primary focus as a root solution that can strengthen local ability to foster economic and social development.
Mali has 13 national languages, that is, languages that are the first language of most Malians and are the vehicular languages between dialect communities that speak variants of the national languages. These languages are supported by government institutions and have official status. Bambara is spoken by about 80% of the population, and is part of the Manding language family covering approximately 50 million speakers of largely mutually-intelligible languages. More people speak a Manding language than, for example, Polish.
AI for Education Project
We have not seen evidence that AI can surpass human educators, but it is a domain which requires a great deal of individualized attention and therefore suffers from a perennial resource gap. AI-based systems that can fill in some of that gap have been the subject of intense investigation over the last few years. Both inside the classroom and outside it, AI has been vaunted as being capable of increasing the efficiency in the development and delivery of educational content, and, for learners, the possibility of extreme personalization of the learning experience when and where a teacher is not available. The overall objective of using AI in education is to make quality education available to all children, more adaptive to the specific needs of each student, and more engaging through various types of immersive experiences – and to do all this with limited resources. As in other domains, there is strong and widespread concern about the potential of AI to produce content which is inappropriate or even harmful, a particularly strong concern where the content aims to develop the skills, knowledge, and character of children. Such AI systems must be rigorously controlled for quality and appropriateness of output, whether by reliable automated means, if feasible, or by human-in-the-loop, that is, integrating human control and feedback.
This paper describes a human-in-the-loop AI system used to produce illustrated children’s books and associated student comprehension materials and teaching guides in African languages where few such materials exist today and resources are very limited to produce them. The paper frames the problem that we are attempting to solve by exploring the applicable educational context and describes how AI, and human supervision, was used in the solution. We describe the output of the project, its potential relevance as a solution to the target problem, and a limited evaluation of performance of this system in field testing.
A Unique Educational Challenge
Mali has one of the highest rates of illiteracy in the world at approximately 70%, with low participation and poor results from the formal school system. The language of instruction in Malian schools is French, while the majority of the population does not speak French at all, and the minority of educated francophones do not use French as the primary language in the home. Research studies have demonstrated that the fact that the written language is an entirely different language from that used in the home and in the daily life of Malians is a major factor in the poor performance of Malian schools and low literacy rates. There have been many efforts over the years to improve literacy through the use of languages that children speak in the schools in early grade education, motivated by the observation that acquisition of reading and numeracy skills is far faster and more sure in their native language than it is in French. The dominant theory is that a successful onboarding of children into the formal education system in their early years using their maternal language will give them a better base for the acquisition of more advanced literacy skills, leading to motivation to stay in school and better results as their schooling progresses.
The implementation of early-age education in maternal languages in Mali has never been widespread and, despite poor results with the current educational system and the proven benefits of the proposed alternative, it has become quite rare. There are many reasons that have been advanced for this, but we would like to highlight one in particular that is rarely mentioned but which we hypothesize is, in reality, the primordial reason: the lack of written language tradition. The objective of our AI experiments can be understood as providing material necessary to test this hypothesis.
All Malian languages are predominantly oral languages, that is, while these languages are spoken by millions of people, a diminishingly small number of native speakers can read or write them, including those who are literate in French. There is little reason to learn to read and write, there are no adult books of any kind originally written in Malian languages. There are a handful of translations, the most significant work being the Bible. There are a small number of children’s stories, most of them being traditional tales, often called “tales of Grandmother”. They are, arguably, tales taken from the oral tradition and not conceived with all the potential devices that distinguish a text anchored in a written tradition.
All languages with fully-developed written languages exhibit greater or lesser degrees of diglossia between oral and written expression. Timing and intonation are critical elements of oral discourse that cannot be transmitted by a written text, while writing organizes ideas around units delimited by punctuation and larger structural units such as paragraphs, sections, and chapters. There are differences in vocabulary and grammar. One can communicate orally having mastered a few thousand words but effective written communication requires understanding of many times the number of words in an oral vocabulary. Many languages have grammatical constructs which are only used in writing. The situation of Malian languages today in relation to French might be compared to that of vernaculars in Europe at the time where Latin was the language of writing. The vernaculars were mainly not written and “higher” knowledge was restricted to the minority that had mastered Latin. Someone that wished to write a text in the vernacular would not have the conventions of written language, the vocabulary or the models of how things might be expressed in writing as a guide. And the writer would not have had an audience to read the text. This is the situation of someone attempting to write in a Malian language today, requiring that they literally invent a written language for a reader that does not exist.
For educators, it may seem that the statute of Malian languages as written or as predominantly oral languages is far removed from their task of producing measured improvements in learning outcomes. However, if the situation of Malian languages is a primary reason for lack of motivation to learn in maternal languages, it is something that will greatly determine the macrosystem impact of an intervention, no matter how successful it is on a microsystem level.
The problem of illiteracy in Mali therefore posed itself to us as whether it might be possible to contribute to the development of Malian languages as written languages in order to provide motivation to strengthen the diffusion of the proven intervention of maternal language education. This may be ambitious, but we are far from being its originator. There is a large community of very passionate advocates for the development of national languages in Mali. In 2012 the Malian government created the Malian Academy of Languages (AMALAN) to support the development of written national languages and in 2023 Mali adopted a new constitution which removed French as its official language, replacing it with 13 Malian languages. USAID’s program SIRA invested $54M dollars in Malian education between 2016 and 2021, a considerable portion of which was dedicated to producing lower grade educational materials in national languages and children’s books.
Proposed Solution and Design Guidelines
We assembled a set of design objectives for a solution that would make a novel contribution to the task of developing the written language through motivating children to learn to read in their maternal language and for parents to support their children learning in the same.
- Create a progressive learning pathway that extends for as many years as possible :
First, If the only purpose of maternal language education is to enable the child to switch to French, the written language, as early as possible, it serves as a tacit admission that the maternal language is not suitable for developing reasoning about the modern world through text. Second, an educational system which stops educating children in their maternal language after a few years cannot produce literate people in that language capable of furthering its development into a fully evolved written language.We therefore set an objective to create a solution targeting levels from pre-reading through high school. - Delight children :
Children must want to read and write in their maternal language, the materials made available must be delightful to them. It is our view that there are no bad books and no bad educational methods if the material triggers in children a passion to read. This principle led us to the precise objective of creating richly illustrated, entertaining children’s stories … because children like stories and learn from them and they like stories with beautiful and interesting images where the images can enhance understanding of the text and convey information beyond what is explicitly written in the text. We did not exclude creation of other types of materials, but we decided for initial experiments to focus mostly on the story format. - Windows and Mirrors :
Present children with what is sometimes termed “windows and mirrors” through the content. A “window” represents a view to the greater world outside the child’s immediate experience, a “mirror” depicts a view familiar to the child to encourage self-identification with the characters and events in the story. In the “windows and mirrors” analogy, one may also note that mirrors and windows are found in the child’s home, a sense that may be employed in setting all the stories in a physical and cultural environment and language that would be like home for Malians, that is, that children should be able to process all the information in the stories using the view of the world that they acquired by living in Mali. That does not say that the foreign ideas or settings should never be shown in the materials, quite the contrary, but that what is presented is something that the child has some references to be able to understand. Materials using these principles are rare in Mali, whether in Malian languages or French. The “tales of Grandmother” is an example of material that is very strong on the mirror’s side, but they are only a small part of the material that needs to be created. On the windows side, it would be difficult to find a single example of a work that depicts the world outside of Mali from a Malian child’s point of view. - Quantity :
Quantity is important. In addition to producing material meant to address children at reading levels from pre-reading through high school, there must be enough variety of material to address different interests and simply to give Malians the impression that their language has a growing literature, to stimulate interest in reading in Malian languages, and to plant the seeds of the development of Malian writing in as many directions as possible - Quality :
The quality of the language is important. While AMALAN has created orthographic standards for Malian languages and deals with issues related to vocabulary and how to express a variety of concepts in the language, this work is far from complete and not put widely into active use due to the limited number of writers. It is rare to find a printed text in a Malian language that is not rife with errors of every kind. In expanding the topical reach of Malian languages, a writer is confronted with difficulties in every sentence. While automatic translation now exists in a few Malian languages, machine translation cannot find necessary expressions that have never been coined in the target language. In creating Malian language literature, these problems must be dealt with and a text produced which builds on the standards that do exist and is as normative as it is possible to get while being readily comprehensible to children. - Support the Teachers :
There is an extreme dearth of pedagogical materials in any Malian language past the 3rd grade level and an equally extreme dearth of lesson materials using literature as a means to teach language and writing. While creating entertaining reading materials in Malian languages was the objective that trumps all, addressing some part of the curriculum gap by creating learning materials around the content will help to create an environment for expanding the written language through the school system.
Motivation for Using AI and Challenges
Up until this point, we have not shown any link between our objectives and AI. Many societies throughout history have gone from the state of their language being predominantly oral to having a fully developed written language without recourse to AI. As with most things AI, AI does not do anything humans are incapable of doing; in fact, it almost always does things worse than the most capable human beings. It is a question of resources and of time. The problem of resources, already a severe problem in education, is greatly exacerbated in the case of predominantly oral languages by the necessity of creating almost everything ab initio.
We made use of Generative AI and machine translation in our pilot. Generative AI generates output such as text or images in response to a description, called a prompt, of what is wanted by the human user. Machine translation takes a text in one language and produces a translation of the text in another language. Both Generative AI and machine translation produce output that is the product of the input and experience in the problem domain in the form of data that has been put into it, a process called training. The output can be quite similar to what a human would produce given the same input and experience, that is, the output is produced by a process that appears to mimic human intelligence. The success of an AI system is often measured by how well it does compared to humans performing the same task. As the data used to train Generative AI and machine translation systems contains very little content specific to African cultures and African languages and very much content specific to the Global North, such systems are extremely good at producing output reflecting a Global North context and extremely poor at producing output appropriate for an African context. That is the most difficult problem we faced in creating Malian language books for Malian children. We attempted to address that problem through a human-in-the-loop approach to using AI.
Objectives: Where AI helps
We found that AI did contribute effectively and at scale to each of our objectives. AI could contribute in similar ways to the creation process in many types of writing projects, but we found that AI could be used successfully in a context where rootedness in a specific and low-resourced environment was an overarching goal despite the fact that training data for large AI systems is not representative of African cultures.
- Creating stories at different reading levels from pre-readers through high-school
We found that Generative AI excelled at adjusting plot complexity, vocabulary, sentence length, grammatical complexity, and settings according to prompts specifying the target age and environment of the reader. This was often used in producing adaptations from existing works, whether from world literature or Malian texts. Sometimes this was used to create different versions of a story adjusted for different age groups and with different emphases in the same way that, for example, Carlos Collodi’s Pinocchio exists in numerous versions targeted at many different ages of children. - Creating beautiful books that make children enthusiastic readers
Our human authors used Generative AI as their writing partner, to explore story ideas, do research and to generate drafts to develop approaches to stories. The AI partner was able to reduce what might be weeks or months of research and experimentation to minutes and the ability of AI to modify version after version of text allowed human authors to concentrate on the most important aspect of their task, producing a story rooted in Malian culture that would delight children. The use of AI-based image generation enabled a level of quality in the illustrations that is exceptional in the entire corpus of African children’s literature, allowing human illustrators to richly illustrate each story with accurate depictions of the Malian environment and beautiful images enhancing the textual content. - Creating books with windows and mirrors
Many of the stories were sourced from international children’s literature but the authors’ use of prompting allowed the content to be transformed to be fully comprehensible to Malian children with elements familiar to them. Stories were crafted that were set in a purely Malian environment as well, again, with prompting that allowed the often euro-centric elements of a story generated through AI to be molded into a Malian framework. Though oftentimes difficult, prompting and composition strategies in AI image generation allowed hundreds of images to be created depicting Malian settings for which no comparable illustrations exist. - Creating an important quantity of material
A small team created thousands of pages of content in a six month period with very specific and demanding requirements. It is difficult to quantify the exact acceleration value brought to the project by AI, but we know of no comparable project that has been able to demonstrate the ability to produce such a volume of material by traditional means. An important body of children’s literature, supported by a full battery of pedagogical material now exists in Bambara (with smaller contributions in 11 other Malian languages) where there was very little before the project began. Anecdotally, Malian authors have related to us that they might spend a year or even years to produce a single children’s book. - Creating texts following language standards while being readily understandable to children
Generative AI was not able to produce text in Malian languages or any other low-resource African language for general purposes, to our knowledge, during the period during which we created the books. We have seen subsequent advances in the generation of Bambara and some other Malian languages, although quality, at the time of writing, remains a mixed bag. It was therefore necessary for our authors to generate text in French or English in preparation for passing the text into an AI-based translation tool chain in order to produce Malian language text. While it may seem that generating authentic Malian language stories in a foreign language would replicate the problem of producing material culturally and linguistically detached from Mali, our process, while fraught with challenges, avoided this outcome. A key part of the prompting strategy was to produce stories designed for translation into Malian languages from the start, with Generative AI proving fairly effective in controlling the use of language such that an accurate translation to an authentic Malian expression was a likely outcome. Still, extensive review and revision of translated text was necessary. Part of the work involves the invention of a written Malian language, automatic translation cannot create Malian text for which no model of how to express something exists. For this reason, it may be some time before Generative AI in Malian languages becomes a possible avenue for high quality text creation. Still, there were significant advantages to the use of automatic translation which was available for Bambara. As much as 90% of the text was found to be suitable by our language experts, meaning that they could concentrate on producing good text only for the remaining 10% where there were challenges. Also, the generated translations were generally orthographically and grammatically correct, greatly reducing the amount of time needed to validate these aspects of the text. - Creating pedagogical materials for language arts, especially after Grade 3
One of the greatest strengths of Generative AI was its ability to create pedagogical support materials using major learning strategies from a generated text. We were able to produce thousands of comprehension questions and activities for students accompanying the stories and a Teacher’s Guide for each story with guidance on the exploration of language, themes, social issues, writing exercises, and activities suitable for use in language arts classrooms from kindergarten through high school.
Outcomes
The pilot produced 180 richly illustrated books for beginning readers through high school level, mainly in Bambara but also in 11 other Malian languages, in approximately 6 months. These books may be viewed, downloaded, and printed or read interactively from the Bloom Library https://bloomlibrary.org/RobotsMali .
The books were used in an initial testing phase with approximately 500 Bambaraphone students in many short term teacher-led reading programs. The children were almost universally excited by the books, irrespective of age, gender, schooling background, or socio-economic situation. Virtually all children were unable to read Bambara at the start and between 67 to 90% were assessed to be able to read complete books unassisted in Bambara after participation in as little as 12 hours of group reading and instruction. A video showing some of the reading sessions can be seen here: https://youtu.be/HGQ5JKEHekk?si=9rGxim-h0En_zg9v
We think that the quantity of the material produced constitutes a form of evidence that AI was an effective accelerator for the production of content. It is difficult to evaluate the extent to which other objectives were achieved and the extent to which AI contributed to them. It is also very difficult to prove that the extraordinary results we saw in literacy attainment in Bambara was due to the design of the books or to the contribution of AI. These questions can be better addressed through further research and application of rigorous evaluation methodology. The great ambition of the project, making a contribution to sowing the seeds of the evolution of Malian languages to fully developed written languages, may take a generation for the results to become evident.
Conclusion
The proof is in the pudding. The project successfully demonstrated that, with the proper guidelines, AI is capable of greatly accelerating the production of high quality, culturally-appropriate and delightful children’s books and supporting pedagogical materials in low-resource African languages. This is an efficient and economical solution to the problem of lack of learning resources that have stymied attempts to strengthen African language education and development of reading and writing culture. Our hope is that our results will encourage similar efforts for other low-resource, predominantly oral language communities. The work also showed potential in addressing educational deficiencies and illiteracy. More research should be conducted to further test our hypothesis that motivation to develop written culture and speaking to the cultural perspective of children in the target communities may be factors in obtaining better educational outcomes that have outsized impacts.