“Non-Traditional Data Sources: Providing Insights into Sustainable Development”
Communications of the ACM, April 2021, Vol. 64 No. 4, Pages 88-95
Arab World Special Section: Big Trends
By Ingmar Weber, Muhammad Imran, Ferda Ofli, Fouad Mrad, Jennifer Colville, Mehdi Fathallah, Alissar Chaker, Wigdan Seed Ahmed
“Mobile phone metadata, such as the number and timing of outgoing calls, mobility patterns, and data consumption behavior, can be good predictors of socioeconomic status.”
“Challenges around rapid access to data provide an opportunity to use non-traditional data sources as a complement to existing data and approaches, in particular through the use of artificial intelligence.”
The world is facing enormous challenges, ranging from climate change to extreme poverty. The 2030 Agenda for Sustainable Development and its 17 Sustainable Development Goals (SDGs) were adopted by United Nations Member States in 2015 as an operational framework to address these challenges. The SDGs include No Poverty, Quality Education, Gender Equality, Peace, Justice and Strong Institutions, among others, as well as a meta goal on Partnerships for the Goals. Despite limitations, the SDGs form a rare global consensus of all 193 UN member states on where we should collectively be heading.
Goals are meaningless without a way to track their progress. Data on the SDGs and the associated indicators are often outdated or unavailable, hindering progress during the Decade of Action leading up to 2030.c Challenges around rapid access to data have also become apparent in the context of, for example, the Sudan revolution (public sentiment) or the Beirut explosion in August 2020 (infrastructure damage). The paucity of data has been highlighted during the COVID-19 pandemic, with its sudden impact on all aspects of life, most of which have yet to be quantified. Going beyond availability, accessibility, and timeliness, there is a need for more disaggregated data, such as by gender and town.
These challenges provide an opportunity to use non-traditional data sources as a complement to existing data and approaches, in particular through the use of artificial intelligence (AI). At the same time, a naive belief in AI as a savior risks ignoring complex, structural root causes. Furthermore, a reliance on digital traces, such as mobile phone data, risks excluding the most vulnerable—and often least connected—further aggravating inequalities. Lastly, there is a risk of taking a reductionist one-size-fits-all approach, often with a Western lens and without understanding local context, in particular in the Arab region with its diverse cultures and languages.
Here, we showcase regionally developed projects that explore the use of non-traditional data sources and AI to help measure progress toward the SDGs. Some of these projects also support countries in other parts of the world, demonstrating that the Arab world is not only a consumer of, but a contributor to, world-leading innovation.
Creating the Sudan Horizon Scanner for Detecting Real-Time Change
Project stakeholders: UNDP Sudan, Republic of Sudan’s Ministry of Labour and Social Development, Republic of Sudan’s Ministry of Trade, Republic of Sudan’s Prime Minister’s Office
Over the past 18 months, Sudan has “witnessed the people’s revolution and history-making transition process.” Along with this, Sudan has experienced a rapid change in public sentiment, a narrative that has been difficult to capture using traditional data, which were detecting neither the dynamics nor drivers of this change. As part of its sensemaking efforts, UNDP Sudan’s Accelerator Lab developed the Sudan Horizon Scanner (SHS), a system to monitor a changing public narrative through real-time change detection, topic identification, sentiment classification, and summarization (see Figure 1).
The Accelerator Lab is exploring whether the system will be able to eliminate noise with minimal intervention and to explain changes in public sentiment by connecting different types of data, including socioeconomic and health data, Facebook posts, and newspaper headlines. The Accelerator Lab is also including in its analysis popular underground songs, radio shows and call-ins, and Friday prayer sermons—unusual sources of data that have captured the attention of other UNDP country offices as they look to analyze rapidly changing public sentiment in their countries.
The songs and sermons have been the most effective thick datasets for detecting signals of change. A challenge in analyzing this data is the fast rate at which vocabulary and lyrics are changing, with no existing training data for the use of these songs’ sub-languages that are colloquially called Randook. To develop the required natural language processing (NLP) functionalities, we are therefore building our own thesaurus and training data.
While striking a balance between case studies with a regional focus and those with a focus beyond the Arab region, all the initiatives presented here showcase regionally developed technology. Even for projects with a purely regional implementation, the lessons learned, and the knowledge obtained are disseminated throughout the global UN system, and together they offer an excellent demonstration of the opportunities that non-traditional data sources combined with AI provide for measuring and advancing the SDGs.
At the same time, these new approaches create challenges, including how to safeguard privacy and how not to exclude people without Internet connectivity, while amplifying the voices of the already-connected. More fundamentally, it is important to note that exclusion often extends beyond the data to the process of building and deploying technology. But any technology is only as good and as fair as the socio-political system it is embedded in. Put simply, extreme poverty will not be eradicated through more advanced technology, and lack of data is not the reason for lack of action on climate change. It is now more needed than ever to broaden the group of people who build technology for the SDGs, but also who get to decide what to build, and how it will be used.
About the Authors:
Ingmar Weber, Qatar Computing Research Institute, Qatar.
Muhammad Imran, Qatar Computing Research Institute, Qatar.
Ferda Ofli, Qatar Computing Research Institute, Qatar.
Fouad Mrad, United Nations Economic and Social Commission for Western Asia, Lebanon.
Jennifer Colville, United Nations Development Programme Regional Hub, Jordan.
Mehdi Fathallah, United Nations Development Programme, Tunisia.
Alissar Chaker, United Nations Development Programme, Tunisia.
Wigdan Seed Ahmed, United Nations Development Programme, Sudan.