This repository includes data from the firs stage of the African Digital News Database Project led by Dr Dani Madrid-Morales with the financial support of the University of Houston's Digital Research Commons.

The project seeks to build a collection of news texts published (in English) by African digital news organisations. The data in this repository cover the period between December 6, 2020 and January 4, 2021.

The data are presented in commonly used formats in computational approaches to text analysis (document-term matrix & tokenized list of features with POS tags).
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

1 to 1 of 1 Result
Mar 12, 2021
Madrid-Morales, D.; Lindner, P.; Periyasamy, M., 2021, "Corpus of African Digital News from 600 Websites Formatted for Text Mining / Computational Text Analysis",, Texas Data Repository, V3
This dataset includes a corpus 200,000+ news articles published by 600 African news organizations between December 4, 2020 and January 3, 2021. The texts have been pre-processed (punctuation and English stopwords have been removed, features have been lowercased, lemmatized and PO...
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.