n Mousaion - Developing a morphological normaliser for Afrikaans for use in Cross Language Information Retrieval

Volume 25, Issue 2
  • ISSN : 0027-2639



This article describes the development of a morphological normaliser for Cross Language Information Retrieval (CLIR) purposes. Word form normalisation in CLIR is necessary, because plural forms, past tense verbs, etc are not included as dictionary entries and therefore cannot be translated. In the development of this normaliser, a corpus consisting of newspaper text was used to establish rules based on statistics of word form occurrences and to create a stopword list for Afrikaans. The procedure described here can normalise the majority of Afrikaans words (most past tense verb forms, most plural forms and compounds). The normaliser was tested on the original newspaper text in a CLIR environment.

Loading full text...

Full text loading...


Article metrics loading...


This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error