Natural Language Processing

The NLP team at the AIR lab has the following ongoing projects:

  • Agricultural Keyword Spotter
  • English and Luganda Keyword Spotter
  • ASR for Luganda using Deep Speech
  • Named Entity Recognition
  • Topic Classification
  • Language Identification in speech and text
  • Machine translation 

Building speech recognition models is a demanding task especially when it comes to obtaining training data. To do this we have embarked on using open-source platforms like Common Voice a platform created by Mozilla. We have collaborated with Mozilla to drive speech donations for Luganda language on the platform. We have also set up an online Agricultural Keyword Collection Platform where we have different people donate their voices by reading different agriculture keywords in English and Luganda. In order to have stable data contributions, we have collaborated with the institute of languages and we have also set up MUK-NLP, a community of NLP Ethusiasts who engage the community and also actively contribute to the data collection platforms. We have a slack channel where we share, learn and connect on a daily basis. Check out our Opensource datasets here .
You can also join the drive to create open-source databases by clicking the buttons below:

You can also catch up on what transpired during the VOICE TECH MEETUP by clicking the  PDF file below.