Deep learning-based part-of-speech tagging of the Ethiopic language
Deep Neural Networks have demonstrated the great efficiency in many NLP tasks for various languages. Unfortunately, some resource-scarce languages as, e.g., Tigrinya still receive too little attention, therefore many NLP applications as part-of-speech tagging are still in their early stages. Consequently, the main objective of this research is to offer the effective part-of-speech tagging solutions for the Tigrinya language having rather small training corpus. In this paper the Deep Neural Network classifier, (i.e., Feed Forward Neural Network Long Short Term Memory, Bidirectional LSTM and Convolutional Neural Network) are investigated by applying them on a top of separately trained distributional neural Word2Vec embeddings. Seeking for the most accurate solutions, DNN models are optimized manually and automatically. Despite automatic hyper- parameter optimization demonstrates a good performance with the Convolutional Neural Network, the manually tested Bidirectional Long Short – Term Memory method achieves the highest overall accuracy equal to 91%.