Handwritten Tamil Word Pre-Processing and Segmentation Based on NLP Using Deep Learning Techniques

Authors

  • Dr. R. Kishore Kanna Computer Science & Communication Engineering;Signal Processing;Bio-Engineering and Technology Vels Institute of Science Technology and Advanced Studies
  • Iskandar Muda Management Universitas Sumatera Utara https://orcid.org/0000-0001-6478-9934
  • Dr. S. Ramachandran Mechanical and Electrical Engineering Department of Electrical and Electronics Engineering, Paavai Engineering college https://orcid.org/0000-0002-1579-4742

Keywords:

Tamil language, Image Processing, pre-processing, segmentation, Alex Net-CNN.

Abstract

Tamil is a traditional Indian language spoken mostly among South Indians, SriLankans, as well as Malaysians. This paper proposed the novel techniques based on pre-processing and segmentation of handwritten Tamil words through NLP using threshold value based RGB image conversion to grayscale image. Then to segment this image based on line boundary detection with Alex Net based Convolutional neural network (Alex Net- CNN) in deep learning architecture. Every text is scaled in to needed pixel in the suggested system, that is then exposed to be trained. – i.e., every scaled word contains a set pixel count, which are used to train networks. The findings reveal that proposed method achieved better detection accuracy in written vocabulary knowledge that are equivalent to features extraction techniques. For numerous pictures, a descriptive analysis was performed in terms of effectiveness, accuracy, recollect, and F1 measure.

Downloads

Published

2022-06-30

How to Cite

Kanna, D. R. K., Muda, I., & Ramachandran, D. S. (2022). Handwritten Tamil Word Pre-Processing and Segmentation Based on NLP Using Deep Learning Techniques. Research Journal of Computer Systems and Engineering, 3(1), 35–42. Retrieved from https://vit.technicaljournals.org/index.php/rjcse/article/view/60