LANGUAGE IN INDIA

Strength for Today and Bright Hope for Tomorrow

Volume 15:11 November 2015
ISSN 1930-2940

Managing Editor: M. S. Thirumalai, Ph.D.
Editors: B. Mallikarjun, Ph.D.
         Sam Mohanlal, Ph.D.
         B. A. Sharada, Ph.D.
         A. R. Fatihi, Ph.D.
         Lakhan Gusain, Ph.D.
         Jennifer Marie Bayer, Ph.D.
         G. Baskaran, Ph.D.
         L. Ramamoorthy, Ph.D.
         C. Subburaman, Ph.D. (Economics)
         N. Nadaraja Pillai, Ph.D.
         Soibam Rebika Devi, M.Sc., Ph.D.
Assistant Managing Editor: Swarna Thirumalai, M.A.

HOME PAGE

Click Here for Back Issues of Language in India - From 2001




BOOKS FOR YOU TO READ AND DOWNLOAD FREE!


REFERENCE MATERIALS

BACK ISSUES


  • E-mail your articles and book-length reports in Microsoft Word to languageinindiaUSA@gmail.com.
  • PLEASE READ THE GUIDELINES GIVEN IN HOME PAGE IMMEDIATELY AFTER THE LIST OF CONTENTS.
  • Your articles and book-length reports should be written following the APA, MLA, LSA, or IJDL Stylesheet.
  • The Editorial Board has the right to accept, reject, or suggest modifications to the articles submitted for publication, and to make suitable stylistic adjustments. High quality, academic integrity, ethics and morals are expected from the authors and discussants.

Copyright © 2015
M. S. Thirumalai


Custom Search

A Coherent Scrutinization on Syntactic Categories for Tagging
Tamil Lexicon

Dr. (Mrs.) Ananthi Sheshasaayee, MCA., M.Phil., Ph.D.
Angela Deepa.V.R., M.Sc., B.Ed.


Abstract

The arrangement of words based on rules is termed as Syntax. Natural languages have their renowned syntactic rules that demonstrate their latent features. It is attributed in a form of free word order and some have conditions on the word order arrangement. As a consequence, the smallest unit in a sentence called word or lexicon has its unique function which determines the nature of the sentence. The categorized groups of functionalities of the words are termed as syntactic categories. The syntactic categories are also termed as Parts of Speech. Numerous NLP application benefits from this syntactic information, but for morphological rich languages like Tamil, the problem of tagging the every word in a particular part of speech remain a exigent task. This paper reports about the various approaches used for developing POS tagging and the developed POS taggers particularly for the Tamil language is discussed.

Keywords: Tag Set, Suffix, Prefix, Parts-of-Speech, Tagging, Morphological Analysis, Hidden Markov Model (Hmm).

Introduction

The importance of parts-of-speech for language processing is about the detailed information it gives to the word and their neighbors. It is also termed as POS, word classes, morphological classes and lexical tags. The computational methods used in assigning parts-of-speech categories of words are termed as parts-of-speech tagging. Syntactic categories or parts-of-Speech tagging is defined as the process of marking the word [1] in a text in a particular part of speech according to a context. This plays a predominant role and serves as a preprocessing step in most of the NLP applications like information retrieval, Word disambiguation, Speech recognition, Machine translation, Name entity recognition, Text to speech, etc. Since numerous NLP applications rely on the syntactic categorical information, the need for developing an efficient POS tagging is important. Although the tagging of Indian languages gained interest in recent times the usage of tag sets by different research scholars leads to a chaotic situation. Standardization is the only dimension that can solve this discrepancy. Dravidian languages like Tamil are morphological rich in content and agglutinative in grammatical nature. Deep analysis is required at appropriate levels [1] to understand the feature of the languages. `


This is only the beginning part of the article. PLEASE CLICK HERE TO READ THE ENTIRE ARTICLE IN PRINTER-FRIENDLY VERSION.


Dr. (Mrs.) Ananthi Sheshasaayee, MCA, M.Phil., Ph.D.
Research Supervisor
PG&Research Department of Computer Science
Quaid-E- Millath Government College for Women
Chennai -600002
Tamilnadu
India
ananthi.research@gmail.com

Angela Deepa.V.R., M.Sc., B.Ed.
Research Scholar
PG&Research Department of Computer Science
Quaid-E- Millath Government College for Women
Chennai -600002
Tamilnadu
India
angelrajan.research@gmail.com


Custom Search


  • Click Here to Go to Creative Writing Section

  • Send your articles
    as an attachment
    to your e-mail to
    languageinindiaUSA@gmail.com.
  • Please ensure that your name, academic degrees, institutional affiliation and institutional address, and your e-mail address are all given in the first page of your article. Also include a declaration that your article or work submitted for publication in LANGUAGE IN INDIA is an original work by you and that you have duly acknowledged the work or works of others you used in writing your articles, etc. Remember that by maintaining academic integrity we not only do the right thing but also help the growth, development and recognition of Indian/South Asian scholarship.