HOME PAGE
AN APPEAL FOR SUPPORT
- We seek your support to meet expenses relating to some new and essential software, formatting of articles and books, maintaining and running the journal through hosting, correrspondences, etc. You can use the PAYPAL link given above. Please click on the PAYPAL logo, and it will take you to the PAYPAL website. Please use the e-mail address thirumalai@mn.rr.com to make your contributions using PAYPAL.
Also please use the AMAZON link to buy your books. Even the smallest contribution will go a long way in supporting this journal. Thank you. Thirumalai, Editor.
BOOKS FOR YOU TO READ AND DOWNLOAD FREE!
- A STUDY OF THE SKILLS OF READING
COMPREHENSION IN ENGLISH DEVELOPED BY STUDENTS OF STANDARD IX IN THE SCHOOLS IN TUTICORIN DISTRICT, TAMILNADU ...
A. Joycilin Shermila, Ph.D.
- A Socio-Pragmatic Comparative Study of Ostensible Invitations in English and Farsi ...
Mohammad Ali Salmani-Nodoushan, Ph.D.
- ADVANCED WRITING - A COURSE TEXTBOOK ...
Parviz Birjandi, Ph.D. Seyyed Mohammad Alavi, Ph.D. Mohammad Ali Salmani-Nodoushan, Ph.D.
- TEXT FAMILIARITY, READING TASKS, AND ESP TEST PERFORMANCE: A STUDY ON IRANIAN LEP AND NON-LEP UNIVERSITY STUDENTS - A DOCTORAL DISSERTATION ...
Mohammad Ali Salmani-Nodoushan, Ph.D.
- A STUDY ON THE LEARNING PROCESS OF ENGLISH
BY HIGHER SECONDARY STUDENTS WITH SPECIAL REFERENCE TO DHARMAPURI DISTRICT IN TAMILNADU ... K. Chidambaram, Ph.D.
- SPEAKING STRATEGIES TO OVERCOME COMMUNICATION
DIFFICULTIES IN THE TARGET LANGUAGE SITUATION - BANGLADESHIS IN NEW ZEALAND ...
Harunur Rashid Khan
- THE PROBLEMS IN LEARNING MODAL AUXILIARY VERBS IN ENGLISH AT HIGH SCHOOL LEVEL ...
Chandra Bose, Ph.D. Candidate
- THE ROLE OF VISION IN LANGUAGE LEARNING
- in Children with Moderate to Severe Disabilities ... Martha Low, Ph.D.
- SANSKRIT TO ENGLISH TRANSLATOR ...
S. Aparna, M.Sc.
- A LINGUISTIC STUDY OF ENGLISH LANGUAGE CURRICULUM AT THE SECONDARY LEVEL IN BANGLADESH - A COMMUNICATIVE APPROACH TO CURRICULUM DEVELOPMENT by
Kamrul Hasan, Ph.D.
- COMMUNICATION VIA EYE AND FACE in Indian Contexts by
M. S. Thirumalai, Ph.D.
- COMMUNICATION
VIA GESTURE: A STUDY OF INDIAN CONTEXTS by M. S. Thirumalai, Ph.D.
- CIEFL Occasional
Papers in Linguistics, Vol. 1
- Language, Thought
and Disorder - Some Classic Positions by M. S. Thirumalai, Ph.D.
- English in India:
Loyalty and Attitudes by Annika Hohenthal
- Language In Science
by M. S. Thirumalai, Ph.D.
- Vocabulary Education
by B. Mallikarjun, Ph.D.
- A CONTRASTIVE ANALYSIS OF HINDI
AND MALAYALAM by V. Geethakumary, Ph.D.
- LANGUAGE OF ADVERTISEMENTS
IN TAMIL by Sandhya Nayak, Ph.D.
- An Introduction to TESOL:
Methods of Teaching English to Speakers of Other Languages by M. S. Thirumalai, Ph.D.
- Transformation of
Natural Language into Indexing Language: Kannada - A Case Study by B. A. Sharada, Ph.D.
- How to Learn
Another Language? by M.S.Thirumalai, Ph.D.
- Verbal Communication
with CP Children by Shyamala Chengappa, Ph.D. and M.S.Thirumalai, Ph.D.
- Bringing Order
to Linguistic Diversity - Language Planning in the British Raj by Ranjit Singh Rangila, M. S. Thirumalai, and B. Mallikarjun
REFERENCE MATERIAL
BACK ISSUES
- E-mail your articles and book-length reports (preferably in Microsoft Word) to thirumalai@mn.rr.com.
- Contributors from South Asia may send their articles to
B. Mallikarjun, Central Institute of Indian Languages, Manasagangotri, Mysore 570006, India or e-mail to mallikarjun@ciil.stpmy.soft.net
- Your articles and booklength reports should be written following the MLA, LSA, or IJDL Stylesheet.
- The Editorial Board has the right to accept, reject, or suggest modifications to the articles submitted for publication, and to make suitable stylistic adjustments. High quality, academic integrity, ethics and morals are expected from the authors and discussants.
Copyright © 2004 M. S. Thirumalai
|
PARSING IN TAMIL: PRESENT STATE OF ART
S. Rajendran, Ph.D.
Parsing
Parsing is actually related to the automatic analysis of texts according to a grammar. Technically, it is used to refer to practice of assigning syntactic structure to a text. It is usually performed after basic morphosyntactic categories have been identified in a text. Based on different grammars parsing brings these morphosyntactic categories into higher-level syntactic relationships with one another. The survey of the state of art of parsing in Tamil reflects upon the global scenario. More or less the trends of the global arena in natural language processing are very much represented in Tamil too.
Overview of the Global Scenario
We try to understand larger textual units by combining our understanding of smaller ones. The linguistic theory aims to show how these larger units of meaning arise out of the combination of the smaller ones. This is modeled by means of a grammar. Computational linguistics then tries to implement this process in an efficient way. Traditionally the task is to subdivide into syntax and semantics; syntax describes how the different formal elements of a textual unit, most often the sentence, can be combined; semantics describes how the interpretation is calculated. In most language technology applications the encoded linguistic knowledge, i.e., the grammar, is separated from the processing components. The grammar consists of a lexicon, and rules that syntactically and semantically combine words and phrases into larger phrases and sentences.
A variety of representation languages have been developed for the encoding of linguistic knowledge. Some of these languages are more geared towards conformity with formal linguistic theories, others are designed to facilitate certain processing models or specialized applications. Several language technology products on the market today employ annotated phrase-structure grammars, grammars with several hundreds or thousands of rules describing different phrase types. Each of these rules is annotated by features, and sometimes also by expressions, in a programming language.
Current Research
In current research, a certain polarization has taken place. Very simple grammar models are employed, e.g., different kinds of finite-state grammars that support highly efficient processing. Some approaches do away with grammars altogether and use statistical methods to find basic linguistic patterns. On the other end of the scale, we find a variety of powerful linguistically sophisticated representation formalisms that facilitate grammar engineering. The most prevalent family of grammar formalisms currently used in computational linguistics is constraint based.
Morphological Analysis in Tamil
Tamil is a Dravidian language. It is a verb final, relatively free-word order and morphologically rich language. Like other Dravidian languages, Tamil is agglutinative. Computationally, each root word can take a few thousand inflected word-forms, out of which only a few hundred will exist in a typical corpus. Subject-verb argument is required for the grammaticality of a Tamil sentence. Tamil allows subject and object drop as well as verb less sentences. In addition, the subject of a sentence or a clause can be a possessive Noun Phrase (NP) or an NP in nominative or dative case. As Tamil is an agglutinative language, each root word can combine with multiple morphemes to generate word forms. For the purpose of analysis of such inflectionally rich languages, the root and the morphemes of each word has to be identified.
The global scenario has influenced the morphological analysis of Tamil. In the last decade, computational morphology has advanced further towards real-life applications than most other subfields of natural language processing. To build a syntactic representation of the input sentence, a parser must map each word in the text to some canonical representation and recognize its morphological properties. The combination of a surface form and its analysis as a canonical form and inflection is called a lemma. The main problems are:
- Morphological alternations: the same morpheme may be realized in different ways depending on the context.
- Morphotactics: stems, affixes, and parts of compounds do not combine freely, a morphological analyzer needs to know what arrangements are valid.
PLEASE CLICK HERE TO READ THE ENTIRE ARTICLE IN A PRINTER-FRIENDLY VERSION.
S. Rajendran
Communication Across Castes | The Hells Envisioned in the Divine Comedy and Bhagavtam | Telugu Parts of Speech Tagging in WSD | Practicing Literary Translation: A Symposium Round 10 | The Effectiveness of Genre-based Approach to Develop Writing Skills of Adult Learners and Its Significance for Designing a Syllabus | Structural Predictability of Malayalam Riddles | Parsing in Tamil - Present State of Art | HOME PAGE OF AUGUST 2006 ISSUE | HOME PAGE | CONTACT EDITOR
S. Rajendran, Ph.D.
Department of Linguistics
Tamil University
Thanjavur 613 005
Tamilnadu, India
raj_ushush@ yahoo.com
|
- Send your articles
as an attachment to your e-mail to thirumalai@mn.rr.com.
- Please ensure that your name, academic degrees, institutional affiliation and institutional address, and your e-mail address are all given in the first page of your article. Also include a declaration that your article or work submitted for publication in LANGUAGE IN INDIA is an original work by you and that you have duly acknolwedged the work or works of others you either cited or used in writing your articles, etc. Remember that by maintaining academic integrity we not only do the right thing but also help the growth, development and recognition of Indian scholarship.
|