LANGUAGE IN INDIA

Strength for Today and Bright Hope for Tomorrow

Volume 11 : 5 May 2011
ISSN 1930-2940

Managing Editor: M. S. Thirumalai, Ph.D.
Editors: B. Mallikarjun, Ph.D.
         Sam Mohanlal, Ph.D.
         B. A. Sharada, Ph.D.
         A. R. Fatihi, Ph.D.
         Lakhan Gusain, Ph.D.
         Jennifer Marie Bayer, Ph.D.
         S. M. Ravichandran, Ph.D.
         G. Baskaran, Ph.D.
         L. Ramamoorthy, Ph.D.


HOME PAGE



BOOKS FOR YOU TO READ AND DOWNLOAD FREE!


REFERENCE MATERIAL

BACK ISSUES


  • E-mail your articles and book-length reports in Microsoft Word to languageinindiaUSA@gmail.com.
  • Contributors from South Asia may e-mail their articles to
    B. Mallikarjun,
    Central Institute of Indian Languages,
    Manasagangotri,
    Mysore 570006, India
    mallikarjun@ciil.stpmy.soft.net.
  • PLEASE READ THE GUIDELINES GIVEN IN HOME PAGE IMMEDIATELY AFTER THE LIST OF CONTENTS.
  • Your articles and book-length reports should be written following the APA, MLA, LSA, or IJDL Stylesheet.
  • The Editorial Board has the right to accept, reject, or suggest modifications to the articles submitted for publication, and to make suitable stylistic adjustments. High quality, academic integrity, ethics and morals are expected from the authors and discussants.

Copyright © 2010
M. S. Thirumalai


Custom Search

Text Extraction for an Agglutinative Language

Sankar K, Vijay Sundar Ram R and Sobha Lalitha Devi


Abstract

The paper proposes an efficient algorithm for sentence ranking based on a graph theoretic ranking model applied to text summarization task. Our approach employs word frequency statistics and a word positional and string pattern based weight calculation for weighing the sentence and to rank the sentences. Here we have worked for a highly agglutinative and morphologically rich language, Tamil.

I. INTRODUCTION

The enormous and on-going increase of digital data in internet, pressurize the NLP community to come up with a highly efficient automated text summarization tools. The research on text summarization is boosted by the various shared tasks such as TIPSTER SUMMAC Text Summarization Evaluation task, Document Understanding conference (DUC 2001 to 2007) and Text Analysis conferences.

A variety of automated summarization schemes have been proposed recently. NeATS [4] is a sentence position, term frequency, topic signature and term clustering based approach and MEAD [10] is a centroid based approach. Iterative graph based Ranking algorithms, such as Kleinberg’s HITS algorithm [3] and Google’s PageRank [1] have been successfully used in web-link analysis, social networks and more recently in text processing applications [8], [7], [2] and [9]. These iterative approaches have a high time complexity and are practically slow in dynamic summarization. The works done in Text Extraction for Indian languages is comparatively less.

In this paper we have discussed a novel automatic and unsupervised graph based ranking algorithm, which gives improved results compared to other ranking algorithms in the context of the text summarization task. Here we have worked for Tamil.


This is only the beginning part of the article. PLEASE CLICK HERE TO READ THE ARTICLE IN PRINTER-FRIENDLY VERSION.


Sankar K, Vijay Sundar Ram R., and Sobha Lalitha Devi
AU-KBC Research Centre
MIT Campus of Anna University
Chennai
Tamilnadu
India

Custom Search


  • Click Here to Go to Creative Writing Section

  • Send your articles
    as an attachment
    to your e-mail to
    languageinindiaUSA@gmail.com.
  • Please ensure that your name, academic degrees, institutional affiliation and institutional address, and your e-mail address are all given in the first page of your article. Also include a declaration that your article or work submitted for publication in LANGUAGE IN INDIA is an original work by you and that you have duly acknowledged the work or works of others you either cited or used in writing your articles, etc. Remember that by maintaining academic integrity we not only do the right thing but also help the growth, development and recognition of Indian scholarship.