LANGUAGE IN INDIA

Strength for Today and Bright Hope for Tomorrow

Volume 10 : 10 October 2010
ISSN 1930-2940

Managing Editor: M. S. Thirumalai, Ph.D.
Editors: B. Mallikarjun, Ph.D.
         Sam Mohanlal, Ph.D.
         B. A. Sharada, Ph.D.
         A. R. Fatihi, Ph.D.
         Lakhan Gusain, Ph.D.
         K. Karunakaran, Ph.D.
         Jennifer Marie Bayer, Ph.D.
         S. M. Ravichandran, Ph.D.
         G. Baskaran, Ph.D.

HOME PAGE

AN APPEAL FOR SUPPORT

We seek your support to meet the expenses relating to the formatting of articles and books, maintaining and running the journal through hosting, correrspondences, etc.Please write to the Editor in his e-mail address languageinindiaUSA@gmail.com to find out how you can support this journal. Thank you. Thirumalai, Editor.

BOOKS FOR YOU TO READ AND DOWNLOAD FREE!

Development of a Hindi to Punjabi Machine Translation System, A Doctoral Dissertation ... Vishal Goyal, Ph.D.

A Report on the State of Urdu Literacy in India, 2010 ...
Omar Khalidi, Ph.D.

English for Medical Students of Hodeidah University, Yemen - A Pre-sessional Course ...
Arif Ahmed Mohammed Hassan Al-Ahdal, Ph.D. Scholar

Global Perspective of Teaching English Literature in Higher Education in Pakistan ...
Rabiah Rustam, M.S., Ph.D. Candidate

Improving Chemmozhi Learning and Teaching - Descriptive Studies in Classical-Modern Tamil Grammar ...
A. Boologa Rambai, Ph.D.

A Phonetic and Phonological Study of the Consonants of English and Arabic ...
Abdulghani A. Al-Hattami, Ph.D. Candidate

Some Aspects of Teaching-Learning English as a Second Language ...
R. Krishnaveni, M.A., M.Sc., M.Phil., Ph.D. Candidate

The Influence of First Language Grammar (L1) on the English Language (L2) Writing of Tamil School Students: A Case Study from Malaysia ...
Mahendran Maniam, Ph.D. (ESL)

Economics of Crime : A Comparative Analysis of the Socio-Economic Conditions of Convicted Female and Male Criminality In Selected Prisons in Tamil Nadu ...
S. Santhanalakshmi, Ph.D.

Technique as Voyage of Discovery: A Study of the Techniques in Dante's Paradiso ...
Raji Narasimhan, M.A.

A Critical Study of The Wasteland - Poetry as Metaphor ...
K. R. Vijaya, M.A., M.Phil.

Language and Literature: An Exposition - Papers Presented in the Karunya University National Seminar ...
Editor: J. Sundar Singh, Ph.D.

Purism and Language Planning in a Multilingual Context ...
L. Ramamoorthy, Ph.D.

Papers Presented in the All-India Conference on Multimedia Enhanced Language Teaching - MELT 2009 ...
L. Ramamoorthy, Ph.D. and J.R. Nirmala, Ph.D.

A Phonological Study of Variety of English Spoken by Oriya Speakers in Western Orissa - A Doctoral Dissertation ... Arun K. Behera, Ph.D.

Phonological Analysis of English Phonotactics of Syllable Initial and Final Consonant Clusters by Yemeni Speakers of English ... Abdulghani. M. A. Al-Shuaibi, M.A.

A Study of Structural Duplication in Tamil and Telugu - A Doctoral Dissertation ... Parimalagantham, Ph.D.

The Politics of Survival in the Novels of Margaret Atwood ... Pauline Das, Ph.D.

Nonverbal Communication in Tamil Novels - A Book in Tamil ... M. S. Thirumalai, Ph.D.

Girish Karnad as a Modern Indian Dramatist - A Study ...
B. Reena, M.A., M.Phil.

A Study of English Loan Words in Selected Bahasa Melayu Newspaper Articles...
Shamimah Binti Haja Mohideen, M.HSc. (TESL)

The Internal Landscape and the Existential Agony of Women in Anjana Appachana�s Novel LISTENING NOW, A Doctoral Dissertation ...
M. Poonkodi, Ph.D.

Trends and Spatial Patterns of Crime in India - A Case Study of a District in India ...
M. Jayamala,, Ph.D.

The Trading Community in Early Tamil Society Up To 900 AD ...
R. Jeyasurya, M.A., M.Phil., Ph.D.

A Study of Auxiliaries in the Old and the Middle Tamil ...
A.Boologarambai, M.A., Ph.D.

History of Growth and Reforms of British Military Administration in India, 1848-1949 ...
Hemalatha, M.A., M.Phil.

Language of Mass Media: A Study Based on Malayalam Broadcasts - A Doctoral Dissertation ...
K. Parameswaran, Ph.D.

Form and Function of Disorders in Verbal Narratives - A Doctoral Dissertation ...
Kandala Srinivasacharya, Ph.D.

Status Marking in Tamil - A Ph.D. Dissertation ...
P. Perumalsamy, Ph.D.

LANGUAGE AND POWER IN COMMUNICATION ...
Editors: Jennifer M. Bayer, Ph.D., and Pushpa Pai, Ph.D.

Onomatopoeia in Tamil ...
V. Gnanasundaram, Ph.D.

Linguistics and Literature ...
C.Shunmugom, Ph.D., and C. Sivashanmugam, Ph.D., V. Thayalan, Ph.D. and C. Sivakumar, Ph.D. (Editors)

Translation: New Dimensions ...
C.Shunmugom, Ph.D., and C. Sivashanmugam, Ph.D., Editors

Language of Headlines in Kannada Dailies ...
M. N. Leelavathi, Ph.D.

Cooperative Learning Incorporating Computer-Mediated Communication: Participation, Perceptions, and Learning Outcomes in a Deaf Education Classroom ...
Michelle Pandian, M.S.

The Effects of Age on the Ability to Learn English As a Second Language ...
Mariam Dadabhai, B.A. Hons.

A STUDY OF THE SKILLS OF READING COMPREHENSION IN ENGLISH DEVELOPED BY STUDENTS OF STANDARD IX IN THE SCHOOLS IN TUTICORIN DISTRICT, TAMILNADU ...
A. Joycilin Shermila, Ph.D.

A Socio-Pragmatic Comparative Study of Ostensible Invitations in English and Farsi ...
Mohammad Ali Salmani-Nodoushan, Ph.D.

ADVANCED WRITING - A COURSE TEXTBOOK ...
Parviz Birjandi, Ph.D.
Seyyed Mohammad Alavi, Ph.D.
Mohammad Ali Salmani-Nodoushan, Ph.D.

TEXT FAMILIARITY, READING TASKS, AND ESP TEST PERFORMANCE: A STUDY ON IRANIAN LEP AND NON-LEP UNIVERSITY STUDENTS - A DOCTORAL DISSERTATION ...
Mohammad Ali Salmani-Nodoushan, Ph.D.

A STUDY ON THE LEARNING PROCESS OF ENGLISH
BY HIGHER SECONDARY STUDENTS
WITH SPECIAL REFERENCE TO DHARMAPURI DISTRICT IN TAMILNADU ...
K. Chidambaram, Ph.D.

SPEAKING STRATEGIES TO OVERCOME COMMUNICATION DIFFICULTIES IN THE TARGET LANGUAGE SITUATION - BANGLADESHIS IN NEW ZEALAND ...
Harunur Rashid Khan

THE PROBLEMS IN LEARNING MODAL AUXILIARY VERBS IN ENGLISH AT HIGH SCHOOL LEVEL ...
Chandra Bose, Ph.D. Candidate

THE ROLE OF VISION IN LANGUAGE LEARNING
- in Children with Moderate to Severe Disabilities ...
Martha Low, Ph.D.

SANSKRIT TO ENGLISH TRANSLATOR ...
S. Aparna, M.Sc.

A LINGUISTIC STUDY OF ENGLISH LANGUAGE CURRICULUM AT THE SECONDARY LEVEL IN BANGLADESH - A COMMUNICATIVE APPROACH TO CURRICULUM DEVELOPMENT by
Kamrul Hasan, Ph.D.

COMMUNICATION VIA EYE AND FACE in Indian Contexts by
M. S. Thirumalai, Ph.D.

COMMUNICATION
VIA GESTURE: A STUDY OF INDIAN CONTEXTS by M. S. Thirumalai, Ph.D.

CIEFL Occasional
Papers in Linguistics,
Vol. 1

Language, Thought
and Disorder - Some Classic Positions by
M. S. Thirumalai, Ph.D.

English in India:
Loyalty and Attitudes
by Annika Hohenthal

Language In Science
by M. S. Thirumalai, Ph.D.

Vocabulary Education
by B. Mallikarjun, Ph.D.

A CONTRASTIVE ANALYSIS OF HINDI
AND MALAYALAM
by V. Geethakumary, Ph.D.

LANGUAGE OF ADVERTISEMENTS
IN TAMIL
by Sandhya Nayak, Ph.D.

An Introduction to TESOL:
Methods of Teaching English
to Speakers of Other Languages
by M. S. Thirumalai, Ph.D.

Transformation of
Natural Language
into Indexing Language:
Kannada - A Case Study
by B. A. Sharada, Ph.D.

How to Learn
Another Language?
by M.S.Thirumalai, Ph.D.

Verbal Communication
with CP Children
by Shyamala Chengappa, Ph.D.
and M.S.Thirumalai, Ph.D.

Bringing Order
to Linguistic Diversity
- Language Planning in
the British Raj by
Ranjit Singh Rangila,
M. S. Thirumalai,
and B. Mallikarjun

REFERENCE MATERIAL

UNIVERSAL DECLARATION OF LINGUISTIC RIGHTS

Lord Macaulay and
His Minute on
Indian Education

In Defense of
Indian Vernaculars
Against
Lord Macaulay's Minute
By A Contemporary of
Lord Macaulay

Languages of India,
Census of India 1991

The Constitution of India:
Provisions Relating to
Languages

The Official
Languages Act, 1963
(As Amended 1967)

Mother Tongues of India,
According to
1961 Census of India

BACK ISSUES

FROM MARCH 2001

E-mail your articles and book-length reports in Microsoft Word to languageinindiaUSA@gmail.com.
Contributors from South Asia may e-mail their articles to
B. Mallikarjun,
Central Institute of Indian Languages,
Manasagangotri,
Mysore 570006, India mallikarjun@ciil.stpmy.soft.net.

PLEASE READ THE GUIDELINES GIVEN IN HOME PAGE IMMEDIATELY AFTER THE LIST OF CONTENTS.

Your articles and booklength reports should be written following the APA, MLA, LSA, or IJDL Stylesheet.

The Editorial Board has the right to accept, reject, or suggest modifications to the articles submitted for publication, and to make suitable stylistic adjustments. High quality, academic integrity, ethics and morals are expected from the authors and discussants.

Copyright © 2010
M. S. Thirumalai

Development of a Hindi to Punjabi
Machine Translation System
A Doctoral Dissertation
Vishal Goyal, Ph.D.

Abstract

Machine Translation is a task of automatic translation a text from one natural language to another. Even after more than 60 years of research, Machine Translation is still an open problem. Work for the development of Machine Translation systems for Indian languages is still in infancy. This research work is an attempt to develop a Machine Translation system from Hindi to Punjabi language. A number of Machine Translation systems have already been developed though their accuracy needs to be improved. Machine Translation is not a trivial task by nature of translation process itself. But Machine Translation of closely related languages eases the task. We call a language pair to be closely related if the languages have the grammar that is close in structure, contain similar constructs having almost same semantics, and share a great deal of lexicon. By closely related languages, we also mean in?ectively and morphosyntactically similar languages. Some linguist define closeness between the languages on the basis of features viz. common root, similar alphabets, similar verb patterns, structural similarity, similar grammar, similar religio-cultural and demograpohic contexts and references, a similar clearly displayed ability to blend with foreign tongues . Generally, such languages have originated from the same source and spoken in the areas in close proximity. Hindi and Punjabi belong to same sub group of the Indo European family, thus are sibling languages. It has been analysed that Hindi and Punjabi languages share all features of closely related languages. For such closely related sibling languages, effective word for word translation can be achieved (Hajic et al., 2000) [90]. Thus for our system, Direct Machine Translation approach which seems promising approach has been used.

The challenges in deleveloping Hindi to Punjabi Machine Translation system lie with major problems mainly related to the non-availability of lexical resources, spelling variations, word sense disambiguation, transliteration, named entity recognition and collocations.

Synopsis

This research work addresses the problems in the various stages of the development of a complete Hindi to Punjabi Machine Translation system and discusses potential solutions. The thesis has been divided into eight chapters.

The first chapter of the thesis introduces general concept of Machine Translation, various approaches to Machine Translation systems and key activities involved in Machine Translation. It also provides a formal description about the research question undertaken for this study. The objectives, need, and scope of the study have also been discussed. Then some of the key application areas of Machine Translation system are explored. Afterwards, the approach followed along with the reasons behind its selection to solve this research problem has been explained in brief. An overview of the design of the Machine Translation system undertaken to develop in this research work is provided later. The chapter concludes by presenting major contributions of this research work and an outline of the study.

Chapter 2 discusses the existing work in the field of Machine Translation in India and outside India. This chapter on literature survey forms the basis of our work on developing the Machine Translation system and later on helps us in comparing our work with the existing state of the art in Machine Translation system.

Chapter 3 explains and compares Hindi and Punjabi languages with respect to orthography, grammar, and Machine Translation.

Chapters 4 and 5 provide the design and implementation details of various activities involved in the Machine Translation system. Chapter 4 describes the system architecture and preprocessing stage. The chapter starts with the choice of approach and discusses the motivation behind its selection. Then the required resources are discussed followed by description of system architecture. The details of preprocessing phase which involves text normalization, Identifying Collocations, Identifying Proper Nouns are discussed. Then tokenization process is explained. The details of the translation system involving the identifying titles, identifying surnames, lexicon lookup, word sense disambiguation module, transliteration module and post processing modules are discussed in Chapter 5.

Chapter 6 describes the post processing stage of the system. Chapter 7 provides the evaluation of the system and its results. Chapter 8 concludes this thesis by providing a summary of the research work undertaken, contributions of this research work, limitations, and some directions in which this work could be extended in the future. In appendix A, the interface designed for text translation, website translation and email translation has been discussed. Test data set for intelligibility test and accuracy test is available in Appendix B and C respectively. The system has been rigorously evaluated and its accuracy has been found to be 94% on the basis of intelligibility test and 90.84% on the basis of accuracy test.

This is only the beginning part of the Dissertation. PLEASE CLICK HERE TO READ THE ENTIRE DISSERTATION IN PRINTER-FRIENDLY VERSION.

English Loanwords in Meiteiron A Linguistic and Sociolinguistic Analysis | A Report on the State of Urdu Literacy in India, 2010 | More Than Meets the Eye Reasons Behind Asian Students' Perceived Passivity in the ESL/EFL Classroom | English for Medical Students of Hodeidah University, Yemen - A Pre-sessional Course | Education as an Indicator for Human Resource Development | Representation of Malaysian Women in Politics | A Modern Approach to Application of Abbreviation and Acronym Strategy for Vocabulary Learning in Second/Foreign Language Learning Procedure | Causes of Social Acceptance of "O" and "A" Level Education System in Pakistan | Pronounce Foreign Words the English way! | Dubhashi and the Colonial Port in Madras Presidency | An Investigation of Davis' Translation of SHAHNAMEH - Rostam and Sohrab Story in Focus | Feminine, Female and Feminist - A Critical Spectrum on Selected Novels by Kamala Markandaya, Shahsi Deshpande and Arundhati Roy | Four-letter Words and the Urdu Learner's Dictionaries in Pakistan | Margaret Atwood's The Blind Assassin - A Study of the Impact of War on Historical and Economic Aspects of the Society | Was Gandhi a True Mahatma? | Omani Women
Are Their Language Skills Good Enough for the Workplace? | Spread of English Globalisation Threatens English Language Teaching (ELT) in Pakistan | Multiple Intelligences, Blended Learning and the English Teacher | A Micro-Case Study of Vocabulary Acquisition among First Year Engineering Students | Imagery of Wilderness in Margaret Hollingsworth's Islands | The Influence of Learning Environment on Learners' Attitude in a Foreign Language Setting | Caste - Gender Ideology in Gundert's Malayalam-English Dictionary | Development of a Hindi to Punjabi Machine Translation System - A Doctoral Dissertation | A PRINT VERSION OF ALL THE PAPERS OF OCTOBER, 2010 ISSUE IN BOOK FORMAT. | HOME PAGE of October 2010 Issue | HOME PAGE | CONTACT EDITOR languageinindiaUSA@gmail.com

Vishal Goyal, Ph.D.
Department of Computer Science
Punjabi University, Patiala
Punjab, India

Send your articles
as an attachment
to your e-mail to
languageinindiaUSA@gmail.com.
Please ensure that your name, academic degrees, institutional affiliation and institutional address, and your e-mail address are all given in the first page of your article. Also include a declaration that your article or work submitted for publication in LANGUAGE IN INDIA is an original work by you and that you have duly acknolwedged the work or works of others you either cited or used in writing your articles, etc. Remember that by maintaining academic integrity we not only do the right thing but also help the growth, development and recognition of Indian scholarship.

LANGUAGE IN INDIA

Strength for Today and Bright Hope for Tomorrow

Volume 10 : 10 October 2010 ISSN 1930-2940

Managing Editor: M. S. Thirumalai, Ph.D. Editors: B. Mallikarjun, Ph.D. Sam Mohanlal, Ph.D. B. A. Sharada, Ph.D. A. R. Fatihi, Ph.D. Lakhan Gusain, Ph.D. K. Karunakaran, Ph.D. Jennifer Marie Bayer, Ph.D. S. M. Ravichandran, Ph.D. G. Baskaran, Ph.D.

Development of a Hindi to Punjabi Machine Translation System A Doctoral Dissertation Vishal Goyal, Ph.D.

Volume 10 : 10 October 2010
ISSN 1930-2940

Managing Editor: M. S. Thirumalai, Ph.D.
Editors: B. Mallikarjun, Ph.D.
Sam Mohanlal, Ph.D.
B. A. Sharada, Ph.D.
A. R. Fatihi, Ph.D.
Lakhan Gusain, Ph.D.
K. Karunakaran, Ph.D.
Jennifer Marie Bayer, Ph.D.
S. M. Ravichandran, Ph.D.
G. Baskaran, Ph.D.

Development of a Hindi to Punjabi
Machine Translation System
A Doctoral Dissertation
Vishal Goyal, Ph.D.