LANGUAGE IN INDIA

Strength for Today and Bright Hope for Tomorrow

Volume 4 : 2 February 2004

Editor: M. S. Thirumalai, Ph.D.
Associate Editors: B. Mallikarjun, Ph.D.
         Sam Mohanlal, Ph.D.
         B. A. Sharada, Ph.D.

AN APPEAL FOR SUPPORT

  • We are in need of support to meet expenses relating to some new and essential software, formatting of articles and books, maintaining and running the journal through hosting, correrspondences, etc. If you wish to support this voluntary effort, please send your contributions to
    M. S. Thirumalai
    6820 Auto Club Road Suite C
    Bloomington
    MN 55438, USA
    .
    Also please use the AMAZON link to buy your books. Even the smallest contribution will go a long way in supporting this journal. Thank you. Thirumalai, Editor.

BOOKS FOR YOU TO READ AND DOWNLOAD


REFERENCE MATERIAL

BACK ISSUES


  • E-mail your articles and book-length reports to thirumalai@bethfel.org or send your floppy disk (preferably in Microsoft Word) by regular mail to:
    M. S. Thirumalai
    6820 Auto Club Road #320
    Bloomington, MN 55438 USA.
  • Contributors from South Asia may send their articles to
    B. Mallikarjun,
    Central Institute of Indian Languages,
    Manasagangotri,
    Mysore 570006, India
    or e-mail to mallikarjun@ciil.stpmy.soft.net
  • Your articles and booklength reports should be written following the MLA, LSA, or IJDL Stylesheet.
  • The Editorial Board has the right to accept, reject, or suggest modifications to the articles submitted for publication, and to make suitable stylistic adjustments. High quality, academic integrity, ethics and morals are expected from the authors and discussants.

Copyright © 2001
M. S. Thirumalai

THE SYSTEM OF PANINI
Girish Nath Jha, Ph.D.


1. AShTAdhyAyI (AD) and Indian Linguistic Tradition (ILT)

Panini's grammar AD (approximately 7th BCE) is important for linguistic computation for two reasons. One, it provides a comprehensive and rule based account of a natural language in about 4000 rules - the only complete grammatical account of any language so far. Two, the model of a 'grammar-in-motion' that it provides seems to closely mimic a fully functional Natural Language Processing (NLP) system -

SOUND CLASSES (phonetic module)
            |
RULE-BASE (parser/grammar module)
            |
LEXICONS (lexical interface modules)

The possibility that a Natural Language (NL) parser based on Panini can help analyze Indian languages has gained momentum in recent years.

Panini was a culmination of a long tradition of unbroken tradition of linguistic thought in India which started with the Vedas about 5000 years ago. Kapoor (1993) has divided the Indian linguistic tradition in four phases -

Phase I: earliest times up to Panini
Speculations in shruti texts, four of the six vedangas (vyAkaraNa, chanda, nirukta, shikShA), work of Yaska, Rk PrAtishAkhya, AcAryas mentioned by Panini.
Phase II: PANini up to Anandavardhana (9th CE)
AD of PANini, vArttika of Katyayana, mahAbhAShya of Patanjali, mImAmsAsUtra of Jaimini, vAkyapadIya of Bhartrhari, works on poetics from Bharata up to Annandavardhana.
Phase III: Ramachandra (11th CE) to Nagesh Bhatta (18th CE)
Pedagogical grammars based on Panini's AD. Investigations into principles of grammar and also attempts to apply Paninian model to describe other languages.
Phase IV: Franz Kielhorn onwards
Modern textual interpretations and machine analysis of language. Works of Kielhorn, Bhandarkar, Carudev Shastri, Katre, Dandekar, among many others.

For details on the preceding classification refer Kapoor(1993) and Kapoor (2004 - in print).

2. AD - Structure and Organization

AD has 8 chapters divided into 4 padas. A sUtra or rule is referenced as x.x.x (x adhyaya, x pada, x sUtra). For example sUtra 1.1.1 (vRRiddhirAdaic) is adhyaya one, pada one and sUtra one. The components of AD are as follows -

  • akSharasamAmnAya (14 sUtras called shiva-sUtras) (AS)
  • sUtrapATha (4000 sUtras - 3983 in kAshikAvRRitti) (SP)
  • dhAtupATha (1967 verb roots - 2014 including kaNDvAdi roots) (DP) .
  • gaNapATha (other pertinent items like primitive nominal bases, avyayas) (GP)

The AS, DP, and the GP can be called the three most basic databases of the Paninian system containing duly arranged and structured data. The SP is Panini's comprehensive rule base for Sanskrit.

2.1 akSharasamAmnAya

The AS contains a sound catalog with 14 classes of sounds based on their phonological properties. Panini's 'shiva SUtras' (SS) or the repertory of phonemes have 14 classes of strings beginning with a i u N (simple vowels), and ending with haL (voiced fricatives). This sound catalog does not include long and prolated vowels, or the anusvAra, visarga, jihvAmUlIya and upadhmAnIya or the supra segmental features. The vowels represent phoneme classes, including features of length, nasality, and the three accents (rising /, falling \, level -).

  1. aiuN
  2. RRiLLik
  3. eo~N
  4. aiauc
  5. hayavaraT
  6. laN
  7. ~nama~NaNanam
  8. jhabha~n
  9. ghaDhadhaSh
  10. jabagaDadash
  11. khaphachaThathacaTatav
  12. kapay
  13. shaShasar
  14. hal

These sUtras (called by different names - 'shiva sUtras', 'mAheshvara sUtra', 'pratyAhAra-sUtra') allow Panini to form pratyaharas or keys representing various sets of sounds to be called in the operations.

Panini tells us how to form a pratyAhAra (sigla) -

Adirantyena sahetA (1.1.71)

Take a phoneme from a sUtra and add one of the anubandhas (the sound which ends a sUtra). The two letter pratyAhAra thus formed includes all the phonemes including the first phoneme in the pratyAhAra name and up to the last phoneme excluding the anubandha. Even if there is any intervening anubandha, it will be excluded from the list. For example, ac = [a I u RRi LLi e o ai au ] includes

[a I u] from aiuN (SS 1)
[RRi LLi] from RRiLLik (SS 2)
[e o] from eo~N (SS 3)
[ai au ] from aiau (SS 4)

These rules generate sigla (whose number can run into three figures) needed for grammatical operations The possible sound groups generated by them can be given as follows -

1+ d13 { dNj ( Nj - i +1) + d14 Nk } j = 1, i = 1, k = j + 1

where Nj=Number of sounds in group j (like 3,2,2,2...in the 14 groups) i = element number in group j, k = group number of it. AD uses 41 (42 including rA according to later Paniniyas) obtained by applying 1.3.2, 1.3.3 along with 1.1.71.

Panini uses these sound classes for calling in his rules using this mechanism. Some of them can be interpreted as follows -

aC ' vowel phoneme classes with suprasegmental features of length (hrasva dIrgha pluta), accent (udAtta, anudAtta, svarita) and nasality (anunAsika)) - 1.1.10
eC 'diphthongs - 8.3.17
jash' voiced unaspirated stops - 1.1.58/8.2.39
jhash'voiced stops - 8.4.53
yaN'semivowels - 1.1.45
shaR'sibilants - 7.4.4
haL'consonants - 1.1.7

pratyAhAras are also used for precision and brevity similar to the use of arrays in programming. For example the sUtra

akaH savarNe dIrghaH (6.1.101)

Here Panini has used the pratyAhAra 'aK' [a I u RRi LLi] from SS 1 -2 and says that the operation of vowel lengthening will operate on the list called 'aK'

Problem with aN

By this mechanism of pratyAhAra expansion, how do we interpret aN ? Does it mean [a I u] from SS 1 or [a I u RRi LLi …..la] from SS 1 - 6 ? That is does it include only vowels or vowels plus semivowels? Panini has given the answer. This pratyAhAra includes add semivowels to the vowel list only in 1.1.69 (anuditsavarNasya cAapratyayaH). Elsewhere it includes only vowels.

2.2. sUtrapATha

The SP contains about 4000 sUtras arranged in chapters (adhyAya) and sub-chapters (pAda) in a particular order. Faddegon (1936) gave a general sketch of what is covered in different sections of the grammar and also analyzed the subsections. In the arrangement of SP he noted a "tendency towards dichotomy" and divided the rules in to two main sections - Chs1-5 and 6-8 which he called analytic and synthetic parts respectively. Kapoor(1992) has reduced the treatment of subject matter into four divisions: Chs1-2 dealing with classification and enumeration of bases and categories, Chs3-5 consist of prakrti-pratyaya enumeration, and derivation of bases, Chs 6-8.1 deal with the synthesis of prakrti-pratyaya, and Chs 8.2-8.4 deal with the rules of morphophonemics.

sUtras are verb-less sentences unlike those in natural language and give an impression of formulae or program like code. They are of following types -

vidhi (operational)
Example: akaH savarNe dIrghaH (6.1.101) ' simple vowels [a I u RRi LLi] will be lengthened if they are followed by a similar (savarNa) vowel
samjna (introduce class and conventions)
Example: supti~Nantam padam(1.4.14) ' bases ending in nominal case affixes (suP) or verbal affixes (ti~N) are called padas (syntactic words)
paribhAShA (metarules)
Example: vipratiShedhe param kAryam (1.4.2) 'if two rules of equal power conflict then latter prevails
adhikAra (headings)
Example: pratyayaH (3.1.1) ' henceforth starts the topic of 'pratyaya'
atidesha (extensions)
Example: kartur Ipsitatamam karma (1.4.49) tathA yuktam cAnIpsitam (1.4.50) ' that which is most desired by kartA (agent) is called 'karma' (1.4.49). And also that which is undesired (1.4.50).
niyama(restriction)
Example: patiH samAsa eva (1.4.8) . This rule restricts the application of previous rule sheShoghyasakhi (1.4.7).
niShedha (negation)
example: tulyAsya prayatnam savarNam (1.1.9) nAjjhalau (1.1.10) ' savarNa is class of sounds with comparable place and manner of articulation (1.1.9). This can not be across vowels and consonants even if they happen to have comparable place and manner of articulation (1.1.10)

2.3. dhAtupATha

The DP lists about 1967 verb roots (2014 including kaNDvAdi roots) distributed in 10 conjugation classes (gaNas) to undergo peculiar operations. Each gana (class) takes its name from the first member of the class like Bhvadi (`bhu'etc), Curadi (`cur'etc). Following four gaNas account for most of the verb roots -

bhvAdi (1000)
divAdi (140)
tudAdi(150)
curAdi (410)

The position of root in the DP and their control characters and accents determine the morphological processing they will undergo.

2.4. gaNapATha

The primitive nominal bases are contained in the GP. The various classes like kRRiT, taddhita, strI, suP, ti~N and the 18 upasargas operate on these bases (including 23 pronouns).

3. Other technical devices of Panini

Adapting the sUtraic style was for brevity or lAghava (brevity), as for ancient grammarians brevity even by half a syllable was like celebrating the birth of a son (`ardha mAtra lAghavena putrotsava manyante vaiyAkaranAH'). In this respect, Panini's system may be a little opaque to understanding if not decoded using a particular set of conventions. For each of these aphorisms the stages of reverse- sandhi, identification of AdhikAra and AnuvRRitti, inserting of adhikAra, anuvRRitti padas, rearranging the vibhakti order (5-7-6-1 manner), adding of verb `be' and finally, interpreting the sUtra by the meta-linguistic meaning of cases.(Kapoor 92)

Patanjali (pashpashAhnika of MB) highlights the necessity of positing a finite number of sUtras to account for an infinite linguistic output. Panini was able to abstract his mother tongue in just about 4000 linguistic statements by using some technical devices - the pratyAhAras being the most important of them. Besides, Panini uses many abbreviations like suP, ti~N, kRRit etc for different sets of affixes for the purpose of brevity. `suP' for example, is made up of `sU' which is the first case affix, and of `P' which is the marker of the last case affix `suP'. Similarly, 'ti~N' denotes verb affixes from `tiP' to `mahi~N'.

Panini's samj~nA sUtras introduce various other such classes and abbreviations that are to be called in the sUtras - vRRiddhi (1.1.1), guNa (1.1.2), anunAsika (1.1.8), savarNa (1.1.9), hrasva-dIrgha-pluta (1.1.27), udAtta-anudAtta-svarita (1.2.29-31), samprasAraNa (1.1.45),prAtipadika (1.2.45), pada (1.4.14), amredita (8.1.2), niShThA (1.1.26) etc. The construct `Adi'(etc) as part a compound with a technical word is used to denote a bigger class like `bhU AdayaH dhAtavaH' (1.3.1) which refers to the entire DP, as the latter begins the root `bhU'. A similar technique has been applied in designating smaller classes. For example, `adiprabhRRitibhyaH' (2.4.72) which refers to a subgroup in the DP beginning with `ad' (eat), and `kaNDAraaH' (2.2.38) referring to a group of items beginning with `kaNDAra'. Affixes ending in a common `it' will undergo similar processing.

The compactness of the SP and the kind of ekavAkyatA that we find in it could be attributed a lot to the devices like adhikAra, anuvRRitti, as well as particle `ca' which are used to avoid unnecessary repetition. The concept of adhikAra is intended to regulate the meaning of the rules to follow in the sense that the whole of adhikAra rule to be read with the subsequent sUtras. For example, the rule `pratyayaH' (3.1.1) is an adhikAra sUtra which applies till the end of the fifth book. That is, anything treated after this rule will get the designation `pratyaya' (except 3.1.5, 3.2.24, 3.2.25). The next sUtra is `parashca' (3.1.2) which is itself an adhikAra, and along with 3.1.1 will be read as `pratyayaH parashca' meaning that an affix will be placed after a base. The next sUtra `adyudAtashca' (3.1.3) if read along with 3.1.1 and 3.1.2 means that a pratyaya (3.1.1) which is to be placed after a base (3.1.2) has an acute accent on its first syllable (3.1.3).

Through anuvRRitti, Panini passes on the vibhakti based information to the following sUtra (whether immediate or not) having the same vibhakti type. In this case, there has to be some recursive searching for the same case ending in a sUtra. In case of a match, the words of dissimilar case endings from the previous sUtra will be understood in the later sUtra. For example, the rule `AdguNaH' (3.1.87) will read `at' (abl.sing.5-1) `aci' (loc.sing.7-1 from `ikah yaNaci' 6.1.77) `samhitAyAm' (loc. sing. 7-1) `pUrvaparayoH'(loc. sing.7-1) `ekaH'(nom.sing.1-1 from 6.1.84) `guNaH'(nom.sing.1-1 from 6.1.87) `bhavati'(part of convention).

The use of the particle `ca' (as conjunction or disjunction) at the end of a rule requires that the immediately preceding sUtra along with adhikAra, if any, is to be read with the sUtra with `ca'. For example, `Dati ca' (1.1.25) is to be read along with `ShnAntA ShaT'(1.1.24) which provides that numerals (1.1.23) with `Sha' or `Na' in as final are called `ShaT' along with the numerals with `Dati' as suffix.

4. Morpho-phonemics

The sandhi or euphonic combination of sounds can take place between vowels and vowel, vowels and semivowels, semivowels and semivowels, consonants and consonants, and between visarga and other sounds. Sandhi is necessary for internal structuring of constituents like roots, and padas (internal sandhi), as well as for the combination of two words (external sandhi). Among some of the general rules for such morphophonemic combinations, the following can be noted -

vowel lengthening: akaH savarNe dIrghaH (6.1.101)
+voc -> +len /- +voc +savarNa
Example: rAma + avatAra ' rAmAvatAra
voicing: jhalAm jashonte (8.2.39)
+cons -> +voice /- +voice
Example: vAk + IshaH ' vAgIshaH
Retroflexization: ShTunAShTuH (8.4.41)
+cons -> +cons /- +cons
dental retroflex retroflex
Example: rAma-s- + ShaShTha -> rAma-Sh-ShaShTha
Palatalization: stoH shcunAshcuH (8.4.40)
+cons -> +cons /- +cons
dental palatal palatal
Example: sat + cit ' saccit
Nasalization: yaro'nunAsike'nunAsiko vA (8.4.45)
+cons -> +cons /- +cons
-nas +nas +nas
Example: ya-t- + nAsti -> ya-n-nAsti

5. Derivational Process

The purpose of Panini's derivational process is to generate complete syntactic words called padas which Panini defines as `supti~Nantam padam' (that is, bases with either 21 suP affixes or 9+9 ti~N affixes). Padas with suP affixes constitute the NPs (subanta pada), and those with the ti~N affixes can be called VPs (ti~Nanta pada). In a subanta pada the base is called prAtipadika(pdk), which are either primitive(as stored in GP) or derived through primary (kRRit), secondary (taddhita), feminine (strI) affixations, and by compounding (samAsa). These prAtipadikas undergo suP affixation under conditions of case, gender, number and the end-characters of the bases to return syntactic words. For example:

Derivation of `RAmaH'

rAma (pdk) + sU (4.1.2/4.1.3) ' rAma + s (1.3.2/1.3.9/1.1.60) ' rAma + rU (8.2.66) ' rAma + r (1.3.2/1.3.9/1.1.60) ' rAmaH (1.4.110/8.3.15).

The verb roots also are either basic or derived. The former are stored in the DP distributed in 10 gaNas each with a fixed infix called vikaraNa. The affix `L' is introduced after a verb root to mark temporal situations (by ten lakAras like laT,liT...) agent. The rule `lasya' (of `la') replaces affix `L' by a set of 9+9 (parasmai and Atmane padins) affixes distributed according to person and number (3x3). The ten lakAras are grouped into two - sArvadhAtuka (sdk), and ArdhadhAtuka (adk) of 5 each taking a particular set of arguments. Let

Derivation of 'paThati'

paThA ' paTh (1.3.1/1.3.2/1.3.9/1.1.60) ' paTh + laT (3.1.91/3.2.123/4.1.2/4.1.3) ' paTh + la (1.3.2/1.3.9/1.1.60) ' paTh + tiP (3.4.79) ' paTh + shaP + tiP (3.1.68) ' paTh + a + ti (1.3.8/1.3.9/1.1.60) ' paThati (syntactic word meaning `reads').

Thus the string `RAmaH paThati' is a complete basic sentence. Cardona (1988) posits the following formula for such formation -

( N - En )p ... ( V - Ev )p

which consists of related padas (p) in which the nominal affixes (En) and verbal affixes (Ev) follow the respective bases.

6. kAraka and vAkya

`vAkya' does not come under Panini's samj~nA category. Such and other non-technical words like `yoga', `samartha', `sAkA~NkSha', `sambandha', `anabhihita',`samAnAdhikaraNa', and `kAraka' etc are used in AD to express relationship between syntactic constituents. Panini's rules pertaining to kAraka explain a situation in terms of action (kriyA) and factors (kArakas) which have a function in the accomplishment of action. In other words, the six Paninian kArakas, that is, apAdAna (source, 1.4.24), sampradAna (beneficiary, 1.4.32), karaNa (means, 1.4.42), adhikaraNa (location, 1.4.45), karman (patient, 1.4.49), and kartRRi (agent, 1.4.54) specify the possible semantic relationship that hold between the nouns and the verb in a grammatical sentence.

For example, the sentence `devdattaH odanam pacati' (Devadatta cooks rice) can be described as `devadattaH' (agent or kartRRi), `odanam' (patient or karman), `pacati' (action or kriyA in present tense). The present tense equivalent in Panini, `laT', expresses agent-ship, goal, or goal-less state (intransitivity). The ti~Nanta component will take the root `pac' with `laT' (and related morphological specifications) to generate the action `pacati' with affix `tiP'. Similarly, the subanta component will generate the goal (object) by adding accusative singular (2-1) case affix `am' to `odana' and the agent by adding nominative singular (1-1) `sU' to `devadatta'. Thus

devadatta + sU odana + am pac + tiP ' devadattaH odanam pacati

7. Conclusion

Panini's is an essentially formal system which suits very well for computation with little formalization. This fact leaves ample scope for language processing insights from Panini. The AShTAdhyAyI is also important for its linguistic insights into the structure and functioning of many Indian languages genealogically related to Sanskrit. The NLP/Computational Linguistics community has already started using Panini as a model for Indian languages with reasonable success. It may be interesting to see if Paninian formalism will work for other languages of the Indo European family.


References

Cardona, George, 1965, On translating and formalizing Paninian rules. Journal of Oriental Institute, Baroda, vol 14, 306-14.

Cardona, George, 1970, Some Principles of Panini's Grammar. Journal of Indian Philosophy, vol 1, 40-74.

Cardona, George, 1974, Panini's Karakas: Agency, Animation and Identity. Journal of Indian Philosophy, vol2, 231-306.

Cardona, George, 1976, Some features of Paninian Derivations. History of thought and contemporary Linguistics.

Cardona, George, 1987, Panini: His work and its traditions (vols 1-3, first edn 1987, Motilal Banarasidass, 1988.

Deshpande, Madhav M. 1992, Panini in the context of Modernity. Language and text. ed. R.N.Srivastava et al, Kalinga Publications, Delhi.

Faddegon, Barend, 1936, Studies on Panini's Grammar. Amsterdam.

Jha, Girish Nath, 1993, Morphology of Case Affixes: a computational analysis, M.Phil. thesis submitted to J.N.U. New Delhi

Joshi, S.D. 1969, Sentence structure according to Panini. Indian Antiquary.

Kanthan, K.L., Formal language system of Panini. Chemical Bank, Information Technology Management, New York.

Kapoor, Kapil, 1991, Panini Vyakarana: Nature, Applicability and Organization. Course notes for NLP-91, IIT-Kanpur, 1991.

--------------------, 1992, Norm and Variation: A Classical Indian Debate. Language and Text. ed. R.N.Srivastava et al, Kalinga Publications, Delhi.

--------------,1993, Text and Interpretation: The Indian Tradition, under publication, D.K. Print World, Delhi.

--------------,2004, Essays on Panini's Ashtadhyayi, under publication D.K. Print World, Delhi.

Katre, Sumitra, M. 1985, Astadhyayi of Panini (first Indian edn. Motilal Banarasidass, 1989.

Mishra, Vidya, Niwas, 1966, The Descriptive Technique of Panini: An Introduction. Mouton & Co., The Hague, Paris)

Sangal, R. 1991, Karaka Theory - sentence and nominals. Course notes for NLP-91, IIT-Kanpur.

Sharma, Ram, Nath, 1987, The Astadhyayi of Panini. Vol I, Munshiram Manoharlal Publishers Pvt. Ltd., New Delhi.

Sharma, Ram, Nath, 1990, The Astadhyayi of Panini. Vol II, Munshiram Manoharlal Publishers Pvt. Ltd., New Delhi.


Appendix: Phonetic chart (ITRANS 5.0)

Vowels

a

A

i

I

u

U

RRi

RRI

LLi

LLI

e

ai

o

au

aM

aH

Consonants

k

kh

g

gh

~N

c

ch

j

jh

~n

T

Th

D

Dh

N

t

th

d

dh

n

p

ph

b

bh

m

y

r

l

v

 

sh

Sh

s

h

L

kSh

j~n

shr

 

 


HOME PAGE | CONTACT EDITOR


Girish Nath Jha, Ph.D.
Special Centre for Sanskrit Studies
Jawaharlal Nehru University
New Delhi 110067
E-mail: girishj@mail.jnu.ac.in