jagomart
digital resources
picture1_Language Pdf 102986 | Gwc2010 Swn


 134x       Filetype PDF       File size 0.26 MB       Source: www.cse.iitb.ac.in


File: Language Pdf 102986 | Gwc2010 Swn
introducing sanskrit wordnet malhar kulkarni chaitali dangarikar irawati kulkarni department of humanities center for indian lan center for indian lan and social sciences guage technology guage technology indian institute of ...

icon picture PDF Filetype PDF | Posted on 23 Sep 2022 | 3 years ago
Partial capture of text on file.
                                                   Introducing Sanskrit Wordnet 
                             Malhar Kulkarni                  Chaitali Dangarikar                   Irawati Kulkarni 
                        Department of Humanities              Center for Indian Lan-              Center for Indian Lan-
                            and Social Sciences,                guage Technology,                   guage Technology,   
                         Indian Institute of Tech-           Indian Institute of Tech-          Indian Institute of Tech-
                              nology Bombay                       nology Bombay                      nology Bombay 
                        malhar@iitb.ac.in                            chaita-                      irawatikulkar-
                                                           li.dangarikar@gmai                       ni@gmail.com 
                                                                       l.com 
                                                                             
                                      Abhishek Nanda                                  Pushpak Bhattacharyya 
                        Center for Indian Language Technology,    Center for Indian Language Technology,   
                         Indian Institute of Technology Bombay                Indian Institute of Technology Bombay 
                              abhi.nanda@gmail.com                                     pb@cse.iitb.ac.in 
                   
                                                                     
                   
                                                                             
                                                                              guages range from 10 million (Konkani) to 500 
                                        Abstract                              million (Hindi/Urdu). 
                                                                                  2. Being a heritage language, there is need to 
                       How does one build the wordnet  of  a  lan-            digitize  and  preserve  ancient  texts  in  Sanskrit. 
                       guage that has a rich lexical tradition span-          This activity is greatly helped by word lists. An 
                       ning  over  millennia?  The  sheer  volume  of         Optical Character Recognition Device (OCR) for 
                       words and their nuances, the rich, deep and            Sanskrit, for example, would need spell correc-
                       diverse grammatical tradition, the pressure of         tion after scan, and this would need an exhaus-
                       modern  developments  on  the  language-  all          tive lexicon. 
                       these  factors  and  more  combine  to  pose               3. Simlarly, there exists real need for trans-
                       unique  challenges  in  creating  lexical  re-
                       sources for such languages. This present pa-           lating ancient texts to preserve traditional culture 
                       per  describes  the  construction  of  Sanskrit        and  knowledge.  An  online  wordnet  would  no 
                       wordnet, being built using the expansion ap-           doubt be a great help to a translator.  
                       proach.  It  presents  the  processes  and  chal-          4. Machine aided translation (MAT) is ma-
                       lenges involved in this task that purports to          turing fast, and automatic translation of Sanskrit 
                       uncover  the  intimate  linkage  that  underlies       text is a challenging problem needing wordnet. 
                       Indian languages most of which have speaker                5. There is an enormous amount of Sanskrit 
                       population numbering 20 to 500 million.                text which should be available in keyword based 
                                                                              searchable form. Text search is greatly helped by 
                  1     Introduction                                          wordnets. 
                  Sanskrit is historically an Indo-Aryan language                6. The tradition of developing lexical resource 
                  Deshpande1992 and one of the 22 official                is very old in Sanskrit. There are diverse koshas 
                                                                              (traditional and rich monolingual dictionaries) in 
                  languages of India. It has a vast literature and the        Sanskrit (see section 1.2 below). Sanskrit word-
                  interest in analyzing and translating these texts is        net  will  serve  as  the  single  reference  point 
                  always on the rise, worldwide.                              representing and pointing to all these resources. 
                       Specifically,  our  motivation  for  building 
                  Sanskrit wordnet arises from the following facts:           1.1    Sanskrit language 
                       1.  For  all  languages  in  the  Indo  European           Indian  subcontinent is  inhabited  by  a  very 
                  family in India, the roots can be traced to San-              large population who speak languages belong-
                  skrit. A large part of the vocabulary of these lan-           ing  to  4  major  families,  Indo-Aryan  (a  sub-
                  guages is derived from Sanskrit which can, there-             family of Indo-European), Dravidian, Tibeto-
                  fore, provide the pivot resource for many Indian              Burman  and  Austro-Asiatic.  Sanskrit  is  the 
                  languages. The speaker population for these lan-              oldest  member  of  the  Indo-Aryan  language 
                                                
                                                     family, a sub branch of Indo-Iranian, which in                                                                                                       1.2                Rich lexical tradition of Sanskrit 
                                                     turn  is  a  branch  of  Indo  European  language 
                                                     family.                                                                                                                                              Sanskrit  has  a  rich  tradition  of  creating  léxica 
                                                                                                                                                                                                                                                                                             4
                                                           There is a traditional fourfold division of lex-                                                                                               (Kulkarni,  2008).  Nighantu  (700BC)  on  which 
                                               ical units of Indian languages into:                                                                                                                       Yaska is believed to have written a commentary 
                                                                                                              1                                                                                           called Nirukta is the oldest known treatise that 
                                                           1. tatsama - words having their origin                                                                                                    arranged lexical material from the point of view 
                                               in  Sanskrit  and  accepted  in  the  modern  Indo-                                                                                                        of synonymy as well as homonymy, and this tradi-
                                               Aryan  languages  without  any  change  in  their                                                                                                                                                                          5
                                                                                                                                                                                                          tion continued to Pali  tradition as well. The first 
                                               phonology.                                                                                                                                                 and the foremost popular name of lexicon work 
                                                           2.    tadbhava2-  words  which  have  their 
                                                                                                                                                                                                          in classical Sanskrit is Amarasimha’s Amarako-
                                               origin  in  Sanskrit  but  their  phonological  forms                                                                                                      sha  (6th  century  AD)  (Oka,  1913).  The  Cata-
                                               are changed as per the rules of  the modern Indo-                                                                                                          logous Catalogorum lists at least 40 commenta-
                                               Aryan languages.                                                                                                                                           ries on Amarkosha alone, which shows how im-
                                                           3.  	
  desh•words  which are  the  native                                                                                                portant and popular this synonyms dictionary in 
                                               words of the particular language and                                                                                                                       ancient India was.  
                                               4.  
	
  videsh•words  borrowed  from  for-                                                                                                                      There were many other léxica created more 
                                               eign languages.                                                                                                                                            or less in the style of Amarakosha which are giv-
                                                           The links to  tatsama and  tadbhava                                                                                                     en in Appendix A (11 of them).  
                                                                                                                                                                                                                      The first modern-day dictionary of Sanskrit  
                                               words, in particular, will be a great pan-Indian                                                                                                           was the Sanskrit-English Dictionary compiled by 
                                               linguistic  resource  for  computational  purposes.                                                                                                        Professor  H.H.  Wilson  and  published  in  1819 
                                               Table 1 below lists some examples of Sanskrit                                                                                                              (Wilson, 1819)Two Indian dictionaries came out 
                                                                                                                    3                                                                                                                                                                                                                                   6
                                               words in Hindi wordnet .                                                                                                                                   soon  after,  namely,  the  Shabdakalpadruma  
                                                                                                                                                                                                          Deb1988 of Pt. Sir Raja Radhakanta Dev 
                                               HWN Synset                                                           Tatsam                   HWN                   English                                and  Vacasptyam7 Bhattacharya,  2003  com-
                                                                                                                    word                     synset                meaning 
                                               {,  ,  ,  	 ,                                                                        basil 
                                                                                                                                                                                                         piled by Pt Taranatha Tarkavacaspati. 
                                                 	, , , ,                                        	                    	                                                                  So far the electronic lexical resources availa-
                                               
 , 
 -!", ,                                                                                                                                                                                                                                                               8
                                                                                                                                                                                                          ble for Sanskrit are mainly online dictionaries.  
                                               
#	,    
#	$,  %,                                                                                                                  The linguistic resources like Shabdakalpadruma
                                               #,                   &,&$,                          #                  #                                                                                                       
                                               ' }                                                                                                                                                    4 Nighantu is Sanskrit term for the collection of words, 
                                               {(,(,)!,  *,                                                 )!                       )!                    eyebrow,                               grouped thematic categories with brief annotations  
                                                                                                                                                                   brow, superci-
                                               
,+	,,+,,,'-}                                                  *                   *              lium                                   5 Pali  is a Middle Indo-Aryan language (or Prakrit) of India. 
                                               {
,./
,/                                                
                                           muscle,  mus-                          It is best known as the language of the earliest extant Budd-
                                               
,.
,
,.                                         
                                       culus                                  hist scriptures. 
                                               
,
,}                                                                                                                                         6 Shabdakalpadruma is a first Sanskrit uni-lingual dictio-
                                               01,1,*,.*,                                                
2                  01                  eggplant,                              nary arranged in the modern alphabetical principles. It gives 
                                               
2,
2,                                                                                                   aubergine,                             full quotations and definitions from the original Koshas 
                                                                                                                    &4
                  01                  mad_apple                              which were unavailable in print at that time.  Sets of syn-
                                                 , ,3 ,                                           67#8                   01                                                         onymous words from the traditional Koshas are arranged 
                                               &4
 , 5,                         1,,                                                                                                      under the headword, followed by the brief gloss.  Each entry 
                                               67#8,9:,  9;:,6<,                                                            01                                                         in the lexicon includes headword, its category, meaning, 
                                               8,*
#                                                       6<                   01                                                         usages in the Sanskrit texts.  
                                                                                                                                                                                                          7 Vacasptyam is a modern mono-lingual Sanskrit lexicon. It 
                                                                                                                    8                    01                                                         arranges words in the Sanskrit alphabetical order and gives 
                                                                                                                                                                                                          grammatical information with word derivations as per the 
                                               Table 1: Tatsama words in the HWN                                                                                                                          traditional Sanskrit grammar. It contains about 46970 
                                                                                                                                                                                                          unique words. Each entry in the lexicon includes headword, 
                                               These  representative  examples  show  that  the                                                                                                           its category, meaning, set of synonymous words, usages and 
                                               synsets in Hindi wordnet contain 60-70% tatsa-                                                                                                             some other information. 
                                                                                                                                                                                                          8 The online dictionaries available for Sanskrit are-(1) 
                                               ma (directly borrowed from Sanskrit) words.                                                                                                                Monier Williams dictionary < http://webapps.uni-
                                                                                                                                                                                                          koeln.de/tamil/>, (2) Apte’s Sanskrit-English Dictionary < 
                                                                                                                                                                                                          http://www.aa.tufs.ac.jp/~tjun/sktdic/>, (3) Apte’s English-
                                               1 Tatsama Shabda Kosha (Tatsama words dictionary) is                                                                                                       Sanskrit Dictionary < http://www.sanskrit-lexicon.uni-
                                               published by Kendriya Hindi Nideshalaya,  Shiksha  Vibha-                                                                                                  koeln.de/aequery/index.html> and (4) Spoken Sanskrit Dic-
                                               ga, Manava Samsadhana Vikasa Mantralaya, Bharata Sara-                                                                                                     tionary: an online hypertext dictionary for Sanskrit - English 
                                               kara in 1988.                                                                                                                                              and English - Sanskrit.< http://spokensanskrit.de/>. Apart 
                                               2 See Hindi ki Tadbhava Shabdavali=Sarma, 1968>.                                                                                                          from that various scanned versions of the printed dictiona-
                                                                                                                                                                                                          ries prepared by European scholars are available at < 
                                               3 www.cfilt.iitb.ac.in/wordnet/webhwn.                                                                                                                     http://www.sanskrit-lexicon.uni-koeln.de/>. 
                                      
                                     and  Vaacaspatyam  are  vast.  For  example,  a                                                                            1.4            Expansion approach for Indian lan-
                                     comparison of the entries for the word war in                                                                                             guage wordnets 
                                     these electronic dictionaries with the synsets of 
                                     the same word in the Sanskrit Wordnet is a good                                                                            Wordnet construction activities in India started in 
                                                                                                                                                                2000  and  the  Hindi  wordnet9 (Narayan  et  al., 
                                     indicator of the richness of this lexical tradition 
                                     in Sanskrit.                                                                                                               2002) is the first one which got released on the 
                                                                                                                                                                Web in 2006. It was built ab initio using words 
                                     1. Spoken Sanskrit Dictionary: (7 words) ?@, ?A,                                                                         from  available  lexical  resources  of  Hindi.  The 
                                                                                                                                                       B
                                                                                                                                                                design of the Hindi wordnet follows the famous 
                                     C, , D?+A,  D , ;? .                                                                                                                                    10
                                                                                                                                                                English WordNet .  
                                     2.        Apate’s  Sanskrit-English  Dictionary:  (7                                                                                While  following  the  expand  method,  the 
                                     words) 
CE, CE, E, , CE, ?@,                                                                    Sanskrit wordnet follows the hierarchy preserva-
                                     3.  Monier  Williams  Dictionary:  (56  words)                                                                             tion principle (HPP) (Tufis et al., 2008). In the 
                                     ''F?	G'H3'D"D,G  D?+A,                                                                             hierarchy of the Hindi wordnet, if synset H  is a 
                                                                                                                                                                                                                                                                      2
                                                                                                                                                                hyponym of synset H , and the translation equi-
                                     D,  D,  ;:,  	,  I,  ,  	,   6C                                                                                                                          1
                                                                                                                                                                valents in the Sanskrit wordnet for H and H are 
                                     J
	J;,?                                                                                                                                                                 1                2 
                                                                                                                                                                S and S respectively, then in the hierarchy of 
                                     ?@?+A?+A;?K*L
	M
	                                                                             1                 2
                                                                                                                                                                Sanskrit wordnet S  should be a hyponym of syn-
                                     
	
	G
	GH6I ?? 1                                                                                                                  2
                                                                                                                                      B                         set  S   Thus,  in the expansion approach lexico-
                                                                                                                                                                            1.
                                     N8+*N8
 *O 	O	P ?1M                                                                      graphers  are  spared  the  task  of  establishing 
                                     Q	?	?@ 3G +6                                                                        afresh semantic relations for the synsets of San-
                                     H?# and                                                                                                     skrit wordnet.  Appendix 2 describes and shows 
                                     4. Sanskrit Wordnet: ( words) ?@ ,  CE ,                                                                           the  screenshots  of  lexicographers’  interface  for 
                                                                                                                            B                                   creating the Sanskrit wordnet.   
                                     E, E, , D?+A, D, ;?, 'E, 
                                                                         B                     B               B            B                                   1.5            Synset creation in Sanskrit wordnet 
                                     ', '6HE, 'F?	GE, 'E, DO	E, ?+A, 
                                                     B                                                                                            B
                                     H?, JA, J
	, A , DN	, P?,                                                                               Domains: Initially the Sanskrit wordnet started 
                                                   B              B                      B           B                         B              B
                                     , H?6?, E, 
CE, JE, 6E,                                                                             creating  synsets  with  random  synsets  from  the 
                                                     B                           B                                                                              Hindi Wordnet. Later on, lists of important San-
                                     N8+*E, ?1E, QE, 'F?1E, DE, 	?E,                                                                          skrit words were acquired from different sources. 
                                     ?, 66E, D"E, 6, ?, E, DE, 
                                                B                                           B        B                                                          University of Hyderabad provided a list of most 
                                     H?E, 
	E, 	, 
, H?E, ,                                                                             frequent words in their Sanskrit corpus. It con-
                                                                                         B            B                                  B
                                     DREG , '6E, 	?E,  
, 
I	E, 	E, E,                                                                        sisted of 8338 words. Another word list available 
                                                                                                    B                                                                                                               11
                                     DO	E,  S?, ', , ?, 
6A,                                                                           on the indology forum  contains a list of 127796 
                                                                            B                B                 B                    B                           unique words from two major epics of Sanskrit 
                                     TE,  , U_
,  E, N , NV ,  , V ,  ,                                                                                                               12                                               13
                                                                    B         B                             B                   B               B               literature:  Ramayana  and  Mahabharata. The 
                                     V   , , ?EG , , +E, 6ME, TIE, T1E,                                                                      third list is prepared based on the lexicon called 
                                                        B                               B                                                                       Bharatiya  Vyavahara  Kosha(Naravane,  1961). 
                                     ?1, T1ME, T1E, # !?G , W E, D"E, 6E, 
                                                   B                                                  B                                                         Table 2 shows the part of speech distribution of 
                                     6E, , IE, IE, (N?, AE, E, 
                                                                             B                                     B                                            Naravane’s  lexicon.  It  contains  2766  words 
                                     ', X, ?, ?V, E 
                                                 B                    B                                                                                         which are used for 1969 concepts related to the 
                                                                                                                                                                day to day life. Table 3 shows a comparison be-
                                     1.3            The  process  of  building  the  Sanskrit                                                                   tween the lists of Sanskrit words gleaned from 
                                                     wordnet                                                                                                    various sources mentioned above.  
                                      There are two methods to develop a Wordnet: 
                                     (1) Expand method and (2) Merge method (Vos-
                                     sen, 2002). In the first method, a wordnet is con-                                                                                                                          
                                                                                                                                                                9 www.cfilt.iitb.ac.in/wordnet/webhwn 
                                     structed  based  on  an  existing  wordnet.  In  the                                                                       10
                                                                                                                                                                   Wordnet.princetoon.edu 
                                     second  method,  sub-Wordnets  for  specific  do-                                                                          11
                                                                                                                                                                    
                                     mains are  built  and  later  merged.  For  Sanskrit                                                                       12
                                                                                                                                                                   Ramayana is an ancient Sanskrit epic. The Valmiki Ra-
                                     Wordnet, the Hindi wordnet is considered as the                                                                            mayana is published in 7 volumes, Baroda: University of 
                                     source  resource.  Though  expanded  from  Hindi                                                                           Baroda Oriental Institute, 1960-1975. 
                                                                                                                                                                13
                                     wordnet, care was taken to ensure that Sanskrit                                                                               Mahabharata is one of the two important epics of India. 
                                                                                                                                                                The Critical Edition of the Mahabharata is prepared by the 
                                     wordnet  captures  the  real  lexical  structure  of                                                                       Bhandarkar Oriental Institute,  Pune from April 1919 to 
                                     Sanskrit language.                                                                                                         September 1966. It has 19 volumeconsisting18 Parvan-s; 
                                                                                                                                                                89000+ verses in the Constituted Text, and an elaborate 
                                                                                                                                                                Critical Apparatus. 
                           
                               The  above  mentioned  words  are  organized                                    kUla-vyApAraH where the members of the com-
                                                        14
                          into  52  domains.  Omitting  function  words,  a                                    pounds are '? (anya)NM (sthAna)?+1 (sa-
                          core set of concepts was prepared and then by                                                                                                            17
                          Sept.  2009  synsets  for  all  these  core  concepts                                Myoga), '! (anukUla)Y? (vyApAra) .and 
                                               15                                                              they are indicated by inserting hyphen.  For ex-
                          were created.                                                                        ample- the gloss of a verb in Sanskrit is generally 
                                                                                                               created using technical terms like Y? vyApAra 
                          Nouns               Verbs               Adjectives          Adverbs                  ‘action’, ? janya ‘produced,’  '! anukUla 
                          1512                225                 180                 52                                            18
                                                                                                               ‘helpful,’ etc.  
                          Table 2:  POS distribution of the synsets created 
                          (core concepts)                                                                      2      Problems faced in the expansion ap-
                                                                                                                      proach 
                           Sanskrit List 1          Sanskrit List 2     Sanskrit List  3    Hindi  List 
                                                                                            1                  In this section we enumerate the challenges faced 
                                                    Sanskrit    Word  Number           of  Hindi               in  creating  the  synsets  of  Sanskrit  wordnet  in 
                          Univ.  of  Hyderabad  list                    Sanskrit            wordnet  
                          most           frequent  (Based  on  Ra- Words  in  Nara- Total  num-                consonance with those of Hindi. 
                           words    in   Sanskrit   mayana              vane's              ber       of        
                          (Amba Kulkarni)            and  Mahabhara- Bhasha       Vyava- unique 
                                                    ta)                 har Kosh            words                                                               
                          8338                      127796              2766                105157             17
                                                                                                                 This way of giving definitions is typical of Sanskritic 
                          Table 3: Sanskrit word list                                                          tradition which used to strongly emphasise precision. The 
                                                                                                               long compound simply defines the act of going.  
                                                                                                               18
                                                                                                                 So using these expressions, Hindi Wordnet gloss is 
                          While creating synsets the following considera-                                      adapted in following ways-  (1){+,L	,D. !
                          tions are kept in mind:                                                              ,O 	 ronA, rudana karanA, AMsu bhAnA, kran-
                                                                                                               dana karanA} HWN DI. 
D. ! 61 AMkha se AMsu 
                          Inserting concepts or glosses in the Sanskrit                                        girAnA SWN I/	EI?+E
1
#F?'& /
                          wordnet: A combination of the glosses given in                                                                                       B            B
                          dictionaries  like  Shabdakalpadruma  and  the                                       -EY?EZ  sukha-duHkhayoH bhAvanAvegAt netrAb-
                          translation of the gloss of the Hindi wordnet syn-                                   hyAm aZrupatan-rUpaH vyApAraH, (2){,*,J
                          set is used to create the Sanskrit synset glosses.                                   ,:[,:+,
*\,A ,A \,
                          While  writing  the  gloss,  complicated  "As                                      ],J],	 mAranA, pITanA, prahAra karanA, 
                          sandhis16 ands  samAsas  (compounds)  are                                       ThokanA, piTAI karanA, dhunanA, dhunAI karanA, tADa-
                                                                                                               nA, pratADanA, rasIda karanA }HWN KKN
                          avoided. Whenever lengthy compounds (having                                          DK	
DQ kisi par kisI vastu Adi se AghAta kara-
                          5-6 members) became necessary, the members of                                        nA  SWN  "N'

'
ND-! G E
                          the compounds were invariably joined with the                                                                 B
                          hyphen symbol (-) as in: ‘‘'?/NM/?+1 !/                                    Y?EZkasmin api kena api vastunA Ahanana-pUrvakaH 
                          Y?  meaning  the  activity  that  is  helpful  in                                vyApAraH (3) {I	,O?,+
,
 kharIdanA, 
                                                                                                               kraya karanA, mola lenA, lenA} HWN  
DK		
K
                          reaching  a  place’’  anya-sthAna-saMyogAnu-                                         	,Y?
9DK	
^_	+
 paise Adi dekar kisI 
                                                                                                                
                                                                                                               dukAna, vyakti Adi se kuch saudA mol lenA SWN D

                          14                                                                                   NM7!2?`?+ED	/J	EY?EZ 
                             These domains are: 1) Grains and Cereals, 2) Limbs of                                                         B
                          Humans, 3) Medical treatment, 4) Tools & implements, 5)                              ApaNe vastu tathA cha tanmUlyam etayoH AdAna-
                          Worms & Insects, 6) Minerals, 7) Food and Drinks,                                    pradAnAtmakaH vyApAraH, (4) {-:, -a +, 'I, 
                          8)Games & sports, 9) Ornaments & Trinkets, 10) House-                                -, b, 8! , ',  'I, '  rUThanA, 
                          hold articles, 11) Limbs of animals, 12) Post office, 13) 
                          Vegetables, 14) Directions, 15) Country, 16) Religion, 17)                           ruSTa honA, anakhanA, rUsanA, risAnA, phUlanA, anasA-
                          Court,  18) Birds, 19) Trees & plants, 20) Dress, 21) Nature,                        nA, anakhAnA} HWN 'J+c	,7 ?'1+
                          22) Animals, 23) Fruits, 24) Flowers, 25) Young-ones of                               aprasanna hokara udAsIna, cupa yA alaga ho jAnA 
                          animals, 26) Amusement, 27) Spices, 28) Weights & meas-                              SWN 'J
?E
?+1-Ed	?8E
                          ures, 29) Colours, 30) Relatives, 31) Diseases, 32) Reptiles,                        Y?EZ aprasannatAhetujanyaH viyogarUpaH audAsInya-
                          33) Conveyances, 34) Occupations, 35) Education, 36) 
                          Time, 37) Government, 38) Verbs, 39) Adverbs, 40) Ab-                                phalajanakaH vA vyApAraH  (5) {De1, .7, 7, 
                                                                                                                                                                                 
                          stract nouns, 41) Adjectives, 42) Prepositions, 43) Numer-                           A, ', D1 AnaA, pahuMcanA, pahucanA, pad-
                          als, 44) Conjunctions, 45) Collective words, 46) Pronouns,                           hAranA, avanA, AgamanA} HWN `NM
D	

                          47) Ordinals, 48) Feminines, 49) Interjections, 50) War, 51)                                                                                               !
                          House, and  52) Miscellaneous.                                                       NMc"NM+ eka stAna se Akara dUsare stAna 
                          15 From this time Sanskrit Wordnet became a part of Indo-                            para upasthita honA  SWN '?/NM/
?+1/! G E'?/
                          WordNet activity which provided a common platform for                                NM/?+1 ! /Y?EZ anya-sthAna-viyoga-pUrvakaH 
                          the lexicographers working on various Indian language 
                          Wordnets.                                                                            anya-sthAna saMyogAnukUla-vyApAraH.  
                          16 Phonological conjoining                                                            
The words contained in this file might help you see if this file matches what you are looking for:

...Introducing sanskrit wordnet malhar kulkarni chaitali dangarikar irawati department of humanities center for indian lan and social sciences guage technology institute tech nology bombay iitb ac in chaita irawatikulkar li gmai ni gmail com l abhishek nanda pushpak bhattacharyya language abhi pb cse guages range from million konkani to abstract hindi urdu being a heritage there is need how does one build the digitize preserve ancient texts that has rich lexical tradition span this activity greatly helped by word lists an ning over millennia sheer volume optical character recognition device ocr words their nuances deep example would spell correc diverse grammatical pressure tion after scan exhaus modern developments on all tive lexicon these factors more combine pose simlarly exists real trans unique challenges creating re sources such languages present pa lating traditional culture per describes construction knowledge online no built using expansion ap doubt be great help translator proa...

no reviews yet
Please Login to review.