jagomart
digital resources
picture1_Icica Paper 24 Formatted


 157x       Filetype PDF       File size 0.12 MB       Source: www.kcgcollege.ac.in


File: Icica Paper 24 Formatted
1 a rule based approach for connective in malayalam language 1 2 1 1 kumari sheeja s lakshmi s sobha lalitha devi 1 au kbc research centre anna university chrompet ...

icon picture PDF Filetype PDF | Posted on 24 Sep 2022 | 3 years ago
Partial capture of text on file.
            1
                          A Rule Based Approach for   Connective in Malayalam Language
                                                                  1, 2           1                    1
                                                 Kumari Sheeja S , Lakshmi S , Sobha Lalitha Devi
                                  1
                                   AU-KBC Research Centre, Anna University, Chrompet, Chennai, sheeja@kcgcollege.com,
                                                        slakshmi@au-kbc.org , sobha@au-kbc.org 
                                             2 KCG College of Technology, Karapakkam,Chennai,
                             Abstract. Discourse connectives signal the relationship between two coherent spans of text. Connective
                            arguments are the text spans they relate. Discourse relations link clauses in text and compose overall text
                            structure. Discourse connectives are an important part of modeling the Malayalam discourse structure.We
                            present our work on rule based approach  in identifying the Discourse connective in Malayalam language.
                         Discourse connectives may or may not be explicitly present in  the relation. In our work we have focused on the
                             rule based  identification of particular  connective in Malayalam text and  showed encouraging results.
                             Keywords:Discourse connectives. rule based approach. Malayalam Discourse . Connective arguments
                         1       Introduction
                              Discourse  relations  connect  clauses  and  sentences  in  the  text  and  compose  the  overall  text
                         structure. Discourse analysis is concerned with analyzing how clause or sentence level units of text
                         are related to each other within a larger unit of text. The two basic units of discourse relations are
                         discourse markers and their arguments. The discourse markers are the words or phrases which
                         connect  two clauses or sentences and establish a relation between two discourse units. 
                          Kamala went to hospital  but doctor was not there.
                         In the this example the connective “but” makes a relation between two clauses or sentences and
                         making the text coherent. Discourse relations are used in NLP applications and it  is important for
                         discourse   analysis.  Identification  of  discourse  relation   in   natural   language  processing  is  a
                         challenging task. Discourse connectives, despite their common function of connecting  the contents
                         of two different clauses, also acts as a conjunction [11]. So it is difficult to distinguish discourse and
                         non-discourse markers. The identification of argument boundaries  in text is even more difficult in
                         large text. Malayalam is a  South Indian or Dravidian language  and also  free word order language
                         but  maintains  the  verb  in  final  position.  Discourse  connectives  are  important  for  producing  or
                         interpreting text in malayalam language . The content of the paper is organized as follows. Section2
                         describes the related work. Section 3 gives an overview of discourse relations and section 4 explains
                         the rule based approach.  Finally the  paper ends with the conclusion of the work.
                             2       Related Work
                            Relevant work on the annotation of discourse connectives and their arguments have been explored
                         in various languages such as Turkish ([12], Arabic [2], English [7], etc. PDTB is the first to follow
                         the lexically grounded approach to annotation of discourse relations and it is unique in adopting a
                         theory-neutral approach to annotation. PDTB  provides argument structure of discourse relations and
                         sense labels of each relation in text  which  follows  hierarchical classification scheme. Elwell et.al,
                         [9] worked using  maximum entropy rankers  and  achieved  3.6%  improvement over the state of art
                         on  identifying  arguments  of  discourse  connectives.  Versley  [11]  worked    on  tagging  German
                         discourse connectives  and arguments using English training data and a German_ English parallel
                         corpus.Versely’s  approaches were to transfer a tagger for English discourse connectives.They have
                         done this work  by annotation projection using a freely accessible list of connectives. He achieved
               2
                               the result as F-score of 68.7% for the identification of discourse connectives.   Ghosh [5] used a data
                               driven approach to identify arguments of explicit discourse connectives in the PDTB corpus.   Al
                               Saif’s work [1] used machine learning algorithms for automatically identifying explicit discourse
                               connectives and its arguments in Arabic language. Wang et al.,[12] used sub-trees as features and
                               achieved  a  significant  improvement  in  identifying  arguments,  explicit  and  implicit  discourse
                               relations. Published works on discourse relation annotations in Indian languages are  available for
                               Hindi, Malayalam and Tamil by Sobha et.al,[3].They have also worked on automatic identification
                               of Discourse Relations in the mentioned three Indian Languages [10] using CRFs technique. Other
                               published works in Indian languages are in   Hindi [6];[7] and Tamil [8].   In this paper we have
                               explored  various  Discourse  connectives  and  rule  based  approach  for  particular  connective   in
                               Malayalam language. 
                                  3      Discourse Connectives In Malayalam
                               Malayalam is a free-word order  language  and  words  are  seen  agglutinated,  hence  most  of  the
                               connectives are seen in agglutinated form.The discourse relation in Malayalam language can be
                               syntactic  (a  suffix)  or  lexical[10].  It  can  be  within  a  clause,  inter-clausal  or  inter-sentential.
                               Discourse connectives are an important part of modeling discourse structure. In this paper,we now
                               describe various connectives present in Malayalam language and a rule based approach to figure out
                               the connective “pakshe” (But).
                                  3.1      Discourse Relation categorization
                                  The discourse markers can be realized in any of the following ways. There are two major category
                               Explicit and Implicit relations.  We also observed other types of relations.
                                   3.2          Explicit connectives
                                      The  explicit  connectives are  morphemes or  free  words   that  trigger  discourse  relations in
                               Malayalam language .Explicit connectives signal the presence of discourse connectives between
                               sentences or clauses. The connectives can occur at the initial, final or medial position in an argument
                               in Malayalam language  [12].  Below  are  the  examples  for  explicit  connectives  in  malayalam
                               language.
                                          [prameham oru nishabdha
                                           diabetes    one   silent
                                         kolayaaLiyaaN.]/arg1
                                         killer   
                                         ennaal [niyanthrichu nirthiyaal
                                         but           control         kept if
                                          kuzhappamilla]/arg2
                                         no problem                   
                               (Diabetes is a silent killer. But when kept in control it is not a problem.)
                               In  the  above  example,  the  connective  “ennaal” occurs  inter  sententially  by  connecting  the  two
                               sentences.  Connective   occur  at  the  initial  position  in  the  second  argument.  We  see  that  the
                               connectives  are  explicitly  realizing  relations  between  two  arguments.  Four  types  of  explicit
                               connectives have been observed. 
      3
           3.3      Explicit connective Types
             Subordinate Conjunctions. This type of conjunctions conjunctions connect the main clause with
           the  adverbial  clause  ,  noun  or  an  adjectival  clause.   Most  commonly observed subordinate
           conjunctions in all three languages are since, because and when. Consider the following examples
           which give the distribution of subordinate conjunctions in malayalam language.
                      [pachakkarikaL vevichu 
                       Vegetables     boil
               kazhikkumpoL]/arg1
                       when eat
                      [athiluLLa poshakam nashtamaakum]/arg2
                 In that       nutrients     loss
            (When vegetables are boiled and consumed, the nutrients in it are lost)
           In the above examples both lexical and morpheme can become the connectives
           Co-ordinate Conjunctions.  This conjunction  give equal emphasis for two clauses. They connect
           two words, phrases and clauses. The most commonly observed co-ordinate conjunction in the corpus
           are “but” and “and”. The conjunction is “pakshe” which is the co-ordinate conjunction.The intra
           sentential coordinating conjunction can occur between the clauses. 
           Conjunct Adverbs.  These are said to modify the    clauses or sentences in which they occur. They
           join independent clauses together. These are special type of conjunctions as they are part of adverbs
           and conjunction. Given below are the examples of such a relation.
               [kazhuth, mukham, kaiviralukal ennivitangalil
                  Neck,        face,       fingers     all+these+palces
                 karuthaniramuNtaakaan kozhuppu 
                 black+color+come          fat
                 kaaraNamaakum.]/arg1 athinaal  [eNNayil 
                 reason+will+be                Therefore     oil               
                 varutha aahaaram, kozhuppulla Bakshanam
                  fried      food          fatty              food        
                  enniva     ozhivaakkaNam.]/arg2
                  all+these    avoid
           (Fat can make the neck, face and fingers turn to black color. Therefore we have to avoid oily foods
           and fatty stuffs.)
           In the above example “athinaal” is the adverbial conjunction which actually shows a cause and
           effect relationship where arg1 is effect and arg2 is the cause.
           Correlative conjunction.  Correlative conjunctions are another type of  simple pair of conjunctions
           that is used in a sentence to join different words or group of words. This conjunction is not used to
           connect sentences themselves.But they link two or more words or clauses of equal importance
           within a sentence itself. They always occur within a sentence.
                 [indyayennaal innu     sachin 
      4
                 india means    today  sachin 
                 maathramalla,]/arg1 [pakshe innum 
                 not only                      but  also  today
                 Sachinillaathe      indyaye 
                 sachin without       india
                 sankalppikkaan prayaasam.]/arg2
                 think                  cannot
           (Today India means not only Sachin, but also    cannot think of an India without Sachin.)
           Here “maathramalla-pakshe” is the correlative connective. But the “pakshe” is even said to be
           dropped in certain cases.
           Complementizer clause.This clause is  considered as a special type of connective. It is a type of
           conjunction which marks a complement clause. 
                   [avare vila kalppikkunnilla]/arg1ennu [nethaakkal 
               they   value    not given             that    leaders      
               abhinayichu]/arg2
                pretend
           (The leaders pretended that they were not given a value.)
           3.4       Implicit Connectives
            An implicit relation can be inferred if there exist a relationship between  adjacent pair of sentences
           and explicit connective is not  present in the text.  We have labeled as  “IMPLICIT” label where an
           implicit relation was  inferred[12]. 
           (7)  [pilkaalath niravadhi svadeshikal bekkarute 
                 later         many         people          bekkar's     
                paatha pinthutarnnu.]/arg1 IMPLICIT [mattu
                way      followed                                  some  
               chilaraakatte  kaayalil      svadesheeyamaaya 
                People      backwater       traditional         
               Reethiyil   kayal       nikathi krishi bhoomi
               style      backwater    filled         farm    land
               uNdaakkiyetuthu.]/arg2
                made
           (Later many people followed bekkar's path. Some people in their traditional style filled up back
           waters and made their farm land.)
           In the above example  two sentences are not explicitly connected but a relationship can be inferred
           implicitly.  
           4      Rule Based Approach
               Malayalam is  a language of the Dravidian family and words are seen agglutinated. In this work,
           we have collected Malayalam sentences from websites and the document consists of 3000 sentences.
The words contained in this file might help you see if this file matches what you are looking for:

...A rule based approach for connective in malayalam language kumari sheeja s lakshmi sobha lalitha devi au kbc research centre anna university chrompet chennai kcgcollege com slakshmi org kcg college of technology karapakkam abstract discourse connectives signal the relationship between two coherent spans text arguments are they relate relations link clauses and compose overall structure an important part modeling we present our work on identifying may or not be explicitly relation have focused identification particular showed encouraging results keywords introduction connect sentences analysis is concerned with analyzing how clause sentence level units related to each other within larger unit basic markers their words phrases which establish kamala went hospital but doctor was there this example makes making used nlp applications it natural processing challenging task despite common function connecting contents different also acts as conjunction so difficult distinguish non argument bou...

no reviews yet
Please Login to review.