146x Filetype PDF File size 0.35 MB Source: www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 11, NOVEMBER 2019 ISSN 2277-8616 Hindi-English Neural Machine Translation Using Attention Model Charu Verma, Aarti Singh, Swagata Seal, Varsha Singh, Iti Mathur Abstract: Translation is the technique in which system translate text from source natural language to target natural language, so that the original message is retained in target language. Deep Neural Networks are capable models that achieved malicious achievement on challenging learning tasks such as visual object recognition and speech recognition and work well whenever large amount of training sets are available. This paper represent Hindi to English machine translation at Hindi-English parallel corpus in which supervised learning algorithm applied with attention model and in which one Recurrent Neural Network map the input sequence to a vector in fixed dimensionality, and another Recurrent Neural Network decode the target sequence from the vector and show how neural machine translation is better way to translate the data from source language to target language. Index Terms: Machine Translation, Deep Learning, Neural Machine Translation, LSTM —————————— —————————— 1. INTRODUCTION 2 PROCEDURE FOR PAPER SUBMISSION To solve a particular problem, people need to discuss or share Loung et al. [1] showed how attention based techniques are their ideas, but language understanding is a big gap. Machine improving the quality of Neural Machine Translation (NMT) translation provides access to information written in an models. Cho at al. [2] elaborated the different properties of unknown language, to resolve lower level barriers in encoder-decoder model used in NMT systems. Wu et al. [3] communication, to increase productivity. Translation can also explained the working of the Google’s NMT system they be performed by humans, who provide perfect translations so showed show a translation process is done from end to end. why there is need of machine translation when it provides Sennrich et al. [4] showed how NMT system performance inferior translation quality of the text with ambiguous words degrades when out of vocabulary words are found in the text. and sentences? Human translation is very expensive and hard They also showed the approach of dealing with this kind of to find (Require Knowledge of both source and target problem. Loung et al. [5] addressed the problem of deal with languages) when machine translation is Less Expensive as rare words in text while performing experiments with NMT compared to Humans and can be found at a click of a button system. Tu et al. [6] discussed the issue in model coverage of by every device like laptop, mobile, tabs. Currently systems an NMT system. Sennrich et al. [7] showed how the system are able to attain input a text in one language and give output can be improved by using more monolingual data. Further, as text in other language, there are more than hundred Sennrich et al. [8] showed the working of their NMT system. machine translation technology providers for example ‘Google Joshi et al [9] developed a mechanism to write in Hindi using Translate’ is powerful translation service developed by Google English. They used statistical machine learning to predict a to support more than hundred languages text and documents word when some of the initial characters are typed. Using this conversion, ‘Yandex Translate’ is a web service provided by Joshi et al. [10] also developed an Example Based Machine Yandex, used to translate ninety-five languages words, whole Translation System. Joshi et al. [11] also evaluated the system texts, phrases and entire text of website only by getting its developed. They also compared the performance of this URL, ‘IBM-Watson’ translator translate documents from one system with other popularly available MT engines. Gupta et al. language to another while preserving file formatting and file [12] developed a rule-based stemmer for Urdu. They types included: MS office, PDF, TXT, HTML, JSON, XML and developed several rules to implement this stemmer. They Open office. These technologies use deep learning to improve further used this stemmer in evaluation of some English-Urdu their accuracy and speed and provide good interface so user MT systems [13]. Singh et al. [14] developed a POS tagger for can easily use. In this paper, we have shown the experiments Marathi using Statistical Machine Learning. Bhalla et al. [15] that we have done for training and and evaluation of our developed a procedure of transliteration of name entities from Neural Machine Translation (NMT). The rest of the paper is English to Punjabi. Joshi et al [16] evaluated several open structured as follows: Section 2 reviews the literature. Section domain MT engines. Gupta et al. [17] did the same for explains our proposed model. Section 4 shows the evaluation English-Urdu MT engines. Singh et al. [18] developed a POS performed on our system and section 5 concludes the papers. tagger for Marathi using supervised learning. Joshi et al. [19] ———————————————— further developed a technique to using machine learning in Charu Verma is member technical staff Next Generation evaluating MT engines. Tyagi et al. [20] [21] developed an Technologies Research Foundation, India. E-mail: approach of translating complex English sentences by first vermacharu284@gmail.com simplifying them and then translating into Hindi. Yogi et al. [22] Aarti Singh is member technical staff Next Generation Technologies developed an approach to identify candidate translation which Research Foundation, India. E-mail: say2aru19@gmail.com are good for post editing. Gupta et al. [23] further extended Swagata Seal is member technical staff Next Generation their stemmer by adding derivational rules to the inflectional Technologies Research Foundation, India. E-mail: stemmer. Asopa et al. [24] developed mechanism for chunking swagata.sita@gmail.com Hindi sentences using a rule-based approach. Gupta et al. [25] Varsha Singh is member technical staff Next Generation developed a rule based lemmatizer for Urdu which was an Technologies Research Foundation, India. E-mail: varshasingh773@gmail.com extension to their stemmer. Kumar et al. [26] developed Iti Mathur is an Associate Professor in Department of Computer several machine learning based classifiers for identifying Science, Banasthali Vidyapith, India. E-mail: different senses to a word in Hindi. Joshi et al. [27] developed mathur_iti@rediffmail.com 2710 IJSTR©2019 www.ijstr.org INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 11, NOVEMBER 2019 ISSN 2277-8616 a mechanism to estimate the quality of English-Hindi MT model. Singh and Joshi [54] developed a rule based approach engines. Chopra et al. [28] [29] developed a name entity for identifying anaphora in Hindi Discourses. Sinha et al. [55] recognition and tagging tool for Hindi using several machine developed a sentiment analyzer for Facebook post using the learning approaches. Gupta et al. [30] developed a POS methods developed by Gupta et al. Sharma et al. [56] [57] tagger for Urdu using machine learning approach. Mathur et used some of the markov model-based approaches used by al. [31] developed an ontology matching evaluation using tool Singh et al. to develop their association classification model. which used the MT engine developed by Joshi et al. Chopra et Similar approaches we used by Goyal et al. [58] [59] for their al. [32] developed a mechanism for rewriting English sentence models. and then translating them into Hindi. This significantly improved the performance of their MT engine. Joshi et al. [33] 3 PROPOSED MODEL investigated some approaches to classifying documents and further suggested an approach for effective classification of 3.1 Supervised Machine Learning text documents. Singh et al. [34] developed an approach to Supervised learning is function of machine leaning in which automatically generate transfer grammar rules. This approach data coming into pairs as input and output. In supervised significantly improved the development process of their learning input could be anything like sensor meas-urements, transfer-based MT engine. Singh et al. [35] developed an pictures, email or messages and output may be label, any approach for text processing of Hindi documents using deep real numbers, in some cases vectors or in other structure neural networks. They further developed this approach to mine (example: negative or positive, dog or cat, spam or not textual data from web documents [36]. Singh et al. [37] spam, right or wrong). developed a translation memory tool which worked as a sub- system in their transfer-based MT system. This further {(xi, yi)}i =1 to N improved the accuracy of their system. Gupta et al. [38] further showed how fuzzy logic can be used in developing NLP In this given equation, element xi among N is a feature vector applications. Gupta et al. [39] used several NLP tools in (is a vector in which each dimension j = 1, . . . , D contains a preprocessing the tweets that they extracted from web. They value that describes the example in some way that value found that this approach improves the accuracy of their called feature and denote x(j).) and yi is label of that xi input. machine learning model which classifies the tweets. Gupta et For example x(1) is an input which represents a person, then al. [40] developed an approach which helped in identification the first feature x(1) contain gender, the second feature x(2) and classification of multiword expressions from Urdu contain weight in kg, x(3) contain height in cm so on, in which documents. Nathani et al. [41] developed a rule based x(1) input’s x(1), x(2), x(3) called feature vector. This paper inflectional stemmer for Sindhi which was written in represent Hindi to English machine translation on Hindi- Devanagari script. Asopa et al. [42] developed a shallow English parallel corpus in which supervised learning algorithm parser for Hindi using conditional random fields. Gupta et al. applied on attention model in which one Recurrent Neural [43] showed the use of machine learning approached in Network map the input sequence to a vector in fixed dimen- developing NLP applications. Gupta et al. [44] used fuzzy sionality, and another Recurrent Neural Network decode the operations in analyzing sentiments of tweets on several topics. target sequence from the vector. This approach showed very promising results over traditional approaches. Sharma and Joshi [45] developed a rule based 3.2 Preprocessing Step word sense disambiguation approach for Hindi. It gave an Preprocessing of data is necessary before training the accuracy of 73%. Katyayan and Joshi [46] studied various network. Generally real word data are incomplete, noisy and approaches of correct identification of sarcastic phrases in inconsistence to overcome data to these problem data need to English documents. Gupta and Joshi [47] showed show tweets preprocess. first clean the text by removing spaces and other can be classified using NLP techniques. They showed how unnecessary symbol of the sentences. Network not negative sentences can be handled using NLP approaches. understands the text format so, conversion of text into vector Shree et al. [48] showed how there is difference between Hindi is necessary. In sequence to sequence translation every word and English languages what problems the current state of the in a sentence should need contain a unique identity, represent art MT system face while translating text. Ahmed et al. [49] each word in a language as a one-hot vector or giant vector showed how MT system can be developed by using an contain zero except one in the whole vector. In the given intermediate language which is related to both the languages. example, the sentence contains several words which shows a They developed a Arabic-Hindi MT system using Urdu as the vector every sentence in the corpus contain SOS, which intermediate language. They further performed the same study represents starting of the sentence and EOS represents End using English and found that if we have a large sized corpus of the sentence. An example of this is shown in figure 1. then English which in unrelated to Arabic and Hindi, can be used for developed a MT system [50]. Seal and Joshi [51] developed a rule based inflectional stemmer for Assamese. This system showed very good results. Singh and Joshi [52] showed the developed of POS taggers for Hindi using different markov models. They concluded that hidden markov model- based tagger produced the best results among several markov based POS taggers. Pandey et al. [53] showed how NLP approached can help in develop a better ranking model for web documents. They used particle swam optimization and NLP approaches in improving the performance of their ranking 2711 IJSTR©2019 www.ijstr.org INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 11, NOVEMBER 2019 ISSN 2277-8616TABLE 1 Evaluation Results of BLEU at Document Level NMT with Baseline Attention NMT Model Doc1 0.375784 0.552926 Doc2 0.338189 0.507185 Doc3 0.363287 0.506252 Doc4 0.358607 0.533271 00 Doc5 0.361307 0.515314 03 04 05 07 08 01 0 0 Table 2 shows the results of evaluation done by human 0 0 06 annotators. Table 3 shows the correlation between these 02 studies. TABLE 2 Where = < 0 0 0 0 1 0 0…………….> Results of Human Evaluation at Document Level NMT with Fig. 1: Assignment of Weights Baseline Attention NMT Model 3.3 Training of NMT using Attention Model Doc1 0.501443 0.552926 Doc2 0.394018 0.507185 We developed Neural Machine Translation by Jointly learning Doc3 0.333809 0.506252 to Align and Translate. Here, attention, reads as a neural Doc4 0.428654 0.533271 extension of Encoder-Decoder model. Encoder – Decoder Doc5 0.456911 0.515314 model contain several limitations which is resolved by Attention. neural network work on vectors, so it compress all TABLE 3 important information of source sentence in encoder- decoder Pearson Correlation Between Human and BLEU Evaluation approach this make neural network difficult to work with long Metrics for all Engines sentences, for mainly those sentences which are longer Engine Correlation Score compare to training corpus sentence. This is shown in figure Human-BLEU 1In decoding phrase at every time step t , first take the as Baseline 0.467728 input the hidden state h at the top layer of the stacking LSTM. NMT t NMT with To capture relevant source side information for find the current Attention 1.0 target word y content vector c is used and share the Model t t subsequence steps.. when model know how the context vector ct is derived, then given the target hidden ht and the 5 CONCULSION source side context vector ct, a simple concatenation layer In this paper, we showed the development of English-Hindi MT which combine the information from both vectors and provide using Neural Approach. We tested the developed Engines a attention hidden state as follows: through 500 sentences. For this, we did both human and automatic evaluation. In the automatic evaluation, we found h = tanh(W [c h]) that BLEU was producing better results for Attention Based t c t; t Model which was an improvement over the Baseline Model. then we feed attention vector ht with the help of softmax layer which provide productivity as: REFERENCES [1] Luong, M.T., Pham, H. and Manning, C.D., 2015. Effective P(y y , x) = softmax(W ) approaches to attention-based neural machine translation. t
no reviews yet
Please Login to review.