Citation Summary
| Citing sentences |
|---|
| J94-4003 1 16:730 Substantial application of semantic or pragmatic knowledge about the word and its context requires compiling huge amounts of knowledge, the usefulness of which for practical applications in broad domains has not yet been proven (e.g. , Lenat et al. 1990; Nirenburg et al. 1988; Chodorow, Byrd, and Heidron 1985). |
| J94-4003 2 695:730 It seems, however, that Brown et al. expect that target word selection would be determined mainly by translation probabilities (the second factor in the above term), which should be derived from a bilingual corpus (Brown et al. 1990, p. 79). |
| W07-0401 3 52:352 Among all possible target language sentences, we will choose the sentence with the highest probability: eI1 = argmax I,eI1 braceleftbigPr(eI 1|f J 1 ) bracerightbig (1) = argmax I,eI1 braceleftbigPr(eI 1) Pr(f J 1 |e I 1) bracerightbig (2) This decomposition into two knowledge sources is known as the source-channel approach to statistical machine translation (Brown et al. , 1990). |
| P95-1050 4 4:64 1 Introduction In a number of recent studies it has been shown that word translations can be automatically derived from the statistical distribution of words in bilingual paxallel texts (e. g. Catizone, Russell & Warwick, 1989; Brown et al. , 1990; Dagan, Church & Gale, 1993; Kay & Rbscheisen, 1993). |
| W99-0602 5 10:212 Bilingual alignments have so far shown that they can play multiple roles in a wide range of linguistic applications, such as computer assisted translation (Isabelle et al. , 1993; Brown et al. , 1990), terminology (Dagan and Church, 1994) lexicography (Langlois, 1996; Klavans and Tzoukermann, 1995; Melamed, 1996), and cross-language information retrieval (Nie et al. , * This research was funded by the Canadian Department of Foreign Affairs and International Trade (http://~.dfait-maeci.gc.ca/), via the Agence de la francophonie (http://~. |
| W08-0301 6 36:195 As to the pioneering IBM word-based SMT models (Brown et al., 1990), IBM models 3, 4 and 5 handle spurious source words by considering them as corresponding to a particular EMPTY word token on the English side, and by the fertility model which allows the English EMPTY to generate a certain number of foreign words. |
| P91-1034 7 5:89 INTRODUCTION An alluring aspect of the statistical ~pproach to machine translation rejuvenated by Brown et al. \[Brown et al. , 1988, Brown et al. , 1990\] is the systematic framework it provides for attacking the problem of lexical disambiguation. |
| P91-1034 8 31:89 From the Viterbi alignments for 1,002,165 pairs of short French and English sentences from the Canadian Hansard data \[Brown et al. , 1990\], we have extracted a set of 12,028,485 connections. |
| P91-1034 9 17:89 STATISTICAL TRANSLATION Following Brown et al. \[Brown et al. , 1990\], we choose as the translation of a French sentence F that sentence E for which Pr (E\[F) is greatest. |
| P91-1034 10 81:89 This system is an enhanced version of the one described by Brown et al. \[Brown et al. , 1990\] in that it uses a trigram language model, and has a French vocabulary of 57,802 words, and an English vocabulary of 40,809 words. |
| P91-1034 11 26:89 Brown et al. \[Brown et al. , 1990\], show an example of such an automatically derived alignment in their Figure 3. |
| P91-1034 12 23:89 The translation model used by Brown et al. \[Brown et al. , 1990\] incorporates the concept of an alignment in which each word in E acts independently to produce some of the words in F. If we denote a typical alignment by A, then we can write the probability of F given E as a sum over all possible alignments: Pr (FIE) = ~ Pr (F, AlE ). |
| A94-1006 13 115:178 have been used in statistical machine translation (Brown et al. , 1990), terminology research and translation aids (Isabelle, 1992; Ogden and Gonzales, 1993; van der Eijk, 1993), bilingual lexicography (Klavans and Tzoukermann, 1990; Smadja, 1992), word-sense disambiguation (Brown et al. , 1991b; Gale et al. , 1992) and information retrieval in a multilingual environment (Landauer and Littman, 1990). |
| A94-1006 14 114:178 3 Bilingual Task: An Application for Word Alignment 3.1 Sentence and word alignment Bilingual alignment methods (Warwick et al. , 1990; Brown et al. , 1991a; Brown et al. , 1993; Gale and Church, 1991b; Gale and Church, 1991a; Kay and Roscheisen, 1993; Simard et al. , 1992; Church, 1993; Kupiec, 1993a; Matsumoto et al. , 1993; Dagan et al. , 1993). |
| P98-1117 15 99:146 RALI/Sallgn The second method proposed by RALI is based on a dynamic programming scheme which uses a score function derived from a translation model similar to that of (Brown et al. , 1990). |
| P96-1021 16 6:177 1 Motivation The statistical translation model introduced by IBM (Brown et al. , 1990) views translation as a noisy channel process. |
| C02-1162 17 49:141 WordNet was constructed with what is commonly referred to as a differential theory of lexical semantics (Miller et al. , 1990), which aims to differentiate word senses by grouping words into synonym sets (synsets), which are constructed as to allow a user to easily distinguish between different senses of a word. |
| C02-1162 18 131:141 There has also been a lot of work involving bilingual corpora, including the IBM Candide project (Brown et al. , 1990), which used statistical data to align words in sentence pairs from parallel corpora in an unsupervised fashion through the EM algorithm; Church (1993) used character frequencies to align words in a parallel corpus; Smadja et al. |
| C02-1162 19 48:141 4 Experiment Details 4.1 Ontologies The ontologies selected for alignment in this work were the American English WordNet (Miller et al. , 1990) version 1.7, and the Mandarin Chinese HowNet (Dong, 1988).2 There are two main reasons why these particular two ontologies were chosen: they represent very different languages, and were constructed with very different approaches. |
| C96-1033 20 20:167 A small community have experimented with either purely statistical approaches(Brown et al. , 1990; Schiitze, 1993) or connectionist based approaches (Berg, 1991; Miikkulainen and Dyer, 1991; Jain, 1991; Wermter and Weber, 1994). |
| D07-1003 21 56:243 Similarly, Murdock and Croft (2005) adopted a simple translation model from IBM model 1 (Brown et al. , 1990; Brown et al. , 1993) and applied it to QA. |
| C00-2168 22 23:87 In some other approaches, parameters and parameter values are either not sought out or are expected to be obtained automatically (e <, Brown et al. 1990; Goldstein 1998), and, while holding promise for the tiittire as a potential component of an elicitation system, cannot, at this time, lbnn the basis of an entire system of this kind. |
| W03-2804 23 62:186 The evaluator just needs to indicate whether each of the marked items is an actual error or whether it can rather be considered as an alternative translation This metric resembles very much the one proposed in (Brown et al, 1990). |
| A92-1014 24 18:224 Though several studies with similar objectives have been reported \[Church, 1988\], \[Zernik and Jacobs, 1990\], \[Calzolari and Bindi, 1990\], \[Garside and Leech, 1985\], \[Hindle and Rooth, 1991\], \[Brown et al. , 1990\], they require that sample corpora be correctly analyzed or tagged in advance. |
| W06-3110 25 34:125 As a decision rule, we obtain: eI1 = argmax I,eI1 braceleftBigg Msummationdisplay m=1 mhm(eI1,fJ1 ) bracerightBigg (3) This approach is a generalization of the sourcechannel approach (Brown et al. , 1990). |
| P95-1035 26 24:172 For example, (Beaven, 1992a) employs a chart to avoid recalculating the same combinations of signs more than once during testing, and (Popowich, 1994) proposes a more general technique for storing which rule applications have been attempted; (Brew, 1992) avoids certain pathological cases by employing global constraints on the solution space; researchers such as (Brown et al. , 1990) and (Chen and Lee, 1994) provide a system for bag generation that is heuristically guided by probabilities. |
| C02-1158 27 16:146 However, in Statistical MT(Brown et al. , 1990), large amounts of translation examples are required in order to obtain high-quality translation. |
| P04-1068 28 5:207 1 Introduction Compilation of translation lexicons is a crucial process for machine translation (MT) (Brown et al. , 1990) and cross-language information retrieval (CLIR) systems (Nie et al. , 1999). |
| W08-0302 29 8:197 Phrase-based MT systems are straightforward to train from parallel corpora (Koehn et al., 2003) and, like the original IBM models (Brown et al., 1990), benefit from standard language models built on large monolingual, target-language corpora (Brants et al., 2007). |
| J04-2003 30 45:412 The translation models they presented in various papers between 1988 and 1993 (Brown et al. 1988; Brown et al. 1990; Brown, Della Pietra, Della Pietra, and Mercer 1993) are commonly referred to as IBM models 15, based on the numbering in Brown, Della Pietra, Della Pietra, and Mercer (1993). |
| J93-1001 31 322:408 (Sinclair et al. 1987; p. xv) The experience of writing the COBUILD dictionary is documented in Sinclair (1987), a collection of articles from the COBUILD project; see Boguraev (1990) for a strong positive review of this collection. |
| J93-1001 32 39:408 (Waibel and Lee 1990; p. 4) A number of data collection efforts have helped to bring about this change in the speech community, especially the Texas Instruments' Digit Corpus (Leonard 1984), TIMIT and the DARPA Resource Management (RM) Database (Price et al. 1988). |
| H91-1026 33 3:200 Much of the current excitement surrounding parallel texts was initiated by Brown et aL (1990), who outline a selforganizing method for using these parallel texts to build a machine translation system. |
| A94-1016 34 115:136 Instead, we are planning to use an English language model on the output, in a manner similar to that done by speech and statistical translation systems (Brown et al. , 1990). |
| W02-0902 35 6:199 The seminal work by Brown et al. [1990] at IBM on the Candide system laid the foundation for much of the current work in Statistical Machine Translation (SMT). |
| N03-1018 36 130:188 We trained an IBM style translation model (Brown et al. , 1990) using GIZA++ (Och and Ney, 2000) on the 500 test lines used in our experiments paired with corresponding English lines from an online Bible. |
| C02-1064 37 10:207 In such translation, given a source language text, S, the translated text, T,inthe target language that maximizes the probability P(T|S) is selected as the most appropriate translation, T best, which is represented as (Brown et al. , 1990) T best = argmax T P(T|S) = argmax T (P(S|T) P(T)). |
| C98-1106 38 63:156 2The WORD SPACE method is closely related to Latent Semantic Indexing (bSI)(Deerwester et al., 1990), where document-by-word matrices are processed by SVD instead of word-by-word matrices. |
| C98-1106 39 15:156 1In fact, this is partly shown by the fact that many MT systems have substitutable domain-dependent (or "user" ) dictionaries . relies on translation probabilities estimated from large bilingual corpora (Brown et al., 1990)(Brown et al., 1991). |
| P91-1023 40 41:211 Table 3: An Entry in a Probabilistic Dictionary (from Brown et al. , 1990) English French Prob(French \] English) the le 0.610 the la 0.178 the 1' 0.083 the les 0.023 the ce 0.013 the il 0.012 the de 0.009 the A 0.007 the clue 0.007 Table 4: A Bilingual Concordance bank/banque ("money" sense) and the governor of the et le gouvemeur de la 800 per cent in one week through % ca une semaine ~ cause d' ut~ bank/banc ("place" sense) bank of canada have fwxluanfly bcaque du canada ont fr&lnemm bank action. |
| P91-1023 41 40:211 Aligning sentences is just a first step toward constructing a probabilistic dictionary (Table 3) for use in aligning words in machine translation (Brown et al. , 1990), or for constructing a bilingual concordance (Table 4) for use in lexicography (Klavans and Tzoukermann, 1990). |
| P91-1023 42 6:211 Introduction Researchers in both machine lranslation (e.g. , Brown et al, 1990) and bilingual lexicography (e.g. , Klavans and Tzoukermann, 1990) have recently become interested in studying bilingual corpora, bodies of text such as the Canadian I-lansards (parliamentary debates) which are available in multiple languages (such as French and English). |
| P91-1023 43 1:211 A PROGRAM FOR ALIGNING SENTENCES IN BILINGUAL CORPORA William A. Gale Kenneth W. Church AT&T Bell Laboratories 600 Mountain Avenue Murray Hill, NJ, 07974 ABSTRACT Researchers in both machine Iranslation (e.g. , Brown et al. , 1990) and bilingual lexicography (e.g. , Klavans and Tzoukermann, 1990) have recently become interested in studying parallel texts, texts such as the Canadian Hansards (parliamentary proceedings) which are available in multiple languages (French and English). |
| C08-1045 44 6:150 1 Introduction A wide variety of machine translation (MT) methods are being studied(Nagao, 1996; Brown et al., 1990; Vogel et al., 2003), but to obtain high-quality translations between languages belonging to different families that are alien each other is difficult. |
| W08-0315 45 20:95 2 Ngram-based SMT System Our translation system implements a log-linear model in which a foreign language sentence fJ1 = f1,f2,,fJ is translated into another language eI1 = f1,f2,,eI by searching for the translation hypothesis eI1 maximizing a log-linear combination of several feature models (Brown et al., 1990): eI1 = argmax eI1 braceleftBigg Msummationdisplay m=1 mhm(eI1,fJ1 ) bracerightBigg where the feature functions hm refer to the system models and the set of m refers to the weights corresponding to these models. |
| D07-1055 46 49:198 The posterior probability Pr(eI1|fJ1 ) is modeled directly using a log-linear combination of several models (Och and Ney, 2002): pM 1 (eI1|fJ1 ) = exp parenleftBigsummationtextM m=1 mhm(e I1,fJ1 ) parenrightBig summationtext Iprime,eprimeIprime1 exp parenleftBigsummationtextM m=1 mhm(eprime Iprime1,fJ 1 ) parenrightBig (1) This approach is a generalization of the sourcechannel approach (Brown et al. , 1990). |
| P96-1041 47 4:170 1 Introduction Smoothing is a technique essential in the construction of n-gram language models, a staple in speech recognition (Bahl, Jelinek, and Mercer, 1983) as well as many other domains (Church, 1988; Brown et al. , 1990; Kernighan, Church, and Gale, 1990). |
| C92-2079 48 32:273 Pour P. Brown et al. d'IBM, le but est de calculer ies param~tres du module probabiliste de traduction automatique qu'ils veulent construire \[Brown et al. , 1988 ; Brown et al. , 1990\]. |
| P93-1001 49 5:194 Introduction Parallel texts have recently received considerable attention in machine translation (e.g. , Brown et al, 1990), bilingual lexicography (e.g. , Klavans and Tzoukermann, 1990), and terminology research for human translators (e.g. , Isabelle, 1992). |
| P93-1001 50 1:194 Char_align: A Program for Aligning Parallel Texts at the Character Level Kenneth Ward Church AT&T Bell Laboratories 600 Mountain Avenue Murray Hill NJ, 07974-0636 kwc @research.att.com Abstract There have been a number of recent papers on aligning parallel texts at the sentence level, e.g., Brown et al (1991), Gale and Church (to appear), Isabelle (1992), Kay and R/Ssenschein (to appear), Simard et al (1992), WarwickArmstrong and Russell (1990). |
| D08-1065 51 39:259 Statistical MT (Brown et al., 1990; Och and Ney, 2004) can be described as a mapping of a word sequence F in the source language to a word sequence E in the target language; this mapping is produced by the MT decoder (F). |
| N04-1023 52 20:201 1.1 Generative Models for MT The seminal IBM models (Brown et al. , 1990) were the first to introduce generative models to the MT task. |
| N04-1023 53 10:201 1 Introduction The noisy-channel model (Brown et al. , 1990) has been the foundation for statistical machine translation (SMT) for over ten years. |
| P03-1019 54 14:237 2 is the so-called source-channel approach to statistical machine translation (Brown et al. , 1990). |
| W03-0304 55 3:87 1 Introduction Since the pioneering work of the IBM machine translation team almost 15 years ago (Brown et al. , 1990), statistical methods have proven to be valuable tools in approaching the automation of translation. |
| J92-4003 56 23:185 Language Models Figure I shows a model that has long been used in automatic speech recognition (Bahl, Jelinek, and Mercer 1983) and has recently been proposed for machine translation (Brown et al. 1990) and for automatic spelling correction (Mays, Demerau, and Mercer 1990). |
| W03-0611 57 7:143 This is broadly similar in concept to the use of parallel multilingual corpora in machine translation (Brown et al. , 1990), except that our parallel corpus consists of texts and underlying numeric data, not texts and their translations. |
| H93-1040 58 20:135 The 1992 Evaluation tested three research MT systems: CANDIDE (IBM, French English) uses a statistical language modeling technique based on speech recognition algorithms (see Brown et al. , 1990). |
| A00-2011 59 10:171 Many corpus-based MT systems require parallel corpora (Brown et al. , 1990; Brown et al. , 1991; Gale and Church, 1991; Resnik, 1999). |
| E93-1015 60 5:236 Some applications using information extracted from bilingual corpora are statistical MT (\[Brown et al. , 1990\]), bilingual lexicography (\[Catizone el al. , 1989\]), word sense disambiguation (\[Gale et al. , 1992\]), and multilingual information retrieval (\[Landauer and Littmann, 1990\]). |
| W06-3108 61 38:203 As a decision rule, we obtain: eI1 = argmax I,eI1 braceleftBigg Msummationdisplay m=1 mhm(eI1,fJ1 ) bracerightBigg (3) This approach is a generalization of the sourcechannel approach (Brown et al. , 1990). |
| H05-1010 62 11:145 For example, when considering whether to align two words in the IBM models (Brown et al. , 1990), one cannot easily include information about such features as orthographic similarity (for detecting cognates), presence of the pair in various dictionaries, similarity of the frequency of the two words, choices made by other alignment systems on this sentence pair, and so on. |
| H05-1010 63 16:145 Word alignment is cast as a maximum weighted matching problem (Cormen et al. , 1990) in which each pair of words (e j,f k ) in a sentence pair (e,f) is associated with a score s jk (e,f) reflecting the desirability of the alignment of that pair. |
| H05-1010 64 6:145 1 Introduction The standard approach to word alignment from sentence-aligned bitexts has been to construct models which generate sentences of one language from the other, then fitting those generative models with EM (Brown et al. , 1990; Och and Ney, 2003). |
| C98-2225 65 27:179 2 Review: Noisy Channel Model The statistical translation model introduced by IBM (Brown et al., 1990) views translation as a noisy channel process. |
| N04-1033 66 8:290 1 Introduction In statistical machine translation, we are given a source language (French) sentence fJ1 = f1 :::fj :::fJ, which is to be translated into a target language (English) sentence eI1 = e1 :::ei :::eI: Among all possible target language sentences, we will choose the sentence with the highest probability: eI1 = argmax eI1 'Pr(eI 1jf J 1 ) (1) = argmax eI1 'Pr(eI 1)Pr(f J 1 je I 1) (2) The decomposition into two knowledge sources in Equation 2 is known as the source-channel approach to statistical machine translation (Brown et al. , 1990). |
| P91-1017 67 10:208 Substantial application of semantic or pragmatic knowledge about the word and its context for broad domains requires compiling huge amounts of knowledge, whose usefulness for practical applications has not yet been proven (Lenat et al. , 1990; Nirenburg et al. , 1988; Chodorow et al. , 1985). |
| W06-1905 68 52:193 IBM applied the noisy channel model idea to translation of sentences from aligned parallel corpora, where the source language sentence is the distorted signal, and the EACL 2006 Workshop on Multilingual Question Answering MLQA06 32 target language sentence is the original signal (Brown et al. , 1990). |
| P05-2016 69 8:31 The first work on SMT done at IBM (Brown et al. , 1990; Brown et al. , 1992; Brown et al. , 1993; Berger et al. , 1994), used a noisy-channel model, resulting in what Brown et al. |
| P06-1092 70 7:176 1 Introduction The noisy channel model approach is being successfully applied to various natural language processing (NLP) tasks, such as speech recognition (Jelinek, 1985), spelling correction (Kernighan et al. , 1990), machine translation (Brown et al. , 1990), etc. In this approach an NLP system is composed of two modules: one is a taskdependent part (an acoustic model for speech recognition) which describes a relationship between an input signal sequence and a word, the other is a language model (LM) which measures the likelihood of a sequence of words as a sentence in the language. |
| P95-1033 71 6:198 1 Introduction Parallel corpora have been shown to provide an extremely rich source of constraints for statistical analysis (e.g. , Brown et al. 1990; Gale & Church 1991; Gale et al. 1992; Church 1993; Brown et al. 1993; Dagan et al. 1993; Dagan & Church 1994; Fung & Church 1994; Wu & Xia 1994; Fung & McKeown 1994). |
| P95-1033 72 98:198 A simpler, related idea of penalizing distortion from some ideal matching pattern can be found in the statistical translation (Brown et al. 1990; Brown et al. 1993) and word alignment (Dagan et al. 1993; Dagan & Church 1994) models. |
| J06-4004 73 9:388 The first SMT systems were developed in the early nineties (Brown et al. 1990, 1993). |
| P06-1067 74 10:241 N-gram language models have also been used in Statistical Machine Translation (SMT) as proposed by (Brown et al. , 1990; Brown et al. , 1993). |
| W04-3009 75 31:174 They simplified a statistical machine translation (MT) model called an IBM model (Brown et al. , 1990), and tried to construct a general post-processor that can correct errors generated by any speech recognizer. |
| W04-3009 76 46:174 2 Noisy Channel Error Correction Model The noisy channel error correction framework has been applied to a wide range of problems, such as spelling correction, statistical machine translation, and ASR error correction (Brill and Moore, 2000; Brown et al. , 1990; Ringger and Allen, 1996). |
| W04-3009 77 55:174 Following (Brown et al. , 1990), we refer to the number of post-channel words oi produced by a pre-channel word wi as a fertility. |
| P98-2230 78 27:182 2 Review: Noisy Channel Model The statistical translation model introduced by IBM (Brown et al. , 1990) views translation as a noisy channel process. |
| W98-1307 79 40:307 This transducer always adopts an "onward" form, in which the output substrings are assigned to the edges in such a way that they are as "close" to the initial state as they can be (see Oncina et al. , 1993 \[15\], Reutenauer, 1990 \[22\]; for a recent reelaboration of these concepts see Mohri, 1997 \[13\]). |
| P04-1062 80 8:264 But for other tasks, such as machine translation (Brown et al. , 1990), the chief merit of unlabeled data is simply that nothing else is available; unsupervised parameter estimation is notorious for achieving mediocre results. |
| P04-1062 81 83:264 2.3 Prior work DA was originally described as an algorithm for clustering data in RN (Rose et al. , 1990). |
| P91-1022 82 7:182 INTRODUCTION Recent work by Brown et al. , \[Brown et al. , 1988, Brown et al. , 1990\] has quickened anew the long dormant idea of using statistical techniques to carry out machine translation from one natural language to another. |
| P91-1022 83 20:182 TIIE HANSARD CORPORA Brown el al. , \[Brown et al. , 1990\] describe the process by which the proceedings of the Ca.nadian Parliament are recorded. |
| W01-1401 84 11:167 Statistical Machine Translation (SMT): SMT learns models for translation from corpora and dictionaries and searches for the best translation according to the models in run-time (Brown et al. , 1990; Knight, 1997; Ney et al. , 2000). |
| W01-1401 85 135:167 Knowledge of EBMT Many EBMT studies (Sato and Nagao, 1990; Sato, 1991; Furuse et al. , 1994; Sadler, 1989) assume the existence of a bank of aligned bilingual trees or a set of translation patterns. |
| W01-1401 86 14:167 EBMT retrieves the translation examples that are best matched to an input expression and adjusts the examples to obtain the translation (Nagao, 1981; Sadler 1989; Sato and Nagao, 1990; Sumita and Iida, 1991; Kitano, 1993; Furuse et al. , 1994; Watanabe and Maruyama, 1994; Cranias et al. , 1994; Jones, 1996; Veale and Way, 1997; Carl, 1999, Andriamanankasina et al. , 1999; Brown, 2000). |
| H01-1035 87 27:245 All corpora were automatically word-aligned by the now publicly available EGYPT system (Al-Onaizan et al. , 1999), based on IBMs Model 3 statistical MT formalism (Brown et al. , 1990). |
| J95-4004 88 144:404 Part-of-speech tagging is an active area of research; a great deal of work has been done in this area over the past few years (e.g. , Jelinek 1985; Church 1988; Derose 1988; Hindle 1989; DeMarcken 1990; Merialdo 1994; Brill 1992; Black et al. 1992; Cutting et al. 1992; Kupiec 1992; Charniak et al. 1993; Weischedel et al. 1993; Schutze and Singer 1994). |
| J95-4004 89 154:404 Almost all recent work in developing automatically trained part-of-speech taggers has been on further exploring Markovmodel based tagging (Jelinek 1985; Church 1988; Derose 1988; DeMarcken 1990; Merialdo 1994; Cutting et al. 1992; Kupiec 1992; Charniak et al. 1993; Weischedel et al. 1993; Schutze and Singer 1994). |
| J95-4004 90 13:404 An effort has recently been undertaken to create automated machine translation systems in which the linguistic information needed for translation is extracted automatically from aligned corpora (Brown et al. 1990). |
| J95-4004 91 11:404 Endemic structural ambiguity, which can lead to such difficulties as trying to cope with the many thousands of possible parses that a grammar can assign to a sentence, can be greatly reduced by adding empirically derived probabilities to grammar rules (Fujisaki et al. 1989; Sharman, Jelinek, and Mercer 1990; Black et al. 1993) and by computing statistical measures of lexical association (Hindle and Rooth 1993). |
| J97-3002 92 13:359 Parallel bilingual corpora have been shown to provide a rich source of constraints for statistical analysis (Brown et al. 1990; Gale and Church 1991; Gale, Church, and Yarowsky 1992; Church 1993; Brown et al. 1993; Dagan, Church, and Gale 1993; Department of Computer Science, University of Science and Technology, Clear Water Bay, Hong Kong. |
| N03-1010 93 3:227 1 Introduction Most of the current work in statistical machine translation builds on word replacement models developed at IBM in the early 1990s (Brown et al. , 1990, 1993; Berger et al. , 1994, 1996). |
| C08-2032 94 4:70 1 Introduction The bilingual lexicon is a crucial resource for multilingual applications in natural language processing including machine translation (Brown et al., 1990) and cross-lingual information retrieval (Nie et al., 1999). |
| N01-1015 95 146:156 The alignment problem in statistical machine translation (Brown et al. , 1990) is too general: longdistance displacement of large chunks of material may occur frequently when translating whole sentences, but are unlikely to play any role for the letter-to-sound mapping, though local reorderings do occur (Sproat, 2000). |
| C00-2145 96 103:153 The length of translation segments as well as their most likely lifttim and final words arc calculated based on proba999 Expected Coverage of the System high low % low high Expected 2anslation Quality 1: Sato and Nagao (1990) 2: Carl (1999) 3: Giivenir and Cicekli (1998) e4: Zer (1997) e~: Heyn (1996) 6: Collins (1998) ?: Brown (1997) s: Brown et al. |
| C00-2145 97 84:153 (Brown et al. , 1990) have a purely holistic view on languages. |
| C00-2145 98 72:153 phras, word O4,5 $9 o6,7 qb 1 o2,3 e8 molecular mixed holistic Atonricity of Representation el: Sato and Nagao (1990) o4: ZERES Zer (1997) or: Brown (1997) 2: EDGAR Carl (1999) 5: TRADOS Heyn (1996) s: Brown et al. |
| P02-1052 99 26:169 Proceedings of the 40th Annual Meeting of the Association for (Brown et al. , 1990; Brown et al. , 1993), a number of other algorithms have been developed. |
| P08-1115 100 6:179 1 Introduction When Brown and colleagues introduced statistical machine translation in the early 1990s, their key insight harkening back to Weaver in the late 1940s was that translation could be viewed as an instance of noisy channel modeling (Brown et al., 1990). |
| J00-1004 101 228:231 At the same time, we believe our method has advantages over the approach developed initially at IBM (Brown et al. 1990; Brown et al. 1993) for training translation systems automatically. |
| J96-1001 102 162:576 This approach is quite different from those adopted for the translation of single words (Klavans and Tzoukermann 1990; Dorr 1992; Klavans and Tzoukermann 1996), since for single words polysemy cannot be ignored; indeed, the problem of sense disambiguation has been linked to the problem of translating ambiguous words (Brown et al. 1991; Dagan, Itai, and Schwall 1991; Dagan and Itai 1994). |
| N04-1022 103 74:155 3 Minimum Bayes-Risk Decoding Statistical Machine Translation (Brown et al. , 1990) can be formulated as a mapping of a word sequence a0 in a source language to word sequence a1a3a2 in the target language that has a word-to-word alignmenta4a18a2 relative to a0 . Given the source sentence a0, the MT decoder a29 a8a25a0a21a13 produces a target word string a1a6a2 with word-to-word alignment a4a5a2 . Relative to a reference translation a1 with word alignment a4, the decoder performance is measured as a7a24a8a12a8a25a1a17a11a23a4a5a13a15a11a30a29 a8a25a0a21a13a12a13 . Our goal is to find the decoder that has the best performance over all translations. |
| W07-0412 104 90:166 The basic idea of using synchronous TAG for machine translation dates from the original definition (Shieber and Schabes, 1990), and has been pursued by several researchers (Abeille et al. , 1990; Dras, 1999; Prigent, 1994; Palmer et al. , 1999), but only recently in its probabilistic form (Nesson et al. , 2006). |
| W07-0412 105 20:166 Systems based on word-to-word lexicons, such as the IBM systems (Brown et al. , 1990; Brown et al. , 1993), incorporate further devices that allow reordering of words (a distortion model) and ranking of alternatives (a monolingual language model). |
| P04-3004 106 22:27 TransSearch exploits sentence alignment techniques (Brown et al 1990; Gale and Church 1990) to facilitate bilingual search at the granularity level of sentences. |
| W01-1404 107 5:145 Some of these studies have concentrated on finite-state or extended finite-state machinery, such as (Vilar and others, 1999), others have chosen models closer to context-free grammars and context-free transduction, such as (Alshawi et al. , 2000; Watanabe et al. , 2000; Yamamoto and Matsumoto, 2000), and yet other studies cannot be comfortably assigned to either of these two frameworks, such as (Brown and others, 1990) and (Tillmann and Ney, 2000). |
| P02-1024 108 11:229 1 Introduction The n-gram model has been widely applied in many applications such as speech recognition, machine translation, and Asian language text input [Jelinek, 1990; Brown et al. , 1990; Gao et al. , 2002]. |
| W02-1603 109 39:121 n. a financial institution that accepts deposits and channels the money into lending activities n. sloping land (especially the slope beside a body of water) In order to resolve structural ambiguity, we apply the concept of the statistical machine translation approach (Brown et al. , 1990). |
| P08-2036 110 6:92 1 Introduction Language models, i.e. models that assign probabilities to sequences of words, have been proven useful in a variety of applications including speech recognition and machine translation (Bahl et al., 1983; Brown et al., 1990). |
| W04-1118 111 30:191 2 Review of the Baseline System for Statistical Machine Translation 2.1 Principle In statistical machine translation, we are given a source language (French) sentence fJ1 = f1 :::fj :::fJ, which is to be translated into a target language (English) sentence eI1 = e1 :::ei :::eI: Among all possible target language sentences, we will choose the sentence with the highest probability: eI1 = argmax eI1 'Pr(eI 1jf J 1 ) (1) = argmax eI1 'Pr(eI 1)Pr(f J 1 je I 1) (2) The decomposition into two knowledge sources in Equation 2 is known as the source-channel approach to statistical machine translation (Brown et al. , 1990). |
| J93-1006 112 23:392 Another example is the completely automatic, statistical approach to translation taken by the research group at IBM (Brown et al. 1990), which takes a large corpus of text with aligned translations as its point of departure. |
| W95-0106 113 87:153 It is interesting to constrast this method with the "parse-parse-match" approaches that have been reported recently for producing parallel bracketed corpora (Sadler & Vendelmans 1990; Kaji et al. 1992; Matsumoto et al. 1993; Cranias et al. 1994; Gfishman 1994). |
| W95-0106 114 13:153 Numerous experiments have shown parallel bilingual corpora to provide a rich source of constraints for statistical analysis (e.g. , Brown et al. 1990; Gale & Church 1991 ; Gale et al. 1992; Church 1993; Brown et al. 1993; Dagan et al. 1993; Fung & Church 1994; Wu & Xia 1994; Fung & McKeown 1994). |
| W05-0104 115 32:104 For example, it meant that simple word alignment models like IBM models 1 and 2 (Brown et al. , 1990) and the HMM model (Vogel et al. , 1996) came many weeks after HMMs were introduced in the context of part-of-speech tagging. |
| C04-1030 116 9:215 1 Introduction In statistical machine translation, we are given a source language (French) sentence fJ1 = f1 :::fj :::fJ, which is to be translated into a target language (English) sentence eI1 = e1 :::ei :::eI: Among all possible target language sentences, we will choose the sentence with the highest probability: eI1 = argmax eI1 'Pr(eI 1jf J 1 ) = argmax eI1 'Pr(eI 1)Pr(f J 1 je I 1) This decomposition into two knowledge sources is known as the source-channel approach to statistical machine translation (Brown et al. , 1990). |
| P97-1063 117 56:218 It is analogous to the step in other translation model induction algorithms that sets all probabilities below a certain threshold to negligible values (Brown et al. , 1990; Dagan et al. , 1993; Chen, 1996). |
| P97-1063 118 7:218 1 Introduction Over the past decade, researchers at IBM have developed a series of increasingly sophisticated statistical models for machine translation (Brown et al. , 1988; Brown et al. , 1990; Brown et al. , 1993a). |
| C08-1056 119 75:155 Giza++ (Och and Ney, 2003) is used to induce, based on statistical principles (Brown et al., 1990), an automatic word alignment of SMS tokens with their normalized counterparts; Moses (Koehn et al., 2007) is used to learn the various parameters of the phrase-based model, to optimize the weight combination and to perform the translation using a multi-stack search algorithm; the SRI language model toolkit (Stolcke, 2002) is finally used to estimate statistical language models. |
| W05-0833 120 140:152 7 Final Remarks Finally, as (Way and Gough, 2005) observe, it is difficult to explain why to this day SMT practitioners have not made full use of the large body of existing work on EBMT, from (Nagao, 1984) to (Carl & Way, 2003) and beyond, which has contributed greatly to the field of corpus-based MT. From its very inception EBMT has made use of a range of sub-sentential data both phrasal and lexical to perform translations whereas, until quite recently, SMT models of translation were based on the relatively simple word alignment models of (Brown et al. , 1990). |
| W05-0833 121 45:152 Until quite recently, SMT models of translation were based on the simple word alignment models of (Brown et al. , 1990). |
| W08-0510 122 5:155 1 Motivation Phrase-based translation has been one of the major advances in statistical machine translation (Brown et al. 1990) in recent years and is currently one of the techniques which can claim to be stateof-the-art in machine translation. |
| W08-0510 123 6:155 Phrase-based models are a development of the word based models as exemplified by the (Brown et al. 1990). |
| H05-1050 124 143:287 While they begin with a small translation lexicon, they are sufficiently robust to the choice of this initial seed (lexicon) that it suffices to construct a single seed by crude automatic means (Brown et al. , 1990; Melamed, 1997). |
| I08-2120 125 8:174 1 Introduction Parallel corpora consisting of text in parallel translation plays an important role in data-driven natural language processing technologies such as statistical machine translation (Brown et al., 1990) and cross-lingual information retrieval (Landauer and Littman, 1990; Oard, 1997). |
| P96-1009 126 139:301 We employ a fertility model (Brown et al, 1990) that indicates how likely each word is to map to multiple words or to a partial word in the SR output. |
| P96-1009 127 106:301 To achieve this, we adapted some techniques from statistical machine translation (such as Brown et al. , 1990) in order to model the errors that Sphinx-II makes in our domain. |
| C92-2080 128 165:169 1992 References \[Brown 90\] P.F.Brown et al. "A Statistical Approach to Machine Translation", Computational Linguistics, Vol.16, No.2, 1990 \[Doi 92\] S.Doi and K.Muraki "Robust Translation and Meaning Interpretation Mechanism based on Examples in Dictionary", Prec. |
| H05-1086 129 59:178 They used IBM Model 1 (Brown et al. , 1990), to rank documents according to their translation probability, given the query. |
| H05-1086 130 36:178 In the estimation step, the probability that a term in the sentence translates to a term in the query is estimated using the implementation of IBM 1http://trec.nist.gov 2http://www.ldc.upenn.edu Model 1 (Brown et al. , 1990) in GIZA++ (AlOnaizan et al. , 1999) out-of-the-box without alteration. |
| P98-1110 131 16:158 relies on translation probabilities estimated from large bilingual corpora (Brown et al. , 1990)(Brown et al. , 1991). |
| P98-1110 132 64:158 2The WORD SPACE method is closely related to Latent Semantic Indexing (LSI)(Deerwester et al. , 1990), where document-by-word matrices are processed by SVD instead of word-by-word matrices. |
| J03-1005 133 585:672 The resulting bilingual data have been sentence-aligned using statistical methods (Brown et al. 1990). |
| J03-1005 134 89:672 (1990) and Brown et al. |
| P99-1068 135 7:186 (Brown et al. , 1990)) typically rely on large quantities of bilingual text aligned at the document or sentence level, and a number of approaches in the burgeoning field of crosslanguage information retrieval exploit parallel corpora either in place of or in addition to mappings between languages based on information from bilingual dictionaries (Davis and Dunning, 1995; Landauer and Littman, 1990; Hull and Oard, 1997; Oard, 1997). |
| P96-1023 136 138:283 The node mapping function f for the entire tree thus has a different role from the alignment function in the IBM statistical translation model (Brown et al. 1990, 1993); the role of the latter includes the linear ordering of words in the target string. |
| P96-1023 137 278:283 We are not advocating an approach in which linguistic structure is ignored (as it is in the IBM translator described by Brown et al. 1990), but rather one in which the syntactic and semantic structure of a string is implicit in the way it is processed by an interpreter. |
| J93-1004 138 16:365 Introduction Researchers in both machine translation (e.g. , Brown et al. 1990) and bilingual lexicography (e.g. , Klavans and Tzoukermann 1990) have recently become interested in studying bilingual corpora, bodies of text such as the Canadian Hansards (parliamentary debates), which are available in multiple languages (such as French and English). |
| J93-1004 139 56:365 (from Brown et al. 1990) English French Prob (FrenchlEnglish) the le 0.610 the la 0.178 the 1' 0.083 the les 0.023 the ce 0.013 the il 0.012 the de 0.009 the ~ 0.007 the que 0.007 very well documented in the published literature; consequently, there has been a lot of unnecessary subsequent work at ISSCO and elsewhere. |
| J93-1004 140 1:365 A Program for Aligning Sentences in Bilingual Corpora William A. Gale* AT&T Bell Laboratories Kenneth W. Church* AT&T Bell Laboratories Researchers in both machine translation (e.g. , Brown et al. 1990) and bilingual lexicography (e.g. , Klavans and Tzoukermann 1990) have recently become interested in studying bilingual corpora, bodies of text such as the Canadian Hansards (parliamentary proceedings), which are available in multiple languages (such as French and English). |
| J93-1004 141 39:365 Aligning sentences is just a first step toward constructing a probabilistic dictionary (Table 3) for use in aligning words in machine translation (Brown et al. 1990), or for constructing a bilingual concordance (Table 4) for use in lexicography (Klavans and Tzoukermann 1990). |
| N06-1015 142 13:205 The standard approach to word alignment is to construct directional generative models (Brown et al. , 1990; Och and Ney, 2003), which produce a sentence in one language given the sentence in another language. |
| N06-1015 143 94:205 Generative alignment models like the HMM model (Vogel et al. , 1996) and IBM models 4 and above (Brown et al. , 1990; Och and Ney, 2003) directly model correlations between alignments of consecutive words (at least on one side). |
| W97-0407 144 118:197 It is more realistic that the one in (Castellanos et al. , 1994), but, unlike other corpora such as the Hansards (Brown et al. , 1990), it is not unrestricted. |
| W03-0414 145 12:242 (Brown et al. , 1990; Brown et al. , 1993)) are best known and studied. |
| J04-4002 146 27:482 Yet the modeling, training, and search methods have also improved since the field of statistical machine translation was pioneered by IBM in the late 1980s and early 1990s (Brown et al. 1990; Brown et al. 1993; Berger et al. 1994). |
| C04-1117 147 130:161 4 Related Work The rise of the empirical paradigm in the field of machine translation is, to a large degree, due to the wide-spread availability of parallel corpora (Brown et al. , 1990). |
| C94-2178 148 7:102 Motivation There have been quite a number of recent papers on parallel text: Brown et al (1990, 1991, 1993), Chen (1993), Church (1993), Church et al (1993), Dagan et al (1993), Gale and Church (1991, 1993), Isabelle (1992), Kay and Rgsenschein (1993), Klavans and Tzoukermann (1990), Kupiec (1993), Matsumoto (1991), Ogden and Gonzales (1993), Shemtov (1993), Simard et al (1992), WarwickArmstrong and Russell (1990), Wu (to appear). |
| W93-0301 149 38:185 2.1.1 Brown et al.'s Model In the context of their statistical machine translation project (Brown et al. , 1990), Brown et al. estimate Pr(f\[e), the probability that f, a sentence in one language (say French), is the translation of e, a sentence in the other language (say English). |
| W93-0301 150 6:185 1 Introduction Aligning parallel texts has recently received considerable attention (Warwick et al. , 1990; Brown et al. , 1991a; Gale and Church, 1991b; Gale and Church, 1991a; Kay and Rosenschein, 1993; Simard et al. , 1992; Church, 1993; Kupiec, 1993; Matsumoto et al. , 1993). |
| W93-0301 151 7:185 These methods have been used in machine translation (Brown et al. , 1990; Sadler, 1989), terminology research and translation aids (Isabelle, 1992; Ogden and Gonzales, 1993), bilingual lexicography (Klavans and Tzoukermann, 1990), collocation studies (Smadja, 1992), word-sense disambiguation (Brown et al. , 1991b; Gale et al. , 1992) and information retrieval in a multilingual environment (Landauer and Littman, 1990). |
| P07-2026 152 30:101 3.2 Baseline System The posterior probability Pr(eI1|fJ1 ) is modeled directly using a log-linear combination of several models (Och and Ney, 2002): Pr(eI1|fJ1 ) = exp parenleftBigsummationtextM m=1 mhm(e I1,fJ1 ) parenrightBig summationtext Iprime,eprimeIprime1 exp parenleftBigsummationtextM m=1 mhm(eprime Iprime1,fJ 1 ) parenrightBig (1) This approach is a generalization of the sourcechannel approach (Brown et al. , 1990). |
| W99-0905 153 9:135 Due to the recent availability of large text corpora, various statistical approaches have been tried including using 1) parallel corpora (Brown et al. , 1990), (Brown et al. , 1991), (Brown, 1997), 2) non-parallel bilingual corpora tagged with topic area (Yamabana et al. , 1998) and 3) un-tagged mono-language corpora in the target language (Dagan and Itai, 1994), (Tanaka and Iwasaki, 1996), (Kikui, 1998). |
| W99-0905 154 121:135 , 1990), (Brown et al. , 1991), (Brown, 1997), (Yamabana et hi. |
| J05-4003 155 9:416 They provide indispensable training data for statistical machine translation (Brown et al. 1990; Och and Ney 2002) and have been found useful in research on automatic lexical acquisition (Gale and Church 1991; Melamed 1997), crosslanguage information retrieval (Davis and Dunning 1995; Oard 1997), and annotation projection (Diab and Resnik 2002; Yarowsky and Ngai 2001; Yarowsky, Ngai, and Wicentowski 2001). |
| J05-4003 156 106:416 Word alignments were first introduced in the context of statistical MT, where they are used to estimate the parameters of a translation model (Brown et al. 1990). |
| C00-2092 157 81:175 As in any statistical MT system, we wish to choose the target sentence w~ so as to maximize P(wtlw,) (Brown et al. , 1990, p. 79). |
| C00-2092 158 27:175 2Links between tree nodes were introduced for TAG trees, in (Schieber and Schabes, 1990), and put to use for Machine Translation by Abeilld et al. |
| C00-2092 159 112:175 6 4.1 Evahmtion In a manner similar to (Brown et al. , 1990, p. 83), we assigned each of the resulting sentences a category according to the following criteria. |
| C02-1137 160 82:136 Therefore, title-word-document-word translation probability P(tw|dw) can be learned from the training corpus using statistical translation model (Brown et al. , 1990). |
| W06-3103 161 38:183 As a decision rule, we obtain: eI1 = argmax I,eI1 braceleftBigg Msummationdisplay m=1 mhm(eI1,fJ1 ) bracerightBigg (3) This approach is a generalization of the sourcechannel approach (Brown et al. , 1990). |
| J03-3002 162 7:541 They represent resources for automatic lexical acquisition (e.g. , Gale and Church 1991; Melamed 1997), they provide indispensable training data for statistical translation models (e.g. , Brown et al. 1990; Melamed 2000; Och and Ney 2002), and they can provide the connection between vocabularies in cross-language information retrieval (e.g. , Davis and Dunning 1995; Landauer and Littman 1990; see also Oard 1997). |
| C00-2169 163 69:160 For further processing steps we have to introduce the concept of alignment (Brown et al. , 1990). |
| W06-1008 164 12:225 A compilationof paralleltexts offered in a serviceableformis calleda parallelcorpus.Parallelcorporaareveryvaluableresourcesin various fields of multilingualnaturallanguageprocessing such as statisticalmachinetranslation(Brown et al. , 1990),cross-lingualIR (Chenand Nie, 2000), and constructionof dictionary(Nagao, 1996). |