From 4fcbc825a18e613c78f1de478634e7f39491fdcb Mon Sep 17 00:00:00 2001 From: nlpfun Date: Wed, 1 Dec 2021 00:08:59 +0800 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 9eaf09b..3a873a1 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ Word Embedding-Based Bilingual Sentence Aligner Bertalign is designed to facilitate the construction of bilingual parallel corpora, which have a wide range of applications in translation-related research such as corpus-based translation studies, contrastive linguistics, computer-assisted translation, translator education and machine translation. -Bertalign uses the [LaBSE multilingual BERT model](https://arxiv.org/abs/2007.01852) provided by [sentence-transformers](https://github.com/UKPLab/sentence-transformers) to represent source and target sentences so that semantically similar sentences in different languages can be mapped onto similar vector spaces. According to our experiments, Bertalign achieves more accurate results than the traditional length-, dictionary-, or MT-based alignment methods such as [Galechurch](https://aclanthology.org/J93-1004/), [Hunalign](http://mokk.bme.hu/en/resources/hunalign/) and [Bleualign](https://github.com/rsennrich/Bleualign). It also performs better than [Vecalign](https://github.com/thompsonb/vecalign) on MAC, a manually aligned parallel corpus of Chinese-English literary texts. +Bertalign uses the [LaBSE multilingual BERT model](https://arxiv.org/abs/2007.01852) (supporting 109 languages) provided by [sentence-transformers](https://github.com/UKPLab/sentence-transformers) to represent source and target sentences so that semantically similar sentences in different languages can be mapped onto similar vector spaces. According to our experiments, Bertalign achieves more accurate results than the traditional length-, dictionary-, or MT-based alignment methods such as [Galechurch](https://aclanthology.org/J93-1004/), [Hunalign](http://mokk.bme.hu/en/resources/hunalign/) and [Bleualign](https://github.com/rsennrich/Bleualign). It also performs better than [Vecalign](https://github.com/thompsonb/vecalign) on MAC, a manually aligned parallel corpus of Chinese-English literary texts. ## Installation