From c78bc5a0589b740746d0ad3f7ae19a0bd28d6144 Mon Sep 17 00:00:00 2001 From: bfsujason Date: Sun, 28 Nov 2021 21:10:36 +0800 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index c5b5b31..54205ee 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ # Bertalign Word Embedding-Based Bilingual Sentence Aligner -Bertalign is designed to facilitate the construction of sentence-aligned bilingual parallel corpora, which have a wide range of applications in translation-related research such as corpus-based translation studies, contrastive linguistics, computer-assisted translation, translator education and machine translation. +Bertalign is designed to facilitate the construction of bilingual parallel corpora, which have a wide range of applications in translation-related research such as corpus-based translation studies, contrastive linguistics, computer-assisted translation, translator education and machine translation. Bertalign uses [cross-lingua embedding models](https://github.com/UKPLab/sentence-transformers) to represent source and target sentences in vectors in order to capture semantically similar sentences in both languages, which according to our explements achieves more accurate results than the traditional length-, dictionary-, or MT-based alignment methods such as [Galechurch](https://aclanthology.org/J93-1004/), [Hunalign](http://mokk.bme.hu/en/resources/hunalign/) and [Bleualign](https://github.com/rsennrich/Bleualign). It also performs better than [Vecalign](https://github.com/thompsonb/vecalign) on MAC, a manually aligned parallel corpus of Chinese-English literary texts.