Update README.md

This commit is contained in:
bfsujason
2021-12-01 11:41:00 +08:00
committed by GitHub
parent c8b0bb7833
commit 173ee6f037

View File

@@ -36,7 +36,7 @@ We use [Moses sentence splitter](https://github.com/moses-smt/mosesdecoder/blob/
The [auto](./data/mac/dev/auto) and [gold](./data/mac/dev/gold) directories are for automatic and gold alignments respectively. All the gold alignments are created manually using [Intertext](https://wanthalf.saga.cz/intertext).
### Bible
[The Bible corpus](./data/bible), consisting of 5,000 source and 6,301 target sentences, is selected from the public [multilingual Bible corpus](https://github.com/christos-c/bible-corpus/tree/master/bibles). This corpus is mainly used to evaluate the speed of Bertalign.
[The Bible corpus](./data/bible), consisting of 5,000 source and 6,301 target sentences, is selected from the public [multilingual Bible corpus](https://github.com/christos-c/bible-corpus/tree/master/bibles). This corpus is mainly used to compare the running time of various aligners.
The directory makeup is similar to MAC-Dev, except that the gold alignments for the Bible corpus are generated automatically from the original verse-aligned Bible corpus.