Mercurial > ppgen
changeset 7:8b2f8f439817
Improves: ding parser.
* Strips greater and lesser signs in the beginning and end of words
when reading a ding directory. Words enclosed by those characters seem
to be variants. This affects about 100 to 200 words for de in de-en 1.7.
author | Bernhard Reiter <bernhard@intevation.de> |
---|---|
date | Tue, 21 Feb 2017 14:14:08 +0100 |
parents | 81f75c9aac84 |
children | 200c2c3c5f67 |
files | ppgen.py |
diffstat | 1 files changed, 1 insertions(+), 1 deletions(-) [+] |
line wrap: on
line diff
--- a/ppgen.py Mon Feb 13 08:38:06 2017 +0100 +++ b/ppgen.py Tue Feb 21 14:14:08 2017 +0100 @@ -102,7 +102,7 @@ languageEntry = p[0] if useLeft else p[2] for word in splitter.split(languageEntry): - word = word.strip('(",.)\'!:;').rstrip('/') + word = word.strip('(",.)\'!:;<>').rstrip('/') if len(word) > 2 and not word[0] in '[{/': dset.add(word)