Leipzig - German News Collocations (Corpora)

Leipzig - German News Collocations (Corpora)
Format: .dsl (for GoldenDict)
Author: @tovaremeterio
Source: Leipzig Corpora Collection / Deutscher Wortschatz

We need help to convert these .dsl files into .mdx format. Also, an improvement of style is desirable…

Description: German Language News from Leipzig Corpora. It is a collection of 12 million phrases from Newspapers from Germany, Austria and Switzerland from 2010 until 2020. It is useful as a collocations dictionary but has the following limitation:

Useful only for full text search on GoldenDict as an off-line Corpora.

An example of how to use the data in .dsl format: if you discover the word “Schadenfreude” and want to know examples, make a full text search on GoldenDict (Ctrl+Alt+F) and find:


P.S. Only the “Swiss Corpus” is available. Other corpus were available but they were encoded as UTF16 and GD cannot read them anymore. A conversion from UTF16 to UTF8 failed. Always create .dsl files as UTF8 !

1 Like