A fork of mdict-utils with parallel processing

Recently, I used the Python package mdict-utils, which is well written and organized. It works well for small and medium projects. In my case of English Wikipedia, its MDX-generating speed is not sufficient.

It turns out mdict-utils compresses and writes blocks sequentially. In my fork, the speed of generating MDX from TXTs is increased significantly by

  • replacing zlib with the deflate.
  • employing the paradigm compress blocks in parallel and write them sequentially in batch.

You can try my fork GitHub - leanhdung1994/mdict-utils: MDict pack/unpack/list/info tool.

5 Likes