Forvo .DSL to Forvo .MDX (Conversion instructions)

Shuibogo · 2023 年8 月 18 日 22:16

Forvo Audios Conversion (.dsl → .mdx)

INSTRUCTIONS (Linux users)

You can find the Forvo audios and ready .dsl dictionaries for them here:

1. Create a folder (lets call it “Forvo folder”) with:

• Forvo audios for the target language (uncompressed) inside an “audio” folder. For french, for instance, the path should look like this:

…/Forvo folder/audio/fr/{forvo usernames}/…

• .dsl file for your language
• forvo_{language}.txt (blank file)
• title.html (just open it and write the name you want to be displayed for your dictionary
• description.html (you can add a description for the dictionary or leave this file blank)

2. Download this Pyglossary version with OctopusMdictSource support (attached below)

Pyglossary 4.6.1 (with MDX support).zip (896.1 KB)

Also install a interface:
• Tkinter-based interface

**Debian/Ubuntu:** apt-get install python3-tk tix
**openSUSE:** zypper install python3-tk tix
**Fedora:** yum install python3-tkinter tix
**Mac OS X:** read https://www.python.org/download/mac/tcltk/
**Nix / NixOS:** nix-shell -p python38Packages.tkinter tix

• Gtk3-based interface

**Debian/Ubuntu:** apt install python3-gi python3-gi-cairo gir1.2-gtk-3.0
**openSUSE:** zypper install python3-gobject gtk3
**Fedora:** dnf install pygobject3 python3-gobject gtk3
**ArchLinux:**
    pacman -S python-gobject gtk3
    https://aur.archlinux.org/packages/pyglossary/
**Mac OS X**: brew install pygobject3 gtk+3
**Nix / NixOS:** nix-shell -p pkgs.gobject-introspection python38Packages.pygobject3 python38Packages.pycairo

If you already have some version of Pyglossary installed, you can just add the OctopusMdictSource plugin (attached below) to …/Pyglossary_folder/pyglossary/plugins

octopus_mdict_source.py (4.7 KB)

3. Install mdict-utils

pip install mdict-utils
or
pip3 install mdict-utils

Open your terminal, go to the Pyglossary folder and open the application

python3 pyglossary.pyw

The Pyglossary UI should open

For the input file field, select the .dsl file in the Forvo folder. The input format is ABBYY Lingvo (.dsl)
For the output field, select the .txt file in the Forvo folder. The format is Octopus Mdict Source
The .mdx source for the .dsl was generated. Now you have to manually fix the paths to the audio files. Using a text editor and regular expressions [RegEx] (i use the Kate text editor), open your “forvo_{language}.txt”:

• FIND: \[s\]
• REPLACE: <a href="sound://

• FIND: //{language_code}/(.*?)/(.*?).opus\[/s\]
• REPLACE: //{language_code}/\1/\2.opus">\2</a>

Instead of {language_code} you should use the code for your language, like “fr” for french or “ru” for russian, just like the name of the zip file with the audios

7. [OPTIONAL] you can link a .css to it, so that further customization is possible

• FIND: <div style=(.*)
• REPLACE: <link rel=“stylesheet” type=“text/css” href=“style.css” />\n<div style=\1

It may take like a minute or so to apply the changes
If you are going to do this, create “style.css” inside the Forvo folder

Before the final step, your Forvo folder should look something like this:

8. Now you just have to compile the .mdx and the .mdd using mdict-utils

• COMPILING THE .MDX
Open your terminal inside the Forvo folder and type:

mdict --title title.html --description description.html -a forvo_{language}.txt forvo_{language}.mdx

• COMPILING THE .MDD

mdict --title title.html --description description.html -a audio forvo_{language}.mdd

Now you can delete everything, except the .mdx, the .mdd and the .css

You can rename the .mdx and the .mdd as you wish, and also add a icon (.png/.jpg, etc.), but be sure .mdx, .mdd and icon have the same name

WARNING: if you want to customize the dictionary using the .css, don’t ever rename it. The .css name is already specified within the compiled .mdx file

Of course the ideal would be creating a script to automate this whole process, but unfortunately I’m not a programmer and have little experience creating python scripts.

I’m not a Windows user so I can’t say much about the process there, but both Pyglossary and mdict-utils are available there.
For Windows, you can alternatively use MDX Builder instead of mdict-utils, but honestly I think mdict-utils is much better and faster

These instructions are for the .opus audio files. If you want to use .mp3 audio, just replace .opus with .mp3 in the RegEx steps and use the .mp3 audio folder. I don’t see any reason for doing this. since the .opus files are high quality and take much less space.

RegEx notes:

• on Kate, \1, \2, … are used to paste what was captured with (.*?) or (.*) in the “FIND” field, following the order respectively. (.*?) is a non-greedy capture group

• On Linux, \n is used for linebreaks, but I heard you should use \r\n on Windows

Following these same guidelines, I made Forvo French (MDX) and Forvo Persian (MDX):

• Link to the Forvo Persian (MDX): https://cloud.freemdict.com/index.php/s/M7BzW2PAWDBF5k8

• Link to the Forvo French (MDX): https://cloud.freemdict.com/index.php/s/8my4FYLCcGm2yed

tovaremeterio · 2023 年8 月 19 日 09:41

Thank you very much ! Great job ! You are really kind… I hope more people would be able to create their favorite Pronunciation Dictionaries…!

Ambulante · 2024 年9 月 8 日 16:27

That’s really great job. Thanks for sharing this tutorial. I wish to know if it is possible to insert these audio files to a French (or any other language) dictionary? If I can combine the original dictionary with the pronunciation, it will be fantastic.

duomham · 2025 年1 月 19 日 16:16

Is there a chance to help converting a dictionary from mdx mdd to Abbyy dsl, dsl.files.zip that I can use it with goldendict mobile. I do not have coding experience but general understand of python. Thanks in advance.

Wankata · 2025 年1 月 19 日 18:00

Try this: Releases · glowinthedark/pyglossary · GitHub

duomham · 2025 年1 月 19 日 18:11

Thanks for replay
I got this error
Traceback (most recent call last):
File “C:\PROGRA~1\PYGLOS~1\pyglossary\ui\ui_tk.py”, line 182, in CallWrapper__call__
File “C:\PROGRA~1\PYGLOS~1\pyglossary\ui\ui_tk.py”, line 1452, in convert
File “C:\PROGRA~1\PYGLOS~1\pyglossary\glossary_v2.py”, line 1272, in convert
File “C:\PROGRA~1\PYGLOS~1\pyglossary\glossary_v2.py”, line 1221, in convertV2
File “C:\PROGRA~1\PYGLOS~1\pyglossary\glossary_v2.py”, line 1168, in _convertPrepare
File “C:\PROGRA~1\PYGLOS~1\pyglossary\glossary_v2.py”, line 1090, in _resolveSortParams
File “C:\PROGRA~1\PYGLOS~1\pyglossary\glossary_v2.py”, line 1006, in _switchToSQLite
File “C:\PROGRA~1\PYGLOS~1\pyglossary\sq_entry_list.py”, line 55, in init
sqlite3.OperationalError: unable to open database file

Wankata · 2025 年1 月 19 日 18:22

Then try this way:

怎样把dsl文件转换为mdx文件How to convert DSL files to MDX files (you can convert not only in MDX, but in another format too).

duomham · 2025 年1 月 19 日 19:14

Thanks Wankata.
I ll try this, I really appreciate your help.

Wankata · 2025 年1 月 19 日 19:26

Convert to StarDict .IFO, it’s easier. I think, GoldenDict Mobile can handle this format.

duomham · 2025 年1 月 19 日 20:33

I ll try, thanks a million. God bless you.

Ambulante · 2025 年2 月 1 日 11:42

Following my previous reply, I haven’t been able to insert the audio files to another dictionary. However, I managed to make an improved CSS in return.
style.css (1.1 KB)