This is possibly the most comprehensive English-French-English dictionary. The link is
here. However, there is no available wordlist. Even worse, the html link remains
https://grc.lerobert.com/login.asp no matter word we search for.
Do you have any idea to scrap the data in this case?
Could you please share with me so that I can have a look at it?
Do you mean you already crawled
Le Grand Robert & Collins and saved the data as JSON?
Could you please share the technique (python file,…) you use with me? I will share the mdd/mdx if I’m able to finish?
Thank you so much for your help!
I’m interested in the possibility of finding a way to secure all entries. Can’t tell from the clip whether an index is available. Scraping is always possible through browser automation if nothing works.
It’s the best that we secure all the entries. I checked that website carefully and could not find any wordlist. Could you elaborate on “browser automation” in case the wordlist is not available?
vocabulary.com also does not provide the wordlist.
Le grand Robert does not have even the short list of nearby words as OALD.
正常访问能打开吗？900 说明网站那边识别了你是爬虫，所以把你屏蔽了，只返回空内容给你。原因很多，如果 IP 地址没被屏蔽，可以伪装的更像一点，可以使用 Python 操作浏览器去爬取网站内容，但需要进一步学习 Python 的基础知识。
pip install playwright
from playwright.sync_api import syn…
On a separate note, it would take quite some time to obtain
vocabulary.com’s full wordlist, but it’s for sure possible.
Are you interested in making this dictionary together? If yes, I will share the account to access the website.
I’m most interested in finding a way to secure all entries. If there isn’t a good way to do so, I can’t guarantee I’ll help scrape it. Happy to help if that’s not the case.
@deusexmachina would you be also interested in this project?
Would you like to cooperate with
@Akira to obtain all the headwords ?
The Grand Robert is by far the best French Dictionary !
The project is not easy. It’s better to have a group of members joining hands :v
Grand Robert 有索引啊，Robert Collins没有而已