Crawl data of "Le Grand Robert & Collins"

This is possibly the most comprehensive English-French-English dictionary. The link is here. However, there is no available wordlist. Even worse, the html link remains no matter word we search for.

Do you have any idea to scrap the data in this case?



1 Like

Yess, I have :v

1 Like

以前用Grand Robert词汇表爬过,数据是json的,可以直接读取使用,要想完整还原网页则需要逆向js,基本是成功了

Could you please share with me so that I can have a look at it?

1 Like

Do you mean you already crawled Le Grand Robert & Collins and saved the data as JSON?


Could you please share the technique (python file,…) you use with me? I will share the mdd/mdx if I’m able to finish?


Thank you so much for your help!

1 Like

I’m interested in the possibility of finding a way to secure all entries. Can’t tell from the clip whether an index is available. Scraping is always possible through browser automation if nothing works.

It’s the best that we secure all the entries. I checked that website carefully and could not find any wordlist. Could you elaborate on “browser automation” in case the wordlist is not available?

For example, also does not provide the wordlist.

Le grand Robert does not have even the short list of nearby words as OALD.

1 Like

On a separate note, it would take quite some time to obtain’s full wordlist, but it’s for sure possible.

Many roads lead to Rome…

Are you interested in making this dictionary together? If yes, I will share the account to access the website.

I’m most interested in finding a way to secure all entries. If there isn’t a good way to do so, I can’t guarantee I’ll help scrape it. Happy to help if that’s not the case.

1 Like

@deusexmachina would you be also interested in this project?

Would you like to cooperate with @Akira to obtain all the headwords ?

The Grand Robert is by far the best French Dictionary !

1 Like

The project is not easy. It’s better to have a group of members joining hands :v

Grand Robert 有索引啊,Robert Collins没有而已