I’ve just crawled the word list from https://www.collinsdictionary.com/dictionary/english-french. For each word, I have an associated link. For example, https://www.collinsdictionary.com/dictionary/english-french/love. I would like to ask
-
which format of a html I should save to create mdd/mdx? Is the below method fine?
r = http.request(‘get’, url)
data = str(r.data.decode(‘utf-8’)) -
How can I deal with the pronunciation associated with each word?
Thank you so much for your elaboration!