[分享]ooeedd 2021.12月源数据一份供大家洗版玩

mdict6 · 2022 年8 月 7 日 11:01

我下载的是一楼的

相比阿弥陀佛转发的mdx，多了几个词条，比如oed之前没收录的Africa，一个词条看不完的长

mdict6 · 2022 年8 月 7 日 11:02

my fault, i downloaded the .rar file.

karx · 2022 年8 月 7 日 12:26

改个html后缀就能打开了

mdict6 · 2022 年8 月 7 日 12:29

再下载几秒钟的事

mwanzo · 2022 年8 月 7 日 19:53

as @karx said, these are HTML files of the entries on the OED site current to december last year. the mdx files are for the categories feature of the OED. i appreciate your interest in producing an mdx file for a version of the OED based on these files that a lot of people have been asking for. credits also should go to the original uploader @137229.

RandomTelegram · 2022 年8 月 8 日 08:45

Where are the audio and image files?

amob · 2022 年8 月 8 日 10:37

Get pronunciation online, no pictures

RandomTelegram · 2022 年8 月 8 日 13:49

Do you have its JavaScript? The site has been changed since then, and the new JavaScript doesn’t work with this 2021 version.

amob · 2022 年8 月 8 日 13:54

My 2021 version works with the old JS, which is likely in mdd

amob · 2022 年8 月 8 日 13:58

I downloaded this one whose maker is the same as the post’s OP, viz. 137223.
For the old version can get pronunciation online, I suppose that the raw data have pronunciation urls.
I don’t have the js you want.

amob · 2022 年8 月 8 日 14:03

It is just my guess. SORRY to interrupt you.

RandomTelegram · 2022 年8 月 8 日 14:31

Thank you very much for the file link. It seems the 2021 has everything I need.

amob · 2022 年8 月 8 日 14:32

But this one only comprises contents until Jun. 2021.

mwanzo · 2022 年8 月 8 日 21:01

here is a text file with direct links to the sound files:

Sounds.zip (10.0 MB)

there may be some duplicates in the file above(the links are extracted raw from the OED files posted) and so here is the same list deduplicated and sorted alphabetically:

Sounds - Sorted.zip (9.3 MB)

members can be able to compare the two and see if deduplication and sorting led to loss of entries, also wget does a really good job downloading the files.

RandomTelegram · 2022 年8 月 10 日 03:04

One more thing, the files from MediaFire aren’t named in chronological order. For example, the first file is named “1”, but the second file is named “4”. Where are files “2” and “3”?

RandomTelegram · 2022 年8 月 10 日 03:10

They’re css and js in the mdd. That’s what I need. I managed to get the audio working with 歐路.

mwanzo · 2022 年8 月 10 日 09:17

this strangeness comes from the OED itself. the entries on there come in this form: https://www.oed.com/view/Entry/X, where the “X” at the end is the number of an entry. so for the definition of “A” its URL on the OED is: https://www.oed.com/view/Entry/1 this is why the filenames are numbers with file “1” corresponding to entry “A” in the OED and filename and “Dogfood” with the URL https://www.oed.com/view/Entry/90744850 on the OED corresponds to the file with the name “90744850” that is found in archive 6. if you try entering these URL’s into your browser:

https://www.oed.com/view/Entry/2
https://www.oed.com/view/Entry/3
https://www.oed.com/view/Entry/5
https://www.oed.com/view/Entry/6
https://www.oed.com/view/Entry/7

you’ll get a 404 error from the OED site which means that the second and third entries in the OED have the URL’s https://www.oed.com/view/Entry/4 and https://www.oed.com/view/Entry/8 which are the second and third files in the archives.

RandomTelegram · 2022 年8 月 10 日 09:49

Thank you very much for your confirmation. I just wanted to make sure I had everything before going further.

RandomTelegram · 2022 年8 月 12 日 13:49

Hello, I have one question about mp3 downloading. I have extracted all mp3 links from the html files. Now I need to download all of them using wget. I’ve put all the 467484 urls in a text file. I know wget can resume downloading a file by using -c, but in case of an input text file above, I can’t download all of those mp3 in one day, so I need to Ctrl + C to pause wget and resume the next day. Will -c be able to resume downloading the rest of the mp3?

RandomTelegram · 2022 年8 月 12 日 23:09

I’ve tried JDownloader and Free Download Manager, but they don’t support downloading from input texts. I’ve tried wget, and it supports downloading from input texts. The speed is good, and it also retains file names, which is very important. I’m just not sure if it can resume downloading the rest of the files. I’ve set up Windows Linux Subsystem just to get wget. I have two options. Download from all the links with pauses and resumes, or splitting the link text, and download in smaller batches.

[分享]ooeedd 2021.12月 源数据一份 供大家洗版玩

[分享]ooeedd 2021.12月源数据一份供大家洗版玩