CJK-English Wiktionary 2021




import requests
import sys
from fake_useragent import UserAgent
from bs4 import BeautifulSoup as b
from concurrent.futures import ThreadPoolExecutor
sys.stdout = open("133000.txt", "wt")
user = UserAgent()
characters = list(map(str.strip, open('small_file_133000.txt').readlines()))

def function_name(character):
#    proxies = {
  #'http': 
  #['http://176.110.121.90:21776',
  #'http://91.187.113.205:53281', 
  #'http://31.172.177.149:83', 
  #'http://116.90.229.186:35561', 
  #'http://103.83.118.10:55443', 
  #'http://116.73.14.16:80', 
  #'http://50.233.42.98:51696'],  
  #'https':    
  #['https://176.110.121.90:21776',
  #'https://91.187.113.205:53281', 
  #'https://31.172.177.149:83', 
#  'https://116.90.229.186:35561', 
  #'https://103.83.118.10:55443', 
  #'https://116.73.14.16:80', 
#  'https://50.233.42.98:51696']
#  }
    chrome_ = user.random
    header = {'User-Agent': chrome_} 
    url = f'https://en.m.wiktionary.org/wiki/{character}'
    r = requests.get(url, headers=header)
    soup = b(r.text, 'html.parser')
    titlewik = soup.title.string
    titlewik_final = titlewik.replace(" - Wiktionary", "")
    find_main = soup.main
    str_main = str(find_main)
    final_main = str_main.replace("/wiki/", "entry://").replace("\n", "")
    return titlewik_final + "\n<link rel='stylesheet' href='/CJK.css'>" + final_main + "\n</>"

with ThreadPoolExecutor() as executor:
    for res in executor.map(function_name, characters):
        print(res)

CJK-English Wiktionary 2021 (Python).zip (2.6 MB)
CJK.css (60.8 KB)

https://forum.freemdict.com/t/topic/6259
https://forum.freemdict.com/t/topic/6787
https://t.me/freemdict/153016

3 Likes

没明白,请问哪位兄台帮忙指导下如何使用,谢谢

2 Likes

到这里下完整文档:

Who don’t have Telegram Account can download mdx here:
https://cowtransfer.com/s/af012cbe7a4741
http://dl.tiapp.ga/170851898936290503733890616/CJK-English%20Wiktionary%202021.mdx

Does that Python script could be used for other languages as German ?

2 Likes

You got to have German list, and url might need to change to German.
sys.stdout = open(“133000.txt”, “wt”)
url = f’https://en.m.wiktionary.org/wiki/{character}

1 Like

链接:百度网盘 请输入提取码 提取码:p9bs

1 Like

Chinese, Japanese and Korean Words from the Wiktionary,

这部词典可以当做很好的汉英词典来用
另外,好奇“CJK”是指那家翻译公司吗?

Chinese, Japanese and Korean

先谢谢楼主与 @amob 的大作。修复在 GD 中显示时缺少上图绿圈动图,清理红圈没用的内容。

链接: 百度网盘 请输入提取码
提取码: amrg


1 Like