Linguee Collocations

Linguee 英法、英西词头取交集,合并 COCA 6万词,全文搜索生成的英语搭配“词典”。

词条示例:

naked

naked aggression, naked ambition, naked back, naked barley, naked bike, naked body, naked breast, naked bulb, naked cds, naked challenge, naked children, naked eye, naked feet, naked flame, naked force, naked girls, naked gun, naked licensing, naked light, naked lunch, naked man, naked mole rat, naked owner, naked people, naked photos, naked pictures, naked power, naked racism, naked ride, naked selling, naked short, naked short position, naked short selling, naked shorting, naked skin, naked state, naked steel, naked sword, naked torso, naked truth, naked version, naked woman, naked-eye

almost naked, are naked, around naked, being naked, born naked, buck naked, clothe the naked, clothing the naked, completely naked, dressed naked, emperor is naked, felt naked, get naked, go naked, half naked, half-naked, i am naked, i feel naked, i was naked, i’m naked, is naked, lay naked, leave naked, nearly naked, pose naked, run naked, semi-naked, sheer naked, stand naked, stark naked, stay naked, strip naked, stripped naked, stripping naked, swimming naked, to be naked, walk naked, was naked, while naked, you are naked

a naked flame, by the naked eye, do not spray on a naked flame, do not spray on naked flame, invisible to the naked eye, near a naked flame, no naked flame sources, no naked flame sources such as candles should be placed on t, not visible to the naked eye, obvious to the naked eye, read with the naked eye, seen by the naked eye, seen with the naked eye, small naked oats, visible to the naked eye, with a naked eye, with the naked eye

3 个赞

能否按频率排序,而不是字母顺序

Linguee 没提供单词短语的频率数据,所以我无能为力。

1 个赞

嗯,个人觉得按频率排序价值大增,有一种近似解决方案,像汉语多功能字库这样,在多个数据库中(比如collins也有这个数据),出现越多,可以认为频率越高

1 个赞

这是个好思路。只是要花精力去整理多源数据,会给项目的长期更新增加负担。我选择维持现状。

现在列出的搭配取自英德、英法、英西三个语料的交集,应该确保了基本的使用频率。另外,按频率排序仅在搭配数超多时才体现出便利,几十、小几百条的话,用户直接通览就是了。

期待佬的大作

有必要舍弃搭配数为1的词条,例如 beep: a beep,behalf of him: on behalf of him。

将简化处理流程,只取英法、英西词头交集。

当前版本:0.1g

规模比前版更大,查 naked 能查到 swimming naked

Linguee Collocations 0.1g.7z.001 (18 MB)

Linguee Collocations 0.1g.7z.002 (14.7 MB)

3 个赞


感谢分享、收获颇丰,可考虑把family放在一起?谢谢

我很少查 Word Family,你自己解开两个 mdx 合并一下吧。

1 个赞

英德、英法、英西词头集合 S1, S2, S3,每个都包含700多万条单词/短语,选定子集用来全文搜索的方案有三种:

(S1 & S2) | (S2 & S3) | (S3 & S1) 数目360万

S2 & S3 数目210万

S1 & S2 & S3 数目130万

方案一成品效果最佳,但考虑到计算耗时与集合元素个数的平方成正比,我只好退而选了方案二。