英文最高频词列表

2 个赞

这个帖子貌似只有存档网页了,很奇怪。

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>The OEC: Facts about the language</title>
    <style>body{font-family:Avenir,sans-serif}</style>
</head>
<body>
    <h1>The OEC: Facts about the language</h1>
    <small><a href="https://web.archive.org/web/20111226085859/http://oxforddictionaries.com/words/the-oec-facts-about-the-language">https://web.archive.org/web/20111226085859/http://oxforddictionaries.com/words/the-oec-facts-about-the-language</a></small>
    <p>The 20-volume historical Oxford English Dictionary is the largest record of words used in English, past and
        present. It contains words that are now obsolete or rare (such as xenagogue 'a person who guides strangers' and
        vicine 'neighbouring or adjacent') in addition to the latest coinages such as phishing and podcast.</p>
    <p>The second edition of the OED, published in 1989 and consisting of twenty volumes, contains more than 615,000
        entries, and the third, available online, is expanding all the time, with batches of 2,500 new and revised words
        and phrases being added in regular quarterly updates.</p>
    <h3>How many words are there in English?</h3>
    <p>It is a question often asked, but not so easily answered. Even the OED does not set out to include every
        specialized technical term or slang or dialect expression ever used. New words are constantly being invented,
        developed from existing words, or adopted from other languages. Most will be used rarely, or only by a small
        group of people. This means that an unlimited number of words may occur in speech and writing which will never
        be recorded in even the largest dictionary.</p>
    <p>Furthermore, what exactly is a word? Clearly we should include single units such as cat and dog. But are the
        plurals cats and dogs separate words? Should we include compounds such as walking stick, which are made up of
        two existing words? There are an almost unlimited number of such two-word compounds, which can't all be included
        in a dictionary. And what about abbreviations like BBC and Dr, or proper names such as London, Nelson, and Harry
        Potter: are they words? As you can see, the question is not a straightforward one.</p>
    <h3>How many words do we use?</h3>
    <p>Although it may be impossible to know the number of words in English, the Oxford English Corpus can help us
        assess the number of words in current use.</p>
    <p>Instead of talking about words, it's more useful in this context to talk about lemmas, a lemma being the base
        form of a word. For example, climbs, climbing, and climbed are all examples of the one lemma climb. Just ten
        different lemmas (the, be, to, of, and, a, in, that, have, and I) account for a remarkable 25% of all the words
        used in the Oxford English Corpus. If you were to read through the corpus, one word in four (ignoring proper
        names) would be an example of one of these ten lemmas. Similarly, the 100 most common lemmas account for 50% of
        the corpus, and the 1,000 most common lemmas account for 75%. But to account for 90% of the corpus you would
        need a vocabulary of 7,000 lemmas, and to get to 95% the figure would be around 50,000 lemmas.</p>
    <p>The remaining 5% of the corpus consists of a very large number of lemmas which occur rarely: words like moidore
        or parados, which may occur only once every several million words. Like all natural languages, English consists
        of a small number of very common words, a larger number of intermediate ones, and then an indefinitely long
        'tail' of very rare terms.</p>
    <table cellpadding="5" bordercolor="#000000" border="1" bgcolor="#ccffff" cellspacing="0" width="100%">
        <tbody>
            <tr valign="middle" align="left">
                <th>Vocabulary size (no. lemmas)</th>
                <th>% of content in OEC</th>
                <th>Example lemmas</th>
            </tr>
            <tr valign="middle" align="left">
                <td>10</td>
                <td>25%</td>
                <td>the, of, and, to, that, have</td>
            </tr>
            <tr valign="middle" align="left">
                <td>100</td>
                <td>50%</td>
                <td>from, because, go, me, our, well, way</td>
            </tr>
            <tr valign="middle" align="left">
                <td>1000</td>
                <td>75%</td>
                <td>girl, win, decide, huge, difficult, series</td>
            </tr>
            <tr valign="middle" align="left">
                <td>7000</td>
                <td>90%</td>
                <td>tackle, peak, crude, purely, dude, modest</td>
            </tr>
            <tr valign="middle" align="left">
                <td>50,000</td>
                <td>95%</td>
                <td>saboteur, autocracy, calyx, conformist</td>
            </tr>
            <tr valign="middle" align="left">
                <td>&gt;1,000,000</td>
                <td>99%</td>
                <td>laggardly, endobenthic, pomological</td>
            </tr>
        </tbody>
    </table>

    <p>The long tail means that to account for 99% of the Oxford English Corpus you would need a vocabulary of more than
        a million lemmas. This would include some words which may occur only once or twice in the whole corpus: highly
        technical terms like chrondrogenesis or dicarboxylate, and one-off coinages like bootlickingly or unsurfworthy
        that people would probably understand but would be unlikely to use.</p>
    <p>If we decide that around 90-95% of the corpus gives a reasonable idea of an average vocabulary, we are left with
        a figure somewhere in the range of 7,000-50,000 lemmas: say, 25,000. What does a vocabulary of this size
        represent? It represents the set of most significant words in English: those which occur reasonably frequently
        and which account for all but a small part of everything we may encounter in speech or writing. It includes all
        the words that we actively use in general everyday life.</p>
    <p>It's interesting to note that most reasonably sized dictionaries contain significantly more than 25,000
        lemmas.The 11th edition of the Concise Oxford English Dictionary, for example, lists more than 75,000
        single-word lemmas, which means that the majority of its entries must belong to the long tail of extremely rare
        words. This makes good sense: such terms occur very infrequently, but when they do they are likely to be crucial
        to what's being said, and the reader might well want to look them up.The idea of a quantifiable vocabulary
        should be seen in this light: the words we ignore for the purposes of the exercise may be very rare, but in
        context they may be very important.</p>

    <h2>What is the commonest word?</h2>
    <p>Based on the evidence of the Oxford English Corpus, which currently contains over 2 billion words, the 100
        commonest English words found in writing around the world are as follows:</p>
    <table cellpadding="5" bordercolor="#000000" border="1" bgcolor="#ccffff" cellspacing="0" width="100%">
        <tbody>
            <tr valign="middle" align="left">
                <td>1 &nbsp;&nbsp;&nbsp; the <br>
                    2 &nbsp;&nbsp;&nbsp; be <br>
                    3 &nbsp;&nbsp;&nbsp; to <br>
                    4 &nbsp;&nbsp;&nbsp; of <br>
                    5 &nbsp;&nbsp;&nbsp; and <br>
                    6 &nbsp;&nbsp;&nbsp; a <br>
                    7 &nbsp;&nbsp;&nbsp; in <br>
                    8 &nbsp;&nbsp;&nbsp; that <br>
                    9 &nbsp;&nbsp;&nbsp; have <br>
                    10 &nbsp;&nbsp; I <br>
                    11 &nbsp;&nbsp; it <br>
                    12 &nbsp;&nbsp; for <br>
                    13 &nbsp;&nbsp; not <br>
                    14 &nbsp;&nbsp; on <br>
                    15 &nbsp;&nbsp; with <br>
                    16 &nbsp;&nbsp; he <br>
                    17 &nbsp;&nbsp; as <br>
                    18 &nbsp;&nbsp; you <br>
                    19 &nbsp;&nbsp; do <br>
                    20 &nbsp;&nbsp; at <br>
                    21 &nbsp;&nbsp; this <br>
                    22 &nbsp;&nbsp; but <br>
                    23 &nbsp;&nbsp; his <br>
                    24 &nbsp;&nbsp; by <br>
                    25 &nbsp;&nbsp; from <br>
                    &nbsp;</td>
                <td>26 &nbsp;&nbsp; they <br>
                    27 &nbsp;&nbsp; we <br>
                    28 &nbsp;&nbsp; say <br>
                    29 &nbsp;&nbsp; her <br>
                    30 &nbsp;&nbsp; she <br>
                    31 &nbsp;&nbsp; or <br>
                    32 &nbsp;&nbsp; an <br>
                    33 &nbsp;&nbsp; will <br>
                    34 &nbsp;&nbsp; my <br>
                    35 &nbsp;&nbsp; one <br>
                    36 &nbsp;&nbsp; all <br>
                    37 &nbsp;&nbsp; would <br>
                    38 &nbsp;&nbsp; there <br>
                    39 &nbsp;&nbsp; their <br>
                    40 &nbsp;&nbsp; what <br>
                    41 &nbsp;&nbsp; so <br>
                    42 &nbsp;&nbsp; up <br>
                    43 &nbsp;&nbsp; out <br>
                    44 &nbsp;&nbsp; if <br>
                    45 &nbsp;&nbsp; about <br>
                    46 &nbsp;&nbsp; who <br>
                    47 &nbsp;&nbsp; get <br>
                    48 &nbsp;&nbsp; which <br>
                    49 &nbsp;&nbsp; go <br>
                    50 &nbsp;&nbsp; me <br>
                    &nbsp;</td>
                <td>51 &nbsp;&nbsp; when <br>
                    52 &nbsp;&nbsp; make <br>
                    53 &nbsp;&nbsp; can <br>
                    54 &nbsp;&nbsp; like <br>
                    55 &nbsp;&nbsp; time <br>
                    56 &nbsp;&nbsp; no <br>
                    57 &nbsp;&nbsp; just <br>
                    58 &nbsp;&nbsp; him <br>
                    59 &nbsp;&nbsp; know <br>
                    60 &nbsp;&nbsp; take <br>
                    61 &nbsp;&nbsp; people <br>
                    62 &nbsp;&nbsp; into <br>
                    63 &nbsp;&nbsp; year <br>
                    64 &nbsp;&nbsp; your <br>
                    65 &nbsp;&nbsp; good <br>
                    66 &nbsp;&nbsp; some <br>
                    67 &nbsp;&nbsp; could <br>
                    68 &nbsp;&nbsp; them <br>
                    69 &nbsp;&nbsp; see <br>
                    70 &nbsp;&nbsp; other <br>
                    71 &nbsp;&nbsp; than <br>
                    72 &nbsp;&nbsp; then <br>
                    73 &nbsp;&nbsp; now <br>
                    74 &nbsp;&nbsp; look <br>
                    75 &nbsp;&nbsp; only <br>
                    &nbsp;</td>
                <td>76 &nbsp;&nbsp; come <br>
                    77 &nbsp;&nbsp; its <br>
                    78 &nbsp;&nbsp; over <br>
                    79 &nbsp;&nbsp; think <br>
                    80 &nbsp;&nbsp; also <br>
                    81 &nbsp;&nbsp; back <br>
                    82 &nbsp;&nbsp; after <br>
                    83 &nbsp;&nbsp; use <br>
                    84 &nbsp;&nbsp; two <br>
                    85 &nbsp;&nbsp; how <br>
                    86 &nbsp;&nbsp; our <br>
                    87 &nbsp;&nbsp; work <br>
                    88 &nbsp;&nbsp; first <br>
                    89 &nbsp;&nbsp; well <br>
                    90 &nbsp;&nbsp; way <br>
                    91 &nbsp;&nbsp; even <br>
                    92 &nbsp;&nbsp; new <br>
                    93 &nbsp;&nbsp; want <br>
                    94 &nbsp;&nbsp; because <br>
                    95 &nbsp;&nbsp; any <br>
                    96 &nbsp;&nbsp; these <br>
                    97 &nbsp;&nbsp; give <br>
                    98 &nbsp;&nbsp; day <br>
                    99 &nbsp;&nbsp; most <br>
                    100 &nbsp; us <br>
                    &nbsp;</td>
            </tr>
        </tbody>
    </table>

    <p>It's noticeable that many of the most frequently used words are short ones whose main purpose is to join other,
        longer words rather than determine the meaning of a sentence. These are known as 'function words'. It could be
        said that it's more interesting to explore the frequency of 'content words', as shown in the list below:</p>
    <table cellpadding="5" bordercolor="#000000" border="1" bgcolor="#ccffff" cellspacing="0" width="100%">
        <tbody>
            <tr valign="middle" align="left">
                <th>Nouns</th>
                <th>Verbs</th>
                <th>Adjectives</th>
            </tr>
            <tr valign="middle" align="left">
                <td>1 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; time <br>
                    2 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; person <br>
                    3 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; year <br>
                    4 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; way <br>
                    5 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; day <br>
                    6 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; thing <br>
                    7 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; man <br>
                    8 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; world <br>
                    9 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; life <br>
                    10 &nbsp;&nbsp;&nbsp;&nbsp; hand <br>
                    11 &nbsp;&nbsp;&nbsp;&nbsp; part <br>
                    12 &nbsp;&nbsp;&nbsp;&nbsp; child <br>
                    13 &nbsp;&nbsp;&nbsp;&nbsp; eye <br>
                    14 &nbsp;&nbsp;&nbsp;&nbsp; woman <br>
                    15 &nbsp;&nbsp;&nbsp;&nbsp; place <br>
                    16 &nbsp;&nbsp;&nbsp;&nbsp; work <br>
                    17 &nbsp;&nbsp;&nbsp;&nbsp; week <br>
                    18 &nbsp;&nbsp;&nbsp;&nbsp; case <br>
                    19 &nbsp;&nbsp;&nbsp;&nbsp; point <br>
                    20 &nbsp;&nbsp;&nbsp;&nbsp; government <br>
                    21 &nbsp;&nbsp;&nbsp;&nbsp; company <br>
                    22 &nbsp;&nbsp;&nbsp;&nbsp; number <br>
                    23 &nbsp;&nbsp;&nbsp;&nbsp; group <br>
                    24 &nbsp;&nbsp;&nbsp;&nbsp; problem <br>
                    25 &nbsp;&nbsp;&nbsp;&nbsp; fact <br>
                    &nbsp;</td>
                <td>1 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; be <br>
                    2 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; have <br>
                    3 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; do <br>
                    4 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; say <br>
                    5 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; get <br>
                    6 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; make <br>
                    7 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; go <br>
                    8 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; know <br>
                    9 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; take <br>
                    10 &nbsp;&nbsp;&nbsp;&nbsp; see <br>
                    11 &nbsp;&nbsp;&nbsp;&nbsp; come <br>
                    12 &nbsp;&nbsp;&nbsp;&nbsp; think <br>
                    13 &nbsp;&nbsp;&nbsp;&nbsp; look <br>
                    14 &nbsp;&nbsp;&nbsp;&nbsp; want <br>
                    15 &nbsp;&nbsp;&nbsp;&nbsp; give <br>
                    16 &nbsp;&nbsp;&nbsp;&nbsp; use <br>
                    17 &nbsp;&nbsp;&nbsp;&nbsp; find <br>
                    18 &nbsp;&nbsp;&nbsp;&nbsp; tell <br>
                    19 &nbsp;&nbsp;&nbsp;&nbsp; ask <br>
                    20 &nbsp;&nbsp;&nbsp;&nbsp; work <br>
                    21 &nbsp;&nbsp;&nbsp;&nbsp; seem <br>
                    22 &nbsp;&nbsp;&nbsp;&nbsp; feel <br>
                    23 &nbsp;&nbsp;&nbsp;&nbsp; try <br>
                    24 &nbsp;&nbsp;&nbsp;&nbsp; leave <br>
                    25 &nbsp;&nbsp;&nbsp;&nbsp; call <br>
                    &nbsp;</td>
                <td>1 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; good <br>
                    2 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; new <br>
                    3 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; first <br>
                    4 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; last <br>
                    5 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; long <br>
                    6 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; great <br>
                    7 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; little <br>
                    8 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; own <br>
                    9 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; other <br>
                    10 &nbsp;&nbsp;&nbsp;&nbsp; old <br>
                    11 &nbsp;&nbsp;&nbsp;&nbsp; right <br>
                    12 &nbsp;&nbsp;&nbsp;&nbsp; big <br>
                    13 &nbsp;&nbsp;&nbsp;&nbsp; high <br>
                    14 &nbsp;&nbsp;&nbsp;&nbsp; different <br>
                    15 &nbsp;&nbsp;&nbsp;&nbsp; small <br>
                    16 &nbsp;&nbsp;&nbsp;&nbsp; large <br>
                    17 &nbsp;&nbsp;&nbsp;&nbsp; next <br>
                    18 &nbsp;&nbsp;&nbsp;&nbsp; early <br>
                    19 &nbsp;&nbsp;&nbsp;&nbsp; young <br>
                    20 &nbsp;&nbsp;&nbsp;&nbsp; important <br>
                    21 &nbsp;&nbsp;&nbsp;&nbsp; few <br>
                    22 &nbsp;&nbsp;&nbsp;&nbsp; public <br>
                    23 &nbsp;&nbsp;&nbsp;&nbsp; bad <br>
                    24 &nbsp;&nbsp;&nbsp;&nbsp; same <br>
                    25 &nbsp;&nbsp;&nbsp;&nbsp; able <br>
                    &nbsp;</td>
            </tr>
        </tbody>
    </table>

    <h3>Nouns</h3>
    <p>The commonest nouns are time, person, and year, followed by way and day (month is 40th). The majority of the top
        25 nouns (15) are from Old English, and of the remainder, most came into medieval English from Old French, and
        before that from Latin. Notice that many of these words are very common because they have more than one meaning:
        way and part, for example, are listed in the Concise OED as having 18 and 16 different meanings respectively.
        They often also form part of common phrases: some of the frequency of time, for example, comes from its use in
        adverbial phrases like on time, in time, last time, next time, this time, etc.</p>

    <h3>Verbs</h3>
    <p>As you would expect, the commonest verbs express basic concepts. Strikingly, the 25 most frequent verbs are all
        one-syllable words; the first two-syllable verbs are become (26th) and include (27th). Of these 25, 20 are Old
        English words, and three more, get, seem, and want, entered English from Old Norse in the early medieval period.
        Only try and use came from Old French. It seems that English prefers terse, ancient words to describe actions or
        occurrences.</p>

    <h3>Adjectives</h3>
    <p>Again, most of the top adjectives are one-syllable words, and 17 out of 25 derive from Old English: only
        different, large, and important are from Latin. In terms of the words' meanings, great is higher in the ranking
        than big, probably because of its informal sense 'very good'. Little is surprisingly high at 7, as compared with
        small at 15. Bad is unexpectedly low at 23: is this because we have such a large choice of synonyms available
        for expressing 'bad things'?</p>
</body>
</html>
2 个赞