I downloaded several PD word lists from the web and had them unified and converted into 26 (A-Z) files, containing a total of 304.946 words in more than 5 MB. The words are already formatted in [[Wiki style]].
Even though wikipedia will never be finished, basically this list should be our ultimate goal, right? So, I'd like to know - Is there a list of all (or many) words already somewhere on wikipedia? I couldn't find one. - Should I upload these files to, say, [[List of all words/A]] to [[List of all words/Z]]? Or is it just a waste of memory?
Magnus
I might not be understanding what you've done very well, Magnus. A "PD word list" is a public domain word list?
I would say it's not the case that each word in a list of (English?) words corresponds to a potential Wikipedia article. Usually, article titles are combinations of words, for one thing; and very many ordinary words (e.g., "wasn't") do not correspond to encyclopedia article topics.
So, unless again I'm not understanding (which is entirely possible), I really don't see what the point would be of including such a list. This isn't to say there *isn't* a point, just that I don't understand what it is.
There's one disadvantage I can think of though--it might encourage too many people to think of Wikipedia as a dictionary. (Such a list would be a great resource for a Wiki dictionary, if anyone ever wanted to make one.
Larry
On Sun, 16 Sep 2001, Magnus Manske wrote:
I downloaded several PD word lists from the web and had them unified and converted into 26 (A-Z) files, containing a total of 304.946 words in more than 5 MB. The words are already formatted in [[Wiki style]].
Even though wikipedia will never be finished, basically this list should be our ultimate goal, right? So, I'd like to know
- Is there a list of all (or many) words already somewhere on wikipedia? I
couldn't find one.
- Should I upload these files to, say, [[List of all words/A]] to [[List of
all words/Z]]? Or is it just a waste of memory?
Magnus
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
To see what it looks like, I posted the "words starting with Y" at http://www.wikipedia.com/wiki.cgi?Magnus_Manske/Y
Yes, there are lots of words I never saw before, and there are several plurals, but as you can see, we already covered a few of these. Maybe we'd have to manually edit these lists. My idea was that everyone could go through the "real" encyclopedia entires, so we could mark them off one by one. Maybe we coud do an "Y-week"? (there goes my mind;)
Maybe I should save this for the "dictionary:" namespace on the coming PHP wikipedia...
Magnus
-----Original Message----- From: wikipedia-l-admin@nupedia.com [mailto:wikipedia-l-admin@nupedia.com]On Behalf Of lsanger@nupedia.com Sent: Sunday, September 16, 2001 6:53 PM To: Wikipedia-L@Nupedia. Com Subject: Re: [Wikipedia-l] Ultimate goal
I might not be understanding what you've done very well, Magnus. A "PD word list" is a public domain word list?
I would say it's not the case that each word in a list of (English?) words corresponds to a potential Wikipedia article. Usually, article titles are combinations of words, for one thing; and very many ordinary words (e.g., "wasn't") do not correspond to encyclopedia article topics.
So, unless again I'm not understanding (which is entirely possible), I really don't see what the point would be of including such a list. This isn't to say there *isn't* a point, just that I don't understand what it is.
There's one disadvantage I can think of though--it might encourage too many people to think of Wikipedia as a dictionary. (Such a list would be a great resource for a Wiki dictionary, if anyone ever wanted to make one.
Larry
On Sun, 16 Sep 2001, Magnus Manske wrote:
I downloaded several PD word lists from the web and had them unified and converted into 26 (A-Z) files, containing a total of 304.946
words in more
than 5 MB. The words are already formatted in [[Wiki style]].
Even though wikipedia will never be finished, basically this
list should be
our ultimate goal, right? So, I'd like to know
- Is there a list of all (or many) words already somewhere on
wikipedia? I
couldn't find one.
- Should I upload these files to, say, [[List of all words/A]]
to [[List of
all words/Z]]? Or is it just a waste of memory?
Magnus
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
I'm not sure we should have any "dictionary:" namespace on Wikipedia. I don't think we should encourage the development of a dictionary on Wikipedia; that would distract from our main mission.
Just speaking for myself, editing a long list that has huge numbers of words that will never have encyclopedia articles doesn't strike me as a good use of time. I'd get tired of deleting the likes of "yumping" and "yuckiest." ;-) If others would like to do so, though, who am I to stop them?
I *do* see considerable value in getting a list of encyclopedia article titles from other encyclopedias. By golly, that's a brilliant idea. I *really* think we should do this.
Jimbo has compiled such a list, in fact, from dmoz. It's out of date, though. He'd probably really appreciate help with that...
Larry
On Sun, 16 Sep 2001, Magnus Manske wrote:
To see what it looks like, I posted the "words starting with Y" at http://www.wikipedia.com/wiki.cgi?Magnus_Manske/Y
Yes, there are lots of words I never saw before, and there are several plurals, but as you can see, we already covered a few of these. Maybe we'd have to manually edit these lists. My idea was that everyone could go through the "real" encyclopedia entires, so we could mark them off one by one. Maybe we coud do an "Y-week"? (there goes my mind;)
Maybe I should save this for the "dictionary:" namespace on the coming PHP wikipedia...
Magnus
-----Original Message----- From: wikipedia-l-admin@nupedia.com [mailto:wikipedia-l-admin@nupedia.com]On Behalf Of lsanger@nupedia.com Sent: Sunday, September 16, 2001 6:53 PM To: Wikipedia-L@Nupedia. Com Subject: Re: [Wikipedia-l] Ultimate goal
I might not be understanding what you've done very well, Magnus. A "PD word list" is a public domain word list?
I would say it's not the case that each word in a list of (English?) words corresponds to a potential Wikipedia article. Usually, article titles are combinations of words, for one thing; and very many ordinary words (e.g., "wasn't") do not correspond to encyclopedia article topics.
So, unless again I'm not understanding (which is entirely possible), I really don't see what the point would be of including such a list. This isn't to say there *isn't* a point, just that I don't understand what it is.
There's one disadvantage I can think of though--it might encourage too many people to think of Wikipedia as a dictionary. (Such a list would be a great resource for a Wiki dictionary, if anyone ever wanted to make one.
Larry
On Sun, 16 Sep 2001, Magnus Manske wrote:
I downloaded several PD word lists from the web and had them unified and converted into 26 (A-Z) files, containing a total of 304.946
words in more
than 5 MB. The words are already formatted in [[Wiki style]].
Even though wikipedia will never be finished, basically this
list should be
our ultimate goal, right? So, I'd like to know
- Is there a list of all (or many) words already somewhere on
wikipedia? I
couldn't find one.
- Should I upload these files to, say, [[List of all words/A]]
to [[List of
all words/Z]]? Or is it just a waste of memory?
Magnus
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
wikipedia-l@lists.wikimedia.org