At 06:29 PM 8/7/02 +0100, Neil Harris wrote:
This is a trimmed-down version of my earlier over-length post.
All in all, a fine thing; below are a couple of specific suggestions for further improvement.
I've now added an extra filter, so that entries whose titles occur in a very large list of common words are rejected.
That sounds useful--it'll save us from having to rewrite articles about spices and such, and provide some useful stuff on less common topics.
Thus, the script will now not attempt to transfer the entry for "Wheel" or "Silk" or other common words, regardless of whether Wikipedia has an entry for that word. This is in addition to the check for not clobbering existing articles.
I have also eliminated any articles containing the words "modern" or "current", which seems to catch a lot of stuff that refers to the author's contemporary information.
Again, a good idea: the Easton stuff seems more useful as a source of information about the Bible as a document than about the contemporary Middle East.
I have also pushed the length filter up to 500 characters.
Doing all of these takes the list down to around 640 filtered articles. Wiki links to the non-imported topics still remain, inviting Wikipedians to write new articles about these topics. These remaning articles are almost entirely about obscure figures and places from the Bible.
I intend to add a header to each imported article, reading something like:
''This is an entry from Easton's Bible Dictionary. The material in it
is written from the viewpoint of the 19th century, and may be out-of-date or biased. Please review and edit this article to bring it up to date''
Maybe that could be expanded to note what sort of 19th century viewpoint: it's clearly Christian, but if it's a particular denomination, that's relevant (I can tell immediately that it wasn't put together by a 19th-century Jew, let alone a Buddhist or atheist).
and a trailer:
From [[Easton's Bible Dictionary (1897)]]
I intend to drip-feed the finished articles in at a rate of one every 20 minutes, allowing lots of time for human review and assimilation, once I think that there is a consensus that this is OK.
Can anyone suggest any further improvements, short of proof-reading all 1200 articles?
One practical fix: when I was editing the [[Amon]] article this morning, I found that it linked to itself in a couple of places. Can you tweak the script to not create self-links.
Also, someone's going to have to proofread the articles--and I'd rather it be someone with more of an interest in the matter than I have, if only because such a person is more likely to catch misspellings of names.
And why not resume at Q or something, instead of back at the beginning of the alphabet?