To me, a major part of the problem is that the material is so out of date. It fails to take into account the past hundred years of archeological research, which is essential. Furthermore, the statistics it gives about places are hopelessly outdated. For example, Anatoth, currently 'Anata, is a fair sized town today, not a hamlet with about 100 people. I don't know how they will deal with places like Dan, Gezer, Megiddo, Hazor, etc. but they have been major excavation sites this century, so most of what we know about them (apart from conjecture) could not possibly appear in a nineteenth century work.
Danny
This is a trimmed-down version of my earlier over-length post.
I've now added an extra filter, so that entries whose titles occur in a very large list of common words are rejected.
Thus, the script will now not attempt to transfer the entry for "Wheel" or "Silk" or other common words, regardless of whether Wikipedia has an entry for that word. This is in addition to the check for not clobbering existing articles.
I have also eliminated any articles containing the words "modern" or "current", which seems to catch a lot of stuff that refers to the author's contemporary information.
I have also pushed the length filter up to 500 characters.
Doing all of these takes the list down to around 640 filtered articles. Wiki links to the non-imported topics still remain, inviting Wikipedians to write new articles about these topics. These remaning articles are almost entirely about obscure figures and places from the Bible.
I intend to add a header to each imported article, reading something like:
''This is an entry from Easton's Bible Dictionary. The material in it is written from the viewpoint of the 19th century, and may be out-of-date or biased. Please review and edit this article to bring it up to date''
and a trailer:
From [[Easton's Bible Dictionary (1897)]]
I intend to drip-feed the finished articles in at a rate of one every 20 minutes, allowing lots of time for human review and assimilation, once I think that there is a consensus that this is OK.
Can anyone suggest any further improvements, short of proof-reading all 1200 articles?
Neil
--------------------------------
Here are some of the results of the filtering of the original Easton's topics:
lines with the word TITLE at the start denote articles that passed; lines with the word BAD represent articles that failed to pass the filter, with the reason for failure.
BAD A = familiar word BAD A type Adam = no comma BAD AEnon = too short BAD Aaron = familiar word BAD Aaronites = too short BAD Abaddon = too short BAD Abagtha = no comma BAD Abana = modern TITLE 1 510 Abarim BAD Abba = familiar word BAD Abda = too short BAD Abdeel = too short BAD Abdi = no he BAD Abdiel = familiar word TITLE 2 745 Abdon BAD Abednego = too short BAD Abel = familiar word BAD Abel-beth-maachah = modern BAD Abel-cheramim = too short TITLE 3 551 Abel-meholah BAD Abel-mizraim = too short BAD Abel-shittim = too short BAD Abez = too short BAD Abi-albon = too short BAD Abia = too short BAD Abiasaph = too short TITLE 4 1872 Abiathar BAD Abib = too short BAD Abida = too short BAD Abidan = too short BAD Abieezer = too short BAD Abiel = too short BAD Abiezrite = too short BAD Abigail = familiar word BAD Abihail = too short TITLE 5 891 Abihu BAD Abihud = too short TITLE 6 2766 Abijah BAD Abijam = too short BAD Abilene = too short BAD Abimael = too short TITLE 7 3025 Abimelech TITLE 8 817 Abinadab BAD Abinoam = too short TITLE 9 574 Abiram TITLE 10 502 Abishag TITLE 11 911 Abishai BAD Abishua = too short BAD Abishur = too short BAD Abital = too short BAD Abitub = too short BAD Abjects = too short BAD Ablution = familiar word BAD Abner = familiar word BAD Abomination = familiar word BAD Abomination of Desolation = too short BAD Abraham = familiar word BAD Abraham's bosom = too short BAD Abram = familiar word BAD Abronah = too short BAD Absalom = familiar word BAD Acacia = familiar word TITLE 12 1943 Accad TITLE 13 574 Accho BAD Accuser = familiar word BAD Aceldama = modern TITLE 14 767 Achaia BAD Achaichus = too short TITLE 15 823 Achan BAD Achbor = too short TITLE 16 1026 Achish BAD Achmetha = familiar word BAD Achor = familiar word BAD Achsah = too short BAD Achshaph = modern BAD Achzib = modern BAD Acre = familiar word TITLE 17 5435 Acts of the Apostles BAD Adah = too short BAD Adam = familiar word BAD Adamah = modern BAD Adamant = familiar word BAD Adar = familiar word BAD Adbeel = too short BAD Addar = no he BAD Adder = familiar word BAD Addi = too short BAD Addon = too short BAD Adiel = familiar word BAD Adin = familiar word BAD Adina = no he BAD Adino = too short BAD Adjuration = familiar word BAD Admah = too short BAD Adnah = too short TITLE 18 1328 Adoni-zedec BAD Adonibezek = too short TITLE 19 984 Adonijah BAD Adonikam = too short BAD Adoniram = familiar word BAD Adoption = familiar word BAD Adoram = no he BAD Adore = familiar word BAD Adrammelech = familiar word BAD Adramyttium = too short BAD Adria = modern BAD Adriel = too short BAD Adullam = familiar word BAD Adullamite = familiar word BAD Adultery = familiar word TITLE 20 571 Adummim BAD Adversary = familiar word BAD Advocate = familiar word BAD Affection = familiar word BAD Affinity = familiar word BAD Afflictions = too short BAD Agabus = too short BAD Agag = familiar word BAD Agagite = too short BAD Agate = familiar word BAD Age = familiar word BAD Agee = familiar word BAD Agony = familiar word BAD Agriculture = familiar word TITLE 21 615 Agrippa I TITLE 22 655 Agrippa II BAD Ague = familiar word
Whatever anyone may think about these particular proposals (they sound great to me), I think Neil has really shown true wiki spirit here.
I'm impressed.
Neil Harris wrote:
This is a trimmed-down version of my earlier over-length post.
I've now added an extra filter, so that entries whose titles occur in a very large list of common words are rejected.
Thus, the script will now not attempt to transfer the entry for "Wheel" or "Silk" or other common words, regardless of whether Wikipedia has an entry for that word. This is in addition to the check for not clobbering existing articles.
I have also eliminated any articles containing the words "modern" or "current", which seems to catch a lot of stuff that refers to the author's contemporary information.
I have also pushed the length filter up to 500 characters.
Doing all of these takes the list down to around 640 filtered articles. Wiki links to the non-imported topics still remain, inviting Wikipedians to write new articles about these topics. These remaning articles are almost entirely about obscure figures and places from the Bible.
I intend to add a header to each imported article, reading something like:
''This is an entry from Easton's Bible Dictionary. The material in it
is written from the viewpoint of the 19th century, and may be out-of-date or biased. Please review and edit this article to bring it up to date''
and a trailer:
From [[Easton's Bible Dictionary (1897)]]
I intend to drip-feed the finished articles in at a rate of one every 20 minutes, allowing lots of time for human review and assimilation, once I think that there is a consensus that this is OK.
Can anyone suggest any further improvements, short of proof-reading all 1200 articles?
Neil
Here are some of the results of the filtering of the original Easton's topics:
lines with the word TITLE at the start denote articles that passed; lines with the word BAD represent articles that failed to pass the filter, with the reason for failure.
BAD A = familiar word BAD A type Adam = no comma BAD AEnon = too short BAD Aaron = familiar word BAD Aaronites = too short BAD Abaddon = too short BAD Abagtha = no comma BAD Abana = modern TITLE 1 510 Abarim BAD Abba = familiar word BAD Abda = too short BAD Abdeel = too short BAD Abdi = no he BAD Abdiel = familiar word TITLE 2 745 Abdon BAD Abednego = too short BAD Abel = familiar word BAD Abel-beth-maachah = modern BAD Abel-cheramim = too short TITLE 3 551 Abel-meholah BAD Abel-mizraim = too short BAD Abel-shittim = too short BAD Abez = too short BAD Abi-albon = too short BAD Abia = too short BAD Abiasaph = too short TITLE 4 1872 Abiathar BAD Abib = too short BAD Abida = too short BAD Abidan = too short BAD Abieezer = too short BAD Abiel = too short BAD Abiezrite = too short BAD Abigail = familiar word BAD Abihail = too short TITLE 5 891 Abihu BAD Abihud = too short TITLE 6 2766 Abijah BAD Abijam = too short BAD Abilene = too short BAD Abimael = too short TITLE 7 3025 Abimelech TITLE 8 817 Abinadab BAD Abinoam = too short TITLE 9 574 Abiram TITLE 10 502 Abishag TITLE 11 911 Abishai BAD Abishua = too short BAD Abishur = too short BAD Abital = too short BAD Abitub = too short BAD Abjects = too short BAD Ablution = familiar word BAD Abner = familiar word BAD Abomination = familiar word BAD Abomination of Desolation = too short BAD Abraham = familiar word BAD Abraham's bosom = too short BAD Abram = familiar word BAD Abronah = too short BAD Absalom = familiar word BAD Acacia = familiar word TITLE 12 1943 Accad TITLE 13 574 Accho BAD Accuser = familiar word BAD Aceldama = modern TITLE 14 767 Achaia BAD Achaichus = too short TITLE 15 823 Achan BAD Achbor = too short TITLE 16 1026 Achish BAD Achmetha = familiar word BAD Achor = familiar word BAD Achsah = too short BAD Achshaph = modern BAD Achzib = modern BAD Acre = familiar word TITLE 17 5435 Acts of the Apostles BAD Adah = too short BAD Adam = familiar word BAD Adamah = modern BAD Adamant = familiar word BAD Adar = familiar word BAD Adbeel = too short BAD Addar = no he BAD Adder = familiar word BAD Addi = too short BAD Addon = too short BAD Adiel = familiar word BAD Adin = familiar word BAD Adina = no he BAD Adino = too short BAD Adjuration = familiar word BAD Admah = too short BAD Adnah = too short TITLE 18 1328 Adoni-zedec BAD Adonibezek = too short TITLE 19 984 Adonijah BAD Adonikam = too short BAD Adoniram = familiar word BAD Adoption = familiar word BAD Adoram = no he BAD Adore = familiar word BAD Adrammelech = familiar word BAD Adramyttium = too short BAD Adria = modern BAD Adriel = too short BAD Adullam = familiar word BAD Adullamite = familiar word BAD Adultery = familiar word TITLE 20 571 Adummim BAD Adversary = familiar word BAD Advocate = familiar word BAD Affection = familiar word BAD Affinity = familiar word BAD Afflictions = too short BAD Agabus = too short BAD Agag = familiar word BAD Agagite = too short BAD Agate = familiar word BAD Age = familiar word BAD Agee = familiar word BAD Agony = familiar word BAD Agriculture = familiar word TITLE 21 615 Agrippa I TITLE 22 655 Agrippa II BAD Ague = familiar word
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
At 06:29 PM 8/7/02 +0100, Neil Harris wrote:
This is a trimmed-down version of my earlier over-length post.
All in all, a fine thing; below are a couple of specific suggestions for further improvement.
I've now added an extra filter, so that entries whose titles occur in a very large list of common words are rejected.
That sounds useful--it'll save us from having to rewrite articles about spices and such, and provide some useful stuff on less common topics.
Thus, the script will now not attempt to transfer the entry for "Wheel" or "Silk" or other common words, regardless of whether Wikipedia has an entry for that word. This is in addition to the check for not clobbering existing articles.
I have also eliminated any articles containing the words "modern" or "current", which seems to catch a lot of stuff that refers to the author's contemporary information.
Again, a good idea: the Easton stuff seems more useful as a source of information about the Bible as a document than about the contemporary Middle East.
I have also pushed the length filter up to 500 characters.
Doing all of these takes the list down to around 640 filtered articles. Wiki links to the non-imported topics still remain, inviting Wikipedians to write new articles about these topics. These remaning articles are almost entirely about obscure figures and places from the Bible.
I intend to add a header to each imported article, reading something like:
''This is an entry from Easton's Bible Dictionary. The material in it
is written from the viewpoint of the 19th century, and may be out-of-date or biased. Please review and edit this article to bring it up to date''
Maybe that could be expanded to note what sort of 19th century viewpoint: it's clearly Christian, but if it's a particular denomination, that's relevant (I can tell immediately that it wasn't put together by a 19th-century Jew, let alone a Buddhist or atheist).
and a trailer:
From [[Easton's Bible Dictionary (1897)]]
I intend to drip-feed the finished articles in at a rate of one every 20 minutes, allowing lots of time for human review and assimilation, once I think that there is a consensus that this is OK.
Can anyone suggest any further improvements, short of proof-reading all 1200 articles?
One practical fix: when I was editing the [[Amon]] article this morning, I found that it linked to itself in a couple of places. Can you tweak the script to not create self-links.
Also, someone's going to have to proofread the articles--and I'd rather it be someone with more of an interest in the matter than I have, if only because such a person is more likely to catch misspellings of names.
And why not resume at Q or something, instead of back at the beginning of the alphabet?
daniwo59@aol.com wrote:
To me, a major part of the problem is that the material is so out of date. It fails to take into account the past hundred years of archeological research, which is essential. Furthermore, the statistics it gives about places are hopelessly outdated. For example, Anatoth, currently 'Anata, is a fair sized
I think I have a working solution to this sort of problem:
Old texts should be scanned (in facsimile if possible) and put on a read-only website which allows deep linking. For example, the article on the Electric Telegraph from a 19th century Swedish encyclopedia is available on the URL http://www.lysator.liu.se/runeberg/nfad/0192.html (have a look, nice pictures, all public domain).
The URL up to ...runeberg/ is the name of the website and "nf" is for the encyclopedia and "nfad" is the 4th volume of the 1st edition.
Then in the wiki, a rule is added so the shorthand "nf:ad0192" is automatically recognized and converted into a hypertext link, in a fashion similar to ISBN numbers. The example is found on the wiki page http://susning.nu/Telegraf where the wiki text "nf:ad0192" is converted into (my translation)
See [http://www.lysator.liu.se/runeberg/nfad/0192.html the article] in the 1st edition of [[Nordisk familjebok]], volume 4, 1881.
and then into HTML.
So, what you need is a stable and deep-linkable read-only website with the old contents that you want to use, and a shorthand linking scheme in the wiki software. You do not want old text copied into the wiki.
Easton's Bible Dictionary is available in a deep-linkable, stable, read-only website, the Christian Classics Ethereal Library, starting on http://www.ccel.org/e/easton/ebd/
For example, the article on Anatoth is available on the URL http://www.ccel.org/e/easton/ebd/ebd/T0000200.html#T0000233 apparently with 100 articles per HTML page, and this is article 233.
If this is a work that you often want to refer to, add the following pattern rule to the wikipedia source code for the English Wikipedia,
ebd:([0-9]+) e.g. ebd:233
translated into
'See [http://www.ccel.org/e/easton/ebd/ebd/' + sprintf("T%05d00.html#T%07d", $1/100, $1) + ' the article] in [[Easton's Bible Dictionary]] (1897)
Adding this "ebd:" rule to the wikipedia software doesn't hurt anybody, since 99.99% of all articles will not contain the ebd: pattern. But as soon as anybody, who knows EBD and this rule, starts to use it, it saves a lot of time and effort in creating links instead of copying useless text into the wiki.
wikipedia-l@lists.wikimedia.org