Those of you who want to take a look at the new namespace manager
(Special:Namespaces) and the associated functionality can now, thanks to
Avar, check out and install the WIKIDATA branch. You need a user with
bureaucrat rights to access [[Special:Namespaces]]. Pretty much all
operations there should work now, but I have not intensively debugged it
yet.
Aver has also adapted the update.php script. The web-installer upgrade
does not yet recognize customizations, but the command-line ugprade
should correctly import customized namespaces and settings into the new
system.
As noted before, the main user-visible changes are that
a) It will become trivial for any wiki community to maintain its own
namespaces,
b) Any namespace can have an arbitrary number of synonyms which redirect
all to a default name, allowing us, for example, to rename "Image:" to
"File:" and have "Image:" redirect there. Providing namespace names in
different languages on multi-language wikis would be another application.
c) Namespaces can have a default target (e.g. "Physics"), so that any
unprefixed link _from_ that namespace will be prefixed with that target.
This is primarily useful for Wikibooks. There's also an experimental
feature to hide namespaces from lists; I'm not sure yet if this is a
good idea.
From an implementation point of view, we can now also make sure that
all namespace operations are safe. I haven't finished debugging that
code yet, but the basic idea should be visible in the save() function in
Namespace.php. Try adding a few duplicate names to the namespace manager
and you'll see that quite a bit of error checking is done there already,
and no operation is executed unless all operations can be successful.
I'll have to spend a couple more days debugging and tweaking things
before we can start thinking seriously about integration into HEAD, but
I encourage you to give it a whirl if you want and report issues.
There's probably still some serious breakage in a few places, but I'm
looking into that. The other thing I want to do is cache the namespace
definitions in memcached, so they don't have to be loaded on every
pageview. Help is always welcome.
I'd also like to have a safer way to access the $wgNamespaces array;
perhaps overloading in PHP5 can be used for this. I'd prefer this to
using a static method to return namespace objects.
More on the next steps for Wikidata soon.
Erik
Hi again,
I finally got importDump.php to run and it runs and already
imported _161_ GB of the english Wikipedia into the database!
That seems to be far too much.
Here is the row count:
page:204876
revision:8263577
text:22768801
user:1
interwiki:124
Does this seem to be reasonable???
161 GB and still running?
22 millions text entries?
Only 1 user entry?
I would appreciate if someone could give me a hint whether
those number are correct!
One final comment: the 161 GB are in the innoDB database
(when you look in mysql/data).
Thank you and best regards,
Martina
Hey,
Does anyone have any suggestions about configuring Squid in reverse proxy
for MW? We're seeing that our articles aren't being cached and Squid is
still making several requests per second for the same article, despite
configuring Squid as instructed at meta.wikimedia.org. Here are our
settings:
$wgUseSquid = true;
$wgUseESI = false;
$wgInternalServer = $wgServer;
$wgSquidMaxage = 18000;
$wgSquidServers = array('10.234.169.202');
$wgSquidServersNoPurge = array();
$wgMaxSquidPurgeTitles = 400;
$wgSquidFastPurge = true;
Squid and Apache are running on different machines, and we're seeing several
200 response codes for unchanged articles from Apache, even in a small time
span of a few minutes. While the apache load has been reduced, we should be
seeing Squid handling more of the serving of normal article pages, instead
of forwarding the request to Apache almost 100% of the time.
Any suggestions would be appreciated,
Travis
this is a forwarded (and typo-cleaned) version of an earlier posting to the general list (http://mail.wikipedia.org/pipermail/wikipedia-l/2005-November/042901.html)
maybe the developers want to have a word in that, too...
i know this topic keeps reoccuring and so my point may not be very
original.
it has been said that wikipedia is "work in progress" and will probably
continue to do so. on the other hand it ails from the fact that at no
given point in time you can be certain to have a 1. consistent , 2.
unvandalized and 3. correct throughout wikipedia. (compared to those
three points the shortcoming of non-completeness dwindles to almost
nothing.)
let me draw your attention to the fact that the construction plans for
roads to stability - or at least local optima - have long been laid out
by physics. heat a dynamic system quickly then let it cool down in a
slower and controlled fashion, allowing less and less dramatic changes
to take place as time passes. simulated annealing
(http://en.wikipedia.org/wiki/Simulated_annealing) is the magic spell
that might work for wikixyzs in a way similar to that in the real world.
the rationale behind my suggestion is of course that articles that have
matured over time are statistically speaking less likely to improve when
large modifications are made than relatively new ones. some of the
articles have reached a stage where well-meant editing effectively mucks
up the inner structure and logic. what i think reasonable is to lift the
threshold for substantial edits, maybe not by limiting access but by
asking for more substantial background information from the authors
(references, printed, electronic,...) than the simple comment line.
there is to much unproven and partially unprovable information in the
wp. that could have been prevented long ago by obliging the authors to
give references for their information. besides, this task would make it
successively harder to simply put established statements upside down.
whereas scientific journals have peer review, wp only offers the weak
weapons of discussion pages and reverts - by others, mostly admins, i
guess. why not confer a little bit more of responsibility to the
authors? he/she could be aided by predefined lists, checkboxes,
comboboxes (for ref. type, etc.)
i find myself increasingly involved in hunting down vandals and their
work - partly due to the ease of use wp offers for non-serious edits,
too, and i can't help feeling that a larger and larger part of wp keeps
a larger and larger part of the community busy with just keeping up the
existing standard.
comments?
best
kai (kku)
-------- Original Message --------
Subject: [Foundation-l] Jihad in Defense of Objectivity (Was: Enforcing
WP:CITE)
Date: Sun, 4 Dec 2005 17:49:36 -0800 (PST)
From: Jonathan Leybovich <jleybov(a)yahoo.com>
Reply-To: Wikimedia Foundation Mailing List <foundation-l(a)wikimedia.org>
To: foundation-l(a)wikimedia.org
All-
The last several dozen messages on this list regarding
Wikipedia citation policy were prompted by Brian's
re-posting of a message I had sent earlier in the week
proposing a change to the page renderer whereby all
factual assertions within an article would
automatically be flagged (say, using red high-lights)
if they were un-sourced. I am truly gratified by the
huge debate which this suggestion has already
generated, and especially grateful to Brian for seeing
enough value in my idea to bring it again to every
one's attention.
This exchange has been truly productive, and the
disagreements that have been aired are, I think, more
apparent than real. One common misconception is that
those of us who are pushing for stronger citation
standards are doing so because we believe in citation
for its own sake, or because we want to blindly mimic
"real encyclopedias", or else because we are in some
way elitist or credentialist and always believe in
deferring to expert opinion.
What has gotten lost in the exchange, I think, is the
fact that those of us advocating a strong citation
policy are doing so only as a means to an end, with
that end being objectivity. The point of an
encyclopedia is to contain objective knowledge,
knowledge which any reasonable person could
potentially confirm by visiting the evidence provided
for it. Ideally such evidence should be as unmediated
and "direct" as possible, but in practice this often
means deferring to an expert authority, because we
either lack the means or skill to reproduce or
interpret this evidence ourselves. This is a
necessary evil, but greatly ameliorated by the fact
that all reputable scholars meticulously document
their results, allowing anyone to reproduce their
evidence later on. Anyone who's read scholarly
journals or monographs knows it is not uncommon for
the footnotes and bibliography (i.e. the evidence) to
take up more pages than the actual text (i.e. the
interpretation)!
Now, just because I think it's valuable to replicate
academic standards of evidence and objectivity does
not mean I think we should blindly reproduce academic
visual/typographic conventions. Just because scholars
put bibliographical/reference sections at the end of
their articles, or make their text unreadable with
lots of footnotes does not mean I think Wikipedia
should also. Let's collect the same data, but think
of better ways to present it. Isn't it ironic that,
memex, the forerunner of hypertext, was thought up
because of the limitations of paper-based scholarship,
and yet we're still talking about how to reproduce
those same limitations within the web browser?
I'm sorry if a lot of this is obvious, but hopefully
the next point is less so- which is that objectivity,
which requires evidence, one means to which happens to
be citation- is not just a scholarly imperative, but
also a moral one. Without objectivity, and the faith
that other people experience the world in roughly the
same ways we do, cooperation and this thing we call
community is impossible. Everyone just does whatever
it is they want and never stop to consider how this
affects other people because without objectivity
knowledge of other people is by definition impossible.
To those who thus maintain that greater standards of
objectivity will damage community within Wikipedia, I
ask you to explain the [[Jihad]] article on the
English language site. This is not an obscure
article; it has gone through 100's, if not 1000's, of
edits and is in the top-10 results list when Googling
on its keyword. Yet this article is a perfect example
of community dysfunction; it is reverted constantly;
it is locked almost weekly; and yet despite all this
activity it is getting worse over time. Because there
is no agreement on what this term even means, the
article is getting shorter and shorter as more and
more of its "controversial" material is shunted off to
sub-articles, where the process repeats itself (see
[[Rules of war in Islam]], under a neutrality alert as
I write). The problem here (leaving aside anonymous
vandals), is not community, it is objectivity. The
warring editors behave unconstructively not because
they mean badly, necessarily, but because they're
trapped in an epistemological hell. It's not only
that there's not enough objective evidence provided
for each assertion, it's that people have no idea
where to find such evidence, or even have the basis
with which to recognize it as such. Thus the
impossibility of consensus, and a continuing edit war
until the article is whittled down to a links page.
Yet isn't the damage done to community, here- in terms
of anger and frustration, in terms of factionalism, in
terms of loss of goodwill and trust- even greater than
that done to knowledge?
I've been working on a new project proposal which I've
deferred announcing on this list partly because I
wanted to do some more polishing to it, but mainly
because it relied upon an enhancement to the software
(i.e. [[m:Wikidata]]) whose completion date was still
a ways off. However, now seems as good a time as any
to make an announcement, so let me provide an
overview. Much of it is identical to SJ's proposal
here and in [[m:Wikicite]].
Phase 1: Toward a more reliable Wikipedia
Citation mark-up is introduced which holds a pointer
to an enclosed factual assertion's proof; proof is
provided via either reference to another work, or with
direct evidence (a photograph, eye-witness testimony,
etc.) when appropriate for the claim. The article
renderer then highlights "evidence holes" with a
distinct, attention-grabbing style that alerts both
readers and editors. Such "footnotes" may be hidden
in the main article, but visible through a new tab
which renders them in a useful graph format. Perhaps
as part of article rating, citations must be confirmed
by the checker; data regarding which assertions were
verified is stored with other article rating
attributes.
Phase 2: Creation of a citation database/authority
text map
Each citation within a Wikipedia article is now
automatically saved within a [[m:Wikidata]] text
relationship database. A text relationship joins two
"[[w:texts]]", and among its other attributes has one
called TYPE. In the case of a Wikpedia citation, TYPE
is by default a positive evidentiary citation- the
Wikipedia article uses the cited book, document,
photograph, etc. as proof of some fact. Yet there are
many other sorts of text relationships, the most
obvious kind being negative citations- one work
attacks the authority of another.
As Wikipedia editors do their research and follow the
citations of those works which they themselves cite,
they are able to create "authority maps" for
literature within various scholarly fields. What is
considered authoritative? What is considered outdated?
They record this information into the text
relationship database. They are not merely copying
other's footnotes, though, since a text relationship
does not have to be "verbalized" within a text. If
they know a particular work contradicts some evidence,
for example, let them record it and so rightly
diminish the work's authority.
Eventually the Wikidata text relationship database
becomes a hugely valuable scholarly tool in its own
right, and acts as the first resort for Wikipedia
editors doing research. Formulas are developed which
rate sources/evidence: incoming positive citations are
good; incoming negative ones are bad. Lots of less
obvious factors like age are considered- a 50 year old
work that's still constantly invoked is probably
particularly sound. Other formula factors are
identified, though anyone can potentially create their
own formulas to run against the data.
Phase 3: The honing of Wikipedia
Using the text relationship database, editors can now
see at a glance what is authoritative within a
particular literature. The article renderer now takes
source quality (generated by the formulas discussed
above) into consideration when rendering each section
of an article. Those parts of the article relying on
weak, discredited, or out-dated sources are flagged
with one style, while perhaps especially credible
sources are "commended" using another. Hopefully a
virtuous circle begins- a citation based upon a work
of popular history is exchanged for one relying upon a
more specialized work, which is later exchanged for a
scholarly monograph or journal article, which in turn
encourages reference to primary sources, etc. By this
process Wikipedia becomes not just accurate, but
scholarly and state-of-the-knowledge.
Please see the following for more details about this
project:
http://meta.wikimedia.org/wiki/WikiTextrosehttp://meta.wikimedia.org/wiki/Wikicite
Thank you for your time and sorry for the long e-mail.
__________________________________________
Yahoo! DSL - Something to write home about.
Just $16.99/mo. or less.
dsl.yahoo.com
_______________________________________________
foundation-l mailing list
foundation-l(a)wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/foundation-l
To whom it may concern (lovers of EnotifWiki)
Pre-announcement:
I successfully updated EnotifWiki to the latest version of MediaWiki
1.5.3 . The new Enotif version will be published soon, most probably on
06.12.2005 UTC.
It will then be the first release since EnotifWiki 3.55 for
MediaWiki1.5rc4 on 31.08.2005.
Please watch [1] or [2] for further news.
[1] http://www.enotifwiki.org MediaWiki incl. e-mail notification and
WYSIWYG editor (FCKeditor)
[2] http://developer.berlios.de/project/showfiles.php?group_id=4473
Brion:
> I applied someone else's patch by their request. If you've got an improved version, please send it over.
You did also choose to ignore my request (who I am to complain about that, after all you're the only Brion around here) and not deem it necessary to communicate about it (which is my main concern).
I happen to still feel responsable for EasyTimeline even when I did not deliver on earlier promises for explained reasons.
As I can't work on it in the near future, I will soon post the patch and explanation, so that someone else with perl knowledge or at least server access can do final tests.
Erik Zachte
Brion, since you quoted it to someone else, you read my previous post about
what is wrong with your quick and dirty patch.
You chose not to ask or inform me beforehand, so that I could explain once again
(you were at the Frankfurt presentation) what makes unicode upgrade more than a one line patch.
You chose not to reply to my wikitech post (unoffensive I would say) which was cc'ed to you.
http://mail.wikipedia.org/pipermail/wikitech-l/2005-November/032649.html
Too bad you seem to have time only for a halfbaked solution and then forget about it.
See http://en.wikipedia.org/w/index.php?title=Template:Timeline_History_of_Ches…
for an example of the mess you caused.
And yes, I am pissed off. I don't know what I did to deserve this inconsiderate behaviour.
Erik Zachte
-----BEGIN PGP SIGNED MESSAGE-----
Moin,
I am proud to announce a new release of the graph extension. There have
been many bugfixes, new features, and it is more stable, robust and even
faster. The code handles more complicated graphs much better now. And the
manual got updated, too :-D
To not bore you with the details, I will just summarize the key changes
since v0.30 below.
All the info you need, including setup and download:
http://bloodgate.com/perl/graph/http://bloodgate.com/perl/graph/manual/
If you have any questions or feedback, please do not hesitate to email.
Best wishes,
Tels
So whats new?
=========
* Boxart:
Thanx to Folke Behrens, I finally got my act together to inplement Unicode
output. The new mode is called "boxart" and is ASCII art on steroids - it
draws the output with Unicode box drawing characters. And it looks good!
┌──────┐ ╔════════╗
│ Bonn │ ══> ║ Berlin ║
└──────┘ ╚════════╝
* SVG in Mediawiki:
The Mediawiki integration can now do SVG. If you have an SVG enabled
browser (FF 1.5, opera 9.0, Konqueror 3.5), you can enjoy truly scalable
output.
* New attributes:
+ font-size (set relative fontsize)
+ text-style (underline, overline, bold, italic, strike-through)
+ start/end (edge port restriction, experimental)
+ basename (for autosplit nodes)
+ arrow-style: none; for edges
* New styles:
+ broad, wide, bold-dash for node borders and edges
+ anon nodes can have a style+attributes, too:
[ ] { border: bold; } --> [ Bonn ]
* Groups
These now work now much better, especially in HTML and ASCII/boxart.
The nitty-gritty details can be found in the full changelogs, if your are
brave and bored :)
http://search.cpan.org/~tels/Graph-Easy/CHANGEShttp://search.cpan.org/~tels/Graph-Easy-As_svg/CHANGEShttp://search.cpan.org/~tels/wikimedia-graph/CHANGES
Have fun! Tels
- --
Signed on Sun Dec 4 23:35:48 2005 with key 0x93B84C15.
Visit my photo gallery at http://bloodgate.com/photos/
PGP key on http://bloodgate.com/tels.asc or per email.
"I'm not a vegetarian, but I eat animals who are" -- Groucho Marx
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
iQEVAwUBQ5Nz/3cLPEOTuEwVAQGYPgf/RJniERKJxhCjD41lFqN3ls4gnWoJsF+t
qvzzE3Hg0tZxxGGOBHV+1VDXEPDSecVF5/+CU71VhGQ0nn7UQ+spvMpVGXzIr3YL
gOF1O3ru3SJvp6fuAqXIHe4wbxJBvtaCtCvXBAUTejNz0g/L3IOgXYCUctSDi0Wu
WiSuhJcUOMNWJtll2SkJKNSPZwh7MOBCW9Jn9TMKcKd2X/siscKdM22xTfIEBeVO
L52oXwJrekTTcIdV8N4yPbcRO6ISVKG7nZsDvDTy91OxwAPavgPOQqYEnwDc03Uq
RV0FpCD8NYkpgpxhek3zgk63Qou2eIAVfBkvOhwwsF/N4K5gJKq15A==
=TXm9
-----END PGP SIGNATURE-----
Several things:
1) The copyright message is moved up, between the main edit area and the
summary/checkboxes/buttons. This is to ensure that it is visible; presently on a
smallish screen it's possible to make thousands of edits without ever seeing the
notice.
This should also encourage site admins to keep the message short and legible.
2) A new message, MediaWiki:Edittools, is created which can hold those extended
character insertion boxes that some wikis use (currently hacked into the
copyright message). This message is now also displayed on the Special:Upload
form, so it can be used when composing file descriptions.
This displays below the buttons, so it's out of the way but not too far away.
3) The list of used templates is moved to the bottom, so it can grow as long as
it gets without pushing other things down.
This is online to try out on the test wiki at:
http://test.leuksman.com/edit/Main_Pagehttp://test.leuksman.com/view/Special:Upload (Log in to see the upload form.)
Avar has expressed objection to putting the notice above the buttons, but I
think it really *needs* to be if we want to pretend it means anything. Any final
comments, and preparation for the change (splitting bits out from the copyright
message) would be good to do; I plan to commit this change in the next day or two.
cf. http://bugzilla.wikipedia.org/show_bug.cgi?id=4100
Note: the charinsert stuff doesn't currently seem to work in Safari. Have to see
about fixing that...
-- brion vibber (brion @ pobox.com)