Since everybody's so frustrated about this, I'm going to go ahead and
force the issue with the upload server. I'll be disabling uploads and
turning off the upload.wikimedia.org web server for a few hours so we
can get everything moved over and totally copied once and for all.
Alas this'll mean not seeing images for a few hours, but it should
finally be nicer after this. :D
http://meta.wikimedia.org/wiki/November_2005_image_server
-- brion vibber (brion @ pobox.com)
Hi all,
On the Dutch Wikipedia we have a recurring discussion about our new main
page. It loads very slowly at times. Some claim it's because there's too
many images and templates in it, some claim it's because servers are
slow, some claim it's due to both factors.
What do experts think? ;-)
http://nl.wikipedia.org/wiki/Hoofdpaginahttp://nl.wikipedia.org/wiki/Sjabloon:Inhoud (Template, versions 5 nov
2005 13:52 and up)
Btw: User:Waerth has even gone on strike until the old main page is put
back into place... (http://nl.wikipedia.org/wiki/Gebruiker:Waerth)
Thanks,
Galwaygirl
Hi, folks. Just a quick note to let you know that there's an extension
for MediaWiki available that allows customized RDF output and in-page
user input of Turtle RDF. Code is here:
http://wikitravel.org/~evan/mw-rdf-0.3.tar.gz
This is in production on Wikitravel, only works for MediaWiki 1.4.x (at
least for the history model, probably some other stuff is broken with
the new database schema, too). More info here:
http://wikitravel.org/en/Wikitravel:RDFhttp://meta.wikimedia.org/wiki/RDF
README file is attached for below for people who don't follow URLs so
much. I'll add it to extensions section of mediawiki CVS RSN, but I've
been using darcs for version control so far and I CBA to merge to CVS
yet.
~Evan
________________________________________________________________________
MediaWiki RDF extension
version 0.3
16 November 2005
This is the README file for the RDF extension for MediaWiki
software. The extension is only useful if you've got a MediaWiki
installation; it can only be installed by the administrator of the site.
The extension adds RDF (= Resource Definition Framework) support to
MediaWiki. It will show RDF data about a page with a new special page,
Special:Rdf. It allows users to add custom RDF statements to a page
between <rdf> ... </rdf> tags. Administrators and programmers can add
new automated RDF models, too.
This is the first version of the extension and it's almost sure to
have bugs. See the BUGS section below for info on how to report
problems.
== License ==
Copyright 2005 Evan Prodromou <evan(a)wikitravel.org>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA
== Installation ==
You have to have MediaWiki 1.4.x installed for this software to work.
Sorry, but that's the version I've got installed, so it's the one this
software works with.
You also have to install RAP, the RDF API for PHP
(www.wiwiss.fu-berlin.de/suhl/bizer/rdfapi/) . I used version 0.92,
plus some custom hacks to make the N3 parser less fragile. You have to
apply a patch to the distribution if you want RDF to work; it's
included in this distribution. (Future versions of RAP will have these
enhancements).
You can copy the file MwRdf.php to the extensions directory of your
MediaWiki installation. Then add these lines to your LocalSettings.php:
define("RDFAPI_INCLUDE_DIR", "/full/path/to/rdfapi-php/api/");
require_once("extensions/MwRdf.php");
== 60-second intro to RDF ==
RDF is a framework for making statements about resources. Statements
are in the form:
subject predicate object
Here, "subject" is a "resource" such as a person, place, idea, Web
page, picture, concept, or whatever. "Predicates" are names of
properties of a resource, like its color, shape, texture, size,
history, or relationships to other "resources". The object is the
value of the property. So "car color red" would be a statement about a
car; "Evan hasBrother Nate" would be a statement about a person.
Of course, it's important to be definite about which resources and
which properties we're discussing. In the Web world, each "resource"
is identified with a URI (usually an URL).
For electronic resources, this is usually pretty easy; the main page
of English-language Wikipedia, for example, has the URI
"http://en.wikipedia.org/wiki/Main_Page". However, for analog subjects
like people or ideas or physical objects, this can be a little
trickier.
There's no general solution, but the typical workaround is to use real
or made-up URIs to "stand in" for offline entities. For example, you
could use the URI for my Wikitravel user page,
"http://wikitravel.org/en/User:Evan", as the URI for me. Or you could
use my email address in URI form, like "mailto:evan@wikitravel.org".
People who need to agree on statements often create 'vocabularies' or
'schemas' that map concepts, object, and relationships to URIs. By
popularizing such a mapping, we can all agree about what a particular
URI "means".
For example, the Dublin Core Metadata Initiative (DCMI)
(http://www.dublincore.org/) has a schema for very simple metadata,
such as you'd find on a library card. They've defined (among other
things), that the idea of authoring or creating something is
represented by the URL http://purl.org/dc/elements/1.1/creator. So
you could say:
http://www.fsf.orghttp://purl.org/dc/elements/1.1/creator
mailto:rms@gnu.org
... means that the creator of the Free Software Foundation is Richard
Stallman.
There are a lot of RDF models out there; you can also create your own
if you want.
RDF statements can be encoded in a number of different ways. By far
the most popular is as XML, sometimes called "RDF/XML". "Turtle" is
another format, which uses plain text rather than XML; and "Ntriples"
is still another.
== Models ==
For any given resource you can describe it from many different
perspectives. For example, you can describe a man in terms of his
academic career, his job experience, his family members, his body
parts' size and weight, his location in space, his membership in
organizations, his hobbies and interests, etc.
In this extension, we use the term "model" to describe a perspective
on a resource. For example, listing the links to and from a page is
one model; its edit history is another model. You can choose which
models you want to know about when querying the system for RDF
statements about a subject, and only statements in that model are
returned.
This is mostly a concession to performance; it doesn't make sense to
calculate information about the history of a page if calling program
isn't going to use it.
There are a number of models built into this extension; you can also
add your own, if you know how to code PHP. The models have short
little codenames for easy access, listed below.
Models built in:
* dcmes: Dublin Core Metadata Element Set (DCMES) data. Mostly
information about who edited a page, when, and other simple stuff.
Titles, format, etc. This is a common vocabulary that's very
useful for general-purpose bots.
* cc: Creative Commons metadata. Gives license information; there
are a few tools and search engines that use this data.
* linksto, linksfrom, links: Internal wiki links to and from a page.
"links" is a shortcut for both.
* image: DCMES information about images in a page.
* history: version history of a page; who edited the page and when.
* interwiki: links to different language versions of a page.
* categories: which categories a page is in.
* inpage: a special model for blocks of RDF embedded into the source
code of MediaWiki pages; see "In-page RDF" below for info.
== Special:RDF ==
You can view RDF for a page using the [[Special:Rdf]] feature. It
should be listed on the list of special pages as "Rdf". Enter the
title of the page you want RDF for in the title box, and choose one or
more of the RDF models from the multiselect box. You can also select
which output format you want; XML is probably most useful and can be
viewed in a browser.
The Special:Rdf page can also be called directly, with the following
parameters:
* target: title of the article to get RDF info about. If no target
URL is provided, the special page shows the input form.
* modelnames: comma-separated list of model names, like
"links,cc,history". Default is a list of standard models,
configurable per-site (see below).
* format: output format; one of 'xml', 'turtle' and 'ntriples'.
Default is XML.
== In-page RDF ==
Any user can make additional RDF statements about any resource by
adding an in-page RDF block to the page. The RDF needs to be in Turtle
format (http://www.dajobe.org/2004/01/turtle/), which is extremely
simple. It's a subset of Notation3
(http://www.w3.org/DesignIssues/Notation3.html), for which there is a
good introduction. (http://www.w3.org/2000/10/swap/Primer.html)
RDF blocks are delimited by the tag "<rdf>". They're invisible for
normal output, but they can provide information for RDF-reading items.
Here's an example:
Mathematics is ''very'' hard.
<rdf>
<> dc:subject "Mathematics"@en .
</rdf>
Here, the rdf block says that the subject of the article is
"Mathematics". Note that <> in Turtle means "this document". Another
example:
Chilean wines are quite delicious.
<rdf>
<> dc:source <http://example.org/chileanwines.html> .
<http://example.org/chileanwines.html>
dc:creator "Bob Smith" .
</rdf>
Here, we've said that the article's source is another Web page on
another server; we can also say that that other Web page's author is
Bob Smith.
In-page RDF is displayed whenever the "inpage" model is requested for
Special:RDF; it's one of the defaults. It's also useful for people
making MediaWiki extensions; you can have users add information in
in-page RDF, and then extract it and read it using the function
MwRdfGetModel(). This lets users add data that isn't for presentation
but perhaps for automated tools to use.
Note also that MediaWiki templates are expanded when in-page RDF is
queries. So if the syntax of Turtle is daunting, you can add templates
that make it easier. For example, we could create a template
Template:Source for showing source documents:
<rdf>
<> dc:source <{{{1}}}> .
<{{{1}}}> dc:creator "{{{2|anonymous}}}" .
</rdf>
We could then make the same statement as above with a template
transclusion:
{{source|http://example.org/chileanwines.html|Bob Smith}}
Note that a number of namespaces are pre-defined for your RDF blocks.
Some basic namespaces are provided by RAP; you can define custom
namespaces with the global variable $wgRdfNamespaces . In addition,
each of the article namespaces is mapped to a namespace prefix in
Turtle, so you can say something like this:
<rdf>
Wikitravel_talk:Spelling dc:subject Wikitravel:Spelling .
:Montreal dc:spatial "Montreal" .
</rdf>
Note that the default prefix (":") is the article namespace.
== Customization ==
There are a few customization variables available, mostly for
programmers.
$wgRdfDefaultModels -- an array of names of the default models to use
when no model name is specified.
$wgRdfNamespaces -- You can add custom namespaces to this associative
array, of the form 'prefix' => 'uri' .
$wgRdfModelFunctions -- an associative array mapping model names to
functions that generate the model. See below for
how to add a new model.
$wgRdfOutputFunctions -- A map of output format to functions that
generate that output. You can add new output
formats by adding to this array.
== Extending ==
You can add new RDF models to the framework by creating a model
function and adding it to the $wgRdfModelFunctions array. The function
will get a single MediaWiki Article object as a parameter; it should
return a single RAP Model object (a collection of statements) as a
result. For example,
function CharacterCount($article) {
# create a new model
$model = ModelFactory::getDefaultModel();
# get the article source
$text = $article->getContent(true);
# ... and its size
$size = mb_strlen($text);
# Get the resource for this article
$ar = MwRdfArticleResource($article);
# Add a statement to the model
$model->add(new Statement($ar, new
Resource("http://example.org/charcount"),
new Literal($size)));
# return the model
return $model;
}
You can then give the model a name like so:
$wgRdfModelFunctions['charcount'] = 'CharacterCount';
You can add a message to the site describing your model like so:
$wgMessageCache->addMessages(array('rdf-charcount' => 'Count of
characters'));
You can also create model-outputting functions if you so desire; they
should accept a RAP model as input and make output as they would to
the Web. This is probably only useful if you want a specific RDF
encoding mechanism that's not RDF/XML, Turtle, or Ntriples; for
example, TriG or TriX.
== Future ==
These are some future directions I'd like to see things go:
* Store statements in DB: statements could be stored in the database
when the page is saved and retrieved when needed. This would make it
to do extended queries based on information about *all* pages.
* Performance: there wasn't much performance tuning and there are
probably way too many DB hits and reads and such.
* Semantic tuning: I'd like to make sure that the statements in the
standard models are accurate and useful.
== Bugs ==
Send bug reports, patches, and feature requests to Evan Prodromou
<evan(a)wikitravel.org> .
--
Evan Prodromou <evan(a)wikitravel.org>
Wikitravel (http://wikitravel.org/) -- the free, complete, up-to-date
and reliable world-wide travel guide
Hi,
We requested a new 'portal' namespace about one year ago but this came
to nothing. Some days ago, we discovered the English and German
Wikipedia are now using this namespace (sic!) so we requesting to be
able to do same [1]. The French word for portal is 'portail'.
Regards,
Aoineko
[1]
http://fr.wikipedia.org/wiki/Wikip%C3%A9dia:Le_Bistro/30_ao%C3%BBt_2005#Un_…
Rob Lanphier wrote:
> Subject: Re: [Wikitech-l] Re: <link> elements for interlanguage link
> information
> To: Wikimedia developers <wikitech-l(a)wikimedia.org>
> Message-ID: <1132538575.6605.35.camel(a)localhost.localdomain>
> Content-Type: text/plain
>
> I'm not aware of any <link> syntax, but one way to do it would be for
> MediaWiki to issue an HTTP 301 status (permanent redirect) to the new
> page, rather than returning 200 and giving the content. That probably
> introduces an unacceptably large performance penalty, though (extra
> round trip per request).
>
> The "Content-Location" HTTP header is a potential longshot. I don't
> think Google documents their use/non-use of this header, but it's one of
> those "can't hurt" kind of things.
>
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.14
>
> It appears it can be tacked on using a meta http-equiv tag in the HTML
> head.
>
Reading the specs it seems more to differenziate among different
content retrived from the same URI/URL rather than (as is our case) to
state that different URL/URI correspond to the smae content.
Despite of this, it seems a brillant idea to use this header. Does
anybody has any contact with google or yahoo (or other web search) to
ask them their opinion about this and other possible solution?
AnyFile
Timwi wrote:
<snip>
> Speaking of which - this reminds me of an idea I had a while ago and I
> was wondering if anyone would be interested to hear this. Currently many
> Wikipedia pages in Google search results are redirects (for example,
> Google for "nonogram" and look at the seventh search result). I was
> wondering if there is a <link> element one could use to say that another
> URL is the "real" page? Then the page returned for a redirect's URL
> would tell search engines the URL of the page it's redirecting to.
Is a list the names of the pages redirected to a page inserted among
he keywords of the target page?
One more radical way would be to respond to a request of a page that
is a redirect page with an http redirect, when a web crawler is
detected asking the page.
AnyFile
Timwi <timwi@...> writes:
>
>
> > I do think these are two seperate points:
> >
> > * how to improve the discussion pages on a wiki
> > * whether each author own his/her comment or not.
>
> But the point is that the answer to the second influences whether the
> solution proposed for the first is seen as an "improvement". I feel that
> if the ability to edit other people's comments is taken away from me, I
> can't label it an "improvement".
>
You may not label it an improvement, but there are others who definately
would.
> > Discussions, OTOH, also involve personal opinions. Danger lies ahead
when
> > the opionion can be changed, but is still labeled (or signed, if you
> > wish) with the original authors name.
>
> We already have this "danger", and we've had it since the beginning of
> Phase II, and it has not turned out to be a great problem, so this is
> not an argument.
>
I've had people complain to me about moving their comments around on my LDAP
patch's page on meta. I erased one person's edit because it was a
non-working
solution, and had a complaint about that.
Just because you don't think this is a problem, doesn't mean it isn't a
problem.
I can definately see lawsuits based upon this. This is definately a valid
argument.
> > Just imagine that this discussion we have is on a wiki, this is the
latest
> > edition (you would need to check the history, aka mailing list archives
> > to see the full revisions) and it contained:
> >
> > On Tuesday 01 November 2005 17:36, Timwi wrote:
> >>>Any model, if over applied, is harmful.
> >>Agree.
> >>I am strongly in favour of LiquidThreads.
> >
> > See the danger?
>
> A fallacious argument by false dilemma, or by lack of imagination, or
> whatever you wanna call it. You almost provided the answer to this one
> yourself:
>
> > (for the record, the above quote of three lines was
> > written/shortened by me, not Timwi).
>
> And that is what it should say.
>
> COMMENT #328645 by [[User:Timwi]]
>
> Agree. I am strongly in favour of LiquidThreads.
> (This comment was last edited by [[User:Tels]] <date/time>.)
>
> If <date/time> is a minute ago, I better check the diff. If it was an
> hour ago, I can probably assume that your edit was harmless.
>
> Therefore, again, your "danger" is not an argument against the ability
> to edit comments.
>
Why can you assume that the edit was harmless? During katrina, I had no
internet
access for weeks. If someone maliciously edited some of my comments during
that
time, would you assume that what was there is actually what I wrote?
Ignoring catastrophies like a large blackout, or a hurricane: say someone
goes
on vacation, or simply hasn't checked his discussions recently, or if an
article's discussion page hasn't been updated in a long while, and someone
stops
checking it as often; in these cases, vandalism may go unnoticed for QUITE a
while, where readers may be seeing the vandalised version for the entire
time.
In this aspect, there is "danger" in others editing comments.
> > If we can improve the discussion page itself, *and* prevent
> > misrepresentation at the same time, well, that would be great :)
>
> It's really easy.
>
> Timwi
>
I think the original idea of LiquidThreads is a good solution for the
problem. I
don't believe the implementation would be easy though ;).
Ryan Lane
Hello,
I used the last hours trying to dig in the infrastructural
organization of the Wikimedia servers. My starting points where
[[meta:Wikimedia_servers]] and Ganglia and my motivation was
Wikipedias slowness in the last time.
In contrast to my expectations, the database servers are far away from
being under high load. It even seems the pressure is so low, you can
easily live without holbach and webster for days (resp. over a month).
Bottlenecks are Apaches and Squids (yes, I know that's nothing new for
you).
But like all other clusters too, the load is very unequally
distributed over the machines. For example the Yahoo! squids showed
yf1003 9.39, yf1000 7.60, yf1004 1.60, yf1002 1.44, yf1001 0.73
at noon (UTC) today and similar load values (albeit with a different
distribution) at other times.
Or the Apaches in Florida:
16 Apaches with load around 15, 9 between 1.5 and 2, 8 between 1 and
1.5 and 10 less than 1.
Where does this come from, or is this wanted? Wouldn't a more balanced
load be better?
Other point: The Yahoo! Squids do virtually nothing between 18:00 and
0:00 (and machines besides yf1000-yf1004 to virtually nothing around
the clock). How nice would it be make them helping out the other
overloaded machines in Florida and Netherlands at least in these six
hours.
And no, I don't criticize anyone or know how to do it better. But
available informations look strange to me - it would be great to get
some explanations.
Speaking of explanations. I've three more simple questions:
1. Squids at lopar idle all the time since dns has been moved of them.
What where the problems with them and will they be back soon?
2. Commons is very slow since the move from the prior "overloaded"
server to the new one. Any explanation to satisfy a simple user?
And what server is the new one?
3. I read about new machines srv51-70. Where do they come from? Can't
see a recent order for them or they are mentioned on
[[meta:Wikimedia_servers]].
Thank you in advance,
Juergen
Hallo
Ich möchte von dem DokuWiki (http://wiki.splitbrain.org/wiki:dokuwiki)
Daten in das MediaWiki importieren. Die Daten sind als UTF-8 Textfile
gespeichert. Gibt es für sowas schon ein Konverter?
Mein Idee ist es, direkt in die mySQl-Datenbank zu importieren. Was ich
vor allem suche ist eine Routine, die ins MediaWIki am Besten über eine
Funktion wie insertNewArticle schreibt. Daher muss das PHP-Skript soweit
ich das sehe nur die localsettings.php laden, zur Datenbank verbinden,
einen Admin anmelden und dann den Text, Zusammenfassung, Autor übergeben
und schreiben. Die Routine zum Auslesen der Daten würde ich natürlich
selbst einfügen.
Kann mir dabei jemand helfen?
Grüße
andreas