Hi,
I have two questions. When looking at the data, there seem to be lines referring to the same article but to different encoding styles, such as
en John_Edwards_presidential_campaign%2C_2008 16 16 en John_Edwards_presidential_campaign,_2008 5 5
Is there a way that the wikistats files could take this into account and report just one line per existing article, no matter what the encoding is?
Second, I find the number of requests on the individual main pages rather low. My this be due to the counting style of only counting those people who for example explicitly asked for http://de.wikipedia.org/wiki/Hauptseite rather than http://de.wikipedia.org ?
Mathias
On Sun, Jan 06, 2008 at 11:05:46AM +0100, Mathias Schindler wrote:
Second, I find the number of requests on the individual main pages rather low. My this be due to the counting style of only counting those people who for example explicitly asked for http://de.wikipedia.org/wiki/Hauptseite rather than http://de.wikipedia.org ?
de.wikipedia.org sends a 301 redirect response to the browser, telling it to load http://de.wikipedia.org/wiki/Hauptseite, so each request to http://de.wikipedia.org generates another browser request for http://de.wikipedia.org/wiki/Hauptseite
Regards,
jens
Mathias Schindler wrote:
en John_Edwards_presidential_campaign%2C_2008 16 16 en John_Edwards_presidential_campaign,_2008 5 5
Is there a way that the wikistats files could take this into account
This is one thing you need to postprocess. Another thing is that every language has a default 8bit encoding, which is Windows-1252 (a superset of Latin-1) for German, Swedish and English, but other 8-bit encodings for other languages. Even if %2C and %2c mean comma in all character sets, a code such as %df can mean different things on the German and Polish Wikipedia. A third thing is that Spezial: (in German) and Special: needs to be treated as the same namespace.
Second, I find the number of requests on the individual main pages rather low. My this be due to the counting style of only counting those people who for example explicitly asked for http://de.wikipedia.org/wiki/Hauptseite rather than http://de.wikipedia.org ?
The latter might not be included in Domas' logs, since it doesn't start with /wiki/.
wikitech-l@lists.wikimedia.org