Re: [Wikitech-l] Core html of a wikisource page

6 Apr 2011

2011/4/6 Daniel Kinzler &lt;daniel(a)brightbyte.de&gt;

...
  On 06.04.2011 09:15, Alex Brollo wrote:
  I saved the HTML source of a typical Page: page
from it.source, the
 resulting txt file having ~ 28 kBy; then I saved the "core html" only, 
t.i.
  the content of <div
class="pagetext">, and this file have 2.1 kBy; so
 there's a more than tenfold ratio between "container" and "real
content". 
 wow, really? that seems a lot...

  I there a trick to download the "core
html" only? 
 there are two ways:

 a) the old style "render" action, like this:
 <http://en.wikipedia.org/wiki/Foo?action=render>

 b) the api "parse" action, like this:
 <

http://en.wikipedia.org/w/api.php?action=parse&page=Foo&redirects=1…

 To learn more about the web API, have a look at <
 http://www.mediawiki.org/wiki/API>

 Thanks Daniel, API stuff is a little hard for me:  the more I study, the
less I edit. :-)

Just to have a try, I called the same page, "render" action gives a file of
~ 3.4 kBy, "api" action a file of ~ 5.6 kBy. Obviuosly I'm thinking to bot
download. You are suggesting that it would be a good idea to use a *unlogged
* bot to avoid page parsing, and to catch the page code from some cache? I
know that some thousands of calls are nothing for wiki servers, but... I
always try to get a good performance, even from the most banal template.

Alex

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Core html of a wikisource page