Re: [Toolserver-l] question about performance/getting webpage content

25 Nov 2008

      Hi!
Platonides wrote:
...
seth wrote:
...
I wrote a perl script, which works on some HTML content of some 
wikipedia-webpages. Some of those pages are >300kB and perls LWP-mirror 
hangs up.
I was wrong. LWP-mirror did not hang up, but the content was not fully 
loaded, because of caching. After I purged the site manually, everything 
was ok.
...
...

Is there a better/faster way to get the HTML content of e.g.

http://meta.wikimedia.org/wiki/Spam_blacklist/Log
than
  my $ua = LWP::UserAgent->new;
  $ua->mirror($url, $filename);
?
To get the content of wikipedia pages you should be using WikiProxy 
http://meta.wikimedia.org/wiki/User:Duesentrieb/WikiProxy
Does this tool purge automatically? Is there any manual for that tool?
bye
seth

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Toolserver-l] question about performance/getting webpage content