Re: [Wikitech-l] thumb generation

15 Sep 2015


      On Mon, Sep 14, 2015 at 4:49 PM, Platonides platonides@gmail.com wrote:
...
You know it will fail for all kind of images included through templates
(particularly infoboxes), right?
Indeed, it is not possible to find out what thumbnails are used by a page
without actually parsing it. Your best bet is to wait until Parsoid dumps
become available (T17017 https://phabricator.wikimedia.org/T17017), then
go through those with an XML parser and extract the thumb URLs. That's
still slow but not as slow as the MediaWiki parser. (Or you can try to find
a regexp which matches thumbnail URLs but we all know what happens
http://stackoverflow.com/a/1732454/323407 when you use a regexp to parse
HTML.) After that, just throw those URLs at the 404 handler.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] thumb generation