Re: [Wikitech-l] Rendering wikimarkup using code from Mediawiki

19 Feb 2008

      Hey thanks for the tip. I tried grepping using the following command
grep -Rl "SELECT" .|grep -v "/.svn/"|grep -v "/docs/"|grep -v "/maintenance/"
and got the list below. Its really long. I have excluded the
maintenance and docs folders completely. The files in includes
directory is the first place I would be looking into.
./extensions/CrossNamespaceLinks/SpecialCrossNamespaceLinks_body.php
./extensions/CategoryTree/CategoryTreeFunctions.php
./includes/SearchTsearch2.php
./includes/SpecialAncientpages.php
./includes/SpecialLonelypages.php
./includes/SpecialWithoutinterwiki.php
./includes/ImagePage.php
./includes/SearchOracle.php
./includes/Export.php
./includes/SpecialUncategorizedpages.php
./includes/SpecialRecentchanges.php
./includes/SpecialMostlinked.php
./includes/Block.php
./includes/Sanitizer.php
./includes/SpecialRecentchangeslinked.php
./includes/SpecialWantedcategories.php
./includes/FileStore.php
./includes/LinkCache.php
./includes/SpecialUnusedcategories.php
./includes/SpecialDeadendpages.php
./includes/BagOStuff.php
./includes/SpecialShortpages.php
./includes/SpecialFewestrevisions.php
./includes/filerepo/File.php
./includes/filerepo/ICRepo.php
./includes/filerepo/LocalFile.php
./includes/SpecialUnusedimages.php
./includes/QueryPage.php
./includes/SiteStats.php
./includes/SpecialUnwatchedpages.php
./includes/Parser.php
./includes/SpecialExport.php
./includes/DatabaseOracle.php
./includes/Parser_OldPP.php
./includes/SearchPostgres.php
./includes/SpecialMostcategories.php
./includes/SpecialListredirects.php
./includes/SpecialLog.php
./includes/SpecialMostlinkedtemplates.php
./includes/Title.php
./includes/SpecialDisambiguations.php
./includes/SpecialDoubleRedirects.php
./includes/SkinTemplate.php
./includes/SpecialRandompage.php
./includes/SpecialMIMEsearch.php
./includes/SpecialPopularpages.php
./includes/LinkBatch.php
./includes/SpecialWantedpages.php
./includes/api/ApiQueryRecentChanges.php
./includes/Database.php
./includes/SpecialMostlinkedcategories.php
./includes/SpecialMostimages.php
./includes/Skin.php
./includes/SpecialBrokenRedirects.php
./includes/SpecialWatchlist.php
./includes/SearchMySQL.php
./includes/DatabasePostgres.php
./includes/SpecialNewimages.php
./includes/SpecialUnusedtemplates.php
./includes/SpecialMostrevisions.php
./includes/Categoryfinder.php
./includes/SpecialAllmessages.php
./includes/SpecialNewpages.php
./includes/SpecialUncategorizedimages.php
./includes/SpecialUpload.php
./includes/LinksUpdate.php
./includes/Article.php
./includes/WatchlistEditor.php
./skins/disabled/MonoBookCBT.php
./config/index.php
./tests/DatabaseTest.php
./tests/MediaWiki_TestCase.php
./profileinfo.php
------------------------------------------
...
However, it turned out that access in bzipped files was way too slow,
unzipped data was way too large to be of use, and re-indexing would
take ages. I even tried sqlite, which bogged down. Maybe sqlite3 does
better these days.
The kde app for reading wiki dump does the very thing of reading
directly from bz2 files and it is not slow.
URL:http://www.kde-apps.org/content/show.php?content=65244
On 2/19/08, Magnus Manske magnusmanske@googlemail.com wrote:
...
On Feb 19, 2008 3:04 AM, Apple Grew applegrew@gmail.com wrote:
...
On Feb 19, 2008 3:25 AM, Roan Kattouw roan.kattouw@home.nl wrote:
...
This more or less exists already in the API:
http://en.wikipedia.org/w/api.php?action=parse&text=%5B%5Bhello%5D%5D&am...
...
...
Roan Kattouw (Catrope)

Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
The problem with this is that it needs a install of Mediawiki with
database working. The database too must have the necessary template
pages in it. If we try to use the the api from the official website
then we need a working internet connection for that (at least during
parsing the XML file, not always) plus it is pointless as the XML file
too contains the template information.
One approach I took some time ago was to alter the database access
script. As a quick hack, use regexp to find queries that want text or
data, then return bogus data (where it's unimportant for the
rendering) or text (retrieve from XML dump). Ignore anything that
doesn't start with "SELECT" ;-)
However, it turned out that access in bzipped files was way too slow,
unzipped data was way too large to be of use, and re-indexing would
take ages. I even tried sqlite, which bogged down. Maybe sqlite3 does
better these days.
Magnus

Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- 
Apple Grew
my blog @ http://applegrew.blogspot.com/

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Rendering wikimarkup using code from Mediawiki