-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Good day,
This is my first post to this list and having read the last couple of posts, my question might be a bit too low-level. But I still hope that you might point me in the right direction.
I've been working on this extension that generates qrcode bitmaps and displays them on a wiki page [0].
In that extension, I'm using the upload() method made available by the LocalFile object, as documented on [1]. In my specific case, the relevant code looks like this:
$ft = Title::makeTitleSafe( NS_FILE, $this->_dstFileName ); $localfile = wfLocalFile( $ft ); $saveName = $localfile->getName(); $pageText = 'QrCode [...]'; $status = $localfile->upload( $tmpName, $this->_label, $pageText, File::DELETE_SOURCE, false, false, $this->_getBot() );
The extension is implemented as a parser function hooked into ParserFirstCallInit.
Now, I haven't found any other explanation, so I suppose this use of the upload() method leads to a peculiar behaviour on my wiki installation, exhibited by these things:
1. QrCodes are generated for pages that do not have or transclude a {{#qrcode:}} function call, in this case properties [2,3,4].
2. These uploaded files have properties [5] and they belong to a category, which means they get linked in the categorylinks table. A common result of this is that qrcode images turn up in i.e. semantic queries [9,10].
3. Qrcodes are even generated for existing qrcodes [6,7]. One way to trigger than behaviour is to visit a File's page and click on the Delete link, without actually deleting the file. This leads to situations such as [8].
4. The files get linked from several pages as this example shows [11]. None of the pages said to link to the file actually do include that file, also those pages vary (2 days ago, 14 pages linked, today only 7 link)
5. Browsing the properties of the above file [12], you can see that it got somehow mixed up with a completely different event.
6. Looking at the database, the mixup hypothesis is confirmed:
SELECT page_id,page_title,cl_sortkey FROM `page` INNER JOIN `categorylinks` FORCE INDEX (cl_sortkey) ON ((cl_from = page_id)) LEFT JOIN `category` ON ((cat_title = page_title AND page_namespace = 14)) WHERE (1 = 1) AND cl_to = 'Project' ORDER BY cl_sortkey
gives (among other data):
page_id page_title cl_sortkey 1403 SMS2Space File:QR-Ask.png 1244 Syn2Sat File:QR-LetzHack.png 1251 ChillyChill File:QR-Syn2cat-radio-ara.png.png
This behaviour occurs in both mw 1.15.5 and 1.16. I would be very grateful if someone more experienced could have a look at this situation. Maybe I'm using the upload() method in a way I should not.
sincerely, David Raison
[0] http://www.mediawiki.org/wiki/Extension:QrCode [1] http://svn.wikimedia.org/doc/classLocalFile.html#4b626952ae0390a7fa453a4bfec... [2] https://www.hackerspace.lu/wiki/File:QR-Is_U19.png [3] https://www.hackerspace.lu/wiki/File:QR-Has_SingleIssuePrice.png [4] https://www.hackerspace.lu/wiki/File:QR-Has_Issues.png [5] https://www.hackerspace.lu/wiki/Property:Has_SingleIssuePrice [6] https://www.hackerspace.lu/wiki/File:QR-QR-Location.png.png [7] https://www.hackerspace.lu/w/index.php?title=Special:RecentChanges&hideb... [8] https://www.hackerspace.lu/wiki/File:QR-QR-QR-QR-Location.png.png.png.png [9] https://www.hackerspace.lu/wiki/Projects#Concluded_Projects [10] https://www.hackerspace.lu/wiki/Special:BrowseData#Q [11] https://www.hackerspace.lu/wiki/File:QR-Syn2cat.png [12] https://www.hackerspace.lu/wiki/Special:Browse/File:QR-2DSyn2cat.png
- -- The Hackerspace in Luxembourg! syn2cat a.s.b.l. - Promoting social and technical innovations 11, rue du cimetière | Pavillon "Am Hueflach" L-8018 Strassen | Luxembourg http://www.hackerspace.lu - ---- mailto:david@hackerspace.lu xmpp:kwisatz@jabber.hackerspaces.org mobile: +43 650 73 63 834 | +352 691 44 23 24 ++++++++++++++++++++++++++++++++++++++++++++ Wear your geek: http://syn2cat.spreadshirt.net
On Fri, Sep 24, 2010 at 2:12 PM, David Raison wrote:
I've been working on this extension that generates qrcode bitmaps and displays them on a wiki page [0].
Hi David! I've actually been peeking at this extension as I'd like to use something like this to generate scannable QR codes with Android software download links for other projects. :)
- QrCodes are generated for pages that do not have or transclude a
{{#qrcode:}} function call, in this case properties [2,3,4].
I haven't fully traced out the execution, but I do notice a few things in the code that look suspicious.
It looks like you're naming the destination file based on the wiki page that has the {{#qrcode}} in it by pulling $wgTitle:
// Use this page's title as part of the filename (Also regenerates qrcodes when the label changes). $this->_dstFileName = 'QR-'.$wgTitle->getDBKey().$append.'.png';
This might be the cause of some of your problems here... background jobs may run re-parses of other seemingly unconnected wiki pages during a request, and other fun things where $wgTitle isn't what you expect, and that might be one cause of it triggering with an unexpected title. You may find that it's more reliable to use $parser->getTitle(), which should definitely return the title for the page being actively parsed.
More generally, using the calling page's title means that you can't easily put multiple codes on a single page, and the same code used on different pages will get copied unnecessarily.
I'd recommend naming the file using a hash of the properties used to generate the image, instead of naming it for the using page. This will make your code a bit more independent of where it gets called from, and will let you both put multiple code images on one page and let common images be shared among multiple pages.
One potential problem is garbage collection: a code that gets generated and used, then removed and not used again will still have been loaded into the system. This is an existing problem with things like the texvc math system, but is a bit more visible here because the images appear in the local uploads area within the wiki. (However they'll be deletable by admins, so not too awful!)
6. Looking at the database, the mixup hypothesis is confirmed:
SELECT page_id,page_title,cl_sortkey FROM `page` INNER JOIN `categorylinks` FORCE INDEX (cl_sortkey) ON ((cl_from = page_id)) LEFT JOIN `category` ON ((cat_title = page_title AND page_namespace = 14)) WHERE (1 = 1) AND cl_to = 'Project' ORDER BY cl_sortkey
gives (among other data):
page_id page_title cl_sortkey 1403 SMS2Space File:QR-Ask.png 1244 Syn2Sat File:QR-LetzHack.png 1251 ChillyChill File:QR-Syn2cat-radio-ara.png.png
It's possible that the internal uploading process interferes with global parsing state when it generates and saves the description page for the wiki; if so, fixing that may require jumping through some interesting hoops. :)
-- brion
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi Brion,
thanks for your quick answer!
On 25/09/10 00:27, Brion Vibber wrote:
Hi David! I've actually been peeking at this extension as I'd like to use something like this to generate scannable QR codes with Android software download links for other projects. :)
Yeah, I was really astonished that there wasn't already a qrcode extension.
- QrCodes are generated for pages that do not have or transclude a
{{#qrcode:}} function call, in this case properties [2,3,4].
I haven't fully traced out the execution, but I do notice a few things in the code that look suspicious.
It looks like you're naming the destination file based on the wiki page that has the {{#qrcode}} in it by pulling $wgTitle:
// Use this page's title as part of the filename (Also regenerates
qrcodes when the label changes). $this->_dstFileName = 'QR-'.$wgTitle->getDBKey().$append.'.png';
This might be the cause of some of your problems here... background jobs may run re-parses of other seemingly unconnected wiki pages during a request, and other fun things where $wgTitle isn't what you expect, and that might be one cause of it triggering with an unexpected title. You may find that it's more reliable to use $parser->getTitle(), which should definitely return the title for the page being actively parsed.
Ok, I'll try that one then, or not, as you suggest below.
More generally, using the calling page's title means that you can't easily put multiple codes on a single page, and the same code used on different pages will get copied unnecessarily.
Well you can use multiple codes on a single page, as demonstrated on the Sandbox [a] and made possible by the $append variable, but you're certainly right about the latter part.
I'd recommend naming the file using a hash of the properties used to generate the image, instead of naming it for the using page. This will make your code a bit more independent of where it gets called from, and will let you both put multiple code images on one page and let common images be shared among multiple pages.
Will do that too then.
One potential problem is garbage collection: a code that gets generated and used, then removed and not used again will still have been loaded into the system. This is an existing problem with things like the texvc math system, but is a bit more visible here because the images appear in the local uploads area within the wiki. (However they'll be deletable by admins, so not too awful!)
Having them uploaded was one of the main reasons I saved the images and don't just return a url to the src attribute of an image tag. But I guess you could have a bot run over it or I suppose there's a hook triggered on deleting a page which would allow to also delete qrcodes embedded into/linked to it.
- Looking at the database, the mixup hypothesis is confirmed:
It's possible that the internal uploading process interferes with global parsing state when it generates and saves the description page for the wiki; if so, fixing that may require jumping through some interesting hoops. :)
Well then let's hope that the $parser->getTitle() alternative solves the problem.
David
- -- The Hackerspace in Luxembourg! syn2cat a.s.b.l. - Promoting social and technical innovations 11, rue du cimetière | Pavillon "Am Hueflach" L-8018 Strassen | Luxembourg http://www.hackerspace.lu - ---- mailto:david@hackerspace.lu xmpp:kwisatz@jabber.hackerspaces.org mobile: +43 650 73 63 834 | +352 691 44 23 24 ++++++++++++++++++++++++++++++++++++++++++++ Wear your geek: http://syn2cat.spreadshirt.net
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 25/09/10 01:40, David Raison wrote:
I'd recommend naming the file using a hash of the properties used to generate the image, instead of naming it for the using page. This will make your code a bit more independent of where it gets called from, and will let you both put multiple code images on one page and let common images be shared among multiple pages.
Will do that too then.
Hmm... this is interesting though... If you check out the source, you'll see that I have replaced every call to the global $wgTitle by the object returned by $parser->getTitle()
And though I don't set a var (but only read them), when you first refresh a page with a QrCode on it, it replaces the page's title with the qrcode's
- Looking at the database, the mixup hypothesis is confirmed:
It's possible that the internal uploading process interferes with global parsing state when it generates and saves the description page for the wiki;
Maybe it is after all this problem. Coming to think of it! When you upload an image using the Special:Upload page, the resulting page's title exhibits exactly the behaviour mentioned above, it turns into
File:<name of uploaded file>
for example:
File:QR-ee09b666b60225368736dfaef75c62ea.png
if so, fixing that may require jumping through some interesting hoops. :)
Can I just another method then, like for example publish() in combination with some other methods that enter the upload into the database?
David
- -- The Hackerspace in Luxembourg! syn2cat a.s.b.l. - Promoting social and technical innovations 11, rue du cimetière | Pavillon "Am Hueflach" L-8018 Strassen | Luxembourg http://www.hackerspace.lu - ---- mailto:david@hackerspace.lu xmpp:kwisatz@jabber.hackerspaces.org mobile: +43 650 73 63 834 | +352 691 44 23 24 ++++++++++++++++++++++++++++++++++++++++++++ Wear your geek: http://syn2cat.spreadshirt.net
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 25/09/10 03:32, David Raison wrote:
Will do that too then.
Hmm... this is interesting though... If you check out the source, you'll see that I have replaced every call to the global $wgTitle by the object returned by $parser->getTitle()
And though I don't set a var (but only read them), when you first refresh a page with a QrCode on it, it replaces the page's title with the qrcode's
Sorry for posting three times in a row, but I just noticed what might be a sign that using $parser->getTitle() is even making things worse than using the global $wgTitle object. Cf. https://www.hackerspace.lu/wiki/Projects
cheers, David
- -- The Hackerspace in Luxembourg! syn2cat a.s.b.l. - Promoting social and technical innovations 11, rue du cimetière | Pavillon "Am Hueflach" L-8018 Strassen | Luxembourg http://www.hackerspace.lu - ---- mailto:david@hackerspace.lu xmpp:kwisatz@jabber.hackerspaces.org mobile: +43 650 73 63 834 | +352 691 44 23 24 ++++++++++++++++++++++++++++++++++++++++++++ Wear your geek: http://syn2cat.spreadshirt.net
wikitech-l@lists.wikimedia.org