Wikitech-l April 2008

wikitech-l@lists.wikimedia.org

100 participants
97 discussions

LiquidThreads patch for user subpages
by Jim Tittsler 27 Apr '08

27 Apr '08

Please add LiquidThreads to the "Component" list of the extensions list of bugzilla.wikimedia.org. The current version of the LiquidThreads extension has a problem when you attempt to start a thread on any subpage of a user. I suggested a fix to the original author a couple of months ago, but he hasn't had time to review it. I am using the attached patch. I would appreciate it if someone could tell me if this is a reasonable fix for my problem. It seems to work, but I don't know my way around MediaWiki and so would appreciate some feedback. Thanks, Jim

2 1

Global groups
by Andrew Garrett 26 Apr '08

26 Apr '08

Hi all, I'm proposing to apply my patch for global groups (as seen on bug 13773[1]) to the CentralAuth extension in the next few days. Nikerabbit and VasilievVV have already had a look over it, but it's a very big change, including changes to the centralauth schema, and so I'm soliciting a bit more review and/or general awareness of it. In essence, the extension stores a global equivalent of $wgGroupPermissions and the user_groups table in the centralauth database, and grants users the appropriate rights. A maintenance script is available for adding all local stewards to a global steward group (which is created by the database patch). I don't currently propose to immediately enable the interface for stewards - we may need to consult the community first, and, in any case, we may wish to grant usage of the interface to some other class of users (perhaps shell users, for the moment?) -- Andrew Garrett

1 0

Addition to importDump.php
by Mohamed Magdy 26 Apr '08

26 Apr '08

Could someone *please* add the option to importDump.php to start from a certain page number? Because now whenever the script die and restarted, it goes from the start and not from the last page it imported which is just a waste of time given that it is already slow. so what do you think? -- --alnokta

2 2

Fwd: Global Groups - request for review.
by Andrew Garrett 25 Apr '08

25 Apr '08

Hi all, I have, in my working copy, some changes to CentralAuth which introduce the concept of a global user group. That is, a group attached to a global account, which applies on all wikis. To use CentralAuth as a guinea pig for core, I've included a "group management" interface, which allows users with the appropriate *global* permissions (local permissions don't count) to edit global group assignments, which are stored in the CentralAuth database. A screenshot of the new interface is available. [1] If this interface is successful, I will consider implementing it in core, to make permissions changes a steward request, not a shell request. This is a pretty major change, and has very wide-reaching implications, both politically and technically (particularly with respect to performance). Consequently, I'm asking for wide review of my patch, which is available on bug 13773 [2], of my methodology for implementing this change, and of the idea itself. Known Issues: * It's not currently clear how to delete a group (you unassign all permissions). I'm going to add a 'delete this group' button in the future. * There's not currently a list of all users in a group. I need to add this functionality to Special:GlobalUsers, as well as changing the logging target. [1] - http://www.mediawiki.org/wiki/Image:Global_Group_Management.png [2] - https://bugzilla.wikimedia.org/show_bug.cgi?id=13773 -- Andrew Garrett

1 0

How is a linkchecker to check if a article exists?
by jidanni＠jidanni.org 25 Apr '08

25 Apr '08

To keep one's website's links fresh, one uses a linkchecker to detect broken links. But how is a linkchecker to check if a wikipedia article exists, given that this returns $ HEAD -H User-Agent: http://*.wikipedia.org/wiki/*|sed q 200 OK for any article, non-existent and existent, and even the 'our servers are experiencing technical problems' message. (-H avoids 403 Forbidden) Should I make a list of the wikipedia URLs I want to check and send each to a API URL for it? Will this API URL return a "more correct" HTTP code? Or must I do something like $ GET $URL|grep 'wiki.* does not currently have an article called .*, maybe it was deleted' && echo $URL Broken

4 3

Duplicate entry importing page table
by Mark Ferguson 25 Apr '08

25 Apr '08

Hi everyone, I am trying to import the page table from the 20080312 dump and I am getting a duplicate entry error. Does anyone know if there is a problem with this dump, and if not could someone help me figure out what I'm doing wrong? I am using the following command to import the file: mysql -u wiki -p wiki20080312 < enwiki-20080312-page.sql And received the following error: ERROR 1062 (23000) at line 338: Duplicate entry '0-' for key 2 This seems to be the page_namespace + page_title key which is broken, so it looks like there are multiple pages with no title? Any ideas on how to get around this? Mark

3 5

Re: [Wikitech-l] Wikitech-l Digest, Vol 57, Issue 38
by Aerik Sylvan 24 Apr '08

24 Apr '08

On Thu, 24 Apr 2008, Robert Stojnic wrote: > Bryan Tong Minh wrote: > > >On Wed, Apr 23, 2008 at 11:06 PM, Robert Stojnic <rainmansr(a)gmail.com> > wrote: > > > >> What remains unsolved, however, is keeping the index updated with the > >> latest changes > >> on the site. If one changes a template with a category in it, the thing > >> goes on the job queue. > >> I assume there would need to be some kind of hook that will either log > >> the change somewhere > >> or send data to lucene somehow. This is the part of the backend that > >> needs thinking and solving. > >> > >LinksUpdateComplete hook? > > > > > > Something like that, yes, but hook probably couldn't just connect to the > lucene indexer and queue updates, since the indexer might be down for > this or that reason... It might be a better solution to put updates into > some table with date attached and then let the indexer fetch updates. > Yes, absolutely what I was saying - use the hook to write to a table. In core, the table is MyISAM and fulltext indexed, for big wikis, it's innodb and Lucene pulls data out of it on a separately defined schedule to build/update the index. How about a simple table like: * pageid - what's this, an int? * categories - a text field * lastchange - a datetime (stamped by mysql on update) Then, whenever the lucene thing runs, it just looks for records changed since the last index update (as you said, Robert). Aerik -- http://www.wikidweb.com - the Wiki Directory of the Web http://tagthis.info - Hosted Tagging for your website!

1 0

Interface embarrassment rant
by Magnus Manske 24 Apr '08

24 Apr '08

<rant> I'm currently working on the Scott Forseman image donation, cutting large scanned images into smaller, manually optimized ones. The category containing the unprocessed images is http://commons.wikimedia.org/wiki/Category:ScottForesman-raw It's shameful. Honestly. Look at it. We're the world's #9 top web site, and this is the best we can do? Yes, I know that the images are large, both in dimensions (~5000x5000px) and size (5-15MB each). Yes, I know that ImageMagick has problems with such images. But honestly, is there no open source software that can generate a thumbnail from a 15MB PNG without nuking our servers? In case it's not possible (which I doubt, since I can generate thumbnails with ImageMagick from these on my laptop, one at a time; maybe a slow-running thumbnail generator, at least for "usual" sizes, on a dedicated server?), it's no use cluttering the entire page with broken thumbnails. Where's the option for a list view? You know, a table with linked title, size, uploader, date, no thumbnails? They're files, so why don't we use things that have proven useful in a file system? And then, of course: "There are 200 files in this category." That's two lines below the "(next 200)" link. At that point, we know there are more than 200 images, but we forget about that two lines further down? Yes, I know that some categories are huge, and that it would take too long to get the exact number. But, would the exact number for large categories be useful? 500.000 or 500.001 entries, who cares? How many categories are that large anyway? 200 or 582 entries, now /that/ people might care about. Why not at least try to get a number, set a limit to, say, 5001, and * give the exact number if it's less that 5001 entries * say "over 5000 entries" if it returns 5001 Yes, everyone's busy. Yes, there are more pressing issues (SUL, stable versions, you name it). Yes, MediaWiki wasn't developed as a media repository (tell me about it;-) Yes, "sofixit" myself. Still, I ask: is this the best we can do? Magnus </rant>

15 56

Re: [Wikitech-l] So... status of category intersections?
by Aerik Sylvan 24 Apr '08

24 Apr '08

On Wed, 23 Apr 2008, Robert Stojnic wrote: > > Roan Kattouw wrote: > > >Brion Vibber schreef: > > > >>Should check whether Robert's already hacked some of this stuff into the > >>lucene server or what changes it would require. > >> > >> > >If I understand correctly, Lucene shouldn't really care what it stores, > >as long as it's text and it's searchable. Storing "Living_people > >Articles_needing_cleanup" would work just fine, right? We do need to > >think about case-sensitivity, though. > > > > Let me briefly repeat what I said earlier about my experience with this > category > intersection thingy. Adding categories to lucene index is easy *IF* they > are inside > the article, e.g. try this: > > > http://en.wikipedia.org/w/index.php?title=Special%3ASearch&search=%2Bincate… > > This will give you category intersection of "Living People" and "English > comedy writers" > in fraction of the second. > Hey Robert, That is really cool - but it seems to be doing a text match on the whole article, not just the categories... ? > > What I found that the hard part is keeping the index updated. If we want > a fancy category > intersection system discussed here before we need to have an index that > is frequently updated, > that will be integrated with the job queue, that will understand > templates etc.. > That is always the hurdle with Lucene, right? It doesn't do updates, just delete, re-add, and then optimize (and I'd guess optimizing can get resource-intensive on a big index). > > Lucene is not that good with very frequent updates. The usual setting is > to have an indexer, > make snapshots of the index at regular intervals and then rsync it onto > searchers. The whole > process takes time, although for a category-only index it will probably > be fast. I assume there > would be at least few tens of minutes lag anyhow. Our current lucene > framework could > easily be used for index distribution and such. > > What remains unsolved, however, is keeping the index updated with the > latest changes > on the site. If one changes a template with a category in it, the thing > goes on the job queue. > I assume there would need to be some kind of hook that will either log > the change somewhere > or send data to lucene somehow. This is the part of the backend that > needs thinking and solving. > Well... this isn't a complete plan, just some thoughts (and maybe they're naive, but I'll give it a shot anyway). I'm thinking of a new table that holds the pageid and a text field that holds the category strings, leaving the underscores in place. This gets updated via hook at the same time an edit triggers an update to the categorylinks table (not familiar enough with the code to have that data at hand) - this will handle categories in templates etc (leverage the logic that already deals with this). Okay, so once you build the table, the updates to that table aren't too bad. In core, this would be a MyISAM table with a fulltext index. For larger wikis, this is an innodb table. Then the question (the one I think you're raising) is at what point do we refresh or update the lucene index from that table? I'm not sure of the best answer to that question. Is it feasible to do delete/add every time a category is changed, and then optimize once a day or something? (probably not, eh?) What are we doing for the main search index? rebuild daily or so? In an initial implementation, why not follow the same type of schedule? Alternately, perhaps do an update and optimize once an hour? Guess it depends on how much time/resources it takes to update and optimize the index... But certainly using the same schedule as the main index is a safe and conservative plan? Best Regards, Aerik -- http://www.wikidweb.com - the Wiki Directory of the Web http://tagthis.info - Hosted Tagging for your website!

3 2

Re: [Wikitech-l] raising the upload limit
by Birgitte SB 24 Apr '08

24 Apr '08

Also archive.org will not work with the ProofeadPage extention. Birgitte SB > Message: 10 > Date: Thu, 24 Apr 2008 11:17:04 +0200 > From: Husky <huskyr(a)gmail.com> > Subject: Re: [Wikitech-l] raising the upload limit > To: "Wikimedia developers" > <wikitech-l(a)lists.wikimedia.org> > Message-ID: > <85f46e6f0804240217of23f254y470b023a79bd208f(a)mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > On Wed, Apr 23, 2008 at 9:41 PM, David Gerard > <dgerard(a)gmail.com> wrote: > > archive.org is better than nothing for now, or do > they have onerous > > requirements? > I haven't looked into archive.org (yet). My feeling was > that > Wikimedia-related files should be hosted on Wikimedia > servers, not on > an external site (even if that site is very > 'friendly' towards our > type of content by allowing CC-licenses to be added to > files). > However, until Tim's proposal with the drop-box is > really built > archive.org is of course a good (temporary) solution. > > -- Hay / Husky > > > > ------------------------------ > > Message: 11 > Date: Thu, 24 Apr 2008 06:14:21 -0500 > From: "Tim Laqua" <t.laqua(a)gmail.com> > Subject: Re: [Wikitech-l] A blank wiki where only the > article appears > To: "'Wikimedia developers'" > <wikitech-l(a)lists.wikimedia.org> > Message-ID: > <48106baa.2de2220a.20cb.426e(a)mx.google.com> > Content-Type: text/plain; charset="us-ascii" > > You could use mod_rewrite to just rewrite every incoming > request to exactly > what you want them to see - completely ignoring any > arguments, query > strings, etc. > > -tl > > > -----Original Message----- > > From: wikitech-l-bounces(a)lists.wikimedia.org > [mailto:wikitech-l- > > bounces(a)lists.wikimedia.org] On Behalf Of Jim Hu > > Sent: Thursday, April 24, 2008 12:45 AM > > To: Wikimedia developers > > Subject: Re: [Wikitech-l] A blank wiki where only the > article appears > > > > It might be easier to page-scrape into an alternate > site using > > action=render. At least if you don't want sneaky > users figuring out > > how to switch skins. Or can one restrict skin use? > > > > Jim > > > > On Apr 21, 2008, at 2:32 PM, Michael Daly wrote: > > > > > Leander Tal wrote: > > >> Hello, i need a blank wiki where only the > text of the article > > >> appears for > > >> the viewer. No header, footer, navigation, > interaction, search, > > >> toolbox, > > >> languages, or categories should appear. Only > the admin, or logged > > >> in persons > > >> should have the standard navigation, and the > ability to edit, and > > >> discuss > > >> the page. How to realize this request? > Thanks, Leander Taler > > > > > > You can do that by creating a skin with the > unwanted items not used. > > > Have the admins and logged-in users select a > conventional skin and > > > have > > > the bland skin used for others. > > > > > > Mike > > > > > > > > > > > > > > > _______________________________________________ > > > Wikitech-l mailing list > > > Wikitech-l(a)lists.wikimedia.org > > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > > ===================================== > > Jim Hu > > Associate Professor > > Dept. of Biochemistry and Biophysics > > 2128 TAMU > > Texas A&M Univ. > > College Station, TX 77843-2128 > > 979-862-4054 > > > > > > _______________________________________________ > > Wikitech-l mailing list > > Wikitech-l(a)lists.wikimedia.org > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > > > ------------------------------ > > _______________________________________________ > Wikitech-l mailing list > Wikitech-l(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > End of Wikitech-l Digest, Vol 57, Issue 37 > ****************************************** ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

1 0

← Newer
1
2
3
4
5
6
7
8
9
10
Older →

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l April 2008