Wikitech-l November 2011

wikitech-l@lists.wikimedia.org

127 participants
144 discussions

New methods to use when checking if a title is part of a namespace

by Daniel Friesen

I've added a few lightly abstract methods to replace some of the direct comparisons we make related to namespaces. When you want to see if a title is part of a namespace, instead of writing this: $title->getNamespace() == NS_USER You can now (or rather, if you don't have compat issues, PLEASE DO use it): $title->inNamespace( NS_USER ); When you need to make a test if a page is part of a subject or a talk eg either User or User_talk instead of something verbose like: $title->getNamespace() == NS_USER || $title->getNamespace() == NS_USER_TALK Please use: $title->hasSubjectNamespace( NS_USER ); hasSubjectNamespace will return true if the title's namespace's subject namespace matches the subject namespace of the namespace you pass in. If you're writing verbose code testing if a title is in any of a number of namespaces by using in_array you can use inNamespaces (note the 's'): $title->inNamespace( NS_USER, NS_TEMPLATE ); To be honest, I don't have any good example use cases on hand of where you would use that, but I didn't want the lack of that functionality and the simplicity of in_array to be a valid rationale in not making use of these abstract interfaces to namespace info. Likewise there are two MWNamespace methods to match. MWNamespace::equals and MWNamespace::subjectEquals. And I DO encourage people making $ns == NS_???? comparisons to use MWNamespace::equals( $ns, NS_???? ) instead. Even though technically right now MWNamespace::equals is in fact `return $ns1 == $ns2;`. This is a little relevant to the "MediaWiki should use a reservation system for namespaces" bug: https://bugzilla.wikimedia.org/show_bug.cgi?id=31063 The idea is essentially to drop our practice of passing around integers and instead start passing around keys like "USER", "SMW_PROPERTY", etc... MediaWiki would have a namespace registration system where when given a new key it would reserve a new namespace number for that key. Instead of extensions being forced to declare what integers they are going to use and coordinate with other extension developers so that extensions don't conflict they can instead just make a call to MediaWiki declaring a string based key like "SMW_PROPERTY" which should not be confusable with another extension and then MediaWiki will register an integer in the database for that namespace and reserve it for use with that key. This also has the benefit that if you install an extension then uninstall it, you shouldn't lose the contents of the namespace, and when you re-install it'll start working again without issues like conflicts with titles created in NS_MAIN that match the prefix used. Theoretically changing the content language of your wiki from say 'fr' to 'it' could be made in such a way that MediaWiki won't break existing links and instead the old i18n'ed NS will end up as an alias. The idea also fits in with another bug asking for a namespace manager ui. If we switch to a namespace registration system it will be much easier to create an administrative ui for this. Having these abstract interfaces for namespace comparison around will mean that in the future if we do in fact start passing around things like "USER" instead of integers, there should be no issue of bugs cropping up if some code happens to come together and for an unfortunate reason you happen to have both "USER" and `2` passed from different sources. Making use of Title::inNamespace and MWNamespace::equals will ensure that even if you have "USER" and `2` they will be considered equivalent. Unlike what would happen if you'd used == directly. -- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

12 years, 5 months

Wiki Query Language

by Sebastian Hellmann

Hello all, is there a query language for wiki syntax? (NOTE: I really do not mean the Wikipedia API here.) I am looking for an easy way to scrape data from Wiki pages. In this way, we could apply a crowd-sourcing approach to knowledge extraction from Wikis. There must be thousands of data scraping approaches. But is there one amongst them that has developed a "wiki scraper language" ? Maybe with some sort of fuzziness involved, if the pages are too messy. I have not yet worked with the XML transformation of the wiki markup: *action=expandtemplates ** generatexml - Generate XML parse tree Is it any good for issueing XPATH queries ? Thank you very much, Sebastian -- Dipl. Inf. Sebastian Hellmann Department of Computer Science, University of Leipzig Projects: http://nlp2rdf.org , http://dbpedia.org Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann Research Group: http://aksw.org

12 years, 5 months

Architecture and features of wikimedia labs (application hosting part)

by Petr Bena

Hi, We discussed a bit yesterday in #mediawiki some features of toolserver and their availability on labs in the future, I know that wikimedia labs are still not being used for application hosting (wikipedia bots etc.) so I think that it would be cool to expand this discussion also to this mailing list. I started a list of "wanted features" as Sumanah asked me for here: http://www.mediawiki.org/wiki/WMF_Projects/Wikimedia_Labs/Toolserver_featur… so please insert some other requested stuff there if you think that something is missing (I am pretty sure there are many things which I forgot to mention). I have absolutely no clue how this part of labs is going to be designed but I think that having several instances shared between bot operators (similar fashion as current toolserver is designed) with shared storage for /home between servers would make sense. Having separate instance for each bot operator would IMHO eat too much system resources (bots need just few mb of ram, os needs hundreds), maybe some more complicated bots could have own dedicated instance if there were needs to have some system customizations. Concerning features, I told Ryan yesterday that most of users would probably appreciate to have possibility to at least forward system mail to their own e-mail boxes (atm most of accounts on toolserver have their mail forwarded to account owner e-mail), however Ryan told me that there are some security implications so I don't know if this could be possible in future. Another thing I forgot to mention was that toolserver application servers allow users to access directly their own www folder, so that bots can produce some output which can be accessed from outside (example: http://toolserver.org/~petrb/logs) this could be also problem because virtual servers don't have public ip, so maybe it would be cool if we had some public www server which would be connected to same storage as /home folders of those application instances (probably accessible directly from wmflabs.org). Any ideas? Thanks!

12 years, 5 months

Can not create subpage with [[/${pagename}/]].

by Zhang, Yi Y

Hi guys, I encountered an issue with my newly installed mediawiki 1.17.0. I firstly create a page called "Hello" in Main page. Then entered "Hello" page and tried to create a subpage of "Hello" by using [[/Test/]] in "Hello". But, it shows "/Test/" in "Hello" page and it actually not a subpage "Test" of "Hello", but a subpage "/Test/" of Main Page. That means I can create a subpage by suing [[/example/]] Do you know why this issue happen? Is it a bug or did I do any wrong settings? Thanks a lot for you help! Yi

12 years, 5 months

Code review graphs working again, and....

by Rob Lanphier

Hi everyone, The JS-generated code review graphs are working again. The phase3 graphs now line up: http://toolserver.org/~bryan/stats/codereview-status-diff.png http://toolserver.org/~robla/crstats/crstats.trunkphase3.html The minor discrepancy between those two graphs can be explained by a minor bug in mine. In my version, those revisions that span multiple trees (e.g. r103519 [1]) aren't counted. That seems to be a relatively small number of revs (mostly localization updates). As for the "and...", it looks like code review in "/trunk" is quickly getting out of control: http://toolserver.org/~robla/crstats/crstats.trunkall.html ...at a pace far faster than I had realized. I'm hoping that a lot of the revisions in there are easily deferred, but if not, things are looking pretty grim for 1.19. Rob [1] https://www.mediawiki.org/wiki/Special:Code/MediaWiki/103519

12 years, 5 months

New committers

by Tim Starling

* Sami Islam (sami): Collapsmi * Christian Neubauer (cneubauer): SpecialUploadLocal, FCKeditor * Greg Varnum (varnent): AddThis

12 years, 5 months

programmatically extracting lists from list pages on Wikipedia

by Fred Zimmerman

hi, I want to programmatically extract lists from list pages on Wikipedia. That is to say, if there is a page that mostly consists of a list (list of episodes, list of presidents, etc.) I want to be able to extract the list from the page, with article names/links. Has anyone already done this? can anyone suggest a good strategy? FredZ

12 years, 5 months

1.18 Release Candidate

by mhershberger＠wikimedia.org

If you haven't tried it yet, please give the release candidate a try: http://download.wikimedia.org/mediawiki/1.18/mediawiki-1.18.0rc1.tar.gz If you've tried it out and found a problem, please let us know. But, if it works for you, please let us know that, too. For example, I upgraded a MediaWiki site that I maintain from 1.15 to 1.18. Except for some trouble with the customized skin, things went smoothly. I added my report to https://www.mediawiki.org/wiki/MediaWiki_roadmap/1.18/Installation_reports but it is looking pretty lonely right now. Please add your own experience. Thanks, Mark.

12 years, 5 months

update.php now makes lines that are twice as long as it used to

by jidanni＠jidanni.org

update.php now makes lines that are twice as long as it used to. https://bugzilla.wikimedia.org/show_bug.cgi?id=32508 . In the past on lines that did nothing, only half the message would be printed. Now the user is getting overloaded with long messages about items even when nothing in the database was changed.

12 years, 5 months

two layers of <a>

by jidanni＠jidanni.org

Why the two layers of <a> even if it passes $ validate http://en.wikipedia.org/wiki/Oaxtepec *** Errors validating Oaxtepec: *** Error at line 2, character 33: there is no attribute "class" Error at line 152, character 10: end tag for "ul" which is not finished Error at line 177, character 10: end tag for "ul" which is not finished Why the </a> halfway through the first pair of Coordinates: 18°54′N 98°58′W / 18.9°N 98.967°W / 18.9; -98.967 Why does it look fine in some browsers but ah-ha caught you in emacs-w3m? Could it be that Firefox and Chromium are fooled into thinking that the outer <a id...> which lasts through the whole six coordinates should render as a clickable link... which it apparently does even with stylesheets off. Only emacs-w3m renders it right, revealing the badly written HTML! Here's the code, <a id="coordinates"><a href="/wiki/Geographic_coordinate_system" title="Geographic coordinate system">Coordinates</a>: <a rel="nofollow" class="external text" href="http://toolserver.org/~geohack/geohack.php?pagename=Oaxtepec&params=18_…">18°54′N</a> 98°58′W / 18.9°N 98.967°W / 18.9; -98.967</a>

12 years, 5 months

← Newer
1
2
3
4
5
6
7
8
...
15
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l November 2011