Wikitech-l August 2012

wikitech-l@lists.wikimedia.org

157 participants
165 discussions

Phabricator trial
by Ori Livneh 10 Aug '12

10 Aug '12

Hi, Phabricator is now running on https://phabricator.wmflabs.org/. If you haven't before indicated that you trust wmflabs.org certificates, your browser may warn you that the site's identity is untrusted. You should add an exception. You should be able to log in using your WMF Google Apps credentials. If you prefer to have a separate username and password, please contact Steven Walling (swalling(a)wikimedia.org) who has volunteered to help manage accounts during the trial run. Be aware that this is a trial run and that the setup hasn't been thoroughly tested for security vulnerabilities. So the usual caveats apply: don't put anything in that is dear to you or private. Other than that, go nuts. Mark Traceur will administer this machine -- please direct questions to him (but also be prepared to offer help). :) Thanks to Ryan and Jeremy for their help with configuring this machine. -- Ori Livneh ori(a)wikimedia.org

5 5

Re: [Wikitech-l] Wikidata blockers weekly update
by Jeroen De Dauw 10 Aug '12

10 Aug '12

Hey, You bring up some good points. I think we're going to need to have some of this and the synchronization > stuff in core. > Right now the code has nothing but the one sites table. No repo code so > presumably the only implementation of that for awhile will be wikidata. And > if parts of this table is supposed to be editable in some cases where there > is no repo but non-editable then I don't see any way for an edit ui to tell > the difference. > We indeed need some configuration setting(s) for wikis to distinguish between the two cases. That seems to be all "synchronisation code" we'll need in core. It might or might not be useful to have more logic in core, or in some dedicated extension. Personally I think having the actual synchronization code in a separate extension would be nice, as a lot of it won't be Wikidata specific. This is however not a requirement for Wikidata, so the current plan is to just have it in the extension, always keeping in mind that it should be easy to split it off later on. I'd love to discuss this point further, but it should be clear this is not much of a blocker for the current code, as it seems unlikely to affect it much, if at all. On that note consider we're initially creating the new system in parallel with the old one, which enabled us to just try out changes, and alter them later on if it turns out there is a better way to do them. Then once we're confident the new system is what we want to stick to, and know it works because of it's usage by Wikidata, we can replace the current code with the new system. This ought to allow us to work a lot faster by not blocking on discussions and details for to long. > I'm also not sure how this synchronization which sounds like one-way will play with individual wikis wanting to add new interwiki links. For our case we only need it to work one way, from the Wikidata repo to it's clients. More discussion would need to happen to decide on an alternate approach. I already indicated I think this is not a blocker for the current set of changes, so I'd prefer this to happen after the current code got merged. I'm talking about things like the interwiki extensions and scripts that > turn wiki tables into interwiki lists. All these things are written against > the interwiki table. So by rewriting and using a new table we implicitly > break all the working tricks and throw the user back into SQL. > I am aware of this. Like noted already, the current new code does not yet replace the old code, so this is not a blocker yet, but it will be for replacing the old code with the new system. Having looked at the existing code using the old system, I think migration should not be to hard, since the new system can do everything the old one can do and the current using code is not that much. The new system also has clear interfaces, preventing the script from needing to know of the database table at all. That ought to facilitate the "do not depend on a single db table" a lot, obviously :) I like the idea of table entries without actual interwikis. The idea of > some interface listing user selectable sites came to mind and perhaps sites > being added trivially even automatically. > Though if you plan to support this I think you'll need to drop the NOT > NULL from site_local_key. > I don't think the field needs to allow for null - right now the local keys on the repo will be by default the same as the global keys, so none of them will be null. On your client wiki you will then have these values by default as well. If you don't want a particular site to be usable as "languagelink" or "interwikilink", then simply set this in your local configuration. No need to set the local id to null. Depending on how actually we end up handling the defaulting process, having null might or might not turn out to be useful. This is a detail though, so I'd suggest sticking with not null for now, and then if it turns out I'd be more convenient to allow for null when writing the sync code, just change it then. Actually, another thought makes me think the schema should be a little > different. > site_local_key probably shouldn't be a column, it should probably be > another table. > Something like site_local_key (slc_key, slc_site) which would map things > like en:, Wikipedia:, etc... to a specific site. > Denny and I discussed this at some length, now already more then a month ago (man, this is taking long...). Our conclusions where that we do not need it, or would benefit from it much in Wikidata. In fact, I'd introduce additional complexity, which is a good argument for not including it in our already huge project. I do agree that conceptually it's nicer to not duplicate such info, but if you consider the extra complexity you'd need to get rid of it, and the little gain you have (removal of some minor duplication which we've had since forever and is not bothering anyone), I'm sceptical we ought to go with this approach, even outside of Wikidata. I think I need to understand the plans you have for synchronization a bit > more. > - Where does Wikidata get the sites > The repository wiki holds the canonical copy of the sites, which gets send to all clients. Modification of the site data can only happen on the repository. All wikis (repo and clients) have their own local config so can choose to enable all sites for all functionality, completely hide them, or anything in between. - What synchronizes the data > The repo. As already mentioned, it might be nicer to split this off in it's own extension at some point. But before we get to that, we first need to have the current changes merged. Btw if you really want to make this an abstract list of sites dropping site_url > and the other two related columns might be an idea. > At first glance the url looks like something standard that every site > would have. But once you throw something like MediaWiki into the mix with > short urls, long urls, and an API the url really becomes type specific data > that should probably go in the blob. Especially when you start thinking > about other custom types. > The patch sitting on gerrit already includes this. (Did you really look at it already? The fields are documented quite well I'd think.) Every site has a url (that's not specific to the type of site), but we have a type system with currently the default (general) site type and a MediaWikiSite type. The type system works with two blob fields, one for type specific data and one for type specific configuration. Cheers -- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. --

4 6

Periodic updates from bits.wikimedia.org
by Eugene Zelenko 10 Aug '12

10 Aug '12

Hi! I noticed that content from bits.wikimedia.org (including WikiEditor) is updated quite regularly - ~ every 20 minutes on Commons. Such behavior is definitely creates problem for users with slow connections or with payed data traffic. Are JavaScript/CCS are really updated so often? Eugene.

4 5

Wikidata blockers weekly update
by bawolff 10 Aug '12

10 Aug '12

> Hey, > > You mean site_config? > > You're suggesting the interwiki system should look for a site by > > site_local_key, when it finds one parse out the site_config, check if it's > > disabled and if so ignore the fact it found a site with that local key? > > Instead of just not having a site_local_key for that row in the first place? > > > > No. Since the interwiki system is not specific to any type of site, this > approach would be making it needlessly hard. The site_link_inline field > determines if the site should be usable as interwiki link, as you can see > in the patchset: > > -- If the site should be linkable inline as an "interwiki link" using > -- [[site_local_key:pageTitle]]. > site_link_inline bool NOT NULL, > > So queries would be _very_ simple. > > > So data duplication simply because one wiki needs a second local name > will mean that one url now has two different global ids this sounds > precisely like something that is going to get in the way of the whole > reason you wanted this rewrite. > > * It does not get in our way at all, and is completely disjunct from why we > want the rewrite > * It's currently done like this > * The changes we do need and are proposing to make will make such a rewrite > at a later point easier then it is now > > > Doing it this way frees us from creating any restrictions on whatever > source we get sites from that we shouldn't be placing on them. > > * We don't need this for Wikidata > * It's a new feature that might or might not be nice to have that currently > does not exist > * The changes we do need and are proposing to make will make such a rewrite > at a later point easier then it is now > > > So you might as well drop the 3 url related columns and just use the data > blob that you already have. > > I don't see what this would gain us at all. It's just make things more > complicated. > > > The $1 pattern may not even work for some sites. > > * We don't need this for Wikidata > * It's a new feature that might or might not be nice to have that currently > does not exist > * The changes we do need and are proposing to make will make such a rewrite > at a later point easier then it is now > > And in fact we are making this more flexible by having the type system. The > MediaWiki site type could for instance be able to form both "nice" urls and > index.php ones. Or a gerrit type could have the logic to distinguish > between the gerrit commit number and a sha1 hash. > > Cheers [Just to clarify, I'm doing inline replies to things various people said, not just Jeroen] First and foremost, I'm a little confused as to what the actual use cases here are. Could we get a short summary for those who aren't entirely following how wikidata will work, why the current interwiki situation is insufficient? I've read the I0a96e585 and http://lists.wikimedia.org/pipermail/wikitech-l/2012-June/060992.html, but everything seems very vague "It doesn't work for our situation", without any detailed explanation of what that situation is. At most the messages kind of hint at wanting to be able to access the list of interwiki types of the wikidata "server" from a wikidata "client" (and keep them in sync, or at least have them replicated from server->client). But there's no explanation given to why one needs to do that (are we doing some form of interwiki transclusion and need to render foreign interwiki links correctly? Want to be able to do global whatlinkshere and need unique global ids for various wikis? Something else?) >* Site definitions can exist that are not used as "interlanguage link" and >not used as "interwiki link" And if we put one of those on a talk page, what would happen? Or if foo was one such link, doing [[:foo:some page]] (Current behaviour is it becomes an interwiki). Although to be fair, I do see how the current way we distinguish between interwiki and interlang links is a bit hacky. >And in fact we are making this more flexible by having the type system. The >MediaWiki site type could for instance be able to form both "nice" urls and >index.php ones. Or a gerrit type could have the logic to distinguish >between the gerrit commit number and a sha1 hash. I must admit I do like this this idea. In particular the current situation where we treat the value of an interwiki link as a title (aka spaces -> underscores etc) even for sites that do not use such conventions, has always bothered me. Having interwikis that support url re-writing based on the value does sound cool, but I certainly wouldn't want said code in a db blob (and just using an integer site_type identifier is quite far away from giving us that, but its still a step in a positive direction), which raises the question of where would such rewriting code go. > The issue I was trying to deal with was storage. Currently we 100% assume >that the interwiki list is a table and there will only ever be one of them. Do we really assume that? Certainly that's the default config, but I don't think that is the config used on WMF. As far as I'm aware, Wikimedia uses a cdb database file (via $wgInterwikiCache), which contains all the interwikis for all sites. From what I understand, it supports doing various "scope" levels of interwikis, including per db, per site (Wikipedia, Wiktionary, etc), or global interwikis that act on all sites. The feature is a bit wmf specific, but it does seem to support different levels of interwiki lists. Furthermore, I imagine (but don't know, so lets see how fast I get corrected ;) that the cdb database was introduced not just as convenience measure for easier administration of the interwiki tables, but also for better performance. If so, one should also take into account any performance hit that may come with switching to the proposed "sites" facility. Cheers, -bawolff

2 1

Phabricator debrief (was: Serious alternatives to Gerrit)
by Erik Moeller 10 Aug '12

10 Aug '12

On Tue, Jul 24, 2012 at 10:26 PM, Erik Moeller <erik(a)wikimedia.org> wrote: > As one quick update, we're also in touch with Evan Priestley, who's no > longer at Facebook and now running Phabricator as a dedicated open > source project and potential business. If all goes well, Evan's going > to come visit WMF sometime soon, which will be an opportunity to > seriously explore whether Phabricator could be a viable long term > alternative (it's probably not a near term one). Will post more > details if this meeting materializes. We had this conversation with Evan today. The following people participated: David Schoonover, Brion Vibber, Rob Lanphier, Chad Horohoe, Terry Chay, Ryan Lane, Ori Livneh, Roan Kattouw, and myself. Evan gave us a walkthrough of Phabricator's current capabilities, comparing it against the evaluation criteria on https://www.mediawiki.org/wiki/Git/Gerrit_evaluation . Some thoughts below; if you participated, please feel free to jump in with your thoughts/impressions from the conversation, and/or to contradict anything I'm saying. :-) As I understood it, the big gotchas for Phabricator adoption are that Phabricator doesn't manage repositories - it knows how to poll a Git repo, but it doesn't have per-repo access controls or even more than a shallow awareness of what a repository is; it literally shells out to git to perform its operations, e.g. poll for changes - and would still need some work to efficiently deal with hundreds of repositories, long-lived remote branches, and some of the other fun characteristics of Wikimedia's repos. Full repo management is on the roadmap, without an exact date, and Evan is very open to making tweaks and changes as needed, especially if it serves a potential flagship user like Wikimedia. My impression was that a lot of Phabricator's features were well-received, including the code review / inline commenting UI tself, its much more flexible code commenting system, the simple notification filters, etc. Brion suggested in the conversation that a logical way to explore Phabricator's potential value for us might be to start using it for one of the more experimental repos. This would enable us to give feedback to Evan about what's working / what's not working, and to build a working relationship with the Phabricator community. If we believe in the potential, and the dealbreaker features are indeed forthcoming, we could then consider more seriously a move away from Gerrit down the road. If we hate it, the "only" cost is the cost to that team of setting up, maintaining and then ramping down some experimental infrastructure. (This would _not_ be a project for Rob's group to shoulder - you'd have to do so yourself.) If so, based on this, I'm wondering if there are any champions who are willing to do the legwork to 1) set up Phabricator for a current WMF engineering project, 2) convince their team (and potentially the rest of the world) to start using it? I suspect that good candidates would be projects that currently live entirely on GitHub, where Phabricator would be a step _towards_ self-hosted OSS infrastructure, as opposed to spinning out something from Gerrit. But I'd be concerned about doing this with more than one project initially, and only if the entire team is convinced that it's the right thing to do, and is willing to somewhat slow down its velocity to do so. Obviously, any volunteer who wants to experiment with it for their Wikimedia-related project would be welcome to do so in Labs, as well, and I'd be happy to connect them w/ Evan if needed. The alternative is to take another look at Phabricator only when it more closely matches our must-have requirements. -- Erik Möller VP of Engineering and Product Development, Wikimedia Foundation Support Free Knowledge: https://wikimediafoundation.org/wiki/Donate

6 9

Wikidata blockers weekly update
by Denny Vrandečić 09 Aug '12

09 Aug '12

Hi all, here is our update on last weeks blocker email. The list got considerably shorter, but we have some long standing issues still there. No new blockers came up. == Ongoing == * Merging the Wikidata branch (ContentHandler) is still open, see <https://bugzilla.wikimedia.org/show_bug.cgi?id=38622>. There has been no feedback in the last few weeks. Daniel is waiting for input. * Changeset <https://gerrit.wikimedia.org/r/#/c/14295/>, bug <https://bugzilla.wikimedia.org/show_bug.cgi?id=38705> about handling sites. The idea is to migrate from the "interwiki" table to the new "Sites" facility. RobLa mentioned two weeks ago that Chad seems to be working in a similar direction, but we haven't seen comments yet. No discussion is ongoing or any substantial feedback was received here as well, and it seems somewhat stuck. == New in the list == Nothing. == Merges == * https://gerrit.wikimedia.org/r/#/c/14301/ (got merged. Yay!) == Abandoned changesets or not-blocking anymore == * https://gerrit.wikimedia.org/r/#/c/14084/ (abandoned) * https://gerrit.wikimedia.org/r/#/c/8924/ (not blocking anymore but could use some reviewing love) * https://gerrit.wikimedia.org/r/#/c/14303/ (review in progress. not blocking anymore if we drop the STTL extension in favour of the ULS extension, currently investigated) * https://gerrit.wikimedia.org/r/#/c/17073/ (a change to the skin, which we abandoned and we resolve it differently) I hope this helps, Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

6 10

gerrit tab width
by bawolff 09 Aug '12

09 Aug '12

> On 08/08/2012 11:07 AM, Mark Holmquist wrote: > >> Also, "it should be 4 spaces" is a matter of opinion--everyone likes > >> different tab widths depending on their preferences and monitor size. > > > > http://www.mediawiki.org/wiki/CC#Tab_size > > > > "you should make no assumptions" appears to support Chad's statement. > > However, I'm pretty sure that in reality, many people assume a width > > of 4. I've definitely seen funky tab-plus-space indentations that > > support that theory. > > > > I officially redact my bogus claim that it "should" be 4 spaces. > However, we certainly shouldn't default to 8! I use 8 :P (Seriously though, everything looks so crunched up with 4 space tab width... Makes me claustrophobic!) -bawolff

4 3

Generalizing interwiki and interlanguage storage
by Denny Vrandečić 09 Aug '12

09 Aug '12

The original discussion for this was here: <http://lists.wikimedia.org/pipermail/wikitech-l/2012-June/060992.html> I am not sure if we can change the link in <https://gerrit.wikimedia.org/r/#/c/14295/> as it does indeed link to the wrong mail (sorry for the inconvenience). I hope you can find more information in the above thread, Daniel. Cheers, Denny 2012/8/9 Daniel Friesen <lists(a)nadir-seen-fire.com>: > On Thu, 09 Aug 2012 06:54:03 -0700, Denny Vrandečić > <denny.vrandecic(a)wikimedia.de> wrote: > >> Hi all, >> >> [...] >> >> * Changeset <https://gerrit.wikimedia.org/r/#/c/14295/>, bug >> <https://bugzilla.wikimedia.org/show_bug.cgi?id=38705> about handling >> sites. The idea is to migrate from the "interwiki" table to the new >> "Sites" facility. RobLa mentioned two weeks ago that Chad seems to be >> working in a similar direction, but we haven't seen comments yet. No >> discussion is ongoing or any substantial feedback was received here as >> well, and it seems somewhat stuck. >> [...] >> >> >> I hope this helps, >> Cheers, >> Denny >> > > I would like some more information on this. The bug doesn't appear to even > have the correct link for a discussion on this. > > Redoing our interwiki code to deal with some mistakes we made in storage was > something I was hoping to do. > So if this is something hoping to replace the interwiki system I'd like to > look over what the plan and overall idea is with this to make sure we don't > repeat the same mistakes. > > -- > ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name] > > > _______________________________________________ > Wikitech-l mailing list > Wikitech-l(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

1 0

Searching in mediawiki instance with solr?
by Uwe Baumbach 09 Aug '12

09 Aug '12

Hi, is there any special advice for searching in mediawiki instances using solr? The only thing I found http://www.mediawiki.org/wiki/Extension:SolrStore seemed to be quit special for SMW? We already run solr for other projects and it would be nice to include our wiki http://genwiki.genealogy.net Uwe (Baumbach)

2 1

Re: [Wikitech-l] Show Pages by pageid not by title
by Daniel Friesen 09 Aug '12

09 Aug '12

On Wed, 08 Aug 2012 02:37:39 -0700, Jens Albrecht <jens.alb(a)gmx.net> wrote: > Hi, > > > is there a way to show a page not by using the title-form > "index.php/Main_Page" but instead by using the page id like > "index.php?pageid=X" ? > > I hope you get what I mean! Sorry for bad english. > > > Regards &curid= should work. However you should avoid it as much as possible. There are multiple situations where the page id for one page can change. -- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

2 3

← Newer
1
...
10
11
12
13
14
15
16
17
Older →

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l August 2012