Hi,
Phabricator is now running on https://phabricator.wmflabs.org/. If you haven't before indicated that you trust wmflabs.org certificates, your browser may warn you that the site's identity is untrusted. You should add an exception.
You should be able to log in using your WMF Google Apps credentials. If you prefer to have a separate username and password, please contact Steven Walling (swalling(a)wikimedia.org) who has volunteered to help manage accounts during the trial run.
Be aware that this is a trial run and that the setup hasn't been thoroughly tested for security vulnerabilities. So the usual caveats apply: don't put anything in that is dear to you or private. Other than that, go nuts.
Mark Traceur will administer this machine -- please direct questions to him (but also be prepared to offer help). :)
Thanks to Ryan and Jeremy for their help with configuring this machine.
--
Ori Livneh
ori(a)wikimedia.org
Hey,
You bring up some good points.
I think we're going to need to have some of this and the synchronization
> stuff in core.
> Right now the code has nothing but the one sites table. No repo code so
> presumably the only implementation of that for awhile will be wikidata. And
> if parts of this table is supposed to be editable in some cases where there
> is no repo but non-editable then I don't see any way for an edit ui to tell
> the difference.
>
We indeed need some configuration setting(s) for wikis to distinguish
between the two cases. That seems to be all "synchronisation code" we'll
need in core. It might or might not be useful to have more logic in core,
or in some dedicated extension. Personally I think having the actual
synchronization code in a separate extension would be nice, as a lot of it
won't be Wikidata specific. This is however not a requirement for Wikidata,
so the current plan is to just have it in the extension, always keeping in
mind that it should be easy to split it off later on. I'd love to discuss
this point further, but it should be clear this is not much of a blocker
for the current code, as it seems unlikely to affect it much, if at all.
On that note consider we're initially creating the new system in parallel
with the old one, which enabled us to just try out changes, and alter them
later on if it turns out there is a better way to do them. Then once we're
confident the new system is what we want to stick to, and know it works
because of it's usage by Wikidata, we can replace the current code with the
new system. This ought to allow us to work a lot faster by not blocking on
discussions and details for to long.
> I'm also not sure how this synchronization which sounds like one-way will
play with individual wikis wanting to add new interwiki links.
For our case we only need it to work one way, from the Wikidata repo to
it's clients. More discussion would need to happen to decide on an
alternate approach. I already indicated I think this is not a blocker for
the current set of changes, so I'd prefer this to happen after the current
code got merged.
I'm talking about things like the interwiki extensions and scripts that
> turn wiki tables into interwiki lists. All these things are written against
> the interwiki table. So by rewriting and using a new table we implicitly
> break all the working tricks and throw the user back into SQL.
>
I am aware of this. Like noted already, the current new code does not yet
replace the old code, so this is not a blocker yet, but it will be for
replacing the old code with the new system. Having looked at the existing
code using the old system, I think migration should not be to hard, since
the new system can do everything the old one can do and the current using
code is not that much. The new system also has clear interfaces, preventing
the script from needing to know of the database table at all. That ought to
facilitate the "do not depend on a single db table" a lot, obviously :)
I like the idea of table entries without actual interwikis. The idea of
> some interface listing user selectable sites came to mind and perhaps sites
> being added trivially even automatically.
> Though if you plan to support this I think you'll need to drop the NOT
> NULL from site_local_key.
>
I don't think the field needs to allow for null - right now the local keys
on the repo will be by default the same as the global keys, so none of them
will be null. On your client wiki you will then have these values by
default as well. If you don't want a particular site to be usable as
"languagelink" or "interwikilink", then simply set this in your local
configuration. No need to set the local id to null. Depending on how
actually we end up handling the defaulting process, having null might or
might not turn out to be useful. This is a detail though, so I'd suggest
sticking with not null for now, and then if it turns out I'd be more
convenient to allow for null when writing the sync code, just change it
then.
Actually, another thought makes me think the schema should be a little
> different.
> site_local_key probably shouldn't be a column, it should probably be
> another table.
> Something like site_local_key (slc_key, slc_site) which would map things
> like en:, Wikipedia:, etc... to a specific site.
>
Denny and I discussed this at some length, now already more then a month
ago (man, this is taking long...). Our conclusions where that we do not
need it, or would benefit from it much in Wikidata. In fact, I'd introduce
additional complexity, which is a good argument for not including it in our
already huge project. I do agree that conceptually it's nicer to not
duplicate such info, but if you consider the extra complexity you'd need to
get rid of it, and the little gain you have (removal of some minor
duplication which we've had since forever and is not bothering anyone), I'm
sceptical we ought to go with this approach, even outside of Wikidata.
I think I need to understand the plans you have for synchronization a bit
> more.
> - Where does Wikidata get the sites
>
The repository wiki holds the canonical copy of the sites, which gets send
to all clients. Modification of the site data can only happen on the
repository. All wikis (repo and clients) have their own local config so can
choose to enable all sites for all functionality, completely hide them, or
anything in between.
- What synchronizes the data
>
The repo. As already mentioned, it might be nicer to split this off in it's
own extension at some point. But before we get to that, we first need to
have the current changes merged.
Btw if you really want to make this an abstract list of sites dropping site_url
> and the other two related columns might be an idea.
> At first glance the url looks like something standard that every site
> would have. But once you throw something like MediaWiki into the mix with
> short urls, long urls, and an API the url really becomes type specific data
> that should probably go in the blob. Especially when you start thinking
> about other custom types.
>
The patch sitting on gerrit already includes this. (Did you really look at
it already? The fields are documented quite well I'd think.) Every site has
a url (that's not specific to the type of site), but we have a type system
with currently the default (general) site type and a MediaWikiSite type.
The type system works with two blob fields, one for type specific data and
one for type specific configuration.
Cheers
--
Jeroen De Dauw
http://www.bn2vs.com
Don't panic. Don't be evil.
--
Hi!
I noticed that content from bits.wikimedia.org (including WikiEditor)
is updated quite regularly - ~ every 20 minutes on Commons.
Such behavior is definitely creates problem for users with slow
connections or with payed data traffic.
Are JavaScript/CCS are really updated so often?
Eugene.
> Hey,
>
> You mean site_config?
> > You're suggesting the interwiki system should look for a site by
> > site_local_key, when it finds one parse out the site_config, check if it's
> > disabled and if so ignore the fact it found a site with that local key?
> > Instead of just not having a site_local_key for that row in the first place?
> >
>
> No. Since the interwiki system is not specific to any type of site, this
> approach would be making it needlessly hard. The site_link_inline field
> determines if the site should be usable as interwiki link, as you can see
> in the patchset:
>
> -- If the site should be linkable inline as an "interwiki link" using
> -- [[site_local_key:pageTitle]].
> site_link_inline bool NOT NULL,
>
> So queries would be _very_ simple.
>
> > So data duplication simply because one wiki needs a second local name
> will mean that one url now has two different global ids this sounds
> precisely like something that is going to get in the way of the whole
> reason you wanted this rewrite.
>
> * It does not get in our way at all, and is completely disjunct from why we
> want the rewrite
> * It's currently done like this
> * The changes we do need and are proposing to make will make such a rewrite
> at a later point easier then it is now
>
> > Doing it this way frees us from creating any restrictions on whatever
> source we get sites from that we shouldn't be placing on them.
>
> * We don't need this for Wikidata
> * It's a new feature that might or might not be nice to have that currently
> does not exist
> * The changes we do need and are proposing to make will make such a rewrite
> at a later point easier then it is now
>
> > So you might as well drop the 3 url related columns and just use the data
> blob that you already have.
>
> I don't see what this would gain us at all. It's just make things more
> complicated.
>
> > The $1 pattern may not even work for some sites.
>
> * We don't need this for Wikidata
> * It's a new feature that might or might not be nice to have that currently
> does not exist
> * The changes we do need and are proposing to make will make such a rewrite
> at a later point easier then it is now
>
> And in fact we are making this more flexible by having the type system. The
> MediaWiki site type could for instance be able to form both "nice" urls and
> index.php ones. Or a gerrit type could have the logic to distinguish
> between the gerrit commit number and a sha1 hash.
>
> Cheers
[Just to clarify, I'm doing inline replies to things various people
said, not just Jeroen]
First and foremost, I'm a little confused as to what the actual use
cases here are. Could we get a short summary for those who aren't
entirely following how wikidata will work, why the current interwiki
situation is insufficient? I've read the I0a96e585 and
http://lists.wikimedia.org/pipermail/wikitech-l/2012-June/060992.html,
but everything seems very vague "It doesn't work for our situation",
without any detailed explanation of what that situation is. At most
the messages kind of hint at wanting to be able to access the list of
interwiki types of the wikidata "server" from a wikidata "client" (and
keep them in sync, or at least have them replicated from
server->client). But there's no explanation given to why one needs to
do that (are we doing some form of interwiki transclusion and need to
render foreign interwiki links correctly? Want to be able to do global
whatlinkshere and need unique global ids for various wikis? Something
else?)
>* Site definitions can exist that are not used as "interlanguage link" and
>not used as "interwiki link"
And if we put one of those on a talk page, what would happen? Or if
foo was one such link, doing [[:foo:some page]] (Current behaviour is
it becomes an interwiki).
Although to be fair, I do see how the current way we distinguish
between interwiki and interlang links is a bit hacky.
>And in fact we are making this more flexible by having the type system. The
>MediaWiki site type could for instance be able to form both "nice" urls and
>index.php ones. Or a gerrit type could have the logic to distinguish
>between the gerrit commit number and a sha1 hash.
I must admit I do like this this idea. In particular the current
situation where we treat the value of an interwiki link as a title
(aka spaces -> underscores etc) even for sites that do not use such
conventions, has always bothered me. Having interwikis that support
url re-writing based on the value does sound cool, but I certainly
wouldn't want said code in a db blob (and just using an integer
site_type identifier is quite far away from giving us that, but its
still a step in a positive direction), which raises the question of
where would such rewriting code go.
> The issue I was trying to deal with was storage. Currently we 100% assume
>that the interwiki list is a table and there will only ever be one of them.
Do we really assume that? Certainly that's the default config, but I
don't think that is the config used on WMF. As far as I'm aware,
Wikimedia uses a cdb database file (via $wgInterwikiCache), which
contains all the interwikis for all sites. From what I understand, it
supports doing various "scope" levels of interwikis, including per db,
per site (Wikipedia, Wiktionary, etc), or global interwikis that act
on all sites.
The feature is a bit wmf specific, but it does seem to support
different levels of interwiki lists.
Furthermore, I imagine (but don't know, so lets see how fast I get
corrected ;) that the cdb database was introduced not just as
convenience measure for easier administration of the interwiki tables,
but also for better performance. If so, one should also take into
account any performance hit that may come with switching to the
proposed "sites" facility.
Cheers,
-bawolff
On Tue, Jul 24, 2012 at 10:26 PM, Erik Moeller <erik(a)wikimedia.org> wrote:
> As one quick update, we're also in touch with Evan Priestley, who's no
> longer at Facebook and now running Phabricator as a dedicated open
> source project and potential business. If all goes well, Evan's going
> to come visit WMF sometime soon, which will be an opportunity to
> seriously explore whether Phabricator could be a viable long term
> alternative (it's probably not a near term one). Will post more
> details if this meeting materializes.
We had this conversation with Evan today. The following people
participated: David Schoonover, Brion Vibber, Rob Lanphier, Chad
Horohoe, Terry Chay, Ryan Lane, Ori Livneh, Roan Kattouw, and myself.
Evan gave us a walkthrough of Phabricator's current capabilities,
comparing it against the evaluation criteria on
https://www.mediawiki.org/wiki/Git/Gerrit_evaluation .
Some thoughts below; if you participated, please feel free to jump in
with your thoughts/impressions from the conversation, and/or to
contradict anything I'm saying. :-)
As I understood it, the big gotchas for Phabricator adoption are that
Phabricator doesn't manage repositories - it knows how to poll a Git
repo, but it doesn't have per-repo access controls or even more than a
shallow awareness of what a repository is; it literally shells out to
git to perform its operations, e.g. poll for changes - and would still
need some work to efficiently deal with hundreds of repositories,
long-lived remote branches, and some of the other fun characteristics
of Wikimedia's repos. Full repo management is on the roadmap, without
an exact date, and Evan is very open to making tweaks and changes as
needed, especially if it serves a potential flagship user like
Wikimedia.
My impression was that a lot of Phabricator's features were
well-received, including the code review / inline commenting UI tself,
its much more flexible code commenting system, the simple notification
filters, etc.
Brion suggested in the conversation that a logical way to explore
Phabricator's potential value for us might be to start using it for
one of the more experimental repos. This would enable us to give
feedback to Evan about what's working / what's not working, and to
build a working relationship with the Phabricator community. If we
believe in the potential, and the dealbreaker features are indeed
forthcoming, we could then consider more seriously a move away from
Gerrit down the road. If we hate it, the "only" cost is the cost to
that team of setting up, maintaining and then ramping down some
experimental infrastructure. (This would _not_ be a project for Rob's
group to shoulder - you'd have to do so yourself.)
If so, based on this, I'm wondering if there are any champions who are
willing to do the legwork to 1) set up Phabricator for a current WMF
engineering project, 2) convince their team (and potentially the rest
of the world) to start using it? I suspect that good candidates would
be projects that currently live entirely on GitHub, where Phabricator
would be a step _towards_ self-hosted OSS infrastructure, as opposed
to spinning out something from Gerrit. But I'd be concerned about
doing this with more than one project initially, and only if the
entire team is convinced that it's the right thing to do, and is
willing to somewhat slow down its velocity to do so.
Obviously, any volunteer who wants to experiment with it for their
Wikimedia-related project would be welcome to do so in Labs, as well,
and I'd be happy to connect them w/ Evan if needed.
The alternative is to take another look at Phabricator only when it
more closely matches our must-have requirements.
--
Erik Möller
VP of Engineering and Product Development, Wikimedia Foundation
Support Free Knowledge: https://wikimediafoundation.org/wiki/Donate
Hi all,
here is our update on last weeks blocker email. The list got
considerably shorter, but we have some long standing issues still
there. No new blockers came up.
== Ongoing ==
* Merging the Wikidata branch (ContentHandler) is still open, see
<https://bugzilla.wikimedia.org/show_bug.cgi?id=38622>. There has been
no feedback in the last few weeks. Daniel is waiting for input.
* Changeset <https://gerrit.wikimedia.org/r/#/c/14295/>, bug
<https://bugzilla.wikimedia.org/show_bug.cgi?id=38705> about handling
sites. The idea is to migrate from the "interwiki" table to the new
"Sites" facility. RobLa mentioned two weeks ago that Chad seems to be
working in a similar direction, but we haven't seen comments yet. No
discussion is ongoing or any substantial feedback was received here as
well, and it seems somewhat stuck.
== New in the list ==
Nothing.
== Merges ==
* https://gerrit.wikimedia.org/r/#/c/14301/ (got merged. Yay!)
== Abandoned changesets or not-blocking anymore ==
* https://gerrit.wikimedia.org/r/#/c/14084/ (abandoned)
* https://gerrit.wikimedia.org/r/#/c/8924/ (not blocking anymore but
could use some reviewing love)
* https://gerrit.wikimedia.org/r/#/c/14303/ (review in progress. not
blocking anymore if we drop the STTL extension in favour of the ULS
extension, currently investigated)
* https://gerrit.wikimedia.org/r/#/c/17073/ (a change to the skin,
which we abandoned and we resolve it differently)
I hope this helps,
Cheers,
Denny
--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
> On 08/08/2012 11:07 AM, Mark Holmquist wrote:
> >> Also, "it should be 4 spaces" is a matter of opinion--everyone likes
> >> different tab widths depending on their preferences and monitor size.
> >
> > http://www.mediawiki.org/wiki/CC#Tab_size
> >
> > "you should make no assumptions" appears to support Chad's statement.
> > However, I'm pretty sure that in reality, many people assume a width
> > of 4. I've definitely seen funky tab-plus-space indentations that
> > support that theory.
> >
>
> I officially redact my bogus claim that it "should" be 4 spaces.
> However, we certainly shouldn't default to 8!
I use 8 :P
(Seriously though, everything looks so crunched up with 4 space tab
width... Makes me claustrophobic!)
-bawolff
The original discussion for this was here:
<http://lists.wikimedia.org/pipermail/wikitech-l/2012-June/060992.html>
I am not sure if we can change the link in
<https://gerrit.wikimedia.org/r/#/c/14295/> as it does indeed link to
the wrong mail (sorry for the inconvenience).
I hope you can find more information in the above thread, Daniel.
Cheers,
Denny
2012/8/9 Daniel Friesen <lists(a)nadir-seen-fire.com>:
> On Thu, 09 Aug 2012 06:54:03 -0700, Denny Vrandečić
> <denny.vrandecic(a)wikimedia.de> wrote:
>
>> Hi all,
>>
>> [...]
>>
>> * Changeset <https://gerrit.wikimedia.org/r/#/c/14295/>, bug
>> <https://bugzilla.wikimedia.org/show_bug.cgi?id=38705> about handling
>> sites. The idea is to migrate from the "interwiki" table to the new
>> "Sites" facility. RobLa mentioned two weeks ago that Chad seems to be
>> working in a similar direction, but we haven't seen comments yet. No
>> discussion is ongoing or any substantial feedback was received here as
>> well, and it seems somewhat stuck.
>> [...]
>>
>>
>> I hope this helps,
>> Cheers,
>> Denny
>>
>
> I would like some more information on this. The bug doesn't appear to even
> have the correct link for a discussion on this.
>
> Redoing our interwiki code to deal with some mistakes we made in storage was
> something I was hoping to do.
> So if this is something hoping to replace the interwiki system I'd like to
> look over what the plan and overall idea is with this to make sure we don't
> repeat the same mistakes.
>
> --
> ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Hi,
is there any special advice for searching in mediawiki instances using solr?
The only thing I found
http://www.mediawiki.org/wiki/Extension:SolrStore
seemed to be quit special for SMW?
We already run solr for other projects and it would be nice to include our wiki
http://genwiki.genealogy.net
Uwe (Baumbach)
On Wed, 08 Aug 2012 02:37:39 -0700, Jens Albrecht <jens.alb(a)gmx.net> wrote:
> Hi,
>
>
> is there a way to show a page not by using the title-form
> "index.php/Main_Page" but instead by using the page id like
> "index.php?pageid=X" ?
>
> I hope you get what I mean! Sorry for bad english.
>
>
> Regards
&curid= should work.
However you should avoid it as much as possible. There are multiple
situations where the page id for one page can change.
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]