Hey,
You bring up some good points.
I think we're going to need to have some of this and the synchronization
stuff in core. Right now the code has nothing but the one sites table. No repo code so presumably the only implementation of that for awhile will be wikidata. And if parts of this table is supposed to be editable in some cases where there is no repo but non-editable then I don't see any way for an edit ui to tell the difference.
We indeed need some configuration setting(s) for wikis to distinguish between the two cases. That seems to be all "synchronisation code" we'll need in core. It might or might not be useful to have more logic in core, or in some dedicated extension. Personally I think having the actual synchronization code in a separate extension would be nice, as a lot of it won't be Wikidata specific. This is however not a requirement for Wikidata, so the current plan is to just have it in the extension, always keeping in mind that it should be easy to split it off later on. I'd love to discuss this point further, but it should be clear this is not much of a blocker for the current code, as it seems unlikely to affect it much, if at all.
On that note consider we're initially creating the new system in parallel with the old one, which enabled us to just try out changes, and alter them later on if it turns out there is a better way to do them. Then once we're confident the new system is what we want to stick to, and know it works because of it's usage by Wikidata, we can replace the current code with the new system. This ought to allow us to work a lot faster by not blocking on discussions and details for to long.
I'm also not sure how this synchronization which sounds like one-way will
play with individual wikis wanting to add new interwiki links.
For our case we only need it to work one way, from the Wikidata repo to it's clients. More discussion would need to happen to decide on an alternate approach. I already indicated I think this is not a blocker for the current set of changes, so I'd prefer this to happen after the current code got merged.
I'm talking about things like the interwiki extensions and scripts that
turn wiki tables into interwiki lists. All these things are written against the interwiki table. So by rewriting and using a new table we implicitly break all the working tricks and throw the user back into SQL.
I am aware of this. Like noted already, the current new code does not yet replace the old code, so this is not a blocker yet, but it will be for replacing the old code with the new system. Having looked at the existing code using the old system, I think migration should not be to hard, since the new system can do everything the old one can do and the current using code is not that much. The new system also has clear interfaces, preventing the script from needing to know of the database table at all. That ought to facilitate the "do not depend on a single db table" a lot, obviously :)
I like the idea of table entries without actual interwikis. The idea of
some interface listing user selectable sites came to mind and perhaps sites being added trivially even automatically. Though if you plan to support this I think you'll need to drop the NOT NULL from site_local_key.
I don't think the field needs to allow for null - right now the local keys on the repo will be by default the same as the global keys, so none of them will be null. On your client wiki you will then have these values by default as well. If you don't want a particular site to be usable as "languagelink" or "interwikilink", then simply set this in your local configuration. No need to set the local id to null. Depending on how actually we end up handling the defaulting process, having null might or might not turn out to be useful. This is a detail though, so I'd suggest sticking with not null for now, and then if it turns out I'd be more convenient to allow for null when writing the sync code, just change it then.
Actually, another thought makes me think the schema should be a little
different. site_local_key probably shouldn't be a column, it should probably be another table. Something like site_local_key (slc_key, slc_site) which would map things like en:, Wikipedia:, etc... to a specific site.
Denny and I discussed this at some length, now already more then a month ago (man, this is taking long...). Our conclusions where that we do not need it, or would benefit from it much in Wikidata. In fact, I'd introduce additional complexity, which is a good argument for not including it in our already huge project. I do agree that conceptually it's nicer to not duplicate such info, but if you consider the extra complexity you'd need to get rid of it, and the little gain you have (removal of some minor duplication which we've had since forever and is not bothering anyone), I'm sceptical we ought to go with this approach, even outside of Wikidata.
I think I need to understand the plans you have for synchronization a bit
more.
- Where does Wikidata get the sites
The repository wiki holds the canonical copy of the sites, which gets send to all clients. Modification of the site data can only happen on the repository. All wikis (repo and clients) have their own local config so can choose to enable all sites for all functionality, completely hide them, or anything in between.
- What synchronizes the data
The repo. As already mentioned, it might be nicer to split this off in it's own extension at some point. But before we get to that, we first need to have the current changes merged.
Btw if you really want to make this an abstract list of sites dropping site_url
and the other two related columns might be an idea. At first glance the url looks like something standard that every site would have. But once you throw something like MediaWiki into the mix with short urls, long urls, and an API the url really becomes type specific data that should probably go in the blob. Especially when you start thinking about other custom types.
The patch sitting on gerrit already includes this. (Did you really look at it already? The fields are documented quite well I'd think.) Every site has a url (that's not specific to the type of site), but we have a type system with currently the default (general) site type and a MediaWikiSite type. The type system works with two blob fields, one for type specific data and one for type specific configuration.
Cheers
-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. --
On 12-08-09 3:55 PM, Jeroen De Dauw wrote:
Hey,
You bring up some good points.
I think we're going to need to have some of this and the synchronization stuff in core. Right now the code has nothing but the one sites table. No repo code so presumably the only implementation of that for awhile will be wikidata. And if parts of this table is supposed to be editable in some cases where there is no repo but non-editable then I don't see any way for an edit ui to tell the difference.
We indeed need some configuration setting(s) for wikis to distinguish between the two cases. That seems to be all "synchronisation code" we'll need in core. It might or might not be useful to have more logic in core, or in some dedicated extension. Personally I think having the actual synchronization code in a separate extension would be nice, as a lot of it won't be Wikidata specific. This is however not a requirement for Wikidata, so the current plan is to just have it in the extension, always keeping in mind that it should be easy to split it off later on. I'd love to discuss this point further, but it should be clear this is not much of a blocker for the current code, as it seems unlikely to affect it much, if at all.
On that note consider we're initially creating the new system in parallel with the old one, which enabled us to just try out changes, and alter them later on if it turns out there is a better way to do them. Then once we're confident the new system is what we want to stick to, and know it works because of it's usage by Wikidata, we can replace the current code with the new system. This ought to allow us to work a lot faster by not blocking on discussions and details for to long.
I'm also not sure how this synchronization which sounds like one-way
will play with individual wikis wanting to add new interwiki links.
For our case we only need it to work one way, from the Wikidata repo to it's clients. More discussion would need to happen to decide on an alternate approach. I already indicated I think this is not a blocker for the current set of changes, so I'd prefer this to happen after the current code got merged.
I'm talking about things like the interwiki extensions and scripts that turn wiki tables into interwiki lists. All these things are written against the interwiki table. So by rewriting and using a new table we implicitly break all the working tricks and throw the user back into SQL.
I am aware of this. Like noted already, the current new code does not yet replace the old code, so this is not a blocker yet, but it will be for replacing the old code with the new system. Having looked at the existing code using the old system, I think migration should not be to hard, since the new system can do everything the old one can do and the current using code is not that much. The new system also has clear interfaces, preventing the script from needing to know of the database table at all. That ought to facilitate the "do not depend on a single db table" a lot, obviously :)
I like the idea of table entries without actual interwikis. The idea of some interface listing user selectable sites came to mind and perhaps sites being added trivially even automatically. Though if you plan to support this I think you'll need to drop the NOT NULL from site_local_key.
I don't think the field needs to allow for null - right now the local keys on the repo will be by default the same as the global keys, so none of them will be null. On your client wiki you will then have these values by default as well. If you don't want a particular site to be usable as "languagelink" or "interwikilink", then simply set this in your local configuration. No need to set the local id to null. Depending on how actually we end up handling the defaulting process, having null might or might not turn out to be useful. This is a detail though, so I'd suggest sticking with not null for now, and then if it turns out I'd be more convenient to allow for null when writing the sync code, just change it then.
You mean site_config? You're suggesting the interwiki system should look for a site by site_local_key, when it finds one parse out the site_config, check if it's disabled and if so ignore the fact it found a site with that local key? Instead of just not having a site_local_key for that row in the first place?
Actually, another thought makes me think the schema should be a little different. site_local_key probably shouldn't be a column, it should probably be another table. Something like site_local_key (slc_key, slc_site) which would map things like en:, Wikipedia:, etc... to a specific site.
Denny and I discussed this at some length, now already more then a month ago (man, this is taking long...). Our conclusions where that we do not need it, or would benefit from it much in Wikidata. In fact, I'd introduce additional complexity, which is a good argument for not including it in our already huge project. I do agree that conceptually it's nicer to not duplicate such info, but if you consider the extra complexity you'd need to get rid of it, and the little gain you have (removal of some minor duplication which we've had since forever and is not bothering anyone), I'm sceptical we ought to go with this approach, even outside of Wikidata.
You've added global ids into this mix. So data duplication simply because one wiki needs a second local name will mean that one url now has two different global ids this sounds precisely like something that is going to get in the way of the whole reason you wanted this rewrite. It will also start to create issues with the sync code. Additionally the number of duplicates needed is going to vary wiki by wiki. en.wikisource is going to need one Wikipedia: to link to en.wp while fr.wp is going to need two, Wikipedia: and en: to point to en.wp. I can only see data duplication creating more problems than we need.
As for the supposed complexity of this extra table. site_data and site_config are blobs of presumably serialized data. You've already eliminated the simplicity needed for this to be human editable from SQL so there is no reason to hold back on making the database schema the best it can be. As for deletions if you're worried about making them simple just add a foreign key with cascading deletion. Then the rows in site_local_key will automatically be deleted when you delete the row in sites without any extra complexity.
I think I need to understand the plans you have for synchronization a bit more. - Where does Wikidata get the sites
The repository wiki holds the canonical copy of the sites, which gets send to all clients. Modification of the site data can only happen on the repository. All wikis (repo and clients) have their own local config so can choose to enable all sites for all functionality, completely hide them, or anything in between.
Ok, I'm leaning more and more towards the idea that we should make the full sites table a second-class index of sites pulled from any number of data sources that you can carelessly truncate and have rebuilt (ie: it has no more value than pagelinks). Wikidata's data syncing would be served by creating a secondary table with the local link_{key,inline,navigation}, forward, and config columns. When you sync the data from the Wikidata repo and the site local table would be combined to create what goes into the index table with the full list of sites. Doing it this way frees us from creating any restrictions on whatever source we get sites from that we shouldn't be placing on them. Wikidata gets site local stuff and global data and doesn't have to worry about whether parts of the row are supposed to be editable or not. There is nothing stopping us from making our first non-wikidata site source a plaintext file so we have time to write a really good UI. And the UI is free from restrictions placed by using this one table, so it's free to do it in whatever way fits a UI best. Whether that means it's an editable wikitext page or better yet a nice ui using that abstract revision system I wanted to build.
- What synchronizes the data
The repo. As already mentioned, it might be nicer to split this off in it's own extension at some point. But before we get to that, we first need to have the current changes merged.
Btw if you really want to make this an abstract list of sites dropping site_url and the other two related columns might be an idea. At first glance the url looks like something standard that every site would have. But once you throw something like MediaWiki into the mix with short urls, long urls, and an API the url really becomes type specific data that should probably go in the blob. Especially when you start thinking about other custom types.
The patch sitting on gerrit already includes this. (Did you really look at it already? The fields are documented quite well I'd think.) Every site has a url (that's not specific to the type of site), but we have a type system with currently the default (general) site type and a MediaWikiSite type. The type system works with two blob fields, one for type specific data and one for type specific configuration.
Yeah, I looked at the schema I know there is a data blob, that's what I'm talking about. I mean while you'd think that a url is something every site would have one of it's actually more of a type specific piece of data because some site types can actually have multiple urls, etc... which depend on what the page input is. So you might as well drop the 3 url related columns and just use the data blob that you already have. The $1 pattern may not even work for some sites. For example something like a gerrit type may want to know a specific root path for gerrit without any $1 funny business and then handle what actual url gets output in special ways. ie: So that [[gerrit:14295]] links to https://gerrit.wikimedia.org/r/#/c/14295 while [[gerrit: I0a96e58556026d8c923551b07af838ca426a2ab3]] links to https://gerrit.wikimedia.org/r/#q,I0a96e58556026d8c923551b07af838ca426a2ab3,...
Cheers
-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. --
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
Hey,
You mean site_config?
You're suggesting the interwiki system should look for a site by site_local_key, when it finds one parse out the site_config, check if it's disabled and if so ignore the fact it found a site with that local key? Instead of just not having a site_local_key for that row in the first place?
No. Since the interwiki system is not specific to any type of site, this approach would be making it needlessly hard. The site_link_inline field determines if the site should be usable as interwiki link, as you can see in the patchset:
-- If the site should be linkable inline as an "interwiki link" using -- [[site_local_key:pageTitle]]. site_link_inline bool NOT NULL,
So queries would be _very_ simple.
So data duplication simply because one wiki needs a second local name
will mean that one url now has two different global ids this sounds precisely like something that is going to get in the way of the whole reason you wanted this rewrite.
* It does not get in our way at all, and is completely disjunct from why we want the rewrite * It's currently done like this * The changes we do need and are proposing to make will make such a rewrite at a later point easier then it is now
Doing it this way frees us from creating any restrictions on whatever
source we get sites from that we shouldn't be placing on them.
* We don't need this for Wikidata * It's a new feature that might or might not be nice to have that currently does not exist * The changes we do need and are proposing to make will make such a rewrite at a later point easier then it is now
So you might as well drop the 3 url related columns and just use the data
blob that you already have.
I don't see what this would gain us at all. It's just make things more complicated.
The $1 pattern may not even work for some sites.
* We don't need this for Wikidata * It's a new feature that might or might not be nice to have that currently does not exist * The changes we do need and are proposing to make will make such a rewrite at a later point easier then it is now
And in fact we are making this more flexible by having the type system. The MediaWiki site type could for instance be able to form both "nice" urls and index.php ones. Or a gerrit type could have the logic to distinguish between the gerrit commit number and a sha1 hash.
Cheers
-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. --
Hi Daniel,
thanks for your comments. Some of the suggestions you make would extend the functionality beyond what we need right now. They look certainly useful, and I don't think that we would make implementing them any harder than it is right now -- rather the opposite.
As usual, the perfect and the next-step are great enemies. I understand that the patch does not lead to a perfect world directly, that would cover all of your use cases -- but it nicely covers ours.
My questions would be:
* do you think that we are going in the wrong direction, or do you think we are not going far enough yet?
* do you think that we are making some use cases harder to implement in the future than they would be now, and if so which ones?
* do you see other issues with the patch that should block it from being deployed, and which ones would that be?
Cheers, Denny
2012/8/10 Daniel Friesen lists@nadir-seen-fire.com:
On 12-08-09 3:55 PM, Jeroen De Dauw wrote:
Hey,
You bring up some good points.
I think we're going to need to have some of this and the synchronization stuff in core. Right now the code has nothing but the one sites table. No repo code so presumably the only implementation of that for awhile will be wikidata. And if parts of this table is supposed to be editable in some cases where there is no repo but non-editable then I don't see any way for an edit ui to tell the difference.
We indeed need some configuration setting(s) for wikis to distinguish between the two cases. That seems to be all "synchronisation code" we'll need in core. It might or might not be useful to have more logic in core, or in some dedicated extension. Personally I think having the actual synchronization code in a separate extension would be nice, as a lot of it won't be Wikidata specific. This is however not a requirement for Wikidata, so the current plan is to just have it in the extension, always keeping in mind that it should be easy to split it off later on. I'd love to discuss this point further, but it should be clear this is not much of a blocker for the current code, as it seems unlikely to affect it much, if at all.
On that note consider we're initially creating the new system in parallel with the old one, which enabled us to just try out changes, and alter them later on if it turns out there is a better way to do them. Then once we're confident the new system is what we want to stick to, and know it works because of it's usage by Wikidata, we can replace the current code with the new system. This ought to allow us to work a lot faster by not blocking on discussions and details for to long.
I'm also not sure how this synchronization which sounds like one-way
will play with individual wikis wanting to add new interwiki links.
For our case we only need it to work one way, from the Wikidata repo to it's clients. More discussion would need to happen to decide on an alternate approach. I already indicated I think this is not a blocker for the current set of changes, so I'd prefer this to happen after the current code got merged.
I'm talking about things like the interwiki extensions and scripts that turn wiki tables into interwiki lists. All these things are written against the interwiki table. So by rewriting and using a new table we implicitly break all the working tricks and throw the user back into SQL.
I am aware of this. Like noted already, the current new code does not yet replace the old code, so this is not a blocker yet, but it will be for replacing the old code with the new system. Having looked at the existing code using the old system, I think migration should not be to hard, since the new system can do everything the old one can do and the current using code is not that much. The new system also has clear interfaces, preventing the script from needing to know of the database table at all. That ought to facilitate the "do not depend on a single db table" a lot, obviously :)
I like the idea of table entries without actual interwikis. The idea of some interface listing user selectable sites came to mind and perhaps sites being added trivially even automatically. Though if you plan to support this I think you'll need to drop the NOT NULL from site_local_key.
I don't think the field needs to allow for null - right now the local keys on the repo will be by default the same as the global keys, so none of them will be null. On your client wiki you will then have these values by default as well. If you don't want a particular site to be usable as "languagelink" or "interwikilink", then simply set this in your local configuration. No need to set the local id to null. Depending on how actually we end up handling the defaulting process, having null might or might not turn out to be useful. This is a detail though, so I'd suggest sticking with not null for now, and then if it turns out I'd be more convenient to allow for null when writing the sync code, just change it then.
You mean site_config? You're suggesting the interwiki system should look for a site by site_local_key, when it finds one parse out the site_config, check if it's disabled and if so ignore the fact it found a site with that local key? Instead of just not having a site_local_key for that row in the first place?
Actually, another thought makes me think the schema should be a little different. site_local_key probably shouldn't be a column, it should probably be another table. Something like site_local_key (slc_key, slc_site) which would map things like en:, Wikipedia:, etc... to a specific site.
Denny and I discussed this at some length, now already more then a month ago (man, this is taking long...). Our conclusions where that we do not need it, or would benefit from it much in Wikidata. In fact, I'd introduce additional complexity, which is a good argument for not including it in our already huge project. I do agree that conceptually it's nicer to not duplicate such info, but if you consider the extra complexity you'd need to get rid of it, and the little gain you have (removal of some minor duplication which we've had since forever and is not bothering anyone), I'm sceptical we ought to go with this approach, even outside of Wikidata.
You've added global ids into this mix. So data duplication simply because one wiki needs a second local name will mean that one url now has two different global ids this sounds precisely like something that is going to get in the way of the whole reason you wanted this rewrite. It will also start to create issues with the sync code. Additionally the number of duplicates needed is going to vary wiki by wiki. en.wikisource is going to need one Wikipedia: to link to en.wp while fr.wp is going to need two, Wikipedia: and en: to point to en.wp. I can only see data duplication creating more problems than we need.
As for the supposed complexity of this extra table. site_data and site_config are blobs of presumably serialized data. You've already eliminated the simplicity needed for this to be human editable from SQL so there is no reason to hold back on making the database schema the best it can be. As for deletions if you're worried about making them simple just add a foreign key with cascading deletion. Then the rows in site_local_key will automatically be deleted when you delete the row in sites without any extra complexity.
I think I need to understand the plans you have for synchronization a bit more. - Where does Wikidata get the sites
The repository wiki holds the canonical copy of the sites, which gets send to all clients. Modification of the site data can only happen on the repository. All wikis (repo and clients) have their own local config so can choose to enable all sites for all functionality, completely hide them, or anything in between.
Ok, I'm leaning more and more towards the idea that we should make the full sites table a second-class index of sites pulled from any number of data sources that you can carelessly truncate and have rebuilt (ie: it has no more value than pagelinks). Wikidata's data syncing would be served by creating a secondary table with the local link_{key,inline,navigation}, forward, and config columns. When you sync the data from the Wikidata repo and the site local table would be combined to create what goes into the index table with the full list of sites. Doing it this way frees us from creating any restrictions on whatever source we get sites from that we shouldn't be placing on them. Wikidata gets site local stuff and global data and doesn't have to worry about whether parts of the row are supposed to be editable or not. There is nothing stopping us from making our first non-wikidata site source a plaintext file so we have time to write a really good UI. And the UI is free from restrictions placed by using this one table, so it's free to do it in whatever way fits a UI best. Whether that means it's an editable wikitext page or better yet a nice ui using that abstract revision system I wanted to build.
- What synchronizes the data
The repo. As already mentioned, it might be nicer to split this off in it's own extension at some point. But before we get to that, we first need to have the current changes merged.
Btw if you really want to make this an abstract list of sites dropping site_url and the other two related columns might be an idea. At first glance the url looks like something standard that every site would have. But once you throw something like MediaWiki into the mix with short urls, long urls, and an API the url really becomes type specific data that should probably go in the blob. Especially when you start thinking about other custom types.
The patch sitting on gerrit already includes this. (Did you really look at it already? The fields are documented quite well I'd think.) Every site has a url (that's not specific to the type of site), but we have a type system with currently the default (general) site type and a MediaWikiSite type. The type system works with two blob fields, one for type specific data and one for type specific configuration.
Yeah, I looked at the schema I know there is a data blob, that's what I'm talking about. I mean while you'd think that a url is something every site would have one of it's actually more of a type specific piece of data because some site types can actually have multiple urls, etc... which depend on what the page input is. So you might as well drop the 3 url related columns and just use the data blob that you already have. The $1 pattern may not even work for some sites. For example something like a gerrit type may want to know a specific root path for gerrit without any $1 funny business and then handle what actual url gets output in special ways. ie: So that [[gerrit:14295]] links to https://gerrit.wikimedia.org/r/#/c/14295 while [[gerrit: I0a96e58556026d8c923551b07af838ca426a2ab3]] links to https://gerrit.wikimedia.org/r/#q,I0a96e58556026d8c923551b07af838ca426a2ab3,...
Cheers
-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. --
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Fri, 10 Aug 2012 05:03:58 -0700, Denny Vrandečić denny.vrandecic@wikimedia.de wrote:
Hi Daniel,
thanks for your comments. Some of the suggestions you make would extend the functionality beyond what we need right now. They look certainly useful, and I don't think that we would make implementing them any harder than it is right now -- rather the opposite.
As usual, the perfect and the next-step are great enemies. I understand that the patch does not lead to a perfect world directly, that would cover all of your use cases -- but it nicely covers ours.
My questions would be:
- do you think that we are going in the wrong direction, or do you
think we are not going far enough yet?
- do you think that we are making some use cases harder to implement
in the future than they would be now, and if so which ones?
I don't think you're going far enough. You're rewriting a core feature in core but key issues with the old system that should be fixed in any rewrite of it are explicitly being repeated just because you don't need them fixed for Wikidata. I'm not expecting any of you to spend a pile of time writing a UI because it's missing. But I do expect that if we have a good idea what the optimal database schema and usage of the feature is that you'd make a tiny effort to include the fixes that Wikidata doesn't explicitly need. Instead of rewriting it using a non-optimal format and forcing someone else to rewrite stuff again.
Taking site_local_key for an example. Clearly site_local_key as a single column does not work. We know from our interwiki experience we really want multiple keys. And there is absolutely no value at all to forcing data duplication.
If we use a proper site_local_key right now before submitting the code it should be a minimal change to the code you have right now. (Unless the ORM mapper makes it hard to use joins, in which case you'd be making a bad choice from the start since when someone does fix site_local_key they will need to break interface compatibility)
If someone trys to do this later. They are going to have to do schema changes, a full data migration in the updater, AND they are going to have to find some way to do de-dupliation of data. These are things that wouldn't need to be bothered with at all if the initial rewrite just made a few extra steps.
- do you see other issues with the patch that should block it from
being deployed, and which ones would that be?
I covered a few of them in inline comments on the commit. Things like not understanding the role of group. Using ints for site types being bad for extensibility. etc...
Cheers, Denny
Hey,
But I do expect that if we have a good idea what the optimal database
schema and usage of the feature is that you'd make a tiny effort to include the fixes that Wikidata doesn't explicitly need.
This is entirely reasonable to ask. However this particular change is not tiny, and it would cost us both effort to implement and make the change even bigger, while we're trying to keep it small. We actually did go for the low hanging fruit we did not need ourselves here, so implying we don't care about concerns outside of our project would be short-sighted. After all, strictly speaking we do not _need_ this rewrite. We could just continue on pouring crap onto the current pile and hope it does not collapse rather then fix all of the issues our change is tackling.
Instead of rewriting it using a non-optimal format and forcing someone
else to rewrite stuff again.
We are not touching this, so you would still need to make the change if you want to fix this issue, but you would not need to do it _again_. To be honest I don't understand why you have a problem here. We're making it easier for you to make this change. If you think it's that important, then let's get our changes through so you can start making yours without us getting in each others way.
Unless the ORM mapper makes it hard to use joins
It does basically does not affect joins - it has no facilities for it, but it does not make them harder either.
Cheers
-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. --
On Fri, Aug 10, 2012 at 9:02 AM, Daniel Friesen lists@nadir-seen-fire.com wrote:
I don't think you're going far enough. You're rewriting a core feature in core but key issues with the old system that should be fixed in any rewrite of it are explicitly being repeated just because you don't need them fixed for Wikidata. I'm not expecting any of you to spend a pile of time writing a UI because it's missing. But I do expect that if we have a good idea what the optimal database schema and usage of the feature is that you'd make a tiny effort to include the fixes that Wikidata doesn't explicitly need. Instead of rewriting it using a non-optimal format and forcing someone else to rewrite stuff again.
I agree one billion percent with everything you've said here, and it's the *exact* point I've been trying to make all along.
I have no qualms with people trying to fix this--it's something that needs to be fixed and has been on my todo list for far longer than it should've been. But if it's going to be fixed/rewritten, time should be taken so it is done properly.
-Chad
wikitech-l@lists.wikimedia.org