Hi all,
I've been working for the past month on an browser-based editor for JSON, called JSONwidget. For those of you unfamiliar with JSON, it's a data serialization format, which is a fancy way of saying text markup for structured data. It's one of two alternatives to XML that is gaining traction as simpler, more compact formats text formats for data serialization (the other one being YAML).
Anyway, what my tool does is takes a JSON file and a JSON-formatted schema, and renders a user interface for editing the JSON, creating neatly formatted JSON on the client for submission back to the server.
Here are the demos: http://robla.net/2005/jsonwidget/#demos
Note that this version has only been tested with Firefox 1.0.7. I plan to fix many known bugs in IE and Opera in my next release, and would welcome help from users of other browsers. There are still plenty of rough edges even for Firefox users, but there should be enough there to give you an idea of where its heading.
For those that object to fancy Javascript interfaces on philosophical grounds, you'll be pleased to know that it does have a failover mode. If Javascript is turned off, you are presented with a simple web form to edit the raw JSON. Not pretty, but functional in a pinch.
I'm not sure if something like this could be adapted for Wikidata; I haven't had a chance to really dive in and see where that project is at. But I'm throwing that out there as a possibility.
Let me know what you think.
Thanks Rob
This is interesting. Based on my own experience, I can tell you that cross-browser compatibility is going to be a PITA. For Wikidata and Ultimate Wiktionary, I'm considering using this library:
http://ianbicking.org/examples/repeat_form/form.html
It's built on top of Mochikit, a fairly powerful JavaScript library.
When dealing with complex forms on top of a relational database (which is essentially what Wikidata is going to be, plus versioning and other wiki-ness), safe failover is really hard to achieve. Essentially, one has to accept that applications of a certain complexity, especially dealing with data entry, require certain capabilities on the client end.
The above forms are one example - where, with JavaScript, you can easily create complex relationships between different elements, without JavaScript, you would have submit the same form 10 or 20 times, use kludges like a fixed number of empty fields, etc. That doesn't even touch upon AJAX stuff like autocompletion, which is hard to do without in some contexts. So when it comes to building Wikidata UIs, I think it makes sense to apply the [[Pareto principle]].
Erik
On Thu, Dec 01, 2005 at 07:18:31PM +0100, Erik Moeller wrote:
This is interesting. Based on my own experience, I can tell you that cross-browser compatibility is going to be a PITA. For Wikidata and Ultimate Wiktionary, I'm considering using this library:
http://ianbicking.org/examples/repeat_form/form.html
It's built on top of Mochikit, a fairly powerful JavaScript library.
When dealing with complex forms on top of a relational database (which is essentially what Wikidata is going to be, plus versioning and other wiki-ness), safe failover is really hard to achieve. Essentially, one has to accept that applications of a certain complexity, especially dealing with data entry, require certain capabilities on the client end.
Has Ruby on Rails (http://www.rubyonrails.com) been considered for Wikidata? (from looking trough the Meta discussion page it seems this is not the case)
I've been reading the "Agile Web Development with Rails" book, and trying some tiny web apps myself, and it really is a very nice and much more abstract way of writing web applications (which most of the time leads to more productivity).
The real challenge here would be the integration of the existing PHP function/state framework with something written in Ruby (http://www.rubyist.net/~slagell/ruby/index.html).
Ruby to Mediawiki bindings, single-login user session handling, combined HTML rendering, and more would be needed, but I think in the end the future benefits of this increased productivity and the future Rails plugin extensions could really be very interesting. Unfortunately it involves a lot of complicated work, thats for sure.
regards,
Jama Poulsen http://wikicompany.org http://debianlinux.net
Hoi, From my visit to Berlin I can assure you that Ruby on Rails was looked at. I can also say that Erik has experimented with Ruby. Thanks, GerardM
Jama Poulsen wrote:
On Thu, Dec 01, 2005 at 07:18:31PM +0100, Erik Moeller wrote:
This is interesting. Based on my own experience, I can tell you that cross-browser compatibility is going to be a PITA. For Wikidata and Ultimate Wiktionary, I'm considering using this library:
http://ianbicking.org/examples/repeat_form/form.html
It's built on top of Mochikit, a fairly powerful JavaScript library.
When dealing with complex forms on top of a relational database (which is essentially what Wikidata is going to be, plus versioning and other wiki-ness), safe failover is really hard to achieve. Essentially, one has to accept that applications of a certain complexity, especially dealing with data entry, require certain capabilities on the client end.
Has Ruby on Rails (http://www.rubyonrails.com) been considered for Wikidata? (from looking trough the Meta discussion page it seems this is not the case)
I've been reading the "Agile Web Development with Rails" book, and trying some tiny web apps myself, and it really is a very nice and much more abstract way of writing web applications (which most of the time leads to more productivity).
The real challenge here would be the integration of the existing PHP function/state framework with something written in Ruby (http://www.rubyist.net/~slagell/ruby/index.html).
Ruby to Mediawiki bindings, single-login user session handling, combined HTML rendering, and more would be needed, but I think in the end the future benefits of this increased productivity and the future Rails plugin extensions could really be very interesting. Unfortunately it involves a lot of complicated work, thats for sure.
Jama Poulsen:
Has Ruby on Rails (http://www.rubyonrails.com) been considered for Wikidata? (from looking trough the Meta discussion page it seems this is not the case)
I've actually been hacking on a RoR app when Gerard came over here. He had to drag me back screaming to work on Wikidata again. ;-) RoR is beautiful indeed, in many ways (ugly in some), but the only way it would be useful for Wikidata is if we implemented WD from scratch as something separate of MediaWiki. That makes a limited amount of sense, since our primary intended deployment are the Wikimedia projects.
We have considered using RoR to hack together some prototypes, but have for now decided against it as we want to avoid further delays that would be caused by throwaway code.
I am, however, trying to learn from RoR in that I want to come up with some clever conventions for table and column names that can lead to very quick deployment of simple applications, including applications which require relations between tables. However, complex Wikidata applications will need quite a bit of specialized coding, for which we're currently contemplating a "hooks" model similar to the one used for extensions.
The Wikidata integration into MediaWiki is intended to be very deep. The simplest Wikidata application would be a regular MediaWiki edit page which, in addition to the normal text field, would have some other edit field, say a date (this could be useful for Wikinews, for example).
In practice, Wikidata will require creating new tables with these desired characteristics, which will then be hooked up to a namespace (hence the current focus on namespace-related work). I've decided against a fixed table model because I considered it not scalable for complex applications, so a Wikidata table always corresponds to a real table in the database (with some required minimum fields for versioning the table and connecting it to a wiki page). Another advantage of this approach is that specialized tools like phpMyAdmin can be used to design the gist of the application.
I will publish my design specifications soon. This will also include some more details about "Ultimate Wiktionary", which is really the main thing we're working towards -- an incredibly complex and powerful application built with Wikidata (and, very likely, tons of specialized code). I hope that when we have these semi-final specs, we can put an end to much speculation. By the end of the year, we also want to have a prototype that demonstrates much of Wikidata as well as some functionality that will be in Ultimate Wiktionary.
My main challenge at the moment is to find a scalable model for versioning multiple tables with complex relations between them. Versioning is necessary for the "wiki" in "Wikidata", but at some level of complexity, it gets very tricky.
If anyone is interested in working on Wikidata, please contact me again (you may have done so before - sorry if something fell through). We're making substantial progress and are also exploring new sources of funding for future development.
Best,
Erik
On 12/1/05, Erik Moeller erik_moeller@gmx.de wrote: [snip]
My main challenge at the moment is to find a scalable model for versioning multiple tables with complex relations between them. Versioning is necessary for the "wiki" in "Wikidata", but at some level of complexity, it gets very tricky.
If anyone is interested in working on Wikidata, please contact me again (you may have done so before - sorry if something fell through). We're making substantial progress and are also exploring new sources of funding for future development.
Eh, I don't see what it's hard. Just do it like Mediawiki.
Every 'table' must have a identifying key which must be non-null, immutable, and will have all the history attached.
For every wikidata table there are actually two tables in the database, a item table and a revision table. In item, the key will be constrained to be unique and non-null, in the revision table it will be just non-null.
Generally just follow mediawiki for the fields, but rather than text have your data fields.
What will generally be a problem is that many forms of wiki data will want to have the ability to associate random name-value pairs with items. like successor="George W. Bush".
A pure SQL approach would be to have another (pair of) tables (due to versioning) with item_key,name,value but the performance of that approach would be poor because of locality issues. If MySQL supported clutering tables on a field (does it?) then you could cluster on item_key, and carry indexes on item_key and name,value.
If MySQL had an indexed hstore datatype like PostgreSQL then name-value data could be stored without additional tables. Which would be simpler and would perform well as long as the name-value data wasn't too big.
Gregory Maxwell:
For every wikidata table there are actually two tables in the database, a item table and a revision table. In item, the key will be constrained to be unique and non-null, in the revision table it will be just non-null.
Generally just follow mediawiki for the fields, but rather than text have your data fields.
The scenario of having a basically flat table that is versioned is a simple one. It gets tricky when you have 20 tables which are in complex relations with each other, and you want to revert or view a particular revision. Note that even our current MediaWiki is broken in this respect - when you view an old version of a page, you see it with the latest versions of the templates it uses, which can lead to very funny results.
Example: In Ultimate Wiktionary, you have an expression ("dog") which can be associated with multiple words of different types and genders, which can be associated with multiple defined meanings (each of which can be associated with a meaning text), synonyms and translations. This is a powerful concept, because it allows you to "magically" find all translations and synonyms when you know what meaning you refer to for a particular expression. The possible applications are endless.
The problem with relations of that complexity is that, if you don't want to balloon your tables, (I think) you need a lot of revision pointers between them. I'm going to try to model this with a simpler example. If you want to model something, feel free to help - it is sometimes good if multiple heads work on the same problem and come up with different solutions.
Too many people see Wikidata as being about infoboxes. I saw it that way when I first came up with the idea, but it's really more interesting to see it as a wiki engine for arbitrary database-driven applications. The goal is, to steal a Perl proverb, to make simple things simple and complex things possible. For apps like UW, a lot of custom code will have to be written, but the basic table structures can be the same as for any other WD application.
Best,
Erik
Erik Moeller wrote:
Gregory Maxwell:
For every wikidata table there are actually two tables in the database, a item table and a revision table. In item, the key will be constrained to be unique and non-null, in the revision table it will be just non-null. Generally just follow mediawiki for the fields, but rather than text have your data fields.
The scenario of having a basically flat table that is versioned is a simple one. It gets tricky when you have 20 tables which are in complex relations with each other, and you want to revert or view a particular revision. Note that even our current MediaWiki is broken in this respect
- when you view an old version of a page, you see it with the latest
versions of the templates it uses, which can lead to very funny results.
This is mentioned on [[m:Article validation possible problems]] - you can't really rate a particular version going back too far sensibly as it'll show the current version of any templates.
One solution is to have bringing up an old rev attempt to bring up whatever version of each of the templates was current at the time it was saved. This is more work, but old revs are viewed a *lot* less than current versions.
Variation: if a stable link is given in a news story or popular blog or whatever it'd be eminently cacheable. Are old revs cached the way the current version is?
A more elaborate variation on the idea is to save each rev with a list of which versions of the included templates it uses.
(Yeah, I know, go write the code ;-)
The problem with relations of that complexity is that, if you don't want to balloon your tables, (I think) you need a lot of revision pointers between them. I'm going to try to model this with a simpler example. If you want to model something, feel free to help - it is sometimes good if multiple heads work on the same problem and come up with different solutions.
Would the second idea above be useful to this end as well?
- d.
David Gerard wrote:
Erik Moeller wrote:
Gregory Maxwell:
For every wikidata table there are actually two tables in the database, a item table and a revision table. In item, the key will be constrained to be unique and non-null, in the revision table it will be just non-null. Generally just follow mediawiki for the fields, but rather than text have your data fields.
The scenario of having a basically flat table that is versioned is a simple one. It gets tricky when you have 20 tables which are in complex relations with each other, and you want to revert or view a particular revision. Note that even our current MediaWiki is broken in this respect
- when you view an old version of a page, you see it with the latest
versions of the templates it uses, which can lead to very funny results.
This is mentioned on [[m:Article validation possible problems]] - you can't really rate a particular version going back too far sensibly as it'll show the current version of any templates.
Another advantage an import-only site would have - change of templates is controlled.
BTW, the "staticwiki.php" extension now fully works for text imports. Image imports are not supported yet.
Magnus
On Thu, 2005-12-01 at 19:18 +0100, Erik Moeller wrote:
This is interesting. Based on my own experience, I can tell you that cross-browser compatibility is going to be a PITA.
Oh, I know. ;-) I actually got things working well in IE midway through developing this, which practically involved a rewrite. The IE bits tend to regress because keeping it working involves dealing with my wife's Windows box. That said, I think the scope of bugfixes needed to get things working well are small.
For Wikidata and Ultimate Wiktionary, I'm considering using this library:
Cool, thanks for the pointer. At first, I was really excited, because I thought that it was doing a bunch of really complimentary things like dealing with the extra WHAT-WG datatypes (e.g. <input type="date">), but it looks like he's just started with the repeat part. That said, it looks like a really small amount of code, which is nice.
It's built on top of Mochikit, a fairly powerful JavaScript library.
I'd looked at Mochikit briefly while I was developing this, but was running into performance issues that a library might exacerbate. I was using Behaviour.js at first, and found I got a big performance boost by writing custom code rather than using Behaviour the way that I was. The good thing about using Behaviour was that it made me think about cleaner approaches to attaching events to controls.
That said, I might be able to get some cross-browser and code clarity benefits by using Mochikit, so I may revisit that decision.
When dealing with complex forms on top of a relational database (which is essentially what Wikidata is going to be, plus versioning and other wiki-ness), safe failover is really hard to achieve. Essentially, one has to accept that applications of a certain complexity, especially dealing with data entry, require certain capabilities on the client end.
Yup. Even for something as simple and specific as what I wanted to do (configure an election with an arbitrary number of candidates) I ran into problems with plain ol' HTML forms.
The above forms are one example - where, with JavaScript, you can easily create complex relationships between different elements, without JavaScript, you would have submit the same form 10 or 20 times, use kludges like a fixed number of empty fields, etc. That doesn't even touch upon AJAX stuff like autocompletion, which is hard to do without in some contexts. So when it comes to building Wikidata UIs, I think it makes sense to apply the [[Pareto principle]].
Here's a way I think something like this could work with Wikidata. Right now, JSONwidget is doing all of the heavy lifting on the client, and doing it with two JSON-formatted components:
* The data * The schema describing the data format
My next step for JSONwidget is to add another piece, which is somewhat analogous to a style sheet. Currently with JSONwidget, the relationship between the datatype (e.g. string, int, bool) and the input control is a hard-coded, one-to-one relationship. My thought is to add a way of custom binding of nodes of the tree (even whole subtrees) to new controls. This could be on a per schema node (by identifier/tree location) or per datatype basis.
This same approach could be taken on the back end as well, where nodes in the schema get mapped to fields in the database. A default mapping could map this to a generalized but possibly suboptimal table layout. An administrator could then lock down a particularly popular schema, add a custom table, index the hell out of that table, and convert the existing data into the new database binding.
Anyway, food for thought.
Rob
wikitech-l@lists.wikimedia.org