Hi all, I'm new to this list, but I'm sure this topic has come up many times before. In short, wiki syntax is inadequate for structuring discussion pages, and Mediawiki needs a forum system. LiquidThreads is the most significant effort in this direction, and it's a big step forward, but the project is apparently dead and it's not being adopted by WMF. Other wikis have been quite successful with forum systems, like the highly active YKTTW forum on TVTropes (http://tvtropes.org/pmwiki/ykttw.php). I believe that the current system substantially impedes usability of WMF projects, particularly for new users, in some of the following ways:
* New users may fail to get attention because they post at the wrong place in the page, forget to sign their post, or incorrectly format their replies. Users new to wiki syntax may be so intimidated that they don't post at all. They may also repeat questions already asked many times because the relevant threads have vanished into the archives and there's no effective search functionality for threads. * Discussions may falter because interested contributors can't watch individual discussions, sorting by creation time encourages short threads, archiving edits mask real edits on watchlists. * Inconsistent formatting causes confusion in threads (e.g. indentation resetting, arbitrary section breaks, comments running together), which occasionally leads to avoidable conflict.
For many years we've dealt with kludgy solutions for these problems on En, like archiving bots, signature bots, and so on.
A common objection I hear is that wiki talk pages are better for discussing drafts of portions of articles. But as long as the contents of posts still use the same wiki syntax this is still straightforward.
There's still significant disagreement about a few specific issues like whether users, just admins, or no one should be permitted to edit comments of others, or move threads from one discussion page to another; a good permission system would allow this to be left up to the individual wiki.
There's a significant migration problem: what to do with the massive reams of content existing on discussion pages and in discussion archives? There may also be scalability issues with creating a system that can handle the kind of load Wikipedia currently sees.
In short, I'm looking to revive the much-delayed effort to get real forum support implemented and deployed to major WMF projects, and offering to contribute and head up this effort myself. What will it take, and what's the best answer to the hard design questions? What have we learned from LiquidThreads? Considering all the schema changes since LiquidThreads, is it better to use it as a starting point, or to do something new? Any feedback is appreciated.
-Vivian
P.S. I'm posting under an alias due to my work's open source contribution policy - I'm an undisclosed administrator on the English Wikipedia.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Vivian Landers wrote:
Hi all, I'm new to this list, but I'm sure this topic has come up many times before. In short, wiki syntax is inadequate for structuring discussion pages, and Mediawiki needs a forum system. LiquidThreads is the most significant effort in this direction, and it's a big step forward, but the project is apparently dead and it's not being adopted by WMF.
LQT is being revived; we've contracted David McCabe, the author of the original version, to polish up the rough edges and maintain it going forward.
- -- brion
On Thu, Sep 11, 2008 at 7:29 PM, Vivian Landers vivianlanders@gmail.com wrote:
Hi all, I'm new to this list, but I'm sure this topic has come up many times before. In short, wiki syntax is inadequate for structuring discussion pages, and Mediawiki needs a forum system. LiquidThreads is the most significant effort in this direction, and it's a big step forward, but the project is apparently dead and it's not being adopted by WMF.
It's not dead, so the answer is probably just to be more aggressive about getting it adopted by the WMF. I suspect it will have serious performance issues that will need to be worked out before it's usable by Wikipedia.
There's still significant disagreement about a few specific issues like whether users, just admins, or no one should be permitted to edit comments of others, or move threads from one discussion page to another
As far as I'm concerned, there is not any disagreement that's significant, at least with respect to Wikimedia. We're talking about a *wiki* discussion system. Allow open editing and preserve histories, it's as simple as that. If someone wants to add options that non-Wikimedia administrators can configure if it's useful to them, I don't necessarily have any problem with that.
There's a significant migration problem: what to do with the massive reams of content existing on discussion pages and in discussion archives?
Not a big issue. Any discussion page will need to have some section that's editable like a usual wiki page, probably, to store non-discussion content: header templates, for instance. So just dump the old discussions in there.
There may also be scalability issues with creating a system that can handle the kind of load Wikipedia currently sees.
That's one of the more serious issues. I recall David was having issues in deciding on the schema, since he wanted efficient versioned trees. I don't know what he ended up settling on, but I would imagine it would have some issues when scaling up to the massive levels needed by Wikipedia. Few schemas don't. There are lots of fiddly MediaWiki-specific details that you need to understand properly, like how external storage works.
In short, I'm looking to revive the much-delayed effort to get real forum support implemented and deployed to major WMF projects, and offering to contribute and head up this effort myself. What will it take, and what's the best answer to the hard design questions? What have we learned from LiquidThreads? Considering all the schema changes since LiquidThreads, is it better to use it as a starting point, or to do something new? Any feedback is appreciated.
I don't know enough about LiquidThreads to comment intelligently on it, unfortunately.
On Thu, Sep 11, 2008 at 8:57 PM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
That's one of the more serious issues. I recall David was having issues in deciding on the schema, since he wanted efficient versioned trees. I don't know what he ended up settling on, but I would imagine it would have some issues when scaling up to the massive levels needed by Wikipedia. Few schemas don't. There are lots of fiddly MediaWiki-specific details that you need to understand properly, like how external storage works.
Taking a quick look at the schema, there are some fairly clear issues already. These are the indexes on the thread table:
PRIMARY KEY thread_id (thread_id), UNIQUE INDEX thread_id (thread_id), INDEX thread_ancestor (thread_ancestor), INDEX thread_article_title (thread_article_namespace, thread_article_title), INDEX thread_modified (thread_modified), INDEX thread_created (thread_created), INDEX thread_summary_page (thread_summary_page)
Well, first of all the duplicate key on thread_id doesn't make any sense at all. Now, I'm not sure how the thread model works exactly, but I find it surprising that thread_modified isn't on the end of any indexes, for sorting. I tried to find a sample query:
$where = array(Threads::articleClause($this->article), 'thread.thread_parent is null', '(thread.thread_summary_page is not null' . ' OR thread.thread_type = '.Threads::TYPE_MOVED.')', 'thread.thread_modified < ' . $startdate->text()); $options = array('ORDER BY thread.thread_modified DESC');
That's only partial, but it gives me a pretty good idea that you would really want an index on (thread_article_namespace, thread_article_title, thread_modified) here, at the very least. You might also need thread_parent or thread_summary_page in there, depending on the expected distribution of values there.
Poking around a bit more, I find the method Threads::monthsWhereArticleHasThreads(), which a comment accurately describes as "Horrible, horrible!" It retrieves *all* threads for the entire page into an array, iterates through them, returns a PHP array of arbitrarily many Thread objects -- the Thread class having 25 member variables which are all loaded, AFAICT including the text of the thread -- and then iterates through that array *again*, collecting a list of distinct months.
I spot one place where this method is used, and there it apparently takes the list of months (assuming we haven't OOMed by now due to this being, e.g., ANI) to . . . well, set some member variables, I can't figure out exactly what without a fair amount of further digging. But it suffices to say that I think we can consider LQT's performance to be a work in progress. :)
wikitech-l@lists.wikimedia.org