Odd plans on Future/Parser_plan - are they true? - Wikitext-l

List overview All Threads
Download

newer

Odd plans on Future/Parser_plan - are they true?

older

Re: [Wikitext-l] Markup syntax

An update on search

vitalif＠yourcmc.ru

5 Feb 2012 5 Feb '12

9:13 p.m.

Hi wikitext-l!

I've read http://www.mediawiki.org/wiki/Future/Parser_plan recently, and the plans seemed strange and scary to me. In several places, there is the following stuff said: ...rich text editor which will let most editors and commenters contribute text without encountering source markup... ...further reducing the need of advanced editors to work with markup directly... ...by integrating well with a rich text editor, we reduce the dependence of editors and commenters on dealing with low-level wikitext... ..."oh there's that funky apostrophe thing in this old-style page". Most editors will never need to encounter it...

Such plans seem very scary to me, as I think the PLAIN-TEXT is one of the MOST IMPORTANT features of Wiki software! And you basically say you want to move away from it and turn MediaWiki to another Word, having all problems of "WYSIWYdnG" (...Is What Wou don't Get) editors. I don't think I need to explain the advantages of plain-text markup to core developers of MediaWiki... :)

I've patched the parser code slightly (in mediawiki4intranet) and I understand it's not perfect, so I support any effort for creating a new parser, but if it involves moving away from markup in the future fully...

So, my question is - is this all true, or did I misunderstand the plans?

Show replies by date

Pavel Tkachenko

6 Feb 6 Feb

7:58 a.m.

...

So, my question is - is this all true, or did I misunderstand the plans?

Judging from other dev pages and maillist activity they seem to be striking true.

I fully agree with the thread starter that the direction in which MediaWiki in general and Wikipedia in particular is going is controversial at best: it has started with a plain text markup and now it's going to end (and finish) with a "less-than-Word" solution. But I'm afraid we are not the ones who can influence this as the core devs seem to have already decided that the community needs WYSIWYG.

But since it won't harm to add a few of my own arguments I'll do so just in case someone wants to give it a second chance.

First of all, if editors can't master even the simplest wiki markup (without templates and the C/HTML mixture) than what good he's for as an editor? Basic Wikipedia markup contains a dozen of tokens, if not less. On the contrary, those who've mastered the basics have passed the first and unobtrusive"editorial filter".

WYSIWYG editor is going to level everyone and give more chances for inappropriate edits. Come on, if someone has no free 5 minutes to learn the basics once and for all than why he must be allowed to edit pages in the first place? And you'll need to have more than 5 minutes to learn WP edit rules anyway - and no WYSIWYG will save you from this unless WYSIWYG is needed to boost the statistics of 'EPD'.

WYSIWYG is known to be much less productive than "WYTIWYG" (what you think is what you get - plain text) when writing more or less long texts, is more robust (doesn't need any modern browser with contenteditable divs which in turn needs less CPU power, etc.), less error-prone (no chances to get markup that isn't visible - in plain text there are no invisible markup whatsoever; this includes semi-hidden markup like URLs which are hidden under link caption), over-9000 technical benefits (no need to reinvent/reimplement Unicode, can be edited even without a web browser in any text editor on any platform and any device, etc.). And I'm sure anyone can come up with a few more points.

I have a hunch that all of this was already discussed long time ago before taking such a critical decision in changing 'wikitext' to 'wikiword' - and given amount of work done I doubt this can be changed now. However, --hope dies last--, I can't just keep silent all the time, especially if there seem to be people thinking my way too.

A. Aharoni in his recent "reimplementation of editing functions in the Visual Editor" thread has outlined just a subset of technical difficulties and inconveniences for its first non-English users regarding rich text editing.

Y. Tarasievich has just rightly said that current MediaWiki markup is HTML by another name. IMO this is the main problem: not inconvenient editing means but inconvenient markup syntax itself.

I think the markup initially was planned for English articles. Moreover, it probably wasn't foreseen to have become so complex, with #directives and <extra markup>. This is perfectly understandable and no one is to be blamed. However, now may be the time to refactor old problems into new strengths.

Making markup language-neutral is easy enough: even a single person can carry on the research to find keyboard symbols that are easily accessible across different language standards. From my experience they are ! % * ( ) - = _ and +. This will eliminate the need to layout switches (for example, currently a Russian Wikipedia editor must switch layout 5 times in typing simple "underline" since neither < nor > are present in Russian layout; the same goes for links: "[[link]]" and "[[link|caption]]" - pipe is also not present here).

My study indicates that the number of available symbols will allow to avoid HTML-style tags completely - this will further simplify the markup. For instance, instead of "__" can be used; <ref> can be replaced by "[[*ref]]" for uniformity with links; and so on. I am ready to give expanded explanation if anyone is interested.

Special tokens like #REDIRECT, {{-}}, <imagemap>, __TOC__, etc. that all use different syntaxes can be uniformized in a way similar to template insertions: {{redir New page}}, {{clear}}, {{imagemap image.png, title x y, ...}}, {{TOC}} and so on. Templates can be called as {{tpl template arg arg arg}} - even if we keep { and } that require layout switch in some languages we eliminate the pipe which just makes things worse and text - less readable.

To sum it up plain text editing might be an issue but not the main one. Mind you, Wikipedia has gained 10.5 mln articles with 347 mln edits with its current (not very convenient to be blunt) text editor. If that was a big issue Wikipedia wouldn't be what it is today.

Old markup can still be entirely supported. I'm sure the team understands that building a visual editor on top of current markup is not the best way to 'fix' it as it will mean headaches for the developers. But it's not necessary: new parser system can incorporate both markups and use old compatibility layer to not only parse old markup into new document tree but even transparently convert it into new markup. Users won't even notice that something has changed after upgrading their MediaWiki - not until they try to edit a page and find that its markup has been seamlessly transformed.

A top-notch WYSIWYG editor most certainly would not hurt, especially if it's built on top of good document tree. But in my opinion low-level markup is primary to the rich editor and is extremely easy to implement, at the same time maintaining maximum platform compatibility. Even a single person given enough motivation and a sane amount of time can build the entire system - parser/serializer/editor. This is hardly true for ambitious visual editor with integrated Unicode/multilanguage support that is currently underway by the Wikitext team.

Signed, P. Tkachenko

2012/2/6 vitalif@yourcmc.ru:

...

Hi wikitext-l!

I've read http://www.mediawiki.org/wiki/Future/Parser_plan recently, and the plans seemed strange and scary to me. In several places, there is the following stuff said: ...rich text editor which will let most editors and commenters contribute text without encountering source markup... ...further reducing the need of advanced editors to work with markup directly... ...by integrating well with a rich text editor, we reduce the dependence of editors and commenters on dealing with low-level wikitext... ..."oh there's that funky apostrophe thing in this old-style page". Most editors will never need to encounter it...

Such plans seem very scary to me, as I think the PLAIN-TEXT is one of the MOST IMPORTANT features of Wiki software! And you basically say you want to move away from it and turn MediaWiki to another Word, having all problems of "WYSIWYdnG" (...Is What Wou don't Get) editors. I don't think I need to explain the advantages of plain-text markup to core developers of MediaWiki... :)

I've patched the parser code slightly (in mediawiki4intranet) and I understand it's not perfect, so I support any effort for creating a new parser, but if it involves moving away from markup in the future fully...

...

Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l

Trevor Parscal

9:20 a.m.

I'm glad to see this discussion getting started. The points raised in this thread so far are well stated, and this is an important issue to talk openly about at a core level.

I'm the lead developer of the visual editor, so I've been involved in this space for some time now. I've heard the following argument many times before - and I would like to respond to it in particular:

On Sun, Feb 5, 2012 at 10:58 PM, Pavel Tkachenko proger.xp@gmail.comwrote:

...

First of all, if editors can't master even the simplest wiki markup (without templates and the C/HTML mixture) than what good he's for as an editor? Basic Wikipedia markup contains a dozen of tokens, if not less. On the contrary, those who've mastered the basics have passed the first and unobtrusive"editorial filter".

I think it fails to take into account a few things:

1. *It's getting harder to edit* - The relative difficulty of contributing to Wikipedia compared to other ways to publish content online has changed over time because other sites have made usability improvements 2. *We've done our research* - There are many ways to improve usability, the decision to move towards a visual editor was based on usability research conducted in a lab and through remote user testing 3. *The status quo is failing us* - We know that fewer people are editing Wikipedia than they used to, and reversing this trend has become a priority for the Wikimedia Foundation

I would also like to offer some perspective from "the visual editor team".

- We understand that if we make editing easier without also making reviewing easier, the Wiki will become littered with unreviewed edits and the backlog will strain the capacity of the community - We view this as unacceptable - There are other projects addressing the usability and scalability of review - We view these as critical - Our goal is to increase the number of people who are able to contribute and to improve the productivity of people who already contribute - This must not be at the expense of contribution quality

I hope this effectively illustrates an alternative perspective on this subject. This is a very hard problem, and any course of action will involve some level of risk. We are trying our best to manage this risk, mostly by conducting a great deal of research and development.

- Trevor

Pavel Tkachenko

11:03 a.m.

2012/2/6 Trevor Parscal tparscal@wikimedia.org:

...

I hope this effectively illustrates an alternative perspective on this subject. This is a very hard problem, and any course of action will involve some level of risk. We are trying our best to manage this risk, mostly by conducting a great deal of research and development.

This is understandable, the scale of problem is as such that it is inappropriate to take any action in one instant. I cannot say that your arguments have given me much to think about due to lack of specifics but I'm glad if they're more elaborated on the core level.

Still, I cannot see why it's impossible to first create a solid foundation without drastically changing everything and then build visual editor and other high-level structures on top of it. It's more than possible to have both plain text and visual editors in one place if it's proven that they suit different groups of users. Reworking present text editor and wiki syntax cannot be compared with writing rich editor with all MediaWiki features from scratch and it might even get some editors back. I have a hunch current WMF team can handle this in one month or less. IMO rethinking wiki syntax is still a must and if not done prior to high-level constructions it'll hit the devs (and users) hard in the end.

But I have already stated my arguments, no point in repeating them again.

Signed, P. Tkachenko

Gabriel Wicke

12:36 p.m.

On 02/06/2012 11:03 AM, Pavel Tkachenko wrote:

...

2012/2/6 Trevor Parscal tparscal@wikimedia.org:

...
I hope this effectively illustrates an alternative perspective on this subject. This is a very hard problem, and any course of action will involve some level of risk. We are trying our best to manage this risk, mostly by conducting a great deal of research and development.

This is understandable, the scale of problem is as such that it is inappropriate to take any action in one instant. I cannot say that your arguments have given me much to think about due to lack of specifics but I'm glad if they're more elaborated on the core level.

Still, I cannot see why it's impossible to first create a solid foundation without drastically changing everything and then build visual editor and other high-level structures on top of it.

The parser we are working on [1] should eventually give us the solid foundation you are lobbying for. It is strongly motivated by, but not technically tied to the visual editor. The enriched HTML DOM we are building (and actually most token stream processing including template expansion) is not tied to any specific syntax or user interface.

We do currently add some round-trip information to support a very gradual normalization of WikiText formatting without non-localized dirty diffs. Anybody wishing to experiment with an alternate markup UI could however ignore this issue initially, which should simplify the task a lot. Alternate markup-based user interfaces would require a different serializer and tokenizer, but can share the remaining parser pipeline, so this simplified task should indeed be quite manageable.

But in any case, we first have to implement a solid tokenizer for the current syntax and recreate at least a part of the higher-level functionality (template expansion etc) based on a syntax-independent representation.

Gabriel

[1]: https://www.mediawiki.org/wiki/Future/Parser_development

Yury Tarasievich

4:45 p.m.

I'm taking liberty to summarize Trevor Parscal's points (as issued on 02/06/2012 10:20 AM) as follows:

Wikipedia is losing editors, and is looking less competitive (?). The visual editor is supposed to (help) remedy that.

If true, this is already diverting us completely from the matters of *wikitext*.

To address the (sub)points themselves, the 3 points Trevor makes are irrefutable, -- however, in a very abstract way. Yes, every visual tool is, by definition, more enabling than its lower-level counterpart.

But the further "perspective from "the visual editor team" just begs for pointing out that the lower entry threshold just can't be expected to *produce* better participation levels, as the perspective seems to suggest. It might be expected just to *stimulate* those, indirectly.

All this, however, isn't addressing quite more significant *people* problems that, as I understand, have already cropped up in Wikipedia. And I won't even start on expertise-recognition and article-baby-sitting issues.

The overall goal of the project might be mentioned, too. Over the time, the wikipedians *will* eventually run out of valid topics to write about. For major subprojects this is already in sight.

All said, what are we, in fact, discussing? I get an impression everything, down to the implementation, is already decided? I wouldn't say no to some sort of visual editor, which would work *only* with the document structure, disregarding the presentation completely, but is it even "on a table"?

Yury

David Gerard

3:57 p.m.

On 6 February 2012 15:45, Yury Tarasievich yury.tarasievich@gmail.com wrote:

...

All said, what are we, in fact, discussing? I get an impression everything, down to the implementation, is already decided?

You're coming in about five years into the discussion. So yes, quite a lot has already been worked on, extremely hard.

Wikitext as it exists is an indefensible mess. To the question "Would we invent this if it didn't exist?" the only sane answer is "HELL NO."

And treating wikitext as a hazing ritual to filter out insufficiently geeky editors is contemptible.

- d/.

Yury Tarasievich

6:29 p.m.

On 02/06/2012 04:57 PM, David Gerard wrote:

...

On 6 February 2012 15:45, Yury Tarasievichyury.tarasievich@gmail.com wrote:

...
All said, what are we, in fact, discussing? I get an impression everything, down to the implementation, is already decided?

You're coming in about five years into the discussion. So yes, quite a lot has already been worked on, extremely hard.

I guessed as much.

I might note, of course, that the "annual complete rewrites of the parser" have become somewhat of a fixture in this list, and that it's rather difficult to understand what's really going on from the list alone.

Anyway, five years or none, all this looks like an answer to a problem unrelated to stated. Which was, in fact, my point.

...

Wikitext as it exists is an indefensible mess. To the question "Would we invent this if it didn't exist?" the only sane answer is "HELL NO."

And treating wikitext as a hazing ritual to filter out insufficiently geeky editors is contemptible.

That wasn't my point, of course.

As to the "mess" you mention, it quite naturally originates in picking one of the OSS packages adequately performing in a specific role, and confronting it with an ever-expanding list of requirements, with one flagship project in mind. So, at some point things "break", or, actually, become unsufferably annoying to maintainers, but is this really related to participation figures? What about unavoidable change reliability? Well, what about wikitext operating skills acquired by the already existing wikipedians?

Sorry, if digesting this was a total waste of your time.

Yury

Mihály Héder

11:04 a.m.

Hello,

...

First of all, if editors can't master even the simplest wiki markup (without templates and the C/HTML mixture) than what good he's for as an editor?

That is about the worst and most harmful topos ever invented by the Wikipedia community. I am sorry I reply this to you - I have heard this line about a dozen times from others and now my frustration became too big to not answer :(

Wikitext is not a proper IQ test - it is an "are you a geek like the rest of us?" test. And it is not really good for that either. I remember the first time I edited a MediaWiki page. Among other languages I already mastered Latex and sed by that time. Latex wasn't easy to learn but I knew how powerful it is when I finally get it. sed was even worse in some respects but finally it saved me 100s of hours of work during the years. And so there were wikitext, with its quite funny and not so consistent syntax just for producing simplified and uniform html pages. At this time I could not feel the vast expression power of plain text+markup in my hands. I think when people say stuff like " WYSIWYG is known to be much less productive than WYTIWYG" they are actually referring to stuff like latex or html source editing and they claim the same power for wikitext, but that power is not there. Having operated a lot of MediaWiki instances for years, having attempted to implement an alternate editor at least 3 times (why I'm on this list) and also doing research on the Wikipedia corpus I really grown to respect the whole project and especially every editor who carried out this huge achievement. But I still could not extend my respect to wikitext.

...

Basic Wikipedia markup contains a dozen of tokens, if not less. On the contrary, those who've mastered the basics have passed the first and unobtrusive"editorial filter".

A classical freshmen commencement procedure: awkward things to do, some humiliation, and bullies. I understand that wikitext became an important part of this culture. But I also think that the knowledge of wikitext should not be a status symbol of our group membership unless we want to appear as a sect or something like that.

...

WYSIWYG editor is going to level everyone and give more chances for inappropriate edits. Come on, if someone has no free 5 minutes to learn the basics once and for all than why he must be allowed to edit pages in the first place?

It is the same flawed argument again. We have a serious quality management problem, inappropriate edits. You assume that it is best solved by using a somewhat cryptic language which will tell us apart the good from the bad. But I don't think we know what we are measuring with our wikitext-test at all. Ethical merits? Surely not. Good or bad intentions? Surely not. Knowledge about the article one wants to edit? Surely not. Devotion to make an edit? Probably. Markup skills? Probably. Tolerance for outdated interfaces? I say this, too. As a result: 1) We know for sure that despite the wikitext barrier we have lots of false negatives, that is inappropriate edits or vandalism. That is because we can only measure the devotion to edit wiki and not the intentions or wisdom. 2) We have no clue how much the false positive is - knowledgeable people with good intentions, who fail on markup skills or initial devotion; the ones who go away when seeing wikitext. Or the ones who inadvertedly mess up a page's markup the first time they edit and getting reverted by an admin, etc.

Ah, well sorry for the harsh sentences, peace

Mihály Héder

...

And you'll need to have more than 5 minutes to learn WP edit rules anyway - and no WYSIWYG will save you from this unless WYSIWYG is needed to boost the statistics of 'EPD'.

WYSIWYG is known to be much less productive than "WYTIWYG" (what you think is what you get - plain text) when writing more or less long texts, is more robust (doesn't need any modern browser with contenteditable divs which in turn needs less CPU power, etc.), less error-prone (no chances to get markup that isn't visible - in plain text there are no invisible markup whatsoever; this includes semi-hidden markup like URLs which are hidden under link caption), over-9000 technical benefits (no need to reinvent/reimplement Unicode, can be edited even without a web browser in any text editor on any platform and any device, etc.). And I'm sure anyone can come up with a few more points.

I have a hunch that all of this was already discussed long time ago before taking such a critical decision in changing 'wikitext' to 'wikiword' - and given amount of work done I doubt this can be changed now. However, --hope dies last--, I can't just keep silent all the time, especially if there seem to be people thinking my way too.

A. Aharoni in his recent "reimplementation of editing functions in the Visual Editor" thread has outlined just a subset of technical difficulties and inconveniences for its first non-English users regarding rich text editing.

Y. Tarasievich has just rightly said that current MediaWiki markup is HTML by another name. IMO this is the main problem: not inconvenient editing means but inconvenient markup syntax itself.

I think the markup initially was planned for English articles. Moreover, it probably wasn't foreseen to have become so complex, with #directives and <extra markup>. This is perfectly understandable and no one is to be blamed. However, now may be the time to refactor old problems into new strengths.

Making markup language-neutral is easy enough: even a single person can carry on the research to find keyboard symbols that are easily accessible across different language standards. From my experience they are ! % * ( ) - = _ and +. This will eliminate the need to layout switches (for example, currently a Russian Wikipedia editor must switch layout 5 times in typing simple "underline" since neither < nor > are present in Russian layout; the same goes for links: "[[link]]" and "[[link|caption]]" - pipe is also not present here).

My study indicates that the number of available symbols will allow to avoid HTML-style tags completely - this will further simplify the markup. For instance, instead of "__" can be used; <ref> can be replaced by "[[*ref]]" for uniformity with links; and so on. I am ready to give expanded explanation if anyone is interested.

Special tokens like #REDIRECT, {{-}}, <imagemap>, __TOC__, etc. that all use different syntaxes can be uniformized in a way similar to template insertions: {{redir New page}}, {{clear}}, {{imagemap image.png, title x y, ...}}, {{TOC}} and so on. Templates can be called as {{tpl template arg arg arg}} - even if we keep { and } that require layout switch in some languages we eliminate the pipe which just makes things worse and text - less readable.

To sum it up plain text editing might be an issue but not the main one. Mind you, Wikipedia has gained 10.5 mln articles with 347 mln edits with its current (not very convenient to be blunt) text editor. If that was a big issue Wikipedia wouldn't be what it is today.

Old markup can still be entirely supported. I'm sure the team understands that building a visual editor on top of current markup is not the best way to 'fix' it as it will mean headaches for the developers. But it's not necessary: new parser system can incorporate both markups and use old compatibility layer to not only parse old markup into new document tree but even transparently convert it into new markup. Users won't even notice that something has changed after upgrading their MediaWiki - not until they try to edit a page and find that its markup has been seamlessly transformed.

A top-notch WYSIWYG editor most certainly would not hurt, especially if it's built on top of good document tree. But in my opinion low-level markup is primary to the rich editor and is extremely easy to implement, at the same time maintaining maximum platform compatibility. Even a single person given enough motivation and a sane amount of time can build the entire system - parser/serializer/editor. This is hardly true for ambitious visual editor with integrated Unicode/multilanguage support that is currently underway by the Wikitext team.

Signed, P. Tkachenko

2012/2/6 vitalif@yourcmc.ru:

...
Hi wikitext-l!

I've read http://www.mediawiki.org/wiki/Future/Parser_plan recently, and the plans seemed strange and scary to me. In several places, there is the following stuff said: ...rich text editor which will let most editors and commenters contribute text without encountering source markup... ...further reducing the need of advanced editors to work with markup directly... ...by integrating well with a rich text editor, we reduce the dependence of editors and commenters on dealing with low-level wikitext... ..."oh there's that funky apostrophe thing in this old-style page". Most editors will never need to encounter it...

Such plans seem very scary to me, as I think the PLAIN-TEXT is one of the MOST IMPORTANT features of Wiki software! And you basically say you want to move away from it and turn MediaWiki to another Word, having all problems of "WYSIWYdnG" (...Is What Wou don't Get) editors. I don't think I need to explain the advantages of plain-text markup to core developers of MediaWiki... :)

I've patched the parser code slightly (in mediawiki4intranet) and I understand it's not perfect, so I support any effort for creating a new parser, but if it involves moving away from markup in the future fully...

...

Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l

Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l

Pavel Tkachenko

12:19 p.m.

Mihaly Heder,

It's better to say harsh truth than soft lie. No offense.

In your speech about wikitext you were clearly keeping *current* markup in mind. I'm sure about this because you mentioned LaTeX which also is not a perfect text markup language. Wikitext is not a proper IQ test indeed, it's test for geekness at best in its current form.

But what kind of IQ are we talking about when quoting paragraphs in maillists? Are those ">"s that hard to comprehend? Or are *bold*, /italic/ and _underline_ are much harder to remember than [b], and \stuff?

Let me get this straight: there has no ideal, or even good, plain text editing language created to date, not one that is widely used and known to me. And the overwhelming majority of them are Latin-centric.

...

A classical freshmen commencement procedure: awkward things to do, some humiliation, and bullies. I understand that wikitext became an important part of this culture. But I also think that the knowledge of wikitext should not be a status symbol of our group membership unless we want to appear as a sect or something like that.

I agree with you completely: "b/w terminal does not pardon some cryptic Unix". Content must be presented and edited in clear form. I am not a big fan of wikitext myself, probably the opposite and I'm all for changing it for the best. And in my opinion visual editor isn't a solution but a concealment.

...

You assume that it is best solved by using a somewhat cryptic language which will tell us apart the good from the bad.

Absolutely not. It seems I've been misunderstood here :( I'm not voting for keeping current wikitext, I'm voting to rework it.

...

Knowledge about the article one wants to edit? Surely not. Devotion to make an edit? Probably. Markup skills? Probably. Tolerance for outdated interfaces? I say this, too.

Can you repeat all of this if someone is reluctant to read Wikipedia editing/copyright guidelines? Why in your opinion editing with plain yet intuitive markup is different from rich editor?

I would love to give practical examples if someone has outlined specific causes of "devoted and intelligent people going away when seeing wikitext". For now, I can invent my own - templates of infoboxes that appear on top of most articles can scare anyone out of his senses.

But let's be blunt and say that markup is about text MARKUP, not presentation. Not to the extent that drives such text to incomprehensibleness. If we accept the thesis that markup must be human-readable and everything else MUST be handled by the machine no matter how "complex" this might become for it we can achieve some interesting results. For example, if I'm a "newbie but intelligent" Wikipedian and open the page on "The Earth" I see this:

{{About|the planet}} {{pp-semi|small=yes}}{{pp-move-indef}} {{Infobox Planet | bgcolour = #c0c0ff | name = Earth | symbol = [[File:Earth symbol.svg|25px|Astronomical symbol of Earth]] ... }}

This is how I understand what human-readable markup means: 1. Okay, I understand that "{{" and "}}" are some special symbols - but why pipe is between what seem to be words and in the start of some new lines? 2. What is "pp-semi" why it "move-indef"? 3. Why bgcolour is there? It's presentation. Is any editor going to change background color of the infobox, ever? Perhaps it has a different tone and must be #c0c1ff?

These are just 3 basic questions. Processed, the above snippet might look like this:

{{About Earth, the planet}} {{Infobox symbol = [[File:Earth symbol.svg|25px|Astronomical symbol of Earth]] ... }}

That's it. Now the machine's part: 1. "About" is a special "template" or some other construct. When it's "ran" (processed) it accepts 2 colon-separated "arguments". The first specifies "name" of something (place, planet, object, etc.), the second - its "role". Moreover, even these are article-specific and can be changed. The point is that this line remains equally understandable regardless of the article type. 2. "{{pp ... indef}}" thing has disappeared. If that's something system it must go in system properties of the page, not be visible to anyone - what's the point in seeing it if the page is protected? And it's often protected to prevent changes to such "system" entries anyway. Vicious circle. 3. "Infobox Planet" has transformed into just "Infobox" - we've got "planet" defined in "About". 4. "bgcolor" is system presentation-specific thing, no place for it in the contents. 4. "name = Earth" - we've got this along with "Planet".

We have just reduced the source almost by 50% of lines.

Another example on the same page:

| temperatures = yes | temp_name1 = [[Kelvin]] | min_temp_1 = 184 K<ref name=asu_lowest_temp/> | mean_temp_1 = 287.2 K<ref name=kinver20091210/> | max_temp_1 = 331 K<ref name=asu_highest_temp/> | temp_name2 = [[Celsius]] | min_temp_2 = ?89.2 °C | mean_temp_2 = 14 °C | max_temp_2 = 57.8 °C ... <ref name=asu_lowest_temp>{{cite web|url=...|work=...|publisher=...|accessdate=2010-08-07}}</ref> ...

What kind of false positives are we talking about? Will any sane individual spend his precious time not editing but preparing to edit this mess?

Again, not to offend anyone and least - MediaWiki devs, but if we're talking about wikitext future the above must look much more plain: temperatures = in Kelvin: 184, 287.2, 331 temperatures min = {{Cite from web: url ..., work ..., publisher ..., access date 2010-08-07}}

5 times less "code" and still perfectly handled by the machine and - what's important too - more managable with more features: 1. "temperatures = yes" - obviously, if any temperatures are specified in the infobox then the temperature block is enabled. Is it possible otherwise? 2. No Celsius temperatures - the machine does better job converting values than human plus calculator. 3. No link in "[[Kelvin]]" - the machine can place link itself, can't it? 4. No   because it just fixes some engine problem and is problem for its devs, not editors or, mind you, users. 5. Replaced all <ref>s with another "template argument" named "temperatures min". This works simple: the machine calculates minimal value and applies given reference to it, if corresponding argument is passed. If no - no reference. 6. No degree symbols: it's cleaner and users don't have to search for the special char (°).

And at least one added benefit: Should Wikipedia core group decide that there are not just Celsius and Kelvin but some different temperature values it can be added without changing ANY article source as long as conversion between Kelvin or Celsius and the new value can be done. Moreover, since K/C is managed by the machine each Wikipedia visitor can customize what he wants to see or their order (for instance, Celsius is his native and he wants to see it first).

---

The above are just small examples not involving large markup changes. I am ready to give more detailed reviews.

Try to imagine what kind of syntax we will get if we rework it from scratch. Personally for years I have been using a home-made markup that I'm using on some of my resources as a replacement for wiki, BB-codes and HTML and from my experience it's possible to create an international and intuitive markup suitable for virtually anyone using a computer. And added a simple plain-text editor like wikEd it will combine powers of both text and rich editors.

Especially if a group like WMF undertakes this task.

Signed, P. Tkachenko

Sumana Harihareswara

7:15 p.m.

On 02/06/2012 06:19 AM, Pavel Tkachenko wrote:

...

Mihaly Heder,

It's better to say harsh truth than soft lie. No offense.

In your speech about wikitext you were clearly keeping *current* markup in mind. I'm sure about this because you mentioned LaTeX which also is not a perfect text markup language. Wikitext is not a proper IQ test indeed, it's test for geekness at best in its current form.

But what kind of IQ are we talking about when quoting paragraphs in maillists? Are those ">"s that hard to comprehend? Or are *bold*, /italic/ and _underline_ are much harder to remember than [b], and \stuff?

Let me get this straight: there has no ideal, or even good, plain text editing language created to date, not one that is widely used and known to me. And the overwhelming majority of them are Latin-centric.

...
A classical freshmen commencement procedure: awkward things to do, some humiliation, and bullies. I understand that wikitext became an important part of this culture. But I also think that the knowledge of wikitext should not be a status symbol of our group membership unless we want to appear as a sect or something like that.

I agree with you completely: "b/w terminal does not pardon some cryptic Unix". Content must be presented and edited in clear form. I am not a big fan of wikitext myself, probably the opposite and I'm all for changing it for the best. And in my opinion visual editor isn't a solution but a concealment.

...
You assume that it is best solved by using a somewhat cryptic language which will tell us apart the good from the bad.

Absolutely not. It seems I've been misunderstood here :( I'm not voting for keeping current wikitext, I'm voting to rework it.

...
Knowledge about the article one wants to edit? Surely not. Devotion to make an edit? Probably. Markup skills? Probably. Tolerance for outdated interfaces? I say this, too.

Can you repeat all of this if someone is reluctant to read Wikipedia editing/copyright guidelines? Why in your opinion editing with plain yet intuitive markup is different from rich editor?

I would love to give practical examples if someone has outlined specific causes of "devoted and intelligent people going away when seeing wikitext". For now, I can invent my own - templates of infoboxes that appear on top of most articles can scare anyone out of his senses.

But let's be blunt and say that markup is about text MARKUP, not presentation. Not to the extent that drives such text to incomprehensibleness. If we accept the thesis that markup must be human-readable and everything else MUST be handled by the machine no matter how "complex" this might become for it we can achieve some interesting results. For example, if I'm a "newbie but intelligent" Wikipedian and open the page on "The Earth" I see this:

{{About|the planet}} {{pp-semi|small=yes}}{{pp-move-indef}} {{Infobox Planet | bgcolour = #c0c0ff | name = Earth | symbol = [[File:Earth symbol.svg|25px|Astronomical symbol of Earth]] ... }}

This is how I understand what human-readable markup means:

Okay, I understand that "{{" and "}}" are some special symbols -

but why pipe is between what seem to be words and in the start of some new lines? 2. What is "pp-semi" why it "move-indef"? 3. Why bgcolour is there? It's presentation. Is any editor going to change background color of the infobox, ever? Perhaps it has a different tone and must be #c0c1ff?

These are just 3 basic questions. Processed, the above snippet might look like this:

{{About Earth, the planet}} {{Infobox symbol = [[File:Earth symbol.svg|25px|Astronomical symbol of Earth]] ... }}

That's it. Now the machine's part:

"About" is a special "template" or some other construct. When it's

"ran" (processed) it accepts 2 colon-separated "arguments". The first specifies "name" of something (place, planet, object, etc.), the second - its "role". Moreover, even these are article-specific and can be changed. The point is that this line remains equally understandable regardless of the article type. 2. "{{pp ... indef}}" thing has disappeared. If that's something system it must go in system properties of the page, not be visible to anyone - what's the point in seeing it if the page is protected? And it's often protected to prevent changes to such "system" entries anyway. Vicious circle. 3. "Infobox Planet" has transformed into just "Infobox" - we've got "planet" defined in "About". 4. "bgcolor" is system presentation-specific thing, no place for it in the contents. 4. "name = Earth" - we've got this along with "Planet".

We have just reduced the source almost by 50% of lines.

Another example on the same page:

| temperatures = yes | temp_name1 = [[Kelvin]] | min_temp_1 = 184 K<ref name=asu_lowest_temp/> | mean_temp_1 = 287.2 K<ref name=kinver20091210/> | max_temp_1 = 331 K<ref name=asu_highest_temp/> | temp_name2 = [[Celsius]] | min_temp_2 = ?89.2 °C | mean_temp_2 = 14 °C | max_temp_2 = 57.8 °C ... <ref name=asu_lowest_temp>{{cite web|url=...|work=...|publisher=...|accessdate=2010-08-07}}</ref> ...

What kind of false positives are we talking about? Will any sane individual spend his precious time not editing but preparing to edit this mess?

Again, not to offend anyone and least - MediaWiki devs, but if we're talking about wikitext future the above must look much more plain: temperatures = in Kelvin: 184, 287.2, 331 temperatures min = {{Cite from web: url ..., work ..., publisher ..., access date 2010-08-07}}

5 times less "code" and still perfectly handled by the machine and - what's important too - more managable with more features:

"temperatures = yes" - obviously, if any temperatures are specified

in the infobox then the temperature block is enabled. Is it possible otherwise? 2. No Celsius temperatures - the machine does better job converting values than human plus calculator. 3. No link in "[[Kelvin]]" - the machine can place link itself, can't it? 4. No   because it just fixes some engine problem and is problem for its devs, not editors or, mind you, users. 5. Replaced all <ref>s with another "template argument" named "temperatures min". This works simple: the machine calculates minimal value and applies given reference to it, if corresponding argument is passed. If no - no reference. 6. No degree symbols: it's cleaner and users don't have to search for the special char (°).

And at least one added benefit: Should Wikipedia core group decide that there are not just Celsius and Kelvin but some different temperature values it can be added without changing ANY article source as long as conversion between Kelvin or Celsius and the new value can be done. Moreover, since K/C is managed by the machine each Wikipedia visitor can customize what he wants to see or their order (for instance, Celsius is his native and he wants to see it first).

The above are just small examples not involving large markup changes. I am ready to give more detailed reviews.

Try to imagine what kind of syntax we will get if we rework it from scratch. Personally for years I have been using a home-made markup that I'm using on some of my resources as a replacement for wiki, BB-codes and HTML and from my experience it's possible to create an international and intuitive markup suitable for virtually anyone using a computer. And added a simple plain-text editor like wikEd it will combine powers of both text and rich editors.

Especially if a group like WMF undertakes this task.

Signed, P. Tkachenko

Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l

I am forwarding below a response from Oliver Keyes, who isn't on this list.

-- Sumana Harihareswara Volunteer Development Coordinator Wikimedia Foundation Hey guys Sumana asked me to chip in; most of the arguments that can be made have already been typed up by people like Trevor, but I thought I'd go into a bit more detail and provide some links for those of you who want to do slightly deeper reading. I'm not commenting on the pros or cons of redoing the underlying wikimarkup; that's a technical issue, and I'm not a technical person. What I *am* is a community engagement person - and Pavel's line that intelligent people can parse markup languages is pretty well within my bailiwick :). The problem with this line is that it has the potential to turn into a "true scotsman" argument. Pavel, you're clearly both an intelligent and a technical man - but not all intelligence is of the same, technically-minded type, and it's not always backed up by pertinent and complex knowledge. I'm sure that you, were you a new editor, would be able to quickly parse our syntax inside your head. However, you're someone who is technically proficient and knows a lot of the background to markup languages, and most people - indeed, most *intelligent* people - simply aren't. It wasn't always the case. Early and mid-term adopters of the internet (I count myself as the latter, having first got online circa 1999) were technically proficient, could probably code, and would certainly be able to deal with not only our markup languages but markup languages generally. This isn't necessarily because they were more intelligent than anyone else, though; this is because the structure of the internet at that time penalised anyone who *wasn't* technical; websites and communications methods expected a degree of technical proficiency. Today that isn't the case. Site after site after site have realised that instituting technical barriers to participation artificially limits your audience and volunteers, and have introduced WYSIWYG editors in some way, shape or form. The result is that the generation of intelligent people we're dealing with now is not the generation of early and mid-level adopters we all know, love and are members of; it's the Facebook generation: people who have come to expect that the barriers to participating will be low, easy to comprehend, and simple to bypass. And because they've come to expect this, and the internet has indulged this, they don't necessarily have the technical knowledge or background to parse markup languages in the same way that members of this list might. Of course, it's a mistake to think that just because someone is young they won't be technical - we have a lot of great, technically minded volunteers. Similarly, it's a mistake to think that just because someone is older they will be. For some cases-in-point, I recommend the usability studies the Foundation ran a couple of years ago - there are some great examples at http://usability.wikimedia.org/wiki/Usability,_Experience,_and_Evaluation_St... http://usability.wikimedia.org/wiki/Usability_and_Experience_Study#Wiki_Synt... The simple fact of the matter is this: editing is complex and technical and we are not, as experienced people, necessarily qualified to say what the general population can or cannot do, because *we are not the general population*. The people qualified to tell us what gen pop feels comfortable doing and what gen pop expects of websites are, well, gen pop. And they've spoken, through the usability initiative and just about every conversation I've had with a reader, and, I'm sure, a heck-load of conversations other contractors and staffers have had too. The complexity of our existing markup language is a barrier, but not as much as the presence of any markup language whatsoever as a default. I appreciate this is a bit TL:DR, and as I'm not really subscribed to this list I'm unlikely to see responses unless Sumana is kind enough to act as my gopher ;). If you want to chat more about the philosophical and cultural underpinnings of usability rather than the technical, I'm always up for a natter; okeyes@wikimedia.org

Amgine

8:09 p.m.

On 12-02-06 10:15 AM, Sumana Harihareswara wrote:

...

I am forwarding below a response from Oliver Keyes, who isn't on this list.

Hey guys Sumana asked me to chip in; most of the arguments that can be made have already been typed up by people like Trevor, but I thought I'd go into a bit more detail and provide some links for those of you who want to do slightly deeper reading.

I'm not commenting on the pros or cons of redoing the underlying wikimarkup; that's a technical issue, and I'm not a technical person. What I *am* is a community engagement person - and Pavel's line that intelligent people can parse markup languages is pretty well within my bailiwick .

The problem with this line is that it has the potential to turn into a "true scotsman" argument. Pavel, you're clearly both an intelligent and a technical man - but not all intelligence is of the same, technically-minded type, and it's not always backed up by pertinent and complex knowledge. I'm sure that you, were you a new editor, would be able to quickly parse our syntax inside your head. However, you're someone who is technically proficient and knows a lot of the background to markup languages, and most people - indeed, most *intelligent* people

simply aren't.

It wasn't always the case. Early and mid-term adopters of the internet (I count myself as the latter, having first got online circa 1999) were technically proficient, could probably code, and would certainly be able to deal with not only our markup languages but markup languages generally. This isn't necessarily because they were more intelligent than anyone else, though; this is because the structure of the internet at that time penalised anyone who *wasn't* technical; websites and communications methods expected a degree of technical proficiency.

Today that isn't the case. Site after site after site have realised that instituting technical barriers to participation artificially limits your audience and volunteers, and have introduced WYSIWYG editors in some way, shape or form. The result is that the generation of intelligent people we're dealing with now is not the generation of early and mid-level adopters we all know, love and are members of; it's the Facebook generation: people who have come to expect that the barriers to participating will be low, easy to comprehend, and simple to bypass. And because they've come to expect this, and the internet has indulged this, they don't necessarily have the technical knowledge or background to parse markup languages in the same way that members of this list might.

Of course, it's a mistake to think that just because someone is young they won't be technical - we have a lot of great, technically minded volunteers. Similarly, it's a mistake to think that just because someone is older they will be. For some cases-in-point, I recommend the usability studies the Foundation ran a couple of years ago - there are some great examples at http://usability.wikimedia.org/wiki/Usability,_Experience,_and_Evaluation_St... http://usability.wikimedia.org/wiki/Usability_and_Experience_Study#Wiki_Synt...

The simple fact of the matter is this: editing is complex and technical and we are not, as experienced people, necessarily qualified to say what the general population can or cannot do, because *we are not the general population*. The people qualified to tell us what gen pop feels comfortable doing and what gen pop expects of websites are, well, gen pop. And they've spoken, through the usability initiative and just about every conversation I've had with a reader, and, I'm sure, a heck-load of conversations other contractors and staffers have had too. The complexity of our existing markup language is a barrier, but not as much as the presence of any markup language whatsoever as a default.

I appreciate this is a bit TL:DR, and as I'm not really subscribed to this list I'm unlikely to see responses unless Sumana is kind enough to act as my gopher . If you want to chat more about the philosophical and cultural underpinnings of usability rather than the technical, I'm always up for a natter; okeyes@wikimedia.org

Oliver Keyes: I do not believe anyone is disputing your general arguments, above.

The concern I see being expressed, fundamentally, is "I have developed skills, practices, and efficiencies with current Wiki syntax. Is your new parser going to destroy my investments in learning? am I going to have to start over with this new system?"

As I understand it, for the foreseeable future there will be a raw wiki syntax interface available. I hope contributors can be reassured on this point.

Amgine

David Gerard

8:28 p.m.

On 6 February 2012 19:09, Amgine amgine@wikimedians.ca wrote:

...

The concern I see being expressed, fundamentally, is "I have developed skills, practices, and efficiencies with current Wiki syntax. Is your new parser going to destroy my investments in learning? am I going to have to start over with this new system?" As I understand it, for the foreseeable future there will be a raw wiki syntax interface available. I hope contributors can be reassured on this point.

I understand that is indeed the plan. A lot of the problems with a visual editor at all have been that wikitext is such a mess, but throwing it away is not possible.

- d.

Gabriel Wicke

8:39 p.m.

...

The concern I see being expressed, fundamentally, is "I have developed skills, practices, and efficiencies with current Wiki syntax. Is your new parser going to destroy my investments in learning? am I going to have to start over with this new system?"

As I understand it, for the foreseeable future there will be a raw wiki syntax interface available. I hope contributors can be reassured on this point.

We are trying to provide an additional, easy way to edit the WikiText of regular content pages. This should not interfere with diffs or otherwise mess up existing WikiText. See many previous posts in this list for technical detail on how we are trying to ensure this.

If WikiText was to be ever removed then there would need to be a very good way to handle all of templates, parser functions and so on. Visual programming does not seem to be as popular as some people expected it to be in the 80s, so I won't hold my breath for markup to disappear any time soon.

Gabriel

Trevor Parscal

8:44 p.m.

I will also confirm that we have no plan or intention to do any of the following:

- Disable markup (plain text) editing - Allow visual editing to mangle plain text documents - Make significant changes to what Wikitext is and can do

The reason the parser and visual editor projects are so hard is very much to do with these points.

If changes to the markup are at some point desirable, the work we are doing helps that greatly by giving non-markup-geeks a way to go about editing without being exposed to markup changes should they occur. The parser work, as Gabriel mentioned before, is also a critical part of evolving Wikitext (if that's ever desirable) because you could simply make changes to the serializer and parser and migrate a document automatically.

- Trevor

On Mon, Feb 6, 2012 at 11:39 AM, Gabriel Wicke wicke@wikidev.net wrote:

...

...
The concern I see being expressed, fundamentally, is "I have developed skills, practices, and efficiencies with current Wiki syntax. Is your new parser going to destroy my investments in learning? am I going to have to start over with this new system?"

As I understand it, for the foreseeable future there will be a raw wiki syntax interface available. I hope contributors can be reassured on this point.

We are trying to provide an additional, easy way to edit the WikiText of regular content pages. This should not interfere with diffs or otherwise mess up existing WikiText. See many previous posts in this list for technical detail on how we are trying to ensure this.

If WikiText was to be ever removed then there would need to be a very good way to handle all of templates, parser functions and so on. Visual programming does not seem to be as popular as some people expected it to be in the 80s, so I won't hold my breath for markup to disappear any time soon.

Gabriel

Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l

Svip

9:43 p.m.

On 6 February 2012 20:44, Trevor Parscal tparscal@wikimedia.org wrote:

...

I will also confirm that we have no plan or intention to do any of the following:

Disable markup (plain text) editing

Allow visual editing to mangle plain text documents

Make significant changes to what Wikitext is and can do

Is it true that Lua is now the scripting language for templates?

Jay Ashworth

10:02 p.m.

----- Original Message -----

...

From: "Amgine" amgine@wikimedians.ca

...

Oliver Keyes: I do not believe anyone is disputing your general arguments, above.

The concern I see being expressed, fundamentally, is "I have developed skills, practices, and efficiencies with current Wiki syntax. Is your new parser going to destroy my investments in learning? am I going to have to start over with this new system?"

Correct, and it isn't merely investments in learning; there are likely investments in wrap-around-the-outside coding which assume access to markup as well. Not All Mediawikiae Are Wikipedia.

Our point, as I share Oliver's concerns, which I think you rephrase well, is that raw markup access is just as important -- if not moreso -- to a subset of users of Mediawiki which is likely smaller than the Wikipedia editor base, but still significant. (How significant, I don't have numbers for, and I'm not sure such numbers can actually be generated...)

Cheers, -- jra

-- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://baylink.pitas.com 2000 Land Rover DII St Petersburg FL USA http://photo.imageinc.us +1 727 647 1274

David Gerard

10:09 p.m.

On 6 February 2012 21:02, Jay Ashworth jra@baylink.com wrote:

...

Correct, and it isn't merely investments in learning; there are likely investments in wrap-around-the-outside coding which assume access to markup as well. Not All Mediawikiae Are Wikipedia.

Your use of "likely" there turns out to be largely incorrect - one of the biggest problems with wikitext is that it's all but unparsable by machines other than the original parser routines in MediaWiki. That fact was one of the inspirations for this list existing at all: to come up with a definition of wikitext that could be used by machine parsers at all.

- d.

Jay Ashworth

10:12 p.m.

---- Original Message -----

...

From: "David Gerard" dgerard@gmail.com

...

On 6 February 2012 21:02, Jay Ashworth jra@baylink.com wrote:

...
Correct, and it isn't merely investments in learning; there are likely investments in wrap-around-the-outside coding which assume access to markup as well. Not All Mediawikiae Are Wikipedia.

Your use of "likely" there turns out to be largely incorrect - one of the biggest problems with wikitext is that it's all but unparsable by machines other than the original parser routines in MediaWiki. That fact was one of the inspirations for this list existing at all: to come up with a definition of wikitext that could be used by machine parsers at all.

I was around when wikitext-l forked; I know pretty much exactly how unparseable MWtext is. That doesn't preclude external code which *generates* MWtext for injection into wikis.

And in fact, IIRC, there are 4 or 5 parser replacements that are between 97 and 99% accurate. Not good enough for Wikipedia, but they'd certainly be good enough for nearly anything else...

Cheers, -- jra

vitalif＠yourcmc.ru

10 Feb 10 Feb

12:47 a.m.

Hi again!

Okay. I've read all the answers and want to make a small summary for myself (IMHO):

1) The MOST important: I think Wikipedia's markup complexity problems are not the problems of Wikitext idea. I think WP should concentrate on improving WP-specific markup (tons of various infoboxes and etc), maybe move some templates to extensions, instead of changing the wikitext totally or even worse - hiding it from the user.

2) I still beg MW developers to not throw the wikitext idea away! (maybe if it will prove being useless for ALL (100%) users one day - but I'm sure this day won't come). Visual editor itself won't harm anything, I will support it. But I'll say my users: if you want to make small changes, here is the visual editor. If you want to write something big, here is the plaintext.

3) I see there are usability talks. Improving usability is good, but targetting at people who don't want to be "technically proficient" (emphasis: DON'T WANT TO BE, not just "ARE NOT") and just want to press a magic button and "that's all" is not good. There are topics like "internal vs external links, users tend to find one and ignore other" - that's ridiculous, that's NOT the usability problem. I.e. I think we shouldn't think of "usability initiatives" as of 100% correct ideas.

George Herbert

1:24 a.m.

On Thu, Feb 9, 2012 at 3:47 PM, vitalif@yourcmc.ru wrote:

...

Improving usability is good, but targetting at people who don't want to be "technically proficient" (emphasis: DON'T WANT TO BE, not just "ARE NOT") and just want to press a magic button and "that's all" is not good.

I strongly disagree, as does community consensus on this point.

There are barriers we need to have between would-be editors and the content - such as basic competence in the language, lack of conflict of interest, willingness to be a participant in shared editing processes. Technically, the presence of a web browser is a legitimate barrier, but the "learn wikitext or else" one is bogus.

-- -george william herbert george.herbert@gmail.com

Yury Tarasievich

7:17 a.m.

On 02/10/2012 02:24 AM, George Herbert wrote:

...

On Thu, Feb 9, 2012 at 3:47 PM,vitalif@yourcmc.ru wrote:

...
Improving usability is good, but targetting at people who don't want to be "technically proficient" (emphasis: DON'T WANT TO BE, not just "ARE NOT") and just want to press a magic button and "that's all" is not good.

I strongly disagree, as does community consensus on this point.

There are barriers we need to have between would-be editors and the content - such as basic competence in the language, lack of conflict of interest, willingness to be a participant in shared editing processes. Technically, the presence of a web browser is a legitimate barrier, but the "learn wikitext or else" one is bogus.

I mostly stayed away from the discussion here, but I feel I have to say (again) that:

I believe that the "wikitext obscuration" will just break things, not helping the participation figures as expected. It might be fun to produce a visual editor etc., but this is solving just a completely different problem from "lowering the barriers" stated here many times.

The Wikipedia participation problem, especially at this stage of the project's lifecycle, is a social problem, not technical. Putting on a new frontend to the "old" wiki-engine will address that how? Let's imagine one day of wikipedia's life with visual editor magically added. What would change?

The main problem is wikipedia model of encyclopedical content creation itself has an inherent, irremovable flaws, if regarded from the point of view of expertise representation. All right, /it turns/, the wikipedia is there (and has some credibility problems, and some expertise problems), but the main expenditures in its creation were not on wikitext learning and using. The essential motivating trick/move was already there (...*anybody* can edit...), and it is exactly that trick/move which creates the meanest problems now, the problems which you guys try to remedy with business-style projects like "usability-friendly visual editing of wikitext".

Sorry if I didn't make myself clear.

Yury

Erik Moeller

7:26 a.m.

On Thu, Feb 9, 2012 at 10:17 PM, Yury Tarasievich yury.tarasievich@gmail.com wrote:

...

The Wikipedia participation problem, especially at this stage of the project's lifecycle, is a social problem, not technical.

It's both. The development of a Visual Editor is a necessary but not a sufficient change to broaden and diversify the editor population.

-- Erik Möller VP of Engineering and Product Development, Wikimedia Foundation Support Free Knowledge: https://wikimediafoundation.org/wiki/Donate

Helder

10:58 a.m.

On Fri, Feb 10, 2012 at 04:26, Erik Moeller erik@wikimedia.org wrote:

...

On Thu, Feb 9, 2012 at 10:17 PM, Yury Tarasievich yury.tarasievich@gmail.com wrote:

...
The Wikipedia participation problem, especially at this stage of the project's lifecycle, is a social problem, not technical.

It's both. The development of a Visual Editor is a necessary but not a sufficient change to broaden and diversify the editor population.

Besides, not all Wikipedias are at the same stage of its lifecycle, and not every wiki is Wikipedia.

Yury Tarasievich

12:26 p.m.

On 02/10/2012 08:26 AM, Erik Moeller wrote:

...

On Thu, Feb 9, 2012 at 10:17 PM, Yury Tarasievich yury.tarasievich@gmail.com wrote:

...
The Wikipedia participation problem, especially at this stage of the project's lifecycle, is a social problem, not technical.

It's both. The development of a Visual Editor is a necessary but not a sufficient change to broaden and diversify the editor population.

Okay, okay, so it's both, but in what proportion? The social kind still beats the technical.

Now, Helder tells us "not all Wikipedias are at the same stage of its lifecycle, and not every wiki is Wikipedia". Well, that's valid, but not quite relevant to the issue. E.g., there would be no MediaWiki development as we know it, if not for the English WP.

So, you are going to "break things" for the distant and rather doubtful gain. But do you indeed want "broad and diverse population of editors", so the social problems will, in fact, flare? And not, for starters, some kind of organisational "mechanism" targeting the content quality?

Like I said, I wouldn't say no to something visual, representing the content structure as as tree with collapsible sections. I don't know, something Texmacs-like? But not some sort of not-quite-Word.

Don't let that stop you. :)

Yury

David Gerard

11 Feb 11 Feb

10:15 p.m.

On 10 February 2012 11:26, Yury Tarasievich yury.tarasievich@gmail.com wrote:

...

So, you are going to "break things" for the distant and rather doubtful gain. But do you indeed want "broad and diverse population of editors", so the social problems will, in fact, flare?

Yes, because the editor numbers are dropping horribly.

You're again positing a horrible technical editing environment as a social filter to keep out people you don't like. This is, as I noted, contemptible.

...

And not, for starters, some kind of organisational "mechanism" targeting the content quality?

Experience suggests that in general, more eyes makes higher quality.

...

Like I said, I wouldn't say no to something visual, representing the content structure as as tree with collapsible sections. I don't know, something Texmacs-like? But not some sort of not-quite-Word.

You need to understand that there are smart people who are not geeks and that there are things to be smart in that are not technical. You don't seem able to comprehend this.

- d.

Yury Tarasievich

12 Feb 12 Feb

7:49 a.m.

On 02/11/2012 11:15 PM, David Gerard wrote:

...

On 10 February 2012 11:26, Yury Tarasievichyury.tarasievich@gmail.com wrote:

...
So, you are going to "break things" for the distant and rather doubtful gain. But do you indeed want "broad and diverse population of editors", so the social problems will, in fact, flare?

Yes, because the editor numbers are dropping horribly.

You're again positing a horrible technical editing environment as a social filter to keep out people you don't like. This is, as I noted, contemptible.

Civil as always. Still, you might want to take that back, as I'm not positing any such thing.

"Many times now" I pointed out that: 1) the visual solution, or lack of it, is not relevant for the problem of dropping numbers, and 2) the solution will affect numbers only *in*directly, if at all, while militant dilettantism, meta-wikyism and wiki-cliques, to say nothing of the "anybody can edit" taken to extremes, will, and directly, at that.

I don't think anybody can map out any significant number of would-be- or already-editors who quit "just because there were no visual tool". While concerns about matters I mentioned seem to be voiced quite regularly last years. I myself pulled out from English WP years ago, as the ratio of return to effort dropped to unreasonable numbers.

Back to topic, if smart people in visual editor team can put in a visual editor without breaking things, I'm all for it. I won't use it, like I don't use visual in MoinMoin, but it's no concern for me. However, realistically, things WILL break, and with small return or none, as sketched above, at that.

...

...
And not, for starters, some kind of organisational "mechanism" targeting the content quality?

Experience suggests that in general, more eyes makes higher quality.

You are misusing the quoted proposition.

In its classical context, in an environment with a clearly defined meterstick for objective and expertise, yes. And even so, "in general" -- right now, Uwe Ohse's rant comes to mind.

Yury

Mike Dupont

7:12 a.m.

On Sun, Feb 12, 2012 at 7:49 AM, Yury Tarasievich < yury.tarasievich@gmail.com> wrote:

...

I don't think anybody can map out any significant number of would-be- or already-editors who quit "just because there were no visual tool".

I have stopped editing because of harassment and wiki-stalking from http://en.wikipedia.org/wiki/User:WhiteWriter and no one stops him from doing the same to many other people. There are gangs of radicals that live on wikipedia and just make life miserable for others, and no visual editor will stop that.

mike

-- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org

Pavel Tkachenko

12:31 p.m.

On Thu, 09 Feb 2012 09:32:43 -0500, Sumana Harihareswara sumanah@wikimedia.org wrote:

...

...
On 08.02.2012 2:52, Platonides wrote:

...
At the beginning, the interesection was huge even if ability to edit was low, just because there was a lot of knowledge missing. So as the knowledge increases (eg. linearly) "people" appear to be more and more stupid for editing.

Pavel, I asked Oliver Keyes, and he said that http://meta.wikimedia.org/wiki/Research:Newbie_reverts_and_article_length may be of interest. He's not on this list, so if you have thoughts about that, please cc him.

Thanks, Sumana and Oliver. This research seems relevant but its result doesn't seem to support or refute the abovementioned statement.

On Fri, 10 Feb 2012 13:26:31 +0200, Yury Tarasievich yury.tarasievich@gmail.com wrote:

...

So, you are going to "break things" for the distant and rather doubtful gain.

I'm getting a hunch WMF guys are just desperate to do anything about the falling statistics and since they can't think of means to change the social factor they chose the visual editor approach.

On Fri, 10 Feb 2012 17:20:51 +0100, Platonides platonides@gmail.com

...

The Visual Editor is the candy with which you try to engage them.

Editors should hardly be motivated to edit by the convenient editor alone. If it's so eveybody will add an extra comma or colon just to bite this "candy".

On Fri, 10 Feb 2012 11:38:23 -0500 (EST), Jay Ashworth jra@baylink.com

...

But after all the time the web design community has spent trying to get us to deal in semantics, rather than presentation, it seems pretty ironic to me that we see such a pressure to do precisely the opposite...

Are you talking about the markup or visual editor? It it's the latter I would agree but if it's the former then I'd say it's just the trouble of (most) existing text markups that they don't follow HTML4+ steps and continue connecting tokens with presentation instead of semantics.

On Sun, 12 Feb 2012 01:08:03 +0400, vitalif@yourcmc.ru wrote:

...

At long last, most of the internet users successfully use BBCode on the forums - why they can't use wikitext?..

I agree with vitalif (I believe I have mentioned this earlier) - I have even seen housewives using BB-codes when they really need to (in those flashy pinky thread headposts).

On Sun, 12 Feb 2012 01:16:51 +0400, vitalif@yourcmc.ru wrote:

...

...
I don't even remember if I have read a manual for any of my cellphones.

I'm sure not just because you ignore the rules, but because all modern cellphones have intuitive interface. Many users read manuals for their FIRST cellphone.

Right, visual editor must be a cellphone that requires no manual while ideal markup must requiring getting familiar with any kind of text markup - any most literate people already do because we all write ordered lists in similar way and emphasize our words by underlining them. Something like that.

On Sun, 12 Feb 2012 07:12:54 +0100, Mike Dupont jamesmikedupont@googlemail.com wrote:

...

I have stopped editing because of harassment and wiki-stalking from WhiteWriter and no one stops him from doing the same to many other people. There are gangs of radicals that live on wikipedia and just make life miserable for others, and no visual editor will stop that.

...which only seconds Yuri's social factor thought.

Still, Yuri, how do you oppose the WMF studies Oliver has presented earlier? About the factor of "any markup by default".

Signed, P. Tkachenko

Yury Tarasievich

3:33 p.m.

On 02/12/2012 01:31 PM, Pavel Tkachenko wrote:

...

...
Pavel, I asked Oliver Keyes, and he said that http://meta.wikimedia.org/wiki/Research:Newbie_reverts_and_article_length may be of interest. He's not on this list, so if you have thoughts about

...

On Sun, 12 Feb 2012 07:12:54 +0100, Mike Dupont jamesmikedupont@googlemail.com wrote:

...
I have stopped editing because of harassment and wiki-stalking from WhiteWriter and no one stops him from doing the same to many other people. There are gangs of radicals that live on wikipedia and just make life miserable for others, and no visual editor will stop that.

...which only seconds Yuri's social factor thought.

Still, Yuri, how do you oppose the WMF studies Oliver has presented earlier? About the factor of "any markup by default".

I do not oppose those studies at all, nor do I deny their integrity.

Just that the research isn't accounting for much of the important factors besides the plain volume of text. What would be those, and how those might be accounted for, I plainly don't know.

Just off the top of my head, no saying how it's useful: attempts to edit what long pages were more likely to be reverted? Were the long pages the same over the years? Did the contested pages actually grow? Who were the reverters?

Just like you say, one might form a distinct impression that the participation numbers indeed *are* falling. But will *enabling* the editing with a (rushed?) visual job actually *help*?

I'm totally not blocking the mouths of visual tool-hungry masses, but why must things be done in such fashion? There were those leisurely "new parser a year" years, and now, suddenly, there is a rush? A complex tool like that, operating on a content corpus which was created without it, is just bound to break things. Are there a projections for the breakage numbers and impact? I wouldn't say Wikipedia is fit for the UNESCO heritage (what about the conservation, eh?), but it is good for some uses, after all.

Now, why not follow the MoinMoin example, and construct an extension for loading the wiki-page into OpenOffice, benefiting from the fact that the return path is already well-covered? "Order of magnitude" more simple.

Yury

Oliver Keyes

7:55 p.m.

...

Now, why not follow the MoinMoin example, and construct an extension for loading the wiki-page into OpenOffice, benefiting from the fact that the return path is already well-covered? "Order of magnitude" more simple.

Yury

Yury, I couldn't have thought of a better way to illustrate the problems

with this conversation if I'd tried. You have my thanks. The problem with the participants here - those attempting to second-guess, particularly - is the repeated attempts to put yourselves into the shoes of new users, in ignorance of the fact that *none of us can adequately impersonate members of the general public*, and that basing how we approach lowering the barrier to participation on the principle that we can is dangerous at best. When you say something like "construct an extension for loading the wiki-page into OpenOffice" as a solution to the problem that one must be (1) relatively tech-savvy and (2) willing to jump through unnecessary hoops if you want to contribute, you highlight this; a custom extension is an awkward way of doing things. and an unnecessary hoop. A custom extension designed to work with a specific piece of software that is largely not used by the general population is similarly going to restrict who can participate, for precisely the same reasons that the existing markup is a restriction.

If your solution to "we need to avoid an overcomplicated interface mostly used by people with a primacy in tech" is to develop an extension that requires potential editors to have a specific piece of software people mostly don't use or install it, you are making a very good argument for why we should work on the research the usability initiative did with actual new editors, and not the ideas of anyone attempting to put themselves into the shoes of new editors. We're not new editors. We can't impersonate them - not adequately, and not for the purpose of somehow divining what it is they want. And we should stop pretending that we can.

-- Oliver Keyes Community Liaison, Product Development Wikimedia Foundation

David Gerard

8:03 p.m.

On 12 February 2012 18:55, Oliver Keyes okeyes@wikimedia.org wrote:

...

If your solution to "we need to avoid an overcomplicated interface mostly used by people with a primacy in tech" is to develop an extension that requires potential editors to have a specific piece of software people mostly don't use or install it, you are making a very good argument for why we should work on the research the usability initiative did with actual new editors, and not the ideas of anyone attempting to put themselves into the shoes of new editors. We're not new editors. We can't impersonate them - not adequately, and not for the purpose of somehow divining what it is they want. And we should stop pretending that we can.

By the way, I covered most of this thread about a year ago:

http://davidgerard.co.uk/notes/2011/01/04/what-you-see-is-for-the-win/

(This was posted just before the big WMF push for a visual editor.)

- d.

Oliver Keyes

8:04 p.m.

Indeed; do love your blog :). Thank you for reminding me of that post - I was about to spend an hour writing a frustrating post of my own about how people shouldn't base decisions on how *they* think a mass of other people would feel. This has precluded that ;p.

On 12 February 2012 19:03, David Gerard dgerard@gmail.com wrote:

...

On 12 February 2012 18:55, Oliver Keyes okeyes@wikimedia.org wrote:

...
If your solution to "we need to avoid an overcomplicated interface mostly used by people with a primacy in tech" is to develop an extension that requires potential editors to have a specific piece of software people mostly don't use or install it, you are making a very good argument for

why

...
we should work on the research the usability initiative did with actual

new

...
editors, and not the ideas of anyone attempting to put themselves into

the

...
shoes of new editors. We're not new editors. We can't impersonate them -

not

...
adequately, and not for the purpose of somehow divining what it is they want. And we should stop pretending that we can.

By the way, I covered most of this thread about a year ago:

http://davidgerard.co.uk/notes/2011/01/04/what-you-see-is-for-the-win/

(This was posted just before the big WMF push for a visual editor.)

d.

Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l

-- Oliver Keyes Community Liaison, Product Development Wikimedia Foundation

David Gerard

8:09 p.m.

On 12 February 2012 19:04, Oliver Keyes okeyes@wikimedia.org wrote:

...

Indeed; do love your blog :). Thank you for reminding me of that post - I was about to spend an hour writing a frustrating post of my own about how people shouldn't base decisions on how *they* think a mass of other people would feel. This has precluded that ;p.

The actual phenomenon is called the Typical Mind Fallacy:

http://lesswrong.com/lw/dr/generalizing_from_one_example/ http://wiki.lesswrong.com/wiki/Typical_mind_fallacy

It's why nerd-designed interfaces are so LEGENDARILY AWFUL, and why engineers are reliably shocked when they see what actual (not theoretical) users do when put in front of their stuff.

Generalising from one example is better than generalising from no examples ... but not better than doing actual usability testing.

- d.

Yury Tarasievich

9:30 p.m.

Well, Oliver, and David, too, don't go congratulating on how you showed'em yet. That's bad taste, guys.

See, I'm not trying to "impersonate" or "second-guess" poor new editors, and I'm quite sure, from where I stand, that whatever some piece of blog says, I don't show "mind fallacy" in my reasoning because I just did not mention here another people's experience at all. Simple, eh?

Of course, I'm not native English speaker, so, obviously, I'm expressing myself unclearly and/or awkwardly.

So, one more (last) time:

1) I profess no "love" for the old interface, and 2) I do not compare merits of old and visual interfaces at all.

I say that 3) you won't get your indirectly expected results (participation) but you'll surely get unexpected (?) results (breakage) because 4) introducing new I/O component of that kind of complexity will surely break things in existing text corpus formed with the previous component, such as it was.

What, am I stating the obvious? Sorry, then.

Yury

Oliver Keyes

8:42 p.m.

...

So, one more (last) time:

I profess no "love" for the old interface, and

I do not compare merits of old and visual interfaces at all.

I say that 3) you won't get your indirectly expected results (participation) but you'll surely get unexpected (?) results (breakage) because 4) introducing new I/O component of that kind of complexity will surely break things in existing text corpus formed with the previous component, such as it was.

What, am I stating the obvious? Sorry, then.

I seem to be misunderstanding you, then, and for that I apologise :). I

think there is a risk of breakage (I'm not a technical person, but doing pretty much *anything* creates some risk of breakage) but I disagree that we're not going to get the expected results.

There are two problems with getting users to participate with Wikipedia (well, 3, but we won't get into "most people don't know they can edit"). The first that concerns us is the technical hurdles; the fact that markup is complex and most people don't understand it. The second is the social hurdles; the community is not, quite a bit of the time, a fun place to be - nor is it something that can be understood easily (we have policies out the wazoo, mostly in technical and internal terms). To "fix" Wikipedia's participation issues, we need to fix both problems.

The technical hurdles are at least partly fixed by the visual editor. This is a necessary component if we want to improve things - it's not, however, the only component. Even with the visual editor, getting people to stick around is going to be difficult because of the social issues (although the New Editor Engagement Project is working on ways to solve those, too). We might get more people, but that's only because we're exposing more people to the possibilities of editing: I doubt the ratio of success:failure in terms of long-term participation will dramatically alter. That's where the second element comes in; fixing social problems. Hopefully those efforts will succeed, and the result is we'll have an interface that's open to newbies, and a community that is as well. All this is going to take a long hard slog though.

Crucially, however, just as the technical improvements won't work without social improvements, the social improvements won't work without technical ones. We're not going to see a massive boost in numbers from exposing lots and lots of people to a culture marked "here be dragons", but similarly we're not going to see a massive boost in numbers if we don't expose them at all. We need to fix both the technical hurdles and the social ones - and that's precisely what we're doing. I don't think anyone is claiming that the visual editor will, in and of itself, be the solution to our woes. But it is an essential component of that solution.

-- Oliver Keyes Community Liaison, Product Development Wikimedia Foundation

Yury Tarasievich

10:22 p.m.

On 02/12/2012 09:42 PM, Oliver Keyes wrote:

...

I seem to be misunderstanding you, then, and for that I apologise :).

Why, thank you. :)

I think there is a risk of

...

breakage (I'm not a technical person, but doing pretty much *anything* creates some risk of breakage) but I disagree that we're not going to

Well, to put it less technically, you (WMF) have now on your hands one big text corpus in somewhat-natural language (meaning not-so-formalised), which is processed bottom-to-top only (for presentation).

You want to introduce an I/O tool which is complex UI component (an enterprise in itself), which will work with the corpus, but will do through an additional top-down path, at that. So, inevitably, more formalisation would have to exist on top level w/r to the bottom level. I didn't see the specifications, but this in its turn presupposes some "sanitisation" of bottom level each time un-sanitised page is hit with new I/O tool. You may shudder now. :)

I'm not including the rest of your post, as I still think I got your (WMF) objectives first time, actually. You want to boost participation, but you have to show (market) something technical, too. Which is prefectly understandable, with said caveats.

...

get the expected results.

Well, forecasts being forecasts, I'd guess that there'll actually form sort of constantly-renewed group of newbie-editors with /short/-lived participation. How big, anybody's guess. I'd truely like to see what reforms are planned on the ideological/organisational (social) side of the matters, though.

Yury

Oliver Keyes

9:34 p.m.

...

Well, forecasts being forecasts, I'd guess that there'll actually form sort of constantly-renewed group of newbie-editors with /short/-lived participation. How big, anybody's guess. I'd truely like to see what reforms are planned on the ideological/organisational (social) side of the matters, though.

Slightly outside the scope of the mailing list, but if you're interested, drop me an email off-list and I'm happy to tell you everything I know about the various ways we're hoping to enable cultural shifts :)

-- Oliver Keyes Community Liaison, Product Development Wikimedia Foundation

David Gerard

9:51 p.m.

On 12 February 2012 21:22, Yury Tarasievich yury.tarasievich@gmail.com wrote:

...

Well, to put it less technically, you (WMF) have now on your hands one big text corpus in somewhat-natural language (meaning not-so-formalised), which is processed bottom-to-top only (for presentation). You want to introduce an I/O tool which is complex UI component (an enterprise in itself), which will work with the corpus, but will do through an additional top-down path, at that. So, inevitably, more formalisation would have to exist on top level w/r to the bottom level. I didn't see the specifications, but this in its turn presupposes some "sanitisation" of bottom level each time un-sanitised page is hit with new I/O tool. You may shudder now. :)

Yes. That's exactly why all this has taken many years. (When was a WYSIWYG editor first proposed - five or six years ago?)

- d.

Yury Tarasievich

13 Feb 13 Feb

8:24 a.m.

On 02/12/2012 10:51 PM, David Gerard wrote:

...

On 12 February 2012 21:22, Yury Tarasievichyury.tarasievich@gmail.com wrote:

...
Well, to put it less technically, you (WMF) have now on your hands one big text corpus in somewhat-natural language (meaning not-so-formalised), which

...

Yes. That's exactly why all this has taken many years. (When was a WYSIWYG editor first proposed - five or six years ago?)

That's exactly why it shouldn't be even considered OR why it should supersede editing wikitext completely.

Yury

Jay Ashworth

4:39 p.m.

----- Original Message -----

...

From: "Oliver Keyes" okeyes@wikimedia.org

...

(1) relatively tech-savvy and (2) willing to jump through unnecessary hoops if you want to contribute, you highlight this; a custom extension is an awkward way of doing things. and an unnecessary hoop. A custom extension designed to work with a specific piece of software that is largely not used by the general population is similarly going to restrict who can participate, for precisely the same reasons that the existing markup is a restriction.

Nearly thirty years experience on the net, starting in '83 with Usenet, leave me entirely unconvinced that the underlying argument here -- that the extra layer of filtering involved in having to *want* to contribute enough to dig through the technical obstacles in the way is somehow A Bad Thing -- is entirely tenable.

The tenor of discourse on the net has trended *steadily* down over those three decades, IME, and I'm not at all convinced that making it easier to contribute to WP is in fact the panacea that all the proponents of this stuff say it is.

Go. :-)

Cheers, -- jra

-- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://baylink.pitas.com 2000 Land Rover DII St Petersburg FL USA +1 727 647 1274

David Gerard

6:59 p.m.

On 13 February 2012 15:39, Jay Ashworth jra@baylink.com wrote:

...

Nearly thirty years experience on the net, starting in '83 with Usenet, leave me entirely unconvinced that the underlying argument here -- that the extra layer of filtering involved in having to *want* to contribute enough to dig through the technical obstacles in the way is somehow A Bad Thing -- is entirely tenable.

Well, all those people left Usenet for message boards. And look how healthy Usenet is now!

I'd suggest that Usenet isn't a good example of what we want to happen.

- d.

Jay Ashworth

7:22 p.m.

----- Original Message -----

...

From: "David Gerard" dgerard@gmail.com

...

Well, all those people left Usenet for message boards. And look how healthy Usenet is now!

The groups I always hung out on are, by and large, just as useful now as they were then.

All *which* people left? The smart, thoughtful ones who could write proper english? Not IME.

...

I'd suggest that Usenet isn't a good example of what we want to happen.

I'd suggest that you're not using a fine enough glass to look.

Cheers, -- jra

Mihály Héder

7:34 p.m.

Hello,

I think that is really hard to show the relation of smart/not smart people and certain forms of using the internet. I believe that it is rather a generational gap. This is how it works with young folks: -I don't drive my grandfathers car and also I don't use usenet -I don't wear my fathers clothes and I don't use wikitext -I listen to the newest music and I use nice AJAX interfaces. Because as all my friends do. And unrelated to all this, I can still be either smart or dumb.

Best Mihály

On 13 February 2012 19:22, Jay Ashworth jra@baylink.com wrote:

...

----- Original Message -----

...
From: "David Gerard" dgerard@gmail.com

...
Well, all those people left Usenet for message boards. And look how healthy Usenet is now!

The groups I always hung out on are, by and large, just as useful now as they were then.

All *which* people left? The smart, thoughtful ones who could write proper english? Not IME.

...
I'd suggest that Usenet isn't a good example of what we want to happen.

I'd suggest that you're not using a fine enough glass to look.

Cheers,

-- jra

Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://baylink.pitas.com 2000 Land Rover DII St Petersburg FL USA http://photo.imageinc.us +1 727 647 1274

Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l

Jay Ashworth

7:42 p.m.

----- Original Message -----

...

From: "Mihály Héder" hedermisi@gmail.com

...

I think that is really hard to show the relation of smart/not smart people and certain forms of using the internet. I believe that it is rather a generational gap. This is how it works with young folks: -I don't drive my grandfathers car and also I don't use usenet -I don't wear my fathers clothes and I don't use wikitext -I listen to the newest music and I use nice AJAX interfaces. Because as all my friends do. And unrelated to all this, I can still be either smart or dumb.

I guess I'm going to have to stop using "smart" to imply "willing to invest the time and energy necessary to properly present my information and opinions so that people will actually pay attention to them". My apologies.

Cheers, -- jra

Mihály Héder

7:45 p.m.

On 13 February 2012 19:42, Jay Ashworth jra@baylink.com wrote:

...

----- Original Message -----

...
From: "Mihály Héder" hedermisi@gmail.com

...
I think that is really hard to show the relation of smart/not smart people and certain forms of using the internet. I believe that it is rather a generational gap. This is how it works with young folks: -I don't drive my grandfathers car and also I don't use usenet -I don't wear my fathers clothes and I don't use wikitext -I listen to the newest music and I use nice AJAX interfaces. Because as all my friends do. And unrelated to all this, I can still be either smart or dumb.

I guess I'm going to have to stop using "smart" to imply "willing to invest the time and energy necessary to properly present my information and opinions so that people will actually pay attention to them". My apologies.

Maybe. But then I would repeat my argument to this category, too.

...

Cheers,

-- jra

Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://baylink.pitas.com 2000 Land Rover DII St Petersburg FL USA http://photo.imageinc.us +1 727 647 1274

Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l

Platonides

10 Feb 10 Feb

5:20 p.m.

On 10/02/12 00:47, vitalif@yourcmc.ru wrote:

...

Hi again!

Okay. I've read all the answers and want to make a small summary for myself (IMHO):

The MOST important: I think Wikipedia's markup complexity problems

are not the problems of Wikitext idea. I think WP should concentrate on improving WP-specific markup (tons of various infoboxes and etc), maybe move some templates to extensions, instead of changing the wikitext totally or even worse - hiding it from the user.

Indeed, that's an important piece. We don't our new Visual Editor users scared due to being yelled for some change the editor did for them.

...

I still beg MW developers to not throw the wikitext idea away! (maybe

if it will prove being useless for ALL (100%) users one day - but I'm sure this day won't come). Visual editor itself won't harm anything, I will support it. But I'll say my users: if you want to make small changes, here is the visual editor. If you want to write something big, here is the plaintext.

If big changes can only be done with plaintext, there's something wrong in your Visual Editor. Also note, doing little plaintext changes expose the users to the syntax, so they would be slowly getting more familiar with it.

...

I see there are usability talks. Improving usability is good, but

targetting at people who don't want to be "technically proficient" (emphasis: DON'T WANT TO BE, not just "ARE NOT") and just want to press a magic button and "that's all" is not good. There are topics like "internal vs external links, users tend to find one and ignore other" - that's ridiculous, that's NOT the usability problem. I.e. I think we shouldn't think of "usability initiatives" as of 100% correct ideas.

The Visual Editor is the candy with which you try to engage them.

Jay Ashworth

5:25 p.m.

----- Original Message -----

...

From: "Platonides" platonides@gmail.com

...

...

I see there are usability talks. Improving usability is good, but

targetting at people who don't want to be "technically proficient" (emphasis: DON'T WANT TO BE, not just "ARE NOT") and just want to press a magic button and "that's all" is not good. There are topics like "internal vs external links, users tend to find one and ignore other" - that's ridiculous, that's NOT the usability problem. I.e. I think we shouldn't think of "usability initiatives" as of 100% correct ideas.

The Visual Editor is the candy with which you try to engage them.

Sure.

I think the argument being made here -- it is certainly mine -- is simply "please don't penalize the 'smart people' to benefit the masses, regardless of how many people are in each group".

Cheers, -- jra

Oliver Keyes

5:29 p.m.

I'd agree with that. Only two problems:

*First, I don't think anyone is suggesting the underlying markup syntax will not still be usable as an alternative to plaintext *It is a massive mistake to confuse "smart" with "capable of understanding markup". The second group includes the first, but not as a large chunk. "Only smart people can grok our markup" does not mean "only people who grok our markup are smart". It is important not to confuse the two.

On 10 February 2012 16:25, Jay Ashworth jra@baylink.com wrote:

...

----- Original Message -----

...
From: "Platonides" platonides@gmail.com

...
...

I see there are usability talks. Improving usability is good, but

targetting at people who don't want to be "technically proficient" (emphasis: DON'T WANT TO BE, not just "ARE NOT") and just want to press a magic button and "that's all" is not good. There are topics like "internal vs external links, users tend to find one and ignore other" - that's ridiculous, that's NOT the usability problem. I.e. I think we shouldn't think of "usability initiatives" as of 100% correct ideas.

The Visual Editor is the candy with which you try to engage them.

Sure.

I think the argument being made here -- it is certainly mine -- is simply "please don't penalize the 'smart people' to benefit the masses, regardless of how many people are in each group".

Cheers,

-- jra

Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://baylink.pitas.com 2000 Land Rover DII St Petersburg FL USA http://photo.imageinc.us +1 727 647 1274

Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l

-- Oliver Keyes Community Liaison, Product Development Wikimedia Foundation

Jay Ashworth

5:38 p.m.

---- Original Message -----

...

From: "Oliver Keyes" okeyes@wikimedia.org

...

*First, I don't think anyone is suggesting the underlying markup syntax will not still be usable as an alternative to plaintext

That seems the underlying concern.

No one *in a position of authority* is suggesting that, no.

...

*It is a massive mistake to confuse "smart" with "capable of understanding markup". The second group includes the first, but not as a large chunk. "Only smart people can grok our markup" does not mean "only people who grok our markup are smart". It is important not to confuse the two.

If you prefer the term "power users", I'm fine with that.

But after all the time the web design community has spent trying to get us to deal in semantics, rather than presentation, it seems pretty ironic to me that we see such a pressure to do precisely the opposite...

And yes, perhaps 'smart people' is a bit elitist. But damnit, I put a lot of work into knowing the things I know. :-)

Cheers, -- jra

Oliver Keyes

6:25 p.m.

On 10 February 2012 16:38, Jay Ashworth jra@baylink.com wrote:

...

---- Original Message -----

...
From: "Oliver Keyes" okeyes@wikimedia.org

...
*First, I don't think anyone is suggesting the underlying markup syntax will not still be usable as an alternative to plaintext

That seems the underlying concern.

No one *in a position of authority* is suggesting that, no.

Well, the lead on this is Trevor - if you look at his message at http://lists.wikimedia.org/pipermail/wikitext-l/2012-February/000543.html:

"I will also confirm that we have no plan or intention to do any of the following: - Disable markup (plain text) editing - Allow visual editing to mangle plain text documents - Make significant changes to what Wikitext is and can do"

That seems to be pretty clear, and it's hard to get a higher position of authority on this project than him.

-- Oliver Keyes Community Liaison, Product Development Wikimedia Foundation

Jay Ashworth

6:33 p.m.

----- Original Message -----

...

From: "Oliver Keyes" okeyes@wikimedia.org

...

On 10 February 2012 16:38, Jay Ashworth jra@baylink.com wrote:

...
---- Original Message -----

...
From: "Oliver Keyes" okeyes@wikimedia.org

...
*First, I don't think anyone is suggesting the underlying markup syntax will not still be usable as an alternative to plaintext

That seems the underlying concern.

No one *in a position of authority* is suggesting that, no.

Well, the lead on this is Trevor - if you look at his message at http://lists.wikimedia.org/pipermail/wikitext-l/2012-February/000543.html:

"I will also confirm that we have no plan or intention to do any of the following:

Disable markup (plain text) editing

Allow visual editing to mangle plain text documents

Make significant changes to what Wikitext is and can do"

That seems to be pretty clear, and it's hard to get a higher position of authority on this project than him.

We are in violent agreement, then. ;-)

Cheers, -- jra

Mihály Héder

10:28 p.m.

Hello,

...

But after all the time the web design community has spent trying to get us to deal in semantics, rather than presentation, it seems pretty ironic to me that we see such a pressure to do precisely the opposite...

Just a minor note: I think semantics vs presentation is a false dichotomy. That drill was about structure vs presentation.

For those of us who are interested in semantics, the visual editor and its DOM+microdata scheme is really good news! I refer here to the fact that formatting info is mixed with semantic(-like) markup in wikitext. In html for example you could use GRDDL (I haven't ever seen it in application), RDFa, microformats and now microdata. They all can give you a layer for semantic annotations that is separable from the structural information. A heaven for robots collecting semantic info. So semantics is exactly why the current parser/editor plans are so exciting. I know tat they are doing it because of a tendency that the old interface is failing to attract some otherwise capable new editors. But I think there will be a huge collateral profit for the projects dealing with semantics.

Best Mihály

vitalif＠yourcmc.ru

11 Feb 11 Feb

10:08 p.m.

On Fri, 10 Feb 2012 17:20:51 +0100, Platonides wrote:

...

Also note, doing little plaintext changes expose the users to the syntax, so they would be slowly getting more familiar with it.

Yes, visual editor is really useful only for totally "technically disabled" users. At long last, most of the internet users successfully use BBCode on the forums - why they can't use wikitext?..

...

If big changes can only be done with plaintext, there's something wrong in your Visual Editor.

I mean plaintext is always more convenient for doing something big. For example, it's just faster to enter '''text''' instead of mouse-clicking [B], then switching back to keyboard, entering "text" and again clicking [B]. If you have keyboard shortcuts, then it will be Ctrl-B text Ctrl-B, which is not easier than entering '''text''' but involves switching between some modes (bold/not bold). Visual editor may be usable and may have cool features (for example, inplace editing), but I'm sure there is a lot more implementation issues and usability problems with it than with plain text. If I were MW, I would keep being simple, but if MW want to experiment - again, PLEASE experiment CAREFULLY...

Magnus Manske

10:32 p.m.

On Sat, Feb 11, 2012 at 9:08 PM, vitalif@yourcmc.ru wrote:

...

On Fri, 10 Feb 2012 17:20:51 +0100, Platonides wrote:

...
Also note, doing little plaintext changes expose the users to the syntax, so they would be slowly getting more familiar with it.

Yes, visual editor is really useful only for totally "technically disabled" users. At long last, most of the internet users successfully use BBCode on the forums - why they can't use wikitext?..

No, some/many of the internet users who use forums use BBCode, to varying degrees of success. That is a very small subsection of the whole of internet users, many of which will run away screaming from both BBCode and wikitext markup.

...

...
If big changes can only be done with plaintext, there's something wrong in your Visual Editor.

I mean plaintext is always more convenient for doing something big. For example, it's just faster to enter '''text''' instead of mouse-clicking [B], then switching back to keyboard, entering "text" and again clicking [B]. If you have keyboard shortcuts, then it will be Ctrl-B text Ctrl-B, which is not easier than entering '''text''' but involves switching between some modes (bold/not bold).

For someone who can't or don't want to wrap their head around wikitext, which I suspect is the vast majority of internet users, the difference is between "no editing" and "Word-speed editing".

Platonides

10:54 p.m.

On 11/02/12 22:08, vitalif@yourcmc.ru wrote:

...

...
If big changes can only be done with plaintext, there's something wrong in your Visual Editor.

I mean plaintext is always more convenient for doing something big. For example, it's just faster to enter '''text''' instead of mouse-clicking [B], then switching back to keyboard, entering "text" and again clicking [B]. If you have keyboard shortcuts, then it will be Ctrl-B text Ctrl-B, which is not easier than entering '''text''' but involves switching between some modes (bold/not bold).

Usually, although you get out of the central keys to type it, so the effort may be similar to enter text, go to the mouse, double click the word and press B (unless it changed to instead of bold convert it to <s><nowiki>text</nowiki></s>).

...

Visual editor may be usable and may have cool features (for example, inplace editing), but I'm sure there is a lot more implementation issues and usability problems with it than with plain text. If I were MW, I would keep being simple, but if MW want to experiment - again, PLEASE experiment CAREFULLY...

It's hard, as it needs to do a lot of things right, as opposed of receiving plain text, which is very easy. It can hardly go wrong, with any kind of device.

Stanton McCandlish

12 Feb 12 Feb

5:37 a.m.

The obvious solution to me is to build this and alpha test it internally, then beta test it on Wikia, not Wikipedia. You'll get more diverse feedback, and when things inevitably break, they'll just mess up stuff like the Battlestar Wiki, not the world's largest, most-used encyclopedia.

-- Stanton McCandlish McCandlish Consulting 9505 Tanoan Dr NE Albuquerque NM 87111-5836

505 715-7650

http://www.facebook.com/Stanton.McCandlish

Mike Dupont

6:18 a.m.

wiki already has a visual editor and a text mode. blogger has a visual editor and a html mode.

On Sun, Feb 12, 2012 at 5:37 AM, Stanton McCandlish smccandlish@gmail.comwrote:

...

then beta test it on Wikia

-- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org

Daniel Friesen

10:47 p.m.

On Sat, 11 Feb 2012 20:37:12 -0800, Stanton McCandlish smccandlish@gmail.com wrote:

...

The obvious solution to me is to build this and alpha test it internally, then beta test it on Wikia, not Wikipedia. You'll get more diverse feedback, and when things inevitably break, they'll just mess up stuff like the Battlestar Wiki, not the world's largest, most-used encyclopedia.

-- Stanton McCandlish McCandlish Consulting 9505 Tanoan Dr NE Albuquerque NM 87111-5836

505 715-7650

http://www.facebook.com/Stanton.McCandlish

Wikia already has it's own Visual Editor. Wikia's Visual Editor already breaks content and has some communities complaining about it and ensuring Wikia has it disabled on their wiki. Wikia has already screwed with their edit page implementation enough that any Visual Editor built to work in vanilla MediaWiki or WikiEditor won't even function on Wikia, so it's a worthless place to test anyways. Wikia is going to take ages to shed 1.16, doesn't make a very good test bed for something that should end up running on WikiMedia which uses the newest stable code it can, especially when all that code is supposed to be based on ResourceLoader improvements and JavaScript code that comes with RL that won't be on Wikia. Wikia also has a different audience than Wikipedia. Unlike Wikipedia which has a balance of technical and non-technical users, Wikia has a large number of wikis which are tilted heavily to the non-technical of users. To the point where Wikia can release a Visual Editor, have it break piles of source text, and have no-one care because a number of wikis don't have any users that even look at the source. It's not a very good place to test a visual editor when there are wikis with users that won't even complain when something is broken.

-- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Mihály Héder

6 Feb 6 Feb

10:23 p.m.

Hi Pavel,

thanks for your detailed reply, I understand your position better now.

Well, as they already pointed out, throwing away the current wiki markup would be immensely difficult because: -the millions of pages we already have is not easy to convert in the absence of a formalized wiki grammar -the users who are pros in wikitext (and some of them are already afraid that this skill will be obsolete because of the new editor, like the thread starter)

By following this list I hope I gathered how they plan to tackle this really hard problem: -a functional decomposition of what the current parser does to a separate tokenizer, an AST(aka WOM or now just DOM) builder and a serializer. Also, AST building might be further decomposed to the builder part and an error handling according to html specs. -in architecture terms, all this will be a separate component, unlike the old php parser which is really hard to take out from the rest of the code. In this setup there is hope that the tokenizing task can be specified with a set of rules, thus effectively creating a wikitext tokenizing standard (already a great leap forward!) Then the really custom stuff (because wikitext still lacks a formal grammar) can be encapsulated in AST building.

(I hope I reconstructed this right.) I think this is the smartest thing to do in this situation. It will not only enable them to create an alternative visual editor, which is the original goal. It is more far-reaching than that. It will also enable you to create an editor which uses the syntax you already started to envision in this thread. It will let me do a lot of stuff I dream of in our project, Sztakipedia. Also, dbpedia can be more effective having this parser. Creating books from wiki will be much easier. People will be able to migrate stuff into mediawiki from other CMS's (also migrate the other way for that matter). We could have wiki syntax warnings in the regular wikitext interface, etc. And most importantly, I'm certain it will enable many other things I cannot foresee now. So I wish them the best of luck (as I'm sure they will need it :)

---

Also, I want to answer to one particular point...

...

...
Knowledge about the article one wants to edit? Surely not. Devotion to make an edit? Probably. Markup skills? Probably. Tolerance for outdated interfaces? I say this, too.

Can you repeat all of this if someone is reluctant to read Wikipedia editing/copyright guidelines? Why in your opinion editing with plain yet intuitive markup is different from rich editor?

Yes, I would repeat this argument about reading the guidelines too. It have been my fixa idea (and research topic) for years that an editor should incorporate enough intelligence to be able to represent the community's (in this case wikipedians and readers) interests (in this case the quality requirements) as an agent. What I imagine is a system which does not bother me until I use a cite template for the first time for example - but then it tries to evaluate whether I use it according the guidelines - and probably explain me the guidelines. It would also help me to fill infoboxes, find the right templates, categories and links, warn me if I don't structure the article clearly enough (a nice feature request from the Hungarian WMF :) I know most of this is science fiction but I hope we will have something like this in the far future :)

But my point is that the guidelines would be much easier to digest if always presented in context and only the relevant part. Maybe I'm an utopist but I can imagine a wikipedia where fresh editors just start typing their knowledge with zero education and still be able to immediately produce valuable output, provided they have good intentions.

Best Mihály

Jay Ashworth

10:32 p.m.

----- Original Message -----

...

From: "Mihály Héder" hedermisi@gmail.com

...

By following this list I hope I gathered how they plan to tackle this really hard problem: -a functional decomposition of what the current parser does to a separate tokenizer, an AST(aka WOM or now just DOM) builder and a serializer. Also, AST building might be further decomposed to the builder part and an error handling according to html specs. -in architecture terms, all this will be a separate component, unlike the old php parser which is really hard to take out from the rest of the code. In this setup there is hope that the tokenizing task can be specified with a set of rules, thus effectively creating a wikitext tokenizing standard (already a great leap forward!) Then the really custom stuff (because wikitext still lacks a formal grammar) can be encapsulated in AST building.

As I noted in a reply I wrote on this thread a few minutes ago (but it was kinda buried): there are between 4 and 7 projects with varying stages of seriosity that are already in work, some of them having posted to this list one or more times.

At least a couple of them had as a serious goal producing a formalized, architecturally cleaner parser that could be dropped into Mediawiki.

The framing of your reply suggests that you needed to know that and didn't.

Cheers, -- jra

Platonides

11:39 p.m.

Replies inline in non-strict order.

Funnily, many of the syntax quirks come from the editors making the templates, they are not imposed by the wikitext in any way, and could be removed today if wished:

...

What is "pp-semi" why it "move-indef"?

Names given by the users.

...

Why bgcolour is there? It's presentation. Is any editor going to

change background color of the infobox, ever? Perhaps it has a different tone and must be #c0c1ff?

Actually, that seems wrong for an article, since I'd expect all planets to have the same background color (it'd be fine as a hidden template for templates extending it).

...

"temperatures = yes" - obviously, if any temperatures are specified

in the infobox then the temperature block is enabled. Is it possible otherwise?

Can be done. The template could instead check for any value being set.

...

No Celsius temperatures - the machine does better job converting

values than human plus calculator.

Can be done.

...

No link in "[[Kelvin]]" - the machine can place link itself, can't it?

Can be done.

...

No   because it just fixes some engine problem and is problem

for its devs, not editors or, mind you, users. 6. No degree symbols: it's cleaner and users don't have to search for the special char (°).

Both could be provided by the template.

...

Replaced all <ref>s with another "template argument" named

"temperatures min". This works simple: the machine calculates minimal value and applies given reference to it, if corresponding argument is passed. If no - no reference.

Can be done.

...

But let's be blunt and say that markup is about text MARKUP, not presentation. Not to the extent that drives such text to incomprehensibleness. If we accept the thesis that markup must be human-readable and everything else MUST be handled by the machine no matter how "complex" this might become for it we can achieve some interesting results. For example, if I'm a "newbie but intelligent" Wikipedian and open the page on "The Earth" I see this:

{{About|the planet}} {{pp-semi|small=yes}}{{pp-move-indef}} {{Infobox Planet | bgcolour = #c0c0ff | name = Earth | symbol = [[File:Earth symbol.svg|25px|Astronomical symbol of Earth]] ... }}

This is how I understand what human-readable markup means:

Okay, I understand that "{{" and "}}" are some special symbols -

but why pipe is between what seem to be words and in the start of some new lines?

I disagree. The pipe in {{About|the planet}} can look odd, but the pipe at the beginning of the line looks natural. It seems like some kind of continuation of the {{.

...

These are just 3 basic questions. Processed, the above snippet might look like this:

{{About Earth, the planet}} {{Infobox symbol = [[File:Earth symbol.svg|25px|Astronomical symbol of Earth]] ... }}

That's it. Now the machine's part:

"About" is a special "template" or some other construct. When it's

"ran" (processed) it accepts 2 colon-separated "arguments". The first specifies "name" of something (place, planet, object, etc.), the second - its "role". Moreover, even these are article-specific and can be changed. The point is that this line remains equally understandable regardless of the article type.

Space separated arguments are more readable for the casual editor, but normal editors would have a harder time to find out what's the template and what the parameters. Also, there are colons as parameters. How would you write as the parameter the article [[Gypsy: A Musical Fable]] or [[Batman: Year One]] ? By banning ':' in titles?

...

"Infobox Planet" has transformed into just "Infobox" - we've got

"planet" defined in "About". 4. "bgcolor" is system presentation-specific thing, no place for it in the contents.

Agree.

...

"name = Earth" - we've got this along with "Planet".

No. We have "Earth, the planet"!

You will need the proper name in the infobox, such as "Felis silvestris catus", even if the article is just called "Cat".

(temperatures markup)

...

What kind of false positives are we talking about? Will any sane individual spend his precious time not editing but preparing to edit this mess?

I think they copy and paste, then fill the fields. Which is a good way of learning as they encounter it.

...

Again, not to offend anyone and least - MediaWiki devs, but if we're talking about wikitext future the above must look much more plain: temperatures = in Kelvin: 184, 287.2, 331

Well, just by looking at it I have no idea what those temperatures are :)

184 and 331 are probably some kind of limits, but what's that 287.2? Some kind of boilding point? What if they were in a different order?

Platonides

7 Feb 7 Feb

12:04 a.m.

On 06/02/12 07:58, Pavel Tkachenko wrote:

...

Making markup language-neutral is easy enough: even a single person can carry on the research to find keyboard symbols that are easily accessible across different language standards. From my experience they are ! % * ( ) - = _ and +. This will eliminate the need to layout switches (for example, currently a Russian Wikipedia editor must switch layout 5 times in typing simple "underline" since neither < nor > are present in Russian layout;

No. It's not easy. It's painful. The goal of wikitext is to make html editing easy. HTML only needs a few special characters: <>&;=" but it's bothersome. So instead of <ul> <li>Dogs <li>Cats <li>Hens </ul>

We define that * is a bullet and serves to make lists: * Dogs * Cats * Hens

It's easier to type, and looks good.

Then we also want numbered lists. Instead of <ol> <li>One <li>Two <li>Three </ol>

We define # as the equivalent for numbered lists. Note that there's no usage of # for numbers in many cultures, so that's less 'visual' there. # One # Two # Three

You then continue adding tricks of "this looks like", sometimes needing a bery crazy mind. But each feature requires new symbols, and when you look at those available on every layout, you get *very* limited...

For example, I could decide to list imagemaps as `Image1´ `Image2´... (grave and acute), but oh, many keyboards don't have both accents. (I happen to have both, but had to copy the acute because it rejected being displayed alone, converting to an apostrophe...)

And obviously, you can't use something that would easily appear in a normal text (or you start defining escape codes which are uglier, too).

...

My study indicates that the number of available symbols will allow to avoid HTML-style tags completely - this will further simplify the markup. For instance, instead of "__" can be used; <ref> can be replaced by "[[*ref]]" for uniformity with links; and so on. I am ready to give expanded explanation if anyone is interested.

How do you type the *content* of the references?

...

the same goes for links: "[[link]]" and "[[link|caption]]" - pipe is also not present [in Russian layout]

[]| are some of the very few forbidden characters in the titles. That's why we can take advantage of them for title splitting. How would you differenciate between link target (page title) and link caption? It's simple to start defining a sensible syntax. But when you want to "go further", you start being limited. The most sane approach is probably to fall back to <tags> and leave them in the "complex" section. Why <> and not anything else? Just because that's what the underlying html uses. Some people is already familiar with that, too. See, mediawiki didn't define as wikitext for underline.

It allows some html as, including and . and have friendly counterparts*, has not. The reason being that use of underlining is discouraged.

*And that also turned out to have issues, ever tried to write wikitext in piedmontese?

...

Special tokens like #REDIRECT, {{-}}, <imagemap>, __TOC__, etc. that all use different syntaxes can be uniformized in a way similar to template insertions: {{redir New page}}, {{clear}}, {{imagemap image.png, title x y, ...}}, {{TOC}} and so on. Templates can be called as {{tpl template arg arg arg}} - even if we keep { and } that require layout switch in some languages we eliminate the pipe which just makes things worse and text - less readable.

{{-}} is not a wikitext token. #REDIRECT and __TOC__ are a sad effect of separate building of contents. They are incoherent with the rest of the syntax. Note you can (and some wikis do) use a {{TOC}} template. You can't wrap #REDIRECT in a template, though, because the redirect applies to the template itself (unless you use some odd escaping?)

Pavel Tkachenko

5:33 p.m.

Giant mails follow, no panic.

2012/2/6 Gabriel Wicke wicke@wikidev.net:

...

The enriched HTML DOM we are building (and actually most token stream processing including template expansion) is not tied to any specific syntax or user interface.

It is tied to HTML and it's the same. Even if all of current wikitext features can be represented by HTML (which I doubt) there's no guarantee that this will be true in furutre. This view has probably led to current messy markup.

When I'm saying that HTML can't represent even current wikitext features I imply that we're not talking about microformats and other XHTML tricks. And if we're talking about plain HTML then why not use a completely new format for storing DOM? Or at least clean XML without any standard namespaces that in theory should ease rendering of DOM into HTML (?). It will be parser-specific and won't suffer from future changes of linked namespaces, will be simple to test, etc.

On Future/Parser development diagram HTML DOM is built right after the stream has been parsed into a tree... in other words HTML5 is used to represent wiki. With tricks.

But I'm already venturing into an offtopic discussion here.

...

But in any case, we first have to implement a solid tokenizer for the current syntax and recreate at least a part of the higher-level functionality (template expansion etc) based on a syntax-independent representation.

I agree on this one.

2012/2/6 Sumana Harihareswara sumanah@wikimedia.org:

...

Pavel, you're clearly both an intelligent and a technical man - but not all intelligence is of the same, technically-minded type, and it's not always backed up by pertinent and complex knowledge.

I'm flattered with your words, thanks, Oliver.

However, this does not explain why at first Wikipedians had no troubles editing (and even creating) articles and now they are gradually loosing this skill. Is this a result of general degradation? I would hate to think this way and believe it's more what Yury has already said above - the project is just getting mature and, naturally, subjects for new articles that are left require more than a general knowledge while edits for existing articles are either complete, require some special knowledge as well or are plain unmotivated enough - new page patrol, "article babysitting", etc. are all "dirty" work and by definition are not that interesting as adding a new article section, prooflink or even correcting a simple typo.

In other words regular edits can be done by most visitors while maintenance - by small slice of them. It's the same with computers and users: most of people can use the computer but only some of them can type regedit.exe without breaking things. Is it different with Wikipedia? Is it different with most other non-commercial project?

...

The complexity of our existing markup language is a barrier, but not as much as the presence of any markup language whatsoever as a default.

Now this is something specific to argue about. I must admit that your speech has given me something to think about; perhaps you're right and the fact that initial editors of Wikipedia have come from that "first wave" of the Internet users - with this in mind it's understandable why their number is wearing out.

The usability studies that you have referred to speak with one accord that WYSIWYG is a must. I admit it sounds appropriate in that context. Still, another link suggests that even non-technical people were able to edit and (uh!) format text as bold and italic given a bit of help. And then it notes that even before doing any edits - or seeing an editor's window, be it text or visual - people were confronted with dozens of guideline links and warnings.

Which problem is more important? How you're going to present users with warnings in an inline visual editor? Or is it easier to just put "I've read and understood the rules" fobber-off and consider the matter settled?

More things to ponder about before my peaceful sleep, huh.

p.s: I wonder why people who can actually give answers are quite often not in the mailing lists.

2012/2/6 Amgine amgine@wikimedians.ca:

...

As I understand it, for the foreseeable future there will be a raw wiki syntax interface available. I hope contributors can be reassured on this point.

Combined with: 2012/2/6 Trevor Parscal tparscal@wikimedia.org:

...

Make significant changes to what Wikitext is and can do

The problem with this is that if present "raw wiki syntax" will be kept it will ensure that edits continue downfall.

...

The concern I see being expressed, fundamentally, is "I have developed skills, practices, and efficiencies with current Wiki syntax. Is your new parser going to destroy my investments in learning? am I going to have to start over with this new system?"

I think it's close in words but not in the meaning. What will you choose: cope with an old dusty car of your grandfather with annual repairs, dyeing, cleaning or find a free day, go to a nearby shop, choose a top-notch car with nano-tech-driven automatic repair, dyeing, cleaner that will serve you for the foreseeable future?

How many programmers (given the opportunity) choose to maintain old spaghetti code over refactoring it to something they'll have pleasure working with?

Quite few, as you probably know. It's the same with common folk who'll stick to old printer, scanner and copier than a new all-in-one device. But it's not right and everyone knows that it's better when they break this trend.

2012/2/7 Jay Ashworth jra@baylink.com:

...

Correct, and it isn't merely investments in learning; there are likely investments in wrap-around-the-outside coding which assume access to markup as well. Not All Mediawikiae Are Wikipedia.

I hope this was not a case for keeping old markup running. Most of the time it's better to provide backward compatibility module running on top of the new system than to fix and repair the old system trying to pursue mythical goal of supporting old versions.

Look at C++ STL and what it has become since '89. Look at Microsoft Windows and if its performance on an 4xi7 core has scaled along with Windows 95 on an 80356.

2012/2/7 Mihaly Heder hedermisi@gmail.com:

...

the millions of pages we already have is not easy to convert in the absence of a formalized wiki grammar

Indeed, but this can be solved by bringing together all pieces of modern wikitext under one roof and building a new strict grammar apart from that. Then a converter can be written that will seamlessly transform old syntax into new and warn user when this is not possible.

...

From what I know this is the direction WMF is going.

...

and some of them are already afraid that this skill will be obsolete because of the new editor, like the thread starter

This is the second time this argument appears in this thread but I don't understand it. Will you be afraid to "lose" your old coat that has worn our in a bin?

...

By following this list I hope I gathered how they plan to tackle this really hard problem: ... I think this is the smartest thing to do in this situation.

I agree, this is a way to go. If this works out then even terrible markup syntax won't be such a big trouble as it is now because if you disagree you can write your own tokenizer. The point is: why keep terrible syntax? It will be necessary to parse and transform the old wikitext syntax - I hope this is not argued - then why care how much improved markup will look like it or will it be completely different at all?

People will only greet the fact that, while a full-scale visual editor is underway, they at least can read and make occasional edits using more or less humane syntax. The best possible.

Isn't Wikimedia about a world with free knowledge? If so, it deals with texts most of the time. And the tools used have to be top-notch - there is no Microsoft to lobby OpenXML and force it down CEOs' throats.

I might sound rude but I hope for understanding. 5 years are enough for a single person (if he's minimally funded) to carry on the research, create an ideal markup language (as ideal as it can be spanning all cultures and nations), write a parser/serializer/renderer and even attach a text editor with advanced features made from scratch. And then there's even half of the time left.

Everyone at WMF able to hold the sword can get this thing done once and for all in a very short amount of time. No more annual parser rewrites, no more markup hell. After all, MediaWiki and its markup is the main workhorse of the community. It can't be kept on shelf that long...

Signed, P. Tkachenko

Pavel Tkachenko

5:35 p.m.

Platonides,

2012/2/7 Platonides platonides@gmail.com:

...

they are not imposed by the wikitext in any way, and could be removed today if wished:

Then why are they still there?

As I have said in my previous message I am ready to break down any piece of markup that you want. Templates are just most crazy part of current wikitext.

...

...

What is "pp-semi" why it "move-indef"?

Names given by the users.

It's funny that users give names that others don't understand. Even those who are "technically proficient" but not part of "the elite".

...

The pipe in {{About|the planet}} can look odd, but the pipe at the beginning of the line looks natural. It seems like some kind of continuation of the {{.

Apart from just "looking natural" argument I would put "crucial need". I think quotes look natural after template parameter names' - but they have no use and will duplicate existing functionality (a parameter cannot last after the beginning of next parameter, for instance).

In other words why do we need a pipe if we assume that a template can have two modes: inline and block. Inline contain no line feeds, block contain line feeds before each of their parameters.

But first let's think if template should have 'mixed' mode. This will make things more complex without any need and it will make source look messy because it'll depend on the user if he wants to put that particular parameter on a separate line or not (imagine two guys: one of a "VT100" terminal with 80 characters in column and one with a latest 30" plasma display). It will also require markup to provide additional means of separating parameters if line feed can be trusted no more.

It's time to sharpen our Occam's Razor or the life will do this.

...

Space separated arguments are more readable for the casual editor, but normal editors would have a harder time to find out what's the template and what the parameters.

You're thinking in terms of parameters and my point was to discard all of this stuff and think in terms of human writing.

What looks more natural - {{About|Earth|the planet}} or {{About Earth, the planet}}? Since when our handwriting produces pipes instead of spaces (and not even all of them?).

...

Also, there are colons as parameters. How would you write as the parameter the article [[Gypsy: A Musical Fable]] or [[Batman: Year One]] ? By banning ':' in titles?

Have I said something about colons and links? Links are fine with colons or any other symbols.

But if we're touching this pipes in links are not that intuitive either. Pipes are actually not present on many keyboard layouts but even apart from that it's more natural to use an equality sign. Or double, for the purpose of text markup.

In fact, doubling a symbol is a great way of both differentiating it from misoperations and making easily recognizable by human eye. Space can be used for the same purpose (as a delimiter).

Let's be concrete: 1. Links are wrapped in [[ and ]]. 2. Links may optionally have titles. A title is separated from link URL (or local page name) with a space. 3. Those links which contain spaces in their address can be given title after double equality sign. 4. Finally, in very rare cases when both space and equality symbol is necessary a special markup-wise (!) escape symbol can be used.

Note how we tie up the links between markup at a whole and its particular tokens: 1. Markup-wise, tokens (links, headings, formatting, etc.) are created using double symbols. Say, [[ and ]] for links, == and == for headings, **bold**, __underline__, etc. This is the cornerstone. 2. Markup-wise, there is a single way of escaping markup. Escaping must be particularly well-thought because it will be the trickiest part even for experienced editors.

Currently wikitext uses terrible "<nowiki>stuff</nowiki>" but it doesn't always work and HTMLTidy comes in handy with its < and >. And some places (such as link titles) cannot be escaped altogether. Of course, given this approach we are forced to use an inconvenient symbol - pipe - which is unlikely to occur in normal text and thus needs no escaping 99% of cases.

But let's put this from bottom-up ot top-down. Think about a good-looking, rarely-used symbol that we will use for escaping... I'm sure the amount of texts Wikipedia has it's easy to conduct a research to determine that symbol but in this example I'll pick tilde (~), I'm actually using it for that purpose in my home-made markups.

What follows is the complete list of cases: 1. [[Link]] 2. [[Link Title]] = [[Link|Title]] now 3. [[Space link==Title]] == [[Space link|Title]] now (pipe needs layout change while ==' doesn't) 4. An URL containing '==' in its query part: [[http://weird.url/?k~==v]]. I've put tilde before '==' to prevent it from being treated as an address/title separator. Since there's no other separator - link has no caption. How wikitext handles links with pipes? Are they banned? 5. Local page name containing tilde: [[~ (Tilde)==Title]] - nothing breaks here because tilde is only special when it escapes something... and space isn't something to be escaped, so tilde is treated as normal text. 6. Extreme case: an URL containing both tilde and double equality sign which originally (in browser address bar) looks like: http://url?k~==v. In our link we will simply triple the tilde - making first tilde to escape itself and last tilde - to escape the separator: [[http://url?k~~~==v]]. 7. For the sake of completeness, most extreme case: URL with ~== and we've got to specify a title. We don't have to invent a wheel - simply put the title after a space and properly escape the URL as in case #6: [[http://url?k~~~==v Title]].

Now if you think "wow. that's a particular mess, nobody is going to understand and use this syntax". But before that note that only the first 3 cases are standard - others are exceptions which still increase their complexity gracefully according to the task.

For example, how wikitext will handle links containing pipes? hard to say, probably [[|]]. <nowiki>? [[<nowiki&gtl]]. <nowiki>? [[&lt;nowiki&gtl]].

As it is demonstrated the above system scales well and thus there is a space for improving wikitext, even if not using exactly my scheme.

A best thing about this is that once you're memorized the above 2 fundamental rules you can apply them anywhere. Markup can be escaped using tilde: ~[[no more a link]]. Associative (aka definition) lists can be uniformized: = Definition Value = Definition with spaces == Value = Definition of ~==, now its == Value

And so on. Scalable.

...

No. We have "Earth, the planet"!

You mean that a template cannot put "the" in front because some planets have their name without "the"? If so, they are not planets (nebulas, satellites, etc.) and a different template must be used. And if it's used the machine can handle language peculiarities.

...

You will need the proper name in the infobox, such as "Felis silvestris catus", even if the article is just called "Cat".

Yes: {{About Felis silvestris catus, cat}}

...

...
What kind of false positives are we talking about? Will any sane individual spend his precious time not editing but preparing to edit this mess?

I think they copy and paste, then fill the fields. Which is a good way of learning as they encounter it.

And the best way is to create/write things from scratch on our own. Where have you learned programming, in the class copy-pasting lines from blackboard to your notebook or in the office actually hitting the keys?

Many things can be hidden under copy-paste approach, there's even a notion "code monkey". I believe if wikitext continutes the path of STL there will be "wiki monkeys", provided that WMF gets commercial (which is a last thing I want to see).

But only syntax that hides nothing and is crisp to its bones can be called fair. Only syntax that doesn't require any "templates" that you just "copy" and "feel"... sorry, "fill in". And of course, there are cases when templates (I mean wikitext {{ and }}) with parameters are of great help and there are cases when parameters with their pipes and whistles are redundant. Just like life.

...

Well, just by looking at it I have no idea what those temperatures are :)

You mean that "| max_temp_1 = 331 K<ref name=asu_highest_temp/>" gives you more ideas?

...

What if they were in a different order?

So the machine cannot sort them out and determine which is "max"?

...

184 and 331 are probably some kind of limits, but what's that 287.2? Some kind of boilding point?

Hm, you might be right on this one and that it's better to have two template parameters: "temperature" and "temperature mean". This is what discussion is for - finding rough edges, right.

...

The goal of wikitext is to make html editing easy.

HTML editing? I thought wikitext was about text editing. Why not edit HTML using HTML?

...

HTML only needs a few special characters: <>&;=" but it's bothersome.

And 4 of them are absent on my native layout.

...

We define that * is a bullet and serves to make lists: It's easier to type, and looks good.

Completely agree, this is a human-readable markup.

...

We define # as the equivalent for numbered lists. Note that there's no usage of # for numbers in many cultures, so that's less 'visual' there.

And this is to be refactored adding features along the way. Let's write ordered lists using digits! 1. One 2. Two 3. Three

The Japanese have their own kanji for numbers but they use Arabic digits sometimes as well.

Oh, I hear your thoughts - "items are moved/removed and the order is gone". Sure but machine can help us out: 1. One 1. Two 1. Three

We use a little trick here: since ordered items with identical marker value are useless in human texts the markup can use them to represent automatically ordered markers. But even if someday we need two identical markers this can be fixed using some clean syntax, such as: 1. First 1#1 First

Just as with links and [[eq==signs]] above our syntax gradually increases complexity. The person gets crazy so does the markup - but not before the user.

The added features of this approach are: 1. Lists now support, say, 6 marker types: 1. (digits) 01. (zero-padded digits) a. (lower-alpha) A. (upper-alpha) i. (lower-Roman) I. (upper-Roman) 2. Lists now can have markers of any value: 3. Third 2. Second 1. First 3. List types can be mixed: 1. Digit i. Roman a. Alpha

...

But each feature requires new symbols, and when you look at those available on every layout, you get *very* limited...

This is far from truth. During last years I have developed for my projects a markup that generally surpasses MediaWiki's and uses just common symbols. I will briefly summarize it in the end of this message.

...

For example, I could decide to list imagemaps as `Image1? `Image2?... (grave and acute), but oh, many keyboards don't have both accents.

Obviously, you cannot get a symbol for each and every particular markup case. But you don't have to. Compare how often will you use imagemaps and, say, highlighted PHP code. Right now both are pretty lengthy and angular in wikitext and the former (less used) is even shorter:

<imagemap> ... </imagemap> 10 + 11 symbols.

<source lang="php"> ... </source> 19 + 9 symbols.

Do you see my point?

...

And obviously, you can't use something that would easily appear in a normal text (or you start defining escape codes which are uglier, too).

Nah, escape codes come from C, don't forget it's not a very friendly language. Instead of escape codes one can use "quoting" (in Pascal style) - I have already touched tilde symbol above.

But I agree that markup that relies on escapes/quoting of any kind is not fair. Escapes by definition are exceptions and cannot overwhelm the common rule.

...

How do you type the *content* of the references?

As a title: [[*http:// My reference goes here]]. Or, better: we can use the same syntax for footnotes: [[*My footnote]] and references will be removed. So two types of footnotes: inline [[*text with ''markup'']], and block which start with [[*, then line break, then any kind of markup (including line breaks, more footnotes, links, even headings and lists), then line break, closing ]] and another line break. Uniform and consistent yet powerful and flexible.

...

But when you want to "go further", you start being limited.

I believe we will not if we start thinking in term of simplicity, not features - the former will give features as demonstrated above but if we focus on the latter it will force it out.

...

Just because that's what the underlying html uses.

This thinking is the problem - it is attached to particular use case. "We will use HTML 5 DOM just because we won't need to transform it when rendering". But what about PDF? XML? FB2? RTF? DOC? ODT?

Machine can handle everything, its time is much less precious than human's. Once written, a framework will perform wikitext <-> HTML transformations in an instant; so goes for some intermediate (completely detached from target, notice this) tree serialization format - even if it's binary (personally I think binary is the only choice here).

Then why care about if we will be rendering Wikipedia for someone's browser or Kindle? Because if we do we will need to invent adaptors and switches for all but the format we have chosen as primary. And things change, even that format may change and the framework will be left with DOM format theoretically based on some old HTML3 with patches here and there. It will no more use "underlying HTML".

...

The reason being that use of underlining is discouraged.

I agree but this is again target-thinking. Looking in future, markup does not necessary define presentation, even basic like bold and italic. You certainly know that is discouraged in HTML in favor of - why? Because it's semantics, not presentation. Similarly, is presentation but __ is semantics. We can define __terms__ like this. Does it look good? Can you define terms like that? No, you will need a new entity - of course soon there will be no symbols left on any keyboard, even Japanese!

Ockam's razor.

...

*And that also turned out to have issues, ever tried to write wikitext in piedmontese?

I have already said that a research of modern keyboard layouts with their approximate user count is necessary if someone is going to define an ideal keyboard layout. I am sure that even "=" can be absent in some layouts (Japanese again). But there are general symbols thanks to IBM.

...

#REDIRECT and __TOC__ are a sad effect of separate building of contents.

Half of current wikitext is a sad effect. Most of C++ standard is a sad effect. Come on, does this prevents us from producing positive?

This is what next generation of a software is meant to bring - core reworked to the last screw based on previous use experience.

...

You can't wrap #REDIRECT in a template, though, because the redirect applies to the template itself (unless you use some odd escaping?)

Once again target-thinking. Why limit {{...}} to templates? I have mentioned this in my message:

2012/2/6 Pavel Tkachenko proger.xp@gmail.com:

...

can be uniformized in a way similar to template insertions: {{redir New page}}, ... Templates can be called as {{tpl template arg arg arg}}

Note that templates are subset of {{construct}} features. "Redir" and "TOC" are actually not templates but "extensions". Or "actions", name doesn't change the meaning. The point is to have a uniform syntax for custom constructs, in other words extensions. It's obvious that no matter how well-thought a standard is there will always be missing features once it hits the reality. It must be prepared for this and this construct is one of the ways.

Now, to be concrete, what I see as a better syntax for text markup: * formatting: **bold** //italic// __underline__ (or other semantic meaning) --strikethru-- ++small++ (semantics) ^^superscript^^ * styling text - replacement for : !!(cls)text!! * code: %%unformatted%% * highlighting: %%(php)unformatted%% * lists (ordered, unordered, definition) - already covered above * blocks in different language (ISO 639): @@ru text@@ * footnotes: [[*footnote text]] * quotes: >inline ( >>older etc. ) and <[block]> * terms - replacement for : (?term desc?) or (?space term==desc?) * headings - just like current wikitext, ==heading== and so on * misc markup: ??comment?? (invisible in resulting document), HTML &&entities; (double ampersand)

The actual markup is almost twice as large as the above but you already know it: most tokens have block form as opposed to the inline (above). * styling blocks: !!(cls) content !! * code highlighting: %%(php) echo 'Hello, world!'; %% * language blocks: @@jp some Japanese text @@ * footnotes: [[* Footnote. [[Link]] More text. ]] * comments: ?? author's comment. can be shown in "draft output mode". ??

As demonstrated, it uses no HTML, BB-codes or any other tag-driven markup. Symbols used are (ordered by rough use count): * / % ( ) ! = - ? & > [ ] _ + ^

The first 9 symbols are quite common, so are _ and +. I have put [ ] in the end because if alternative ((link)) syntax is allowed then those two are only used by <[blockquote]>.

If wikitext syntax improvement is to be considered by the community I am ready to give it more details. The above listing misses several important points which require more explanations (in particular about %%code and {{action}} calls).

Signed, P. Tkachenko

Trevor Parscal

6:09 p.m.

This thread is quickly becoming quite tl;dr - but I think the discussions are valid and useful

Maybe we could break some of these desperate topics up a bit?

- Trevor

On Tue, Feb 7, 2012 at 8:35 AM, Pavel Tkachenko proger.xp@gmail.com wrote:

...

Platonides,

2012/2/7 Platonides platonides@gmail.com:

...
they are not imposed by the wikitext in any way, and could be removed today if wished:

Then why are they still there?

As I have said in my previous message I am ready to break down any piece of markup that you want. Templates are just most crazy part of current wikitext.

...
...

What is "pp-semi" why it "move-indef"?

Names given by the users.

It's funny that users give names that others don't understand. Even those who are "technically proficient" but not part of "the elite".

...
The pipe in {{About|the planet}} can look odd, but the pipe at the beginning of the line looks natural. It seems like some kind of continuation of the {{.

Apart from just "looking natural" argument I would put "crucial need". I think quotes look natural after template parameter names' - but they have no use and will duplicate existing functionality (a parameter cannot last after the beginning of next parameter, for instance).

In other words why do we need a pipe if we assume that a template can have two modes: inline and block. Inline contain no line feeds, block contain line feeds before each of their parameters.

But first let's think if template should have 'mixed' mode. This will make things more complex without any need and it will make source look messy because it'll depend on the user if he wants to put that particular parameter on a separate line or not (imagine two guys: one of a "VT100" terminal with 80 characters in column and one with a latest 30" plasma display). It will also require markup to provide additional means of separating parameters if line feed can be trusted no more.

It's time to sharpen our Occam's Razor or the life will do this.

...
Space separated arguments are more readable for the casual editor, but normal editors would have a harder time to find out what's the template and what the parameters.

You're thinking in terms of parameters and my point was to discard all of this stuff and think in terms of human writing.

What looks more natural - {{About|Earth|the planet}} or {{About Earth, the planet}}? Since when our handwriting produces pipes instead of spaces (and not even all of them?).

...
Also, there are colons as parameters. How would you write as the

parameter

...
the article [[Gypsy: A Musical Fable]] or [[Batman: Year One]] ? By banning ':' in titles?

Have I said something about colons and links? Links are fine with colons or any other symbols.

But if we're touching this pipes in links are not that intuitive either. Pipes are actually not present on many keyboard layouts but even apart from that it's more natural to use an equality sign. Or double, for the purpose of text markup.

In fact, doubling a symbol is a great way of both differentiating it from misoperations and making easily recognizable by human eye. Space can be used for the same purpose (as a delimiter).

Let's be concrete:

Links are wrapped in [[ and ]].

Links may optionally have titles. A title is separated from link

URL (or local page name) with a space. 3. Those links which contain spaces in their address can be given title after double equality sign. 4. Finally, in very rare cases when both space and equality symbol is necessary a special markup-wise (!) escape symbol can be used.

Note how we tie up the links between markup at a whole and its particular tokens:

Markup-wise, tokens (links, headings, formatting, etc.) are created

using double symbols. Say, [[ and ]] for links, == and == for headings, **bold**, __underline__, etc. This is the cornerstone. 2. Markup-wise, there is a single way of escaping markup. Escaping must be particularly well-thought because it will be the trickiest part even for experienced editors.

Currently wikitext uses terrible "<nowiki>stuff</nowiki>" but it doesn't always work and HTMLTidy comes in handy with its < and >. And some places (such as link titles) cannot be escaped altogether. Of course, given this approach we are forced to use an inconvenient symbol - pipe - which is unlikely to occur in normal text and thus needs no escaping 99% of cases.

But let's put this from bottom-up ot top-down. Think about a good-looking, rarely-used symbol that we will use for escaping... I'm sure the amount of texts Wikipedia has it's easy to conduct a research to determine that symbol but in this example I'll pick tilde (~), I'm actually using it for that purpose in my home-made markups.

What follows is the complete list of cases:

[[Link]]

[[Link Title]] = [[Link|Title]] now

[[Space link==Title]] == [[Space link|Title]] now (pipe needs

layout change while ==' doesn't) 4. An URL containing '==' in its query part: [[http://weird.url/?k~==v]]. I've put tilde before '==' to prevent it from being treated as an address/title separator. Since there's no other separator - link has no caption. How wikitext handles links with pipes? Are they banned? 5. Local page name containing tilde: [[~ (Tilde)==Title]] - nothing breaks here because tilde is only special when it escapes something... and space isn't something to be escaped, so tilde is treated as normal text. 6. Extreme case: an URL containing both tilde and double equality sign which originally (in browser address bar) looks like: http://url?k~==v. In our link we will simply triple the tilde - making first tilde to escape itself and last tilde - to escape the separator: [[http://url?k~~~==v]]. 7. For the sake of completeness, most extreme case: URL with ~== and we've got to specify a title. We don't have to invent a wheel - simply put the title after a space and properly escape the URL as in case #6: [[http://url?k~~~==v Title]].

Now if you think "wow. that's a particular mess, nobody is going to understand and use this syntax". But before that note that only the first 3 cases are standard - others are exceptions which still increase their complexity gracefully according to the task.

For example, how wikitext will handle links containing pipes? hard to say, probably [[|]]. <nowiki>? [[<nowiki&gtl]]. <nowiki>? [[&lt;nowiki&gtl]].

As it is demonstrated the above system scales well and thus there is a space for improving wikitext, even if not using exactly my scheme.

A best thing about this is that once you're memorized the above 2 fundamental rules you can apply them anywhere. Markup can be escaped using tilde: ~[[no more a link]]. Associative (aka definition) lists can be uniformized: = Definition Value = Definition with spaces == Value = Definition of ~==, now its == Value

And so on. Scalable.

...
No. We have "Earth, the planet"!

You mean that a template cannot put "the" in front because some planets have their name without "the"? If so, they are not planets (nebulas, satellites, etc.) and a different template must be used. And if it's used the machine can handle language peculiarities.

...
You will need the proper name in the infobox, such as "Felis silvestris catus", even if the article is just called "Cat".

Yes: {{About Felis silvestris catus, cat}}

...
...
What kind of false positives are we talking about? Will any sane individual spend his precious time not editing but preparing to edit this mess?

I think they copy and paste, then fill the fields. Which is a good way of learning as they encounter it.

And the best way is to create/write things from scratch on our own. Where have you learned programming, in the class copy-pasting lines from blackboard to your notebook or in the office actually hitting the keys?

Many things can be hidden under copy-paste approach, there's even a notion "code monkey". I believe if wikitext continutes the path of STL there will be "wiki monkeys", provided that WMF gets commercial (which is a last thing I want to see).

But only syntax that hides nothing and is crisp to its bones can be called fair. Only syntax that doesn't require any "templates" that you just "copy" and "feel"... sorry, "fill in". And of course, there are cases when templates (I mean wikitext {{ and }}) with parameters are of great help and there are cases when parameters with their pipes and whistles are redundant. Just like life.

...
Well, just by looking at it I have no idea what those temperatures are :)

You mean that "| max_temp_1 = 331 K<ref name=asu_highest_temp/>" gives you more ideas?

...
What if they were in a different order?

So the machine cannot sort them out and determine which is "max"?

...
184 and 331 are probably some kind of limits, but what's that 287.2? Some kind of boilding point?

Hm, you might be right on this one and that it's better to have two template parameters: "temperature" and "temperature mean". This is what discussion is for - finding rough edges, right.

...
The goal of wikitext is to make html editing easy.

HTML editing? I thought wikitext was about text editing. Why not edit HTML using HTML?

...
HTML only needs a few special characters: <>&;=" but it's bothersome.

And 4 of them are absent on my native layout.

...
We define that * is a bullet and serves to make lists: It's easier to type, and looks good.

Completely agree, this is a human-readable markup.

...
We define # as the equivalent for numbered lists. Note that there's no usage of # for numbers in many cultures, so that's less 'visual' there.

And this is to be refactored adding features along the way. Let's write ordered lists using digits!

One

Two

Three

The Japanese have their own kanji for numbers but they use Arabic digits sometimes as well.

Oh, I hear your thoughts - "items are moved/removed and the order is gone". Sure but machine can help us out:

One

Two

Three

We use a little trick here: since ordered items with identical marker value are useless in human texts the markup can use them to represent automatically ordered markers. But even if someday we need two identical markers this can be fixed using some clean syntax, such as:

First

1#1 First

Just as with links and [[eq==signs]] above our syntax gradually increases complexity. The person gets crazy so does the markup - but not before the user.

The added features of this approach are:

Lists now support, say, 6 marker types: 1. (digits) 01.

(zero-padded digits) a. (lower-alpha) A. (upper-alpha) i. (lower-Roman) I. (upper-Roman) 2. Lists now can have markers of any value: 3. Third 2. Second

First

List types can be mixed:

Digit

i. Roman a. Alpha

...
But each feature requires new symbols, and when you look at those available on every layout, you get *very* limited...

This is far from truth. During last years I have developed for my projects a markup that generally surpasses MediaWiki's and uses just common symbols. I will briefly summarize it in the end of this message.

...
For example, I could decide to list imagemaps as `Image1? `Image2?... (grave and acute), but oh, many keyboards don't have both accents.

Obviously, you cannot get a symbol for each and every particular markup case. But you don't have to. Compare how often will you use imagemaps and, say, highlighted PHP code. Right now both are pretty lengthy and angular in wikitext and the former (less used) is even shorter:

<imagemap> ... </imagemap> 10 + 11 symbols.

<source lang="php"> ... </source> 19 + 9 symbols.

Do you see my point?

...
And obviously, you can't use something that would easily appear in a normal text (or you start defining escape codes which are uglier, too).

Nah, escape codes come from C, don't forget it's not a very friendly language. Instead of escape codes one can use "quoting" (in Pascal style) - I have already touched tilde symbol above.

But I agree that markup that relies on escapes/quoting of any kind is not fair. Escapes by definition are exceptions and cannot overwhelm the common rule.

...
How do you type the *content* of the references?

As a title: [[*http:// My reference goes here]]. Or, better: we can use the same syntax for footnotes: [[*My footnote]] and references will be removed. So two types of footnotes: inline [[*text with ''markup'']], and block which start with [[*, then line break, then any kind of markup (including line breaks, more footnotes, links, even headings and lists), then line break, closing ]] and another line break. Uniform and consistent yet powerful and flexible.

...
But when you want to "go further", you start being limited.

I believe we will not if we start thinking in term of simplicity, not features - the former will give features as demonstrated above but if we focus on the latter it will force it out.

...
Just because that's what the underlying html uses.

This thinking is the problem - it is attached to particular use case. "We will use HTML 5 DOM just because we won't need to transform it when rendering". But what about PDF? XML? FB2? RTF? DOC? ODT?

Machine can handle everything, its time is much less precious than human's. Once written, a framework will perform wikitext <-> HTML transformations in an instant; so goes for some intermediate (completely detached from target, notice this) tree serialization format - even if it's binary (personally I think binary is the only choice here).

Then why care about if we will be rendering Wikipedia for someone's browser or Kindle? Because if we do we will need to invent adaptors and switches for all but the format we have chosen as primary. And things change, even that format may change and the framework will be left with DOM format theoretically based on some old HTML3 with patches here and there. It will no more use "underlying HTML".

...
The reason being that use of underlining is discouraged.

I agree but this is again target-thinking. Looking in future, markup does not necessary define presentation, even basic like bold and italic. You certainly know that is discouraged in HTML in favor of - why? Because it's semantics, not presentation. Similarly, is presentation but __ is semantics. We can define __terms__ like this. Does it look good? Can you define terms like that? No, you will need a new entity - of course soon there will be no symbols left on any keyboard, even Japanese!

Ockam's razor.

...
*And that also turned out to have issues, ever tried to write wikitext in piedmontese?

I have already said that a research of modern keyboard layouts with their approximate user count is necessary if someone is going to define an ideal keyboard layout. I am sure that even "=" can be absent in some layouts (Japanese again). But there are general symbols thanks to IBM.

...
#REDIRECT and __TOC__ are a sad effect of separate building of contents.

Half of current wikitext is a sad effect. Most of C++ standard is a sad effect. Come on, does this prevents us from producing positive?

This is what next generation of a software is meant to bring - core reworked to the last screw based on previous use experience.

...
You can't wrap #REDIRECT in a template, though, because the redirect

applies

...
to the template itself (unless you use some odd escaping?)

Once again target-thinking. Why limit {{...}} to templates? I have mentioned this in my message:

2012/2/6 Pavel Tkachenko proger.xp@gmail.com:

...
can be uniformized in a way similar to template insertions: {{redir New page}}, ... Templates can be called as {{tpl template arg arg arg}}

Note that templates are subset of {{construct}} features. "Redir" and "TOC" are actually not templates but "extensions". Or "actions", name doesn't change the meaning. The point is to have a uniform syntax for custom constructs, in other words extensions. It's obvious that no matter how well-thought a standard is there will always be missing features once it hits the reality. It must be prepared for this and this construct is one of the ways.

Now, to be concrete, what I see as a better syntax for text markup:

formatting: **bold** //italic// __underline__ (or other semantic

meaning) --strikethru-- ++small++ (semantics) ^^superscript^^

styling text - replacement for : !!(cls)text!!

code: %%unformatted%%

highlighting: %%(php)unformatted%%

lists (ordered, unordered, definition) - already covered above

blocks in different language (ISO 639): @@ru text@@

footnotes: [[*footnote text]]

quotes: >inline ( >>older etc. ) and <[block]>

terms - replacement for : (?term desc?) or

(?space term==desc?)

headings - just like current wikitext, ==heading== and so on

misc markup: ??comment?? (invisible in resulting document), HTML

&&entities; (double ampersand)

The actual markup is almost twice as large as the above but you already know it: most tokens have block form as opposed to the inline (above).

styling blocks:

!!(cls) content !!

code highlighting:

%%(php) echo 'Hello, world!'; %%

language blocks:

@@jp some Japanese text @@

footnotes:

[[* Footnote. [[Link]] More text. ]]

comments:

?? author's comment. can be shown in "draft output mode". ??

As demonstrated, it uses no HTML, BB-codes or any other tag-driven markup. Symbols used are (ordered by rough use count):

/ % ( ) ! = - ? & > [ ] _ + ^

The first 9 symbols are quite common, so are _ and +. I have put [ ] in the end because if alternative ((link)) syntax is allowed then those two are only used by <[blockquote]>.

If wikitext syntax improvement is to be considered by the community I am ready to give it more details. The above listing misses several important points which require more explanations (in particular about %%code and {{action}} calls).

Signed, P. Tkachenko

Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l

Trevor Parscal

6:20 p.m.

This thread is quickly becoming quite tl;dr - but I think the discussions are valid and useful.

Also, the list just prevented me from posting this message because it was too large (because I included all the quoted replies, which my mail client does by default)

Maybe we could break some of these desperate topics up a bit?

- Trevor

Trevor Parscal

6:29 p.m.

And by desperate, I really meant disparate.

Really.

- Trevor

On Tue, Feb 7, 2012 at 9:20 AM, Trevor Parscal tparscal@wikimedia.orgwrote:

...

This thread is quickly becoming quite tl;dr - but I think the discussions are valid and useful.

Also, the list just prevented me from posting this message because it was too large (because I included all the quoted replies, which my mail client does by default)

Maybe we could break some of these desperate topics up a bit?

Trevor

Platonides

11:27 p.m.

On 02/07/12 17:35, Pavel Tkachenko wrote:

...

Platonides,

2012/2/7 Platonides platonides@gmail.com:

...
they are not imposed by the wikitext in any way, and could be removed today if wished:

Then why are they still there?

Nobody proposed to change the template in that way? :)

...

As I have said in my previous message I am ready to break down any piece of markup that you want. Templates are just most crazy part of current wikitext.

...
...

What is "pp-semi" why it "move-indef"?

Names given by the users.

It's funny that users give names that others don't understand. Even those who are "technically proficient" but not part of "the elite".

...
The pipe in {{About|the planet}} can look odd, but the pipe at the beginning of the line looks natural. It seems like some kind of continuation of the {{.

Apart from just "looking natural" argument I would put "crucial need". I think quotes look natural after template parameter names' - but they have no use and will duplicate existing functionality (a parameter cannot last after the beginning of next parameter, for instance).

...

In other words why do we need a pipe if we assume that a template can have two modes: inline and block. Inline contain no line feeds, block contain line feeds before each of their parameters.

But first let's think if template should have 'mixed' mode. This will make things more complex without any need and it will make source look messy because it'll depend on the user if he wants to put that particular parameter on a separate line or not (imagine two guys: one of a "VT100" terminal with 80 characters in column and one with a latest 30" plasma display). It will also require markup to provide additional means of separating parameters if line feed can be trusted no more.

If you start creating inline, block and mixed template modes, I suspect the syntax will end up being chaotic (I'm thinking in concrete cases in MW syntax).

...

...
Space separated arguments are more readable for the casual editor, but normal editors would have a harder time to find out what's the template and what the parameters.

You're thinking in terms of parameters and my point was to discard all of this stuff and think in terms of human writing.

That assumes that there's a non-ambiguous way to express that in natural language (plus that it is easily parseable by a machine).

...

What looks more natural - {{About|Earth|the planet}} or {{About Earth, the planet}}? Since when our handwriting produces pipes instead of spaces (and not even all of them?).

So, how do you split {{About Bijection, injection and surjection}} ?

The point of using an additional character not used in normal language is precisely for working the metalanguage.

...

...
Also, there are colons as parameters. How would you write as the parameter the article [[Gypsy: A Musical Fable]] or [[Batman: Year One]] ? By banning ':' in titles?

Have I said something about colons and links? Links are fine with colons or any other symbols.

You mentioned colons for template arguments << Now the machine's part: 1. "About" is a special "template" or some other construct. When it's "ran" (processed) it accepts 2 colon-separated "arguments". >>

I'm acting as the devil's advocate asking you how to provide those titles as parameters to a template.

...

But if we're touching this pipes in links are not that intuitive either. Pipes are actually not present on many keyboard layouts but even apart from that it's more natural to use an equality sign. Or double, for the purpose of text markup.

It's consistent with the use of pipes in templates (which do use equal in that way to name parameters). Although link syntax was probably earlier.

...

In fact, doubling a symbol is a great way of both differentiating it from misoperations and making easily recognizable by human eye. Space can be used for the same purpose (as a delimiter).

Let's be concrete:

Links are wrapped in [[ and ]].

Links may optionally have titles. A title is separated from link

URL (or local page name) with a space. 3. Those links which contain spaces in their address can be given title after double equality sign.

So is [[Batman Forever]] your syntax for [[Batman Forever|Batman Forever]] or [[Batman|Forever]] ? So much cases are bad, KISS.

...

Finally, in very rare cases when both space and equality symbol is

necessary a special markup-wise (!) escape symbol can be used.

As an example: [[2 + 2 = 5]]

...

Note how we tie up the links between markup at a whole and its particular tokens:

Markup-wise, tokens (links, headings, formatting, etc.) are created

using double symbols. Say, [[ and ]] for links, == and == for headings, **bold**, __underline__, etc. This is the cornerstone.

Would you remove === headings?

...

Markup-wise, there is a single way of escaping markup. Escaping

must be particularly well-thought because it will be the trickiest part even for experienced editors.

Currently wikitext uses terrible "<nowiki>stuff</nowiki>" but it doesn't always work and HTMLTidy comes in handy with its < and >. And some places (such as link titles) cannot be escaped altogether.

Really? I think you can.

...

Of course, given this approach we are forced to use an inconvenient symbol - pipe - which is unlikely to occur in normal text and thus needs no escaping 99% of cases.

But let's put this from bottom-up ot top-down. Think about a good-looking, rarely-used symbol that we will use for escaping... I'm sure the amount of texts Wikipedia has it's easy to conduct a research to determine that symbol but in this example I'll pick tilde (~), I'm actually using it for that purpose in my home-made markups.

What follows is the complete list of cases:

[[Link]]

[[Link Title]] = [[Link|Title]] now

[[Space link==Title]] == [[Space link|Title]] now (pipe needs

layout change while ==' doesn't) 4. An URL containing '==' in its query part: [[http://weird.url/?k~==v]]. I've put tilde before '==' to prevent it from being treated as an address/title separator. Since there's no other separator - link has no caption. How wikitext handles links with pipes? Are they banned?

They work fine. Note that external links use a space to separate url and caption (being very inconsistent with internal links).

Your proposal for forcing to edit the urls is very bad. You can't just paste, you need to go changing every = on it (which is a frequent character) to ~==.

...

Local page name containing tilde: [[~ (Tilde)==Title]] - nothing

breaks here because tilde is only special when it escapes something... and space isn't something to be escaped, so tilde is treated as normal text. 6. Extreme case: an URL containing both tilde and double equality sign which originally (in browser address bar) looks like: http://url?k~==v. In our link we will simply triple the tilde - making first tilde to escape itself and last tilde - to escape the separator: [[http://url?k~~~==v]]. 7. For the sake of completeness, most extreme case: URL with ~== and we've got to specify a title. We don't have to invent a wheel - simply put the title after a space and properly escape the URL as in case #6: [[http://url?k~~~==v Title]].

Aaaargh!!

...

Now if you think "wow. that's a particular mess, nobody is going to understand and use this syntax". But before that note that only the first 3 cases are standard - others are exceptions which still increase their complexity gracefully according to the task.

For example, how wikitext will handle links containing pipes? hard to say, probably [[|]]. <nowiki>? [[<nowiki&gtl]]. <nowiki>? [[&lt;nowiki&gtl]].

Pipes are banned from titles.

...

As it is demonstrated the above system scales well and thus there is a space for improving wikitext, even if not using exactly my scheme.

A best thing about this is that once you're memorized the above 2 fundamental rules you can apply them anywhere. Markup can be escaped using tilde: ~[[no more a link]]. Associative (aka definition) lists can be uniformized: = Definition Value = Definition with spaces == Value = Definition of ~==, now its == Value

And so on. Scalable.

Messy.

...

...
No. We have "Earth, the planet"!

You mean that a template cannot put "the" in front because some planets have their name without "the"? If so, they are not planets (nebulas, satellites, etc.) and a different template must be used. And if it's used the machine can handle language peculiarities.

Nope, I meant that you can't just get "whatever goes after About" and show it as the name of the planet in the template.

...

...
You will need the proper name in the infobox, such as "Felis silvestris catus", even if the article is just called "Cat".

Yes: {{About Felis silvestris catus, cat}}

...
...
What kind of false positives are we talking about? Will any sane individual spend his precious time not editing but preparing to edit this mess?

I think they copy and paste, then fill the fields. Which is a good way of learning as they encounter it.

And the best way is to create/write things from scratch on our own. Where have you learned programming, in the class copy-pasting lines from blackboard to your notebook or in the office actually hitting the keys?

I'm not sure this is a good analogy. Copy-pasting chunks of code look like copying phrases from other articles to make your own. That should be original. OTOH, reusing the existing LaTeX template is much more appropiate than writing your own from scratch trying to copy the style of the provided one.

Even if I write a program from scratch, I should make it consistent with other tools. That means an appropiate arguments would be sort -r --ignore-case --sort=month ./myfile instead of sort <- !case (sort as month) \\./myfile\\

regardless if I'm using getopt or not.

...

Many things can be hidden under copy-paste approach, there's even a notion "code monkey". I believe if wikitext continutes the path of STL there will be "wiki monkeys", provided that WMF gets commercial (which is a last thing I want to see).

But only syntax that hides nothing and is crisp to its bones can be called fair. Only syntax that doesn't require any "templates" that you just "copy" and "feel"... sorry, "fill in". And of course, there are cases when templates (I mean wikitext {{ and }}) with parameters are of great help and there are cases when parameters with their pipes and whistles are redundant. Just like life.

...
Well, just by looking at it I have no idea what those temperatures are :)

You mean that "| max_temp_1 = 331 K<ref name=asu_highest_temp/>" gives you more ideas?

...
What if they were in a different order?

So the machine cannot sort them out and determine which is "max"?

You are giving many attributions to the machine. Personally, I would spit out an error, just in they were eg. in different units. But you are making up your syntax, then requiring the system to adapt for you.

...

...
184 and 331 are probably some kind of limits, but what's that 287.2? Some kind of boilding point?

Hm, you might be right on this one and that it's better to have two template parameters: "temperature" and "temperature mean". This is what discussion is for - finding rough edges, right.

...

...
The goal of wikitext is to make html editing easy.

HTML editing? I thought wikitext was about text editing. Why not edit HTML using HTML?

Because it's considered cumbersome. (Actually, it's presentational editing, but as the presentation is obtained by using HTML as an intermediate language...)

...

...
HTML only needs a few special characters: <>&;=" but it's bothersome.

And 4 of them are absent on my native layout.

...
We define that * is a bullet and serves to make lists: It's easier to type, and looks good.

Completely agree, this is a human-readable markup.

...
We define # as the equivalent for numbered lists. Note that there's no usage of # for numbers in many cultures, so that's less 'visual' there.

And this is to be refactored adding features along the way. Let's write ordered lists using digits!

One

Two

Three

The Japanese have their own kanji for numbers but they use Arabic digits sometimes as well.

Oh, I hear your thoughts - "items are moved/removed and the order is gone". Sure but machine can help us out:

One

Two

Three

We use a little trick here: since ordered items with identical marker value are useless in human texts the markup can use them to represent automatically ordered markers. But even if someday we need two identical markers this can be fixed using some clean syntax, such as:

First

1#1 First

And you have complicated the originally clean syntax of 1, 2, 3

...

Just as with links and [[eq==signs]] above our syntax gradually increases complexity. The person gets crazy so does the markup - but not before the user.

The added features of this approach are:

Lists now support, say, 6 marker types: 1. (digits) 01.

(zero-padded digits) a. (lower-alpha) A. (upper-alpha) i. (lower-Roman) I. (upper-Roman) 2. Lists now can have markers of any value: 3. Third 2. Second

First

List types can be mixed:

Digit

i. Roman a. Alpha

...

...
But each feature requires new symbols, and when you look at those available on every layout, you get *very* limited...

This is far from truth. During last years I have developed for my projects a markup that generally surpasses MediaWiki's and uses just common symbols. I will briefly summarize it in the end of this message.

...
For example, I could decide to list imagemaps as `Image1? `Image2?... (grave and acute), but oh, many keyboards don't have both accents.

Obviously, you cannot get a symbol for each and every particular markup case. But you don't have to. Compare how often will you use imagemaps and, say, highlighted PHP code. Right now both are pretty lengthy and angular in wikitext and the former (less used) is even shorter:

<imagemap> ... </imagemap> 10 + 11 symbols.

<source lang="php"> ... </source> 19 + 9 symbols.

Do you see my point?

Interesting. I would have considered <imagemap> easier (no parameter, quotes...) not even realising the tag length differences.

...

...
And obviously, you can't use something that would easily appear in a normal text (or you start defining escape codes which are uglier, too).

Nah, escape codes come from C, don't forget it's not a very friendly language. Instead of escape codes one can use "quoting" (in Pascal style) - I have already touched tilde symbol above.

But I agree that markup that relies on escapes/quoting of any kind is not fair. Escapes by definition are exceptions and cannot overwhelm the common rule.

...
How do you type the *content* of the references?

As a title: [[*http:// My reference goes here]]. Or, better: we can use the same syntax for footnotes: [[*My footnote]] and references will be removed. So two types of footnotes: inline [[*text with ''markup'']], and block which start with [[*, then line break, then any kind of markup (including line breaks, more footnotes, links, even headings and lists), then line break, closing ]] and another line break. Uniform and consistent yet powerful and flexible.

...
But when you want to "go further", you start being limited.

I believe we will not if we start thinking in term of simplicity, not features - the former will give features as demonstrated above but if we focus on the latter it will force it out.

...
Just because that's what the underlying html uses.

This thinking is the problem - it is attached to particular use case. "We will use HTML 5 DOM just because we won't need to transform it when rendering". But what about PDF? XML? FB2? RTF? DOC? ODT?

Machine can handle everything, its time is much less precious than human's. Once written, a framework will perform wikitext <-> HTML transformations in an instant; so goes for some intermediate (completely detached from target, notice this) tree serialization format - even if it's binary (personally I think binary is the only choice here).

Then why care about if we will be rendering Wikipedia for someone's browser or Kindle? Because if we do we will need to invent adaptors and switches for all but the format we have chosen as primary. And things change, even that format may change and the framework will be left with DOM format theoretically based on some old HTML3 with patches here and there. It will no more use "underlying HTML".

...
The reason being that use of underlining is discouraged.

I agree but this is again target-thinking. Looking in future, markup does not necessary define presentation, even basic like bold and italic. You certainly know that is discouraged in HTML in favor of - why? Because it's semantics, not presentation. Similarly, is presentation but __ is semantics. We can define __terms__ like this. Does it look good? Can you define terms like that? No, you will need a new entity - of course soon there will be no symbols left on any keyboard, even Japanese!

Ockam's razor.

...
*And that also turned out to have issues, ever tried to write wikitext in piedmontese?

I have already said that a research of modern keyboard layouts with their approximate user count is necessary if someone is going to define an ideal keyboard layout. I am sure that even "=" can be absent in some layouts (Japanese again). But there are general symbols thanks to IBM.

...
#REDIRECT and __TOC__ are a sad effect of separate building of contents.

Half of current wikitext is a sad effect. Most of C++ standard is a sad effect. Come on, does this prevents us from producing positive?

This is what next generation of a software is meant to bring - core reworked to the last screw based on previous use experience.

...
You can't wrap #REDIRECT in a template, though, because the redirect applies to the template itself (unless you use some odd escaping?)

Once again target-thinking. Why limit {{...}} to templates? I have mentioned this in my message:

2012/2/6 Pavel Tkachenko proger.xp@gmail.com:

...
can be uniformized in a way similar to template insertions: {{redir New page}}, ... Templates can be called as {{tpl template arg arg arg}}

Note that templates are subset of {{construct}} features. "Redir" and "TOC" are actually not templates but "extensions". Or "actions", name doesn't change the meaning. The point is to have a uniform syntax for custom constructs, in other words extensions. It's obvious that no matter how well-thought a standard is there will always be missing features once it hits the reality. It must be prepared for this and this construct is one of the ways.

Now, to be concrete, what I see as a better syntax for text markup:

formatting: **bold** //italic// __underline__ (or other semantic

meaning) --strikethru-- ++small++ (semantics) ^^superscript^^

styling text - replacement for : !!(cls)text!!

code: %%unformatted%%

highlighting: %%(php)unformatted%%

lists (ordered, unordered, definition) - already covered above

blocks in different language (ISO 639): @@ru text@@

footnotes: [[*footnote text]]

quotes: >inline ( >>older etc. ) and <[block]>

terms - replacement for : (?term desc?) or

(?space term==desc?)

headings - just like current wikitext, ==heading== and so on

misc markup: ??comment?? (invisible in resulting document), HTML

&&entities; (double ampersand)

Would html links become italic? (that was a problem of wikicreole, it was defined as 'italic unless in links')

...

The actual markup is almost twice as large as the above but you already know it: most tokens have block form as opposed to the inline (above).

styling blocks:

!!(cls) content !!

code highlighting:

%%(php) echo 'Hello, world!'; %%

language blocks:

@@jp some Japanese text @@

footnotes:

[[* Footnote. [[Link]] More text. ]]

comments:

?? author's comment. can be shown in "draft output mode". ??

As demonstrated, it uses no HTML, BB-codes or any other tag-driven markup. Symbols used are (ordered by rough use count):

/ % ( ) ! = - ? & > [ ] _ + ^

The first 9 symbols are quite common, so are _ and +. I have put [ ] in the end because if alternative ((link)) syntax is allowed then those two are only used by <[blockquote]>.

If wikitext syntax improvement is to be considered by the community I am ready to give it more details. The above listing misses several important points which require more explanations (in particular about %%code and {{action}} calls).

Well, I have to say it seems well though, it "doesn't look bad".

Platonides

11:52 p.m.

On 02/07/12 17:33, Pavel Tkachenko wrote:

...

Giant mails follow, no panic.

2012/2/6 Gabriel Wicke wicke@wikidev.net:

...
The enriched HTML DOM we are building (and actually most token stream processing including template expansion) is not tied to any specific syntax or user interface.

It is tied to HTML and it's the same. Even if all of current wikitext features can be represented by HTML (which I doubt) there's no guarantee that this will be true in furutre. This view has probably led to current messy markup.

When I'm saying that HTML can't represent even current wikitext features I imply that we're not talking about microformats and other XHTML tricks. And if we're talking about plain HTML then why not use a completely new format for storing DOM? Or at least clean XML without any standard namespaces that in theory should ease rendering of DOM into HTML (?). It will be parser-specific and won't suffer from future changes of linked namespaces, will be simple to test, etc.

On Future/Parser development diagram HTML DOM is built right after the stream has been parsed into a tree... in other words HTML5 is used to represent wiki. With tricks.

But I'm already venturing into an offtopic discussion here.

...
But in any case, we first have to implement a solid tokenizer for the current syntax and recreate at least a part of the higher-level functionality (template expansion etc) based on a syntax-independent representation.

I agree on this one.

2012/2/6 Sumana Harihareswara sumanah@wikimedia.org:

...
Pavel, you're clearly both an intelligent and a technical man - but not all intelligence is of the same, technically-minded type, and it's not always backed up by pertinent and complex knowledge.

I'm flattered with your words, thanks, Oliver.

However, this does not explain why at first Wikipedians had no troubles editing (and even creating) articles and now they are gradually loosing this skill. Is this a result of general degradation?

Not necessarily. It doesn't mean that lose ability to edit, but it can mean that the subset people which know how to edit no longer have available the topics they can write about. (this is just a reformulation of your below argument)

There are two subsets: * People able to edit. * People which can add knowledge.

At the beginning, the interesection was huge even if ability to edit was low, just because there was a lot of knowledge missing. So as the knowledge increases (eg. linearly) "people" appear to be more and more stupid for editing.

...

I would hate to think this way and believe it's more what Yury has already said above - the project is just getting mature and, naturally, subjects for new articles that are left require more than a general knowledge while edits for existing articles are either complete, require some special knowledge as well or are plain unmotivated enough - new page patrol, "article babysitting", etc. are all "dirty" work and by definition are not that interesting as adding a new article section, prooflink or even correcting a simple typo. (...)

...

...
The complexity of our existing markup language is a barrier, but not as much as the presence of any markup language whatsoever as a default.

Now this is something specific to argue about. I must admit that your speech has given me something to think about; perhaps you're right and the fact that initial editors of Wikipedia have come from that "first wave" of the Internet users - with this in mind it's understandable why their number is wearing out.

The usability studies that you have referred to speak with one accord that WYSIWYG is a must. I admit it sounds appropriate in that context. Still, another link suggests that even non-technical people were able to edit and (uh!) format text as bold and italic given a bit of help. And then it notes that even before doing any edits - or seeing an editor's window, be it text or visual - people were confronted with dozens of guideline links and warnings.

Which problem is more important? How you're going to present users with warnings in an inline visual editor? Or is it easier to just put "I've read and understood the rules" fobber-off and consider the matter settled?

Ability to edit and knowledge of rules are probably orthogonal. And users have a inmense rule-blindness. They won't want to read pages and pages of rules or tutorials. They just want things done (eg. change a birth date) I think most people act the same way. What was the last time you read the VCR manual? And we should take that into account, too, not making Rules/Tutorials that look like EULAs. Nonetheless, such thing would be hard to do.

...

More things to ponder about before my peaceful sleep, huh.

p.s: I wonder why people who can actually give answers are quite often not in the mailing lists.

2012/2/6 Amgine amgine@wikimedians.ca:

...
As I understand it, for the foreseeable future there will be a raw wiki syntax interface available. I hope contributors can be reassured on this point.

Combined with: 2012/2/6 Trevor Parscal tparscal@wikimedia.org:

...
Make significant changes to what Wikitext is and can do

The problem with this is that if present "raw wiki syntax" will be kept it will ensure that edits continue downfall.

I disagree. Its existence in the backend shouldn't influcence it.

...

...
The concern I see being expressed, fundamentally, is "I have developed skills, practices, and efficiencies with current Wiki syntax. Is your new parser going to destroy my investments in learning? am I going to have to start over with this new system?"

I think it's close in words but not in the meaning. What will you choose: cope with an old dusty car of your grandfather with annual repairs, dyeing, cleaning or find a free day, go to a nearby shop, choose a top-notch car with nano-tech-driven automatic repair, dyeing, cleaner that will serve you for the foreseeable future?

How many programmers (given the opportunity) choose to maintain old spaghetti code over refactoring it to something they'll have pleasure working with?

Quite few, as you probably know. It's the same with common folk who'll stick to old printer, scanner and copier than a new all-in-one device. But it's not right and everyone knows that it's better when they break this trend.

Your proposing is like adopting a new, highly improved C² programming language and throwing all C code (which would be incompatible with the new one). You are not proposing to create a new C² language, but also to stop all support for C.

...

2012/2/7 Jay Ashworth jra@baylink.com:

...
Correct, and it isn't merely investments in learning; there are likely investments in wrap-around-the-outside coding which assume access to markup as well. Not All Mediawikiae Are Wikipedia.

I hope this was not a case for keeping old markup running. Most of the time it's better to provide backward compatibility module running on top of the new system than to fix and repair the old system trying to pursue mythical goal of supporting old versions.

Look at C++ STL and what it has become since '89. Look at Microsoft Windows and if its performance on an 4xi7 core has scaled along with Windows 95 on an 80356.

2012/2/7 Mihaly Heder hedermisi@gmail.com:

...
the millions of pages we already have is not easy to convert in the absence of a formalized wiki grammar

Indeed, but this can be solved by bringing together all pieces of modern wikitext under one roof and building a new strict grammar apart from that. Then a converter can be written that will seamlessly transform old syntax into new and warn user when this is not possible.

So you say "Oh, no problem. I will make this wonderful C2C² converter that will seamlessly produce equivalent C² code from the original C one, so you don't need to rewrite things from scratch. It will be automatically taken care of." Yes. Until that inline assembly that worked in C code makes a random memory overwrite in the kernel. And that other function, which got a compatible by luck of sizeof(int) == sizeof(void*), now in C² makes the application die horribily... and so on.

...

...
From what I know this is the direction WMF is going.

...
and some of them are already afraid that this skill will be obsolete because of the new editor, like the thread starter

This is the second time this argument appears in this thread but I don't understand it. Will you be afraid to "lose" your old coat that has worn our in a bin?

If people is concerned about it, that's a reason to be concerned even if it wasn't rational. I think the reasoning can be explaines as it's better the devil you know...

Pavel Tkachenko

8 Feb 8 Feb

11:20 a.m.

On 08.02.2012 2:52, Platonides wrote:

...

At the beginning, the interesection was huge even if ability to edit was low, just because there was a lot of knowledge missing. So as the knowledge increases (eg. linearly) "people" appear to be more and more stupid for editing.

This is well put. Perhaps this calls for RTFM but were there studies that have examined the trend from this POV?

...

Ability to edit and knowledge of rules are probably orthogonal. And users have a inmense rule-blindness. They won't want to read pages and pages of rules or tutorials. They just want things done (eg. change a birth date) I think most people act the same way. What was the last time you read the VCR manual?

Yes, I agree with you completely, I don't even remember if I have read a manual for any of my cellphones.

...

And we should take that into account, too, not making Rules/Tutorials that look like EULAs. Nonetheless, such thing would be hard to do.

Indeed.

On 07.02.2012 1:23, Mihály Héder wrote:

...

What I imagine is a system which does not bother me until I use a cite template for the first time for example - but then it tries to evaluate whether I use it according the guidelines - and probably explain me the guidelines.

I have missed this part yesterday and it's clearly interesting. I think it's even possible to implement and will certainly improve the usability. An editor without a license checkbox? Very nice, very nice indeed.

...

I know most of this is science fiction but I hope we will have something like this in the far future :)

I don't see any top-science in showing "When external links are acceptable" textbox when the editor founds a user has added an external link. In fact, current MediaWiki editor does this (shows CAPTCHA).

...

But my point is that the guidelines would be much easier to digest if always presented in context and only the relevant part.

This is the best definition.

...

Maybe I'm an utopist but I can imagine a wikipedia where fresh editors just start typing their knowledge with zero education and still be able to immediately produce valuable output, provided they have good intentions.

And combined with a visual editor evne a housewife can bring simple edits, right. This might be a way to go.

Still, back to our reality...

...

...
2012/2/6 Amgineamgine@wikimedians.ca:

...
As I understand it, for the foreseeable future there will be a raw wiki syntax interface available. I hope contributors can be reassured on this point.

Combined with: 2012/2/6 Trevor Parscaltparscal@wikimedia.org:

...
Make significant changes to what Wikitext is and can do

The problem with this is that if present "raw wiki syntax" will be kept it will ensure that edits continue downfall.

I disagree. Its existence in the backend shouldn't influcence it.

In the backend - no, it won't; in the frontend/user space - yes, it will. But we seem to be agree here.

...

Your proposing is like adopting a new, highly improved C² programming language and throwing all C code (which would be incompatible with the new one). You are not proposing to create a new C² language, but also to stop all support for C.

Yes, this is my proposal. However, your phrasing misses one crucial point: there can and will be written a compatibility layer that will even work better than current wikitext parser. I say better because when we will be writing it: 1. Old wikitext syntax will be frozen and this will allow devs create a well-round parser knowing they won't need to add patches and fixes all around later because of 'new features'. 2. Compatibility layer will work on the new parser framework and thus will be easier and faster to create.

Moreover, since current wikitext syntax is more or less based on regular expressions it is not a big deal to process it using more advanced tokenizer.

...

So you say "Oh, no problem. I will make this wonderful C2C² converter that will seamlessly produce equivalent C² code from the original C one, so you don't need to rewrite things from scratch. It will be automatically taken care of."

Correct but not completely, see below.

...

Yes. Until that inline assembly that worked in C code makes a random memory overwrite in the kernel. And that other function, which got a compatible by luck of sizeof(int) == sizeof(void*), now in C² makes the application die horribily... and so on.

Really precise comparison. Code relying on language tricks is hard to migrate. But is this meant to prevent computer languages from evolving?

Backward compatibility is important but shouldn't act like a weight on your leg. If you feel that old code and syntax is suffocating the project will you push on instead of throwing it away, manually replacing incompatible and tricky pieces of code with new?

Similarly, it is not denied that Wikipedia and many other MediaWiki-based projects use tricky markup equivalent to inline assembler in old C. Of course, such code will be troublesome to support (albeit possible with enough effort) and thus it might be better to error when converting such documents.

How many pages there exist that contain tricky wikitext? Will it be possible to convert them manually to new syntax? To not offend anyone but it is already a fault of wikitext devs that the markup wasn't being improved in consistency with growing needs; why cutting this deeper and letting old markup tag behind like a ghost?

The point of a human-readable/writable markup is to be crisp, it can't allow tricky constructs and if it doesn't then it will be a piece of cake to convert it automatically according to any future needs. And once current pages are converted to proper markup there will be less to none manual transformation necessary in future.

Signed, P. Tkachenko

Sumana Harihareswara

9 Feb 9 Feb

3:32 p.m.

On 02/08/2012 05:20 AM, Pavel Tkachenko wrote:

...

On 08.02.2012 2:52, Platonides wrote:

...
At the beginning, the interesection was huge even if ability to edit was low, just because there was a lot of knowledge missing. So as the knowledge increases (eg. linearly) "people" appear to be more and more stupid for editing.

This is well put. Perhaps this calls for RTFM but were there studies that have examined the trend from this POV?

Pavel, I asked Oliver Keyes, and he said that http://meta.wikimedia.org/wiki/Research:Newbie_reverts_and_article_length may be of interest. He's not on this list, so if you have thoughts about that, please cc him.

-- Sumana Harihareswara Volunteer Development Coordinator Wikimedia Foundation

vitalif＠yourcmc.ru

11 Feb 11 Feb

10:16 p.m.

...

...
Ability to edit and knowledge of rules are probably orthogonal. And users have a inmense rule-blindness. They won't want to read pages and pages of rules or tutorials. They just want things done (eg. change a birth date) I think most people act the same way. What was the last time you read the VCR manual?

Yes, I agree with you completely, I don't even remember if I have read a manual for any of my cellphones.

I'm sure not just because you ignore the rules, but because all modern cellphones have intuitive interface. Many users read manuals for their FIRST cellphone.

Yury Tarasievich

6 Feb 6 Feb

8 a.m.

Mind you, if the content produced with MediaWiki (so, Wikipedia) became less dependent on the low-level stuff, it might be "good thing" yet. Right now, the markup is a html by other name. Now, if it could be turned to the structure-only plus presentation-only somehow, sort of xml-ish, then even visual editor wouldn't really hurt...

Part of a bigger problem, really: how far can the arbitrary piece of software "stretch", given new, ever-expanding sets of requirements continue to be produced? Are all of those requirements wise?

Yury

On 02/05/2012 10:13 PM, vitalif@yourcmc.ru wrote:

...

I've read http://www.mediawiki.org/wiki/Future/Parser_plan recently, and the plans seemed strange and scary to me.

...

Such plans seem very scary to me, as I think the PLAIN-TEXT is one of the MOST IMPORTANT features of Wiki software! And you basically say you want to move away from it and turn MediaWiki to another Word, having all problems of "WYSIWYdnG"

...

4710

Age (days ago)

4718

Last active (days ago)

wikitext-l@lists.wikimedia.org

73 comments

20 participants

tags (0)

participants (20)

Amgine
Daniel Friesen
David Gerard
Erik Moeller
Gabriel Wicke
George Herbert
Helder
Jay Ashworth
Magnus Manske
Mihály Héder
Mike Dupont
Oliver Keyes
Pavel Tkachenko
Platonides
Stanton McCandlish
Sumana Harihareswara
Svip
Trevor Parscal
vitalif＠yourcmc.ru
Yury Tarasievich