As I understand it, there is rightfully little interest in the developer community to write a new parser function for every single template need to come along.
Therefore, when it comes to a template like {{val}}, which now generates rounding errors about 5–10% of the time because of the math- based parser functions it must use, it would be nice if the template- authoring community could have a character-counting parser function that is not only suitable for {{val}}, but which could be a general - purpose parser function that could be used for a great variety of purposes.
A description of what {{val}} tries to do at its fundamental level is described here:
http://en.wikipedia.org/wiki/Wikipedia_talk:Manual_of_Style_(dates_and_numbe...
Is there a developer whom I can have the author of {{val}} e-mail to see if you two can arrive at a relatively easy-to-make parser function that A) meets the basic needs of {{val}}, and B) has sufficient utility to be useful for other character-counting needs?
greg_l_at_wikipedia wrote:
As I understand it, there is rightfully little interest in the developer community to write a new parser function for every single template need to come along.
Therefore, when it comes to a template like {{val}}, which now generates rounding errors about 5–10% of the time because of the math- based parser functions it must use, it would be nice if the template- authoring community could have a character-counting parser function that is not only suitable for {{val}}, but which could be a general - purpose parser function that could be used for a great variety of purposes.
I would rather have an application-specific number formatting function, rather than a character-counting function. It could be similar to PHP's number_format(). Wikitext is a terrible programming language, slow to execute and hard to understand. It's much better to write in PHP.
-- Tim Starling
On Sat, Jan 31, 2009 at 9:39 PM, Tim Starling tstarling@wikimedia.org wrote:
greg_l_at_wikipedia wrote:
As I understand it, there is rightfully little interest in the developer community to write a new parser function for every single template need to come along.
Therefore, when it comes to a template like {{val}}, which now generates rounding errors about 5–10% of the time because of the math- based parser functions it must use, it would be nice if the template- authoring community could have a character-counting parser function that is not only suitable for {{val}}, but which could be a general - purpose parser function that could be used for a great variety of purposes.
I would rather have an application-specific number formatting function, rather than a character-counting function. It could be similar to PHP's number_format(). Wikitext is a terrible programming language, slow to execute and hard to understand. It's much better to write in PHP.
We already have {{formatnum:}} with a very limited functionality that presumably could be extended.
Though I would like to re-emphasize that Greg's complaint principally arrises because of floating point round-off errors in #expr that are difficult for normal editors to predict or plan for, and that should be addressed irrespective of other work to improve number formatting.
-Robert Rohde
Output a big red error when giving numbers that will encounter a floating point error?
Perhaps also provide a # of use limited #expr equivalent that will use a bignum library rather than normal numbers which can be used in cases where that big red error shows up.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://nadir-seen-fire.com] -Nadir-Point (http://nadir-point.com) -Wiki-Tools (http://wiki-tools.com) -MonkeyScript (http://monkeyscript.nadir-point.com) -Animepedia (http://anime.wikia.com) -Narutopedia (http://naruto.wikia.com) -Soul Eater Wiki (http://souleater.wikia.com)
Robert Rohde wrote:
On Sat, Jan 31, 2009 at 9:39 PM, Tim Starling tstarling@wikimedia.org wrote:
greg_l_at_wikipedia wrote:
As I understand it, there is rightfully little interest in the developer community to write a new parser function for every single template need to come along.
Therefore, when it comes to a template like {{val}}, which now generates rounding errors about 5–10% of the time because of the math- based parser functions it must use, it would be nice if the template- authoring community could have a character-counting parser function that is not only suitable for {{val}}, but which could be a general - purpose parser function that could be used for a great variety of purposes.
I would rather have an application-specific number formatting function, rather than a character-counting function. It could be similar to PHP's number_format(). Wikitext is a terrible programming language, slow to execute and hard to understand. It's much better to write in PHP.
We already have {{formatnum:}} with a very limited functionality that presumably could be extended.
Though I would like to re-emphasize that Greg's complaint principally arrises because of floating point round-off errors in #expr that are difficult for normal editors to predict or plan for, and that should be addressed irrespective of other work to improve number formatting.
-Robert Rohde
Daniel Friesen wrote:
Output a big red error when giving numbers that will encounter a floating point error?
Perhaps also provide a # of use limited #expr equivalent that will use a bignum library rather than normal numbers which can be used in cases where that big red error shows up.
Well, such a library could be used unconditionally, the overhead wouldn't be large compared to the general overhead of parsing. But you have to truncate the output somewhere, you can't just let the decimal expansion of 1/3 run until you run out of memory.
There are two techniques which can be used to avoid inaccuracy due to truncation in special cases: rational arithmetic, for exact division, and decimal arithmetic, for addition of numbers written as decimals and exact multiplication/division by powers of ten.
But I don't think either of those techniques are appropriate for formatting a directly input number for presentation, adding thousands separators. What we want for that job is string processing, not arithmetic.
-- Tim Starling
On Sat, Jan 31, 2009 at 10:21 PM, Daniel Friesen dan_the_man@telus.net wrote:
Output a big red error when giving numbers that will encounter a floating point error?
Perhaps also provide a # of use limited #expr equivalent that will use a bignum library rather than normal numbers which can be used in cases where that big red error shows up.
Noticing errors isn't the hard part, but the common user will have a hard time figuring out how to avoid them.
For example:
{{#expr:floor(0.00007*100000)}} = 6 {{#expr:0.07*100 = 7}} is False {{#expr:5/6 = (1/6)*5}} is False {{#expr:(10^16 + 1) % 10}} = 0
All of these are examples of floating point quirks, but at the same time none of them is so complicated that analogous expressions wouldn't come up in practical situations. From the point of view of a non-programmer Mediawiki user, this behavior is both unexpected and cryptic.
The first three are fixable, provided one applies some reasonable tolerance for when two numbers are the same. The fourth would require big math, or failing that should probably generate an error.
-Robert Rohde
{{#expr:floor(0.00007*100000)}} = 6 {{#expr:0.07*100 = 7}} is False {{#expr:5/6 = (1/6)*5}} is False {{#expr:(10^16 + 1) % 10}} = 0
what is the rationale for doing math homework assignments as mediawiki wikitext?
On Sun, Feb 1, 2009 at 6:13 AM, Robert Rohde rarohde@gmail.com wrote:
For example:
{{#expr:floor(0.00007*100000)}} = 6 {{#expr:0.07*100 = 7}} is False {{#expr:5/6 = (1/6)*5}} is False {{#expr:(10^16 + 1) % 10}} = 0
...
The first three are fixable, provided one applies some reasonable tolerance for when two numbers are the same. The fourth would require big math, or failing that should probably generate an error.
And this one would probably be unfixable short of building in a full-fledged CAS:
{{#expr:ln(1000000) = 6*ln(10)}}
:)
Robert Rohde wrote “Noticing errors isn't the hard part, but the common user will have a hard time figuring out how to avoid them.”
I couldn’t agree more. The {val} template takes care of details such as making a slightly narrower <span> gap preceding the digit 1. It also substitutes the somewhat longer looking, true mathematical “minus sign” for the keyboard-typed hyphen when generating a negative exponent.
Whenever {val} chokes on a big number or does one of its rounding errors, the only work-around for editors that produces identical functionality and appearance in an article is to code this:
2.718<span style="margin-left:0.25em">281</span><span style="margin- left:0.2em">828</span>(30) × 10<sup>−23</sup> kg
This is totally beyond the ability of most editors, who expect {{val| 2.718281828|(30)|e=-23|u=kg}} to work.
So merely flagging rendered output with big red text that the {val} template generated a rounding error would be minimally helpful. What would be exceedingly helpful is if the math functions wouldn’t produce rounding errors. Something tells me that fixing the math functions would be time consuming. So, the best option of all, and what the community needs, is one of the following:
1) for StringFunction to be made bullet proof, (as Jason Schulz wrote), or 2) for a new general-purpose character-counting parser function to be written (which many templates could use), or 3) for a magic-word-based version of {val} to be written.
It seems that StringFunction (option 1, above) has been buggy for a long time and there is little interest in the developer community to wade through it. To my mind, the most useful thing developers could do is produce a really clean, elegant, bullet proof, tight character- counting parser function.
One of the intricacies of the {val} template is that it follows the ISO convention (also observed by the NIST and the BIPM) where numbers are delimited in groups as follows:
2.1 2.12 2.123 2.1234 2.123 45 2.123 456
So there may be two, three, or four digits in the last group in order that there is never a single hanging digit. Accordingly, the template follows this logic:
Q1: Are there five or more undelimited digits remaining after the decimal marker? No=Stop / Yes=Advance three digits and prepare to add span gap. Goto Q2. Q2: Is the span gap to be added following the digit “1”? No=Add a span gap of 0.25 em and then goto Q1 / Yes=Add a span gap of 0.2 em and then goto Q1.
Note too that the {val} template also needs to strip off any digits to the left of the decimal marker and delimit them with commas.
How hard can can it be to make a parser function capable of supporting this sort of “Q1/Q2” logic? I would think that some parser functions are already available to use in the above logic that can report to a template how many characters there are in the string. With that information, the template could do the math and to know how many groups of three there will be and whether the final group will have 2, 3, or 4 characters. I would also think that the new parser function required to fulfill the rest of {val}’s needs would be very useful for many other templates and would fulfill perhaps 80% of what StringFunction would have ever been used for anyway.
I’m no programmer, but it seems to me that all this new parser function needs to do is spit out a requested number of characters from a string. After the template does the math on the number of characters it is dealing with, it asks the new parser function “give me three digits” … “give me three digits” … “and now give me four digits”. Yes?
Greg L
I should think that there must already be a parser function that a template could ask “how many characters are in the string”.
On Feb 1, 2009, at 3:13 AM, Robert Rohde wrote:
On Sat, Jan 31, 2009 at 10:21 PM, Daniel Friesen dan_the_man@telus.net wrote:
Output a big red error when giving numbers that will encounter a floating point error?
Perhaps also provide a # of use limited #expr equivalent that will use a bignum library rather than normal numbers which can be used in cases where that big red error shows up.
Noticing errors isn't the hard part, but the common user will have a hard time figuring out how to avoid them.
For example:
{{#expr:floor(0.00007*100000)}} = 6 {{#expr:0.07*100 = 7}} is False {{#expr:5/6 = (1/6)*5}} is False {{#expr:(10^16 + 1) % 10}} = 0
All of these are examples of floating point quirks, but at the same time none of them is so complicated that analogous expressions wouldn't come up in practical situations. From the point of view of a non-programmer Mediawiki user, this behavior is both unexpected and cryptic.
The first three are fixable, provided one applies some reasonable tolerance for when two numbers are the same. The fourth would require big math, or failing that should probably generate an error.
-Robert Rohde
_______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Sun, Feb 1, 2009 at 12:13 PM, Robert Rohde rarohde@gmail.com wrote: ...
{{#expr:floor(0.00007*100000)}} = 6 {{#expr:0.07*100 = 7}} is False {{#expr:5/6 = (1/6)*5}} is False {{#expr:(10^16 + 1) % 10}} = 0
http://pt.php.net/manual/en/ref.bc.php <?php echo floor(bcmul("0.00007","100000")); ?> outputs 7
On 2/1/09 3:13 AM, Robert Rohde wrote:
Noticing errors isn't the hard part, but the common user will have a hard time figuring out how to avoid them.
For example:
{{#expr:floor(0.00007*100000)}} = 6 {{#expr:0.07*100 = 7}} is False {{#expr:5/6 = (1/6)*5}} is False {{#expr:(10^16 + 1) % 10}} = 0
All of these are examples of floating point quirks, but at the same time none of them is so complicated that analogous expressions wouldn't come up in practical situations. From the point of view of a non-programmer Mediawiki user, this behavior is both unexpected and cryptic.
The first three are fixable, provided one applies some reasonable tolerance for when two numbers are the same. The fourth would require big math, or failing that should probably generate an error.
Added a bug entry for supporting bcmath, which has a handy extension in PHP, if anybody's interested. As Tim notes this would probably not be terribly difficult nor a perf problem though you'll still have precision cutoffs... but they'll be more predictable and comprehensible to humans used to working in decimal. :)
https://bugzilla.wikimedia.org/show_bug.cgi?id=17468
-- brion
Robert Rohde wrote:
We already have {{formatnum:}} with a very limited functionality that presumably could be extended.
Though I would like to re-emphasize that Greg's complaint principally arrises because of floating point round-off errors in #expr that are difficult for normal editors to predict or plan for, and that should be addressed irrespective of other work to improve number formatting.
-Robert Rohde
The problem doesn't lie on rouding errors per se, but that they're using math functions for string actions, because mediawiki doesn't provide anything better.
On Sun, Feb 1, 2009 at 12:39 AM, Tim Starling tstarling@wikimedia.org wrote:
I would rather have an application-specific number formatting function, rather than a character-counting function. It could be similar to PHP's number_format(). Wikitext is a terrible programming language, slow to execute and hard to understand. It's much better to write in PHP.
Note that {{val}} uses at least a few strange HTML-specific hacks, like using CSS instead of space characters to create spaces between the digit groups (for copy-paste purposes). Something that would work like {{val}} would be considerably different from number_format().
The virtue of using CSS <span> gaps is the entire significand can be copied and pasted into applications like Excel, where they will be automatically treated as numbers without the need to hand-delete non- breaking spaces, which are unique and distinct characters. To see the difference in action, see here:
http://en.wikipedia.org/wiki/User:Greg_L#Val
Greg
On Feb 1, 2009, at 12:52 PM, Aryeh Gregor wrote:
On Sun, Feb 1, 2009 at 12:39 AM, Tim Starling tstarling@wikimedia.org wrote:
I would rather have an application-specific number formatting function, rather than a character-counting function. It could be similar to PHP's number_format(). Wikitext is a terrible programming language, slow to execute and hard to understand. It's much better to write in PHP.
Note that {{val}} uses at least a few strange HTML-specific hacks, like using CSS instead of space characters to create spaces between the digit groups (for copy-paste purposes). Something that would work like {{val}} would be considerably different from number_format().
_______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org