[Wikitech-l] Re: Goto for microoptimisation

31 Jul 2021


      On Fri, Jul 30, 2021 at 10:10 PM Tim Starling tstarling@wikimedia.org wrote:
...
For performance sensitive tight loops, such as parsing and HTML construction, to get the best performance it's necessary to think about what PHP is doing on an opcode by opcode basis.
Certain flow control patterns cannot be implemented efficiently in PHP without using "goto". The current example in Gerrit 708880 comes down to:
if ( $x == 1 ) {
    action1();
} else {
    action_not_1();
}
if ( $x == 2 ) {
    action2();
} else {
    action_not_2();
}
If $x==1 is true, we know that the $x==2 comparison is unnecessary and is a waste of a couple of VM operations.
It's not feasible to just duplicate the actions, they are not as simple as portrayed here and splitting them out to a separate function would incur a function call overhead exceeding the proposed benefit.
I am proposing
if ( $x == 1 ) {
    action1();
    goto not_2; // avoid unnecessary comparison $x == 2
} else {
    action_not_1();
}
if ( $x == 2 ) {
    action2();
} else {
    not_2:
    action_not_2();
}
I'm familiar with the cultivated distaste for goto. Some people are just parotting the textbook or their preferred authority, and others are scarred by experience with other languages such as old BASIC dialects. But I don't think either rationale really holds up to scrutiny.
I feel that some people who have an absolutist stance on this issue
are directly or indirectly parroting the essay "Go To Statement
Considered Harmful" written by Edsger Dijkstra in 1968 [0][1]. I
wonder however how many who do so have read the short essay itself and
thought about it in the context of the time it was presented. In my
own reading of the essay I find Dijkstra arguing for structure and
practices in writing software that I think we all take for granted
today.
He is arguing for software to have a written structure which makes it
easier to form a mental model of how state will change and execution
will flow when the program is executed. To set up his argument,
Dijkstra states two 'remarks' from his own experience. I invite you to
read his original (dense academic) prose, but I will summarize the
premises as:
* A program must fulfill the business requirements to be useful.
* Human brains are better at static analysis than dynamic analysis,
and therefore code should be written to optimize for understandability
under static analysis.
These statements seem generally reasonable to me, and I believe that
many "standard practices" are in service of these premises. Unit,
end-to-end, and user acceptance testing are all tools to validate
fulfillment of business requirements. Our collective bias for smaller
functions, smaller classes, and 'separation of concerns', along with
linters and static checkers like the one commenting on Tim's gerrit
patch, are attempts to increase readability and comprehension of our
code. None of these things were in any way widely available in 1968,
but they are widely accepted tools and practices today.
Treating the title of the essay as dogma however, I feel misses some
of the nuance of the argument. This statement from the essay for me is
key: "The unbridled use of the go to statement has as an immediate
consequence that it becomes terribly hard to find a meaningful set of
coordinates in which to describe the process progress."
Dijkstra is arguing against what I would colloquially call 'spaghetti
code'; code where the flow of control jumps around a lot and in the
process leaves the reader confused about what is expected to happen
and why. The flow of execution becomes tangled much as a plate of
cooked noodles dumped from a pot. Finding both ends of a particular
noodle with a quick visual inspection is a mental and physical
challenge, not a trivial task.
...
I think goto is often easier to read than workarounds for the lack of goto. For example, maybe you could do the current example with break:
do {
    do {
        if ( $x === 1 ) {
            action1();
            break;
        } else {
            action_not_1();
        }
        if ( $x === 2 ) {
            action2();
            break 2;
        }
    } while ( false );
    action_not_2();
} while ( false );
But I don't think that's an improvement for readability.
You can certainly use goto in a way that makes things unreadable, but that goes for a lot of things.
Tim's example here is nice in that it shows how other PHP language
constructs could be used, but that these are also not common
constructs and that they do not immediately yield a more
understandable function.
...
I am requesting that goto be considered acceptable for micro-optimisation.
When performance is not a concern, abstractions can be introduced which restructure the code so that it flows in a more conventional way. I understand that you might do a double-take when you see "goto" in a function. Unfamiliarity slows down comprehension. That's why I'm suggesting that it only be used when there is a performance justification.
I am in agreement with Tim. I do not think that any of us should adopt
goto as a commonly used tool. I do however think there are situations
where a goto and comments actually do produce understandable code
which also conforms to a business requirement of keeping wall clock
execution time as small as possible.
Now I guess my question to the group is, how can we describe this
nuance as a replacement for statements in current coding conventions
like "Do not use the goto() syntax introduced in 5.3. PHP may have
introduced the feature, but that does not mean we should use it." [2]?
Could it be as simple as stating the bias more like "The use of `goto`
should be exceedingly rare, always accompanied by comments explaining
why it is used (likely for performance), and the author should be
prepared for others to challenge the usage"?
[0]: https://dl.acm.org/doi/10.1145/362929.362947
[1]: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD02xx/EWD215.html
[2]: https://www.mediawiki.org/wiki/Manual:Coding_conventions/PHP#Other
Bryan
-- 
Bryan Davis              Technical Engagement      Wikimedia Foundation
Principal Software Engineer                               Boise, ID USA
[[m:User:BDavis_(WMF)]]                                      irc: bd808

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[Wikitech-l] Re: Goto for microoptimisation