A previous SO question raised the issue about which idiom is better in time of execution efficency terms:
[ (var := exp) > 0 ] whileTrue: [ ... ]
versus
[ var := exp.
var > 0 ] whileTrue: [ ... ]
Intuitively it seems the first form could be more efficient during execution, because it saves fetching one additional statement (second form). Is this true in most Smalltalks?
Trying with two stupid benchmarks:
| var acc |
var := 10000.
[ [ (var := var / 2) < 0 ] whileTrue: [ acc := acc + 1 ] ] bench.
| var acc |
var := 10000.
[ [ var := var / 2. var < 0 ] whileTrue: [ acc := acc + 1 ] ] bench
Reveals no major differences between both versions.
Any other opinions?
So the question is: What should I use to achieve a better execution time?
or
In cases like this one, the best way to arrive at a conclusion is to go down one step in the level of abstraction. In other words, we need a better understanding of what's happening behind the scenes.
The executable part of a
CompiledMethodis represented by its bytecodes. When we save a method, what we are doing is compiling it into a series of low level instructions for the VM to be able to execute the method every time it is invoked. So, let's take a look at the bytecodes of each one of the cases above.Since
<expression>is the same in the same in both cases, let's reduce it drastically to eliminate noise. Also, let's put our code in a method so to have aCompiledMethodto play withNow, let's look
CompiledMethodand its superclasses for some message that would show us the bytecodes ofObject >> #m. The selector should contain the subword bytecodes, right?...
Here it is
#symbolicBytecodes! Now let's evaluate(Object >> #m) symbolicBytecodesto get:Note by the way how our
tempvariable has been renamed toTemp: 0in the bytecodes language.Now repeat with the other and get:
The difference is
versus
What this reveals is that in both cases
tempis read from the stack in different ways. In the first case, the result of our<expression>is popped intotempfrom the execution stack and thentempis pushed again to restore the stack. Apopfollowed by apushof the same thing. In the second case, instead, nopushorpophappens andtempis simply read from the stack.So the conclusion is that in the first case we will be generating two cancelling instructions
popfollowed bypush.This also explains why the difference is so hard to measure:
pushandpopinstructions have direct translations into machine code and the CPU will execute them really fast.Note however, that nothing prevents the compiler to automatically optimize the code and realize that in fact
pop + pushis equivalent tostoreInto. With such an optimization both Smalltalk snippets would result in exactly the same machine code.Now, you should be able to decide which form do you prefer. I my opinion such a decision should only take into account the programming style that you like better. Taking into consideration the execution time is irrelevant because the difference is minimal, and could be easily reduced to zero by implementing the optimization we just discussed. By the way, that would be an excellent exercise for those willing to understand the low level realms of the unparalleled Smalltalk language.