NotNil, then what?

This is about a fix of a conceptual bug in the current version of Squeak, though it’s the detailed analysis why this was a bug that I think is worth a blog post.

A while ago when ifNil: and ifNotNil: were not standard in the commercial Smalltalks of the day, there was an occasional argument on c.l.s now and then whether they were a good thing or superfluous sugar not providing any new functionality. The typical case against ifNil: is something like

foo ifNil: [self doSomething]

which can trivially be rewritten in “classic” Smalltalk. The case that is less trivial, though, is

^self foo ifNil: [self defaultFoo]

It’s important that the value being tested is computed and not just fetched from a variable. For a computed value, we cannot in the general case reduce the above to

^self foo isNil ifTrue: [self defaultFoo] ifFalse: [self foo]

because this would evaluate self foo twice—something we would want to avoid if self foo had side effects. Avoiding double evaluation would call for a variable to hold onto the result of self foo:

| foo |
foo := self foo.
foo isNil ifTrue: [self defaultFoo] ifFalse: [foo]

which is significantly more verbose than the ifNil: alternative. So, while ifNil: in this case can still be reduced to “classic” Smalltalk, doing so takes us down an abstraction level—from a single message doing exactly what it says to messing around with variables, tests and branches. While in some languages such exposed plumbing is a fact of life, in Smalltalk we like to hide it when we don’t need to deal with it directly.

Now, looking at ifNotNil:, it’s important to note that its typical use case is different. In fact, this is what’s interesting about ifNil: and ifNotNil: in general—they are asymmetrical. While ifNil: allows us to provide a useful value in the cases when the “main” branch of the computation returns nil, it’s unlikely that we’d ever use ifNotNil: in a mirrored pattern as

^self foo ifNotNil: [nil]

This shows that the cause of the asymmetry is the obvious fact that typically nil is used as a token indicating the absence of a value. A non-nil object is interesting in its own right, while nil isn’t.

So, ifNotNil: is primarily useful not as a value-producing expression, but rather as a control statement that triggers computation when another expression produces a useful result, in the simplest case going as something like

foo ifNotNil: [self doSomethingWith: foo]

This use of ifNotNil: could again be reduced to isNil ifFalse:. The case when the receiver of ifNotNil: is computed is more interesting because of the same problem of avoiding multiple evaluation, the obvious solution to which would again make the code much bulkier:

| foo |
foo := self computeFoo.
foo ifNotNil: [self doSomethingWith: foo]

The way to hide the plumbing here is to have ifNotNil: accept a one-argument block and feed the receiver into it, allowing us to fold the above back into a single expression

self computeFoo ifNotNil: [:foo | self doSomethingWith: foo]

This illustrates another asymmetry of ifNotNil: and ifNil:—while ifNil: block needs no arguments because nil is not an “interesting” object, it’s often helpful for ifNotNil: to take the non-nil receiver as argument.

A number of years ago Squeak had ifNil: and ifNotNil: implemented exactly this way. The former would take a niladic block as the argument and the latter (as far as I can remember) would only accept a monadic one. When a few years ago Eliot and I were adding the two to VisualWorks we kept almost the same pattern, extending the ifNotNil: case to also accept a niladic block.

In Squeak the two messages have since then been reimplemented as special messages, expanded by the compiler into the equivalent == nil ifTrue:/ifFalse: forms. Inexplicably, in the process ifNotNil: was changed to only accept a niladic block—precisely the case that is less valuable! Also interestingly, the fallback implementation in ProtoObject still allows a monadic block in a “pure” ifNotNil: but not as the ifNotNil argument of ifNil:ifNotNil: and ifNotNil:ifNil:! Their classification under the ‘testing’ protocol is a minor nit in comparison.

And now for the fix. It is available as a zip file MonadicIfNotNil.zip with two change sets. One modifies the handling of ifNotNil: and related messages to allow monadic blocks. The second contains the tests and should obviously be filed in after the compiler changes in the first set.

The change was fairly straightforward. Most of the work was dancing around the original error checking code that assumes only niladic blocks are legal as arguments of macroexpanded messages. The expansion simply promotes the block argument into a method temp, expanding

self foo ifNotNil: [:arg | ...]

into

(arg := self foo) ifNotNil: [...]

In VisualWorks such treatment would count as incorrect, but this promotion of a block argument into a method temp is in fact the classic Smalltalk-80 trick still surviving in Squeak in the similar treatment of the to:do: message.

Sections Wrap-Up

To wrap up for now the thread of functional-like features, here is a very simple implementation of sections that doesn’t rely on currying:

Object>>~ aSymbol
    aSymbol numArgs = 1 ifFalse: [self error: 'Invalid selector'].
    ^[:arg | self perform: aSymbol with: arg]

Symbol>>~ anObject
    self numArgs = 1 ifFalse: [self error: 'Invalid selector'].
    ^[:arg | arg perform: self with: anObject]

First of all, note the ambiguity of the section operator that we didn’t discuss in the previous post. What does

#, ~ #,

mean? Is it a left section prepending a comma to the argument, or a right section appending it?

Considering the implementation, clearly it’s the latter, but this particular choice of behavior is only a side effect of the implementation. In principle though, the intended meaning is undecidable. There is no difference in Smalltalk between a symbol and a selector—so there is no way to tell which comma is the operator and which is the section argument. This creates two potential gotchas.

One, it is impossible to create a left section with a Symbol as the fixed argument. We can’t make a block to check whether the symbol #foobar: includes a given character by writing

#foobar: ~ #includes:

The second, related, problem is that the behavior of the section operator can vary if the first argument is a variable:

foo ~ #includes:

creates a left section if the current value of foo is anything but a Symbol, a right section if it is a one-argument selector, or fails otherwise.

In practice, the ambiguity can be avoided by “downgrading” the first argument to a string in situations where it might be a symbol we don’t want to be mistaken for a selector.

A few people asked for some examples how some of the things I talked about could be useful. As I wrote before, I don’t think currying of blocks can be very useful in practical Smalltalk, simply because the Smalltalk school of expression is different.

I have a slightly different opinion about sections. I use them (the simple currying-independent implementation above) in the framework I’m working on. I hope to write more about that one day, but for now I’ll adapt the example to my past project, VisualWorks Announcements.

Remember that it’s possible to subscribe for announcements by using either a block

anObject
    when: SomethingHappened
    do: [:ann | ...do something with ann...]

or a receiver-selector pair

anObject
    when: SomethingHappened
    send: #processAnnouncement:
    to: self

Given sections, we could handle the receiver-selector case using the same message we use for blocks:

anObject
    when: SomethingHappened
    do: self ~ #processAnnouncement:

Of course, we could also simply drop the receiver-selector option and require explicitly writing

[:ann | self processAnnouncement: ann]

but arguably the block isn’t quite as succinct as the equivalent section. So, what does this buy us?

Of course, we simplify the API. There is now only one subscription message instead of two. What was a separate case becomes part of the same mechanism.

The implementation is also simpler. In the original we needed to remember the receiver and the selector to send to it. (The block case is handled by remembering the block as the receiver and setting the selector to #value:—granted, I’m simplifying the Announcements picture a little, but as I said my example is only an adaptation). Once we throw out separate support for the receiver-selector case, all we do is remember the block and evaluate it when needed.

There even is a potential performance improvement. The receiver-selector implementation always sends #perform:with: to the receiver to deliver an announcement, even for those subscriptions that use a block. The simplified implementation instead evaluates the block to set things in motion, and #perform:with: only enters the picture together with sections in those cases where we specifically want a receiver-selector option.