What’s a Binary Selector?

Intuitively, binary selectors are obvious—you string together a few funky characters and you have yourself a binary selector. And binary selectors can useful for making some things easier to read. At least that’s the theory, and reading the ANSI standard one might assume everything is exactly that clear. The practice in Smalltalk-80 descendants is different.

To begin with, Smalltalk-80 only allowed two characters in a binary selector, ruling out beauties such as <=>, <+> or >>=. Fortunately, the ANSI standard has dropped this limit, and Squeak also allows selectors of more than two characters. VisualWorks, however, still sticks to the old ways.

Then there is the wrench negative number literals throw in the works. Intuitively, 3 –– 4 is supposed to send the #– message to 3—and probably cause an MNU. Indeed, that is what would happen in an ANSI-compliant Smalltalk. Instead, Squeak will happily parse and evaluate the above to 7, while VisualWorks will throw a compiler error saying that an argument is expected following the first minus.

Here is why that is. A negative number literal in Smalltalk-80 is simply a minus token followed by a number token. Squeak still sticks to that interpretation, and so it sees no difference between "-4" and "-  4". Both are valid forms of writing a negative four. ANSI and VisualWorks have since moved to a less quirky behavior treating the minus as part of the literal number token. Interestingly, the cute syntax diagrams at the end of the Blue Book suggest Smalltalk-80 does the same, but evaluating "-  4" shows that it doesn’t.

But wait, before the second minus in "3 –– 4" got attached by Squeak to the number that follows, why did it get detached from the first one to begin with?

And here lurks another feature. This time it is in fact documented by the cute syntax diagrams. A binary selector is made of special characters, but a minus is a special case of a special character. It’s only allowed once, and only as the first character of a binary selector. So, -, -> or -+ are legal, +-, <- or -- are not. Clearly, the point of this feature was to have “3+-4” or “3––4” parse “sensibly” as arithmetic expressions rather than exotic message sends.

This also explains why VisualWorks treats “3 –– 4” as an error. It switched from the classical treatment of a minus before a literal to a more modern (or at least compliant with the Blue Book syntax diagrams) treatment, so it doesn’t see the second minus as belonging to the 4 that follows—but it still refuses to accept “––” as a valid binary selector, hence the error. Strange, that.

And that’s not the end of it. The problem with a minus is that the grammar uses it for two purposes—as part of a number literal and also as part of a selector. Another character just like that is a vertical bar. “foo || bar” doesn’t work in Squeak and VisualWorks. And it never did, going as far back as Smalltalk-80 again.

3 || 4

This time the scanner doesn’t even try to be nice. A vertical bar always parses as a separate token, so anything like || or |> is treated by the scanner as a sequence of two tokens and then rejected by the parser as invalid syntax. So in fact the only legal binary selector with a vertical bar is the vertical bar itself. (This again is contrary to what the Blue Book claims Smalltalk-80 recognizes as a binary selector).

All that was perhaps way too much nit-picking as far as any practical programming is concerned, but I find two things interesting about this situation.

One is that the Blue Book misrepresents the grammar of the actual implementation, and not once but twice. The version in the book is probably more of what the authors wanted the grammar to be, rather than an accurate documentation of the ad-hoc implementation in the image. Maybe this was why one of these quirks later got fixed in the ParcPlace branch of the family.

The second interesting point is why is this such a minor issue? How many people in the past 25 years have actually wanted to write “foo || bar” or “foo <- bar” and discovered they didn’t work? Do Smalltalkers use binary messages creatively—that is, outside (semi-)traditional math, concatenation and point creation? And if, as it seems, they do not—why?

6 thoughts to “What’s a Binary Selector?”

  1. I think that binary operators are too ambiguous most of the time. Even if you have very basic objects like vectors in a vector space you use dot: or cross: to make clear which product operation you want to perform.
    If you use binary math operators you have to deal with the expectation, that they “work” the same like they do with numbers. I at least would stumble over implementations of + where commutativity is not given. (I guess there are even numerical examples where a+b=b+a is false, but that does not change my expectation! :)

  2. Yeah, for clarity I’ve stopped using “@” for the creation of points using keyword methods instead. For example instead of:

    100 @ 50

    I make use of

    Point x: 100 y: 50.

    Yes, it’s much more verbose but it’s very clear. To me clarity is more important than brevity via cryptic symbols.

    I also now use “String new” rather than the empty string, ”. Again for clarity but also to avoid any confusion about which instance is being referred to.

  3. I got used to x@y, but I also favor “String new” over the empty string ”.
    I also tend to use “Character space” instead of “$ “, which does not read well in cascades or with a . at the end

  4. I recently added (and the users appreciated it) a Python-like #% operator to strings:

    ‘%1 %2’ % { ‘Hello’. ‘world’ } => ‘Hello world’

    Note a subtle difference between ” and String new. The former is read-only, the latter is not. One case where this is important, is when #become: enters the scene. For example you can have a proxy object like

    Proxy class>>on: aBlock
    ^self basicNew set__Block: aBlock

    Proxy>>set__Block: aBlock
    block := aBlock

    Proxy>>doesNotUnderstand: aMessage
    [ block value become: self. “assumes two-way become!” ]
    on: Error do: [ :ex | ex pass. ^self ].
    ^self perform: aMessage

    then, only “Proxy on: [ String new ]” will work. Either of “Proxy on: [ ” ]” or “Proxy on: [ 3 ]” won’t work. Indeed, you better put that exception handler or bad usage will lock up the system. Of course, this is made up, but you got the idea.

    Also interesting is “(p := Proxy on: [ p ]) foo”, which will nicely halt the system. :-)

Comments are closed.