Archive for the ‘newspeak’ Category

Elements of Goodthink: Setter Sends

Thursday, November 10th, 2011

Smalltalkers are in a pretty good position to understand Newspeak code out of the box. Unfamiliar constructs such as nested classes and the related message lookup semantics need some experience to internalize, but they are usually only an obstacle to understanding the higher-level design. Within a method everything should look familiar, except perhaps for setter sends.

A setter send is an expression like

foo:: a doSomethingWith: b

For a Smalltalker, the short explanation would be to say it’s just like

foo := a doSomethingWith: b

Like most short explanations, it’s a useful lie. Typically when you see something like the above, there is indeed a variable or a slot named “foo” somewhere in the scope, and the expression sets the value of that slot. But it doesn’t have to always be the case.

One half of the true story is that the expression is more like

foo: (a doSomethingWith: b)

It is a message send to an implicit receiver, and the message is a plain keyword message “foo:” with a single colon. Doubling the colon makes it get parsed at a lower precedence so we don’t have to parenthesize the argument (without parentheses that would be an implicit receiver send of “foo:doSomethingWith:”).

Remember that there is no assignment operator in Newspeak. To set a slot, you invoke its setter method using an expression like the above. Without this syntax, most of those would need parentheses. We set slots quite often, so that would be a lot of parentheses. This is why the syntax is called a setter send, even though strictly speaking it does not have to invoke a setter of a slot. It’s a regular message send going through the regular message lookup procedure.

The second half of the story has to do with the value returned by the send. The value of

foo: bar

is the value returned by “foo:” The value of

foo:: bar

is “bar”. The value returned by “foo:” is thrown away. So after all, while a setter send is a regular message send as far as message lookup goes, its value is different from a regular message send. This is to recognize the fact that a setter send is usually used in place of an a assignment. Assignments can be chained, and this behavior guarantees that a chain of setter sends

foo:: bar:: baz

passes the same value to “foo:” regardless of now “bar:” is implemented.

To condense all of the above into a true short story,

foo:: a doSomethingWith: b

is just like

[:v | foo: v. v] value: (a doSomethingWith: b)

A Taste of Nested Classes, part 4

Sunday, March 8th, 2009

(Continues from the intermission).

In the previous examples I sometimes referred to the familiar Smalltalk library classes such as OrderedCollection as if it were the most natural thing to do. But what does something like

OrderedCollection with: 'foo'

really mean when found in a Newspeak method? It’s clear that with: is sent to OrderedCollection, but what is OrderedCollection?

OrderedCollection is, of course, a message. It’s sent to an implicit receiver, meaning we first look for a matching method in the lexical scope and send the message to the corresponding outer object if we find it. If not, we send the message to self and do the standard Smalltalk message lookup. It’s reasonable (and correct) to assume that Object and other superclasses don’t define a method named OrderedCollection, and that second lookup is bound to fail. Which means that in order for the above to work, one of the outer classes has to implement OrderedCollection as a method doing the right thing.

Let’s pause here and summarize the important points.

First, there are no variable references in Newspeak. Any expression that begins with foo or Foo begins its evaluation by sending that message to an implicit receiver. This applies to anything that looks like a global variable, an instance variable, and even (at least conceptually) a temporary variable in the traditional Smalltalk understanding. Even pseudo-variables like self or true, as we’ve seen in the intermission, could be implemented as message sends.

Second, because there are no variable references, there can’t be such a thing as the global scope. We can’t just say OrderedCollection anywhere in the code and expect to get one. But if that is so, how can we use library classes at all?

Let’s start by fleshing out a more concrete example.

class AsteroidsGame = ( )
(
    class Screen = ( 
        | asteroids = OrderedCollection new |
    ) (
    ...
    )
)

Here we want a slot of the nested class to hold an instance of OrderedCollection. Initializer code is no different from method code, so OrderedCollection here is an implicit receiver send. Someone on the outside has to understand that message and nobody does in this example, so it can’t possibly work. Let’s try modifying it like this:

class AsteroidsGame usingOrderedCollection: orderedCollectionClass = (
    | OrderedCollection = orderedCollectionClass |
) (
    class Screen = (
        | asteroids = OrderedCollection new |
    ) (
    ...
    )
)

This is much better because it can actually work. The top-level class is now instantiated with the usingOrderedCollection: message. The creator is expected to pass an OrderedCollection class metaobject to the instance. (For now let’s not worry where the creator gets it). The game instance stores the metaobject in a slot named OrderedCollection. Because defining a slot automatically defines accessors (in this case only a getter because of the “=” syntax which defines read-only slots), the AsteroidsGame instance now understands the OrderedCollection message. The implicit receiver send of OrderedCollection in the Screen class initializer will now be directed to the outer instance of AsteroidsGame and return the OrderedCollection class.

This example isn’t final yet, but it shows something very important. The top-level AsteroidsGame class is actually a module! It holds together the classes nested inside, and acts as a namespace for their names. But most importantly, it controls their implementation dependencies. The only way a class inside a module can use something from the outside is if that something has been explicitly declared as a requirement by passing it into the module and storing it in a slot.

Of course, passing each class that a module needs as a separate argument to its initializer would be incredibly tedious. Here is the final iteration of our example, this time showing how it’s done for real:

class AsteroidsGame usingPlatform: platform = (
    | 
    private OrderedCollection = platform Collections OrderedCollection.
    private ReadStream = platform Streams ReadStream.
    private PlatformScreen = platform Graphics Screen. 
    |
) (
    class Screen = ( 
        | asteroids = OrderedCollection new |
    ) (
    ...
    )
)

Instead of passing each dependency separately, we pass in an object we call the platform. A platform is a “supermodule”—an object that holds together a group of modules and makes them available through messages like Collections. The collections module returns the OrderedCollection implementation in response to the OrderedCollection message. The slot initializers in AsteroidsGame are now essentially “import statements,” binding the required classes to names local inside the module.

As an aside, this is reminiscent of the idiom one occasionally sees in Scheme, where values of some names are bound to the same names but local to a closure:

(let ((car car)
      (cdr cdr))
    ...)

to capture the known “standard” values and ensure they are used regardless of the possible reassignments of the outer variables. The precise motivation is different but the mechanism is similar, especially if the let is expanded into the equivalent function call.

Another change we introduced is the private modifier on the imported slots. This means that the corresponding accessors are only available via messages sent from the same object or from the instances nested inside. (Though this is not enforced in the current Newspeak prototype). That is a good, and I’d even say required, style in this particular case because it insulates the dependencies of this module. Without that anyone could rely on retrieving the OrderedCollection class from an instance of AsteroidsGame, creating an undeclared and uncontrolled “derivative dependency.”

Finally, the last import illustrates how we can locally rename an imported value to avoid a conflict with a name used inside the module.

What about star imports: just grabbing everything from another module to avoid the hassle of listing every dependency we need by hand? That’s exactly the problem with star imports: they allow using a name without declaring that name as a dependency. It’s a misguided convenience that defeats the very modularity mechanism it is a part of. Even in languages that support star imports, good programming practices dictate avoiding them. In Newspeak they are simply not supported.

And finally, how do we actually use modules? If “actually” refers to the current prototype of Newspeak hosted in Squeak, the procedure I am about to describe is not the final idea.

In the prototype, a top-level Newspeak class resides in the Smalltalk system dictionary. Because message sends in Smalltalk and Newspeak mean the same thing, it is possible for Smalltalk code to communicate with Newspeak. We can bootstrap a Newspeak module by writing code like the following in a Smalltalk workspace (or in a menu item implementation method):

module := AsteroidsGame usingPlatform: Platform new.
gameScreen := module Screen new.
gameScreen openWindow.

The first line instantiates the module, the second instantiates the module’s Screen class, the third tells the instance to open a window. Platform is a special class included in the prototype to give the Newspeak world access to what’s available in the surrounding Smalltalk world. When Newspeak code executes

platform Collections OrderedCollection.

the platform’s doesNotUnderstand: method kicks in and retrives the standard Smalltalk OrderedCollection, making sure it is indeed part of the Collections package. Again, this is a temporary bootstrapping hack and not the final idea—though even as a hack it illustrates the power of modularity based on late-bound message sends.

For the final idea, see Gilad’s post about living without global namespaces. Combined with the ability to serialize and deserialize objects (and module instances are objects too), this ability to mix and link different modules has many exciting uses. Here are just a couple of scenarios.

You have an application that relies on the platform version 3.0. However, you’d also want to use another application written back in the times of version 2.1. Imagining we have an object responsible for module management called (say) CentralServices, launching both applications at the same time is as easy as

platform30:: CentralServices platformVersion: '3.0'.
platform21:: CentralServices platformVersion: '2.1'.
(AnApplication usingPlatform: platform30) MainWindow new open.
(AnotherApplication usingPlatform: platform21) MainWindow new open.

We could just as easily try and plug platform30 into AnotherApplication to see if it can in fact run with the latest version of the platform, and we could do that without even shutting down the already running AnotherApplication linked to the older version.

This leads to another scenario which as a tools developer I’m really excited about. Working on tools in a Smalltalk-like system is great because of the instant feedback, but also dangerous because you can break the very tools you need to fix the breakage. An idea has been floating in the Smalltalk world since (at least) mid-90s to separate the developer and the developed images. You’d have your tools running in one image, working remotely on the objects living in another image. The technology is there, and there have been various prototypes and implementations, but strangely none of those gained momentum. In Newspeak the solution is much simpler. Similar to creating a new instance of AnotherApplication module to try it with the new version of the platform while using the original, we can have two tools module instances at the same time, one running the tools we are using and the other representing the tools we are working on.

This shows an interesting difference between the roles images play in Smalltalk and in Newspeak. In Smalltalk an image is a manifestation of a specific platform version and configuration. We say things like “I tried running this in a 3.1 image with Foo loaded.” In Newspeak an image is only an object universe that can host a mix of applications and libraries of any imaginable version and configuration.

Referrer watching

Saturday, March 7th, 2009

I’ve just discovered that Jörg Mittag was kind enough to mention Newspeak, Brazil and Hopscotch as the biggest thing so far in developer tools in 2009. That’s nice to hear, even if only in an answer to a question hidden away on a website whose regulars seem unlikely to vote it up.

A Taste of Nested Classes, intermission

Sunday, February 22nd, 2009

(Continues from part 3).

This post is an intermission because it’s not specifically about nested classes. In fact, it’s not about anything that really exists in Newspeak. Instead, it reviews a few hypothetical examples to show off some of the expressive capabilities of implicit receiver sends.

Most expressions in Newspeak begin with an implicit receiver send, at least conceptually. In something like

foo size

or

OrderedCollection new

both foo and OrderedCollection really are message sends, for some definition of “really”.

Of course, in some cases we can’t do without real receivers. There are literals, as well as traditional pseudo-variables (“reserved words”), whose interpretation should be hard-coded into the language.

Or should it be?

Starting with literals, imagine that every occurrence of a literal is compiled as an implicit receiver send identifying the type of the literal, with the “primitive” value of the literal as the argument. For example,

'Hello, brave new world!'

would really mean

(literalString: 'Hello, brave new world!')

with literalString: implemented in Object as

literalString: primitiveValue <String> = (
    ^primitiveValue
)

At this point this is only a possibility, but one that is explicitly mentioned in the language spec (5.1). If implemented, this would allow user code to redefine the interpretation of what appears to be literals for greater flexibility when implementing internal DSLs.

As for pseudo-variables, consider this implementation of the class Object:

class Object = (
    ...

    self = (
        "empty - return the receiver"
    )

    true = (
        ^True soleInstance
    )

    false = (
        ^False soleInstance
    )

    nil = (
        ^UndefinedObject soleInstance
    )

    thisContext = (
        ^ActivationMirror forSenderContext
    )
)

With this, most of the reserved words of Smalltalk become regular messages. The expression

^self foo: nil

now really means

^σ self foo: σ nil

with σ denoting a reference to (usually!) the receiver unrepresentable in the language syntax. We use the fact that an empty method effectively reads as

to grab that elusive σ in the self method.

super is more complicated because it doesn’t denote a separate object but rather self with a special mode of message lookup. However, even that can be lifted from language magic into a reflective implementation at the library level:

super = (
    ^SuperDispatcher forReceiver: self
)

with SuperDispatcher implementing doesNotUndestand: to reflectively look for the method implementing the message starting from the superclass of the implementor of the sender context’s method. (A mouthful, but that really is what it should do).

In a similar way, outer, the new reserved word of Newspeak not found in Smalltalk could be implemented as a message send handled by reflective code at the image level.

The last two examples, super and outer, are even more hypothetical than the other pseudo-variables. While technically possible, a reflective implementation of those is quite expensive compared to the usual “language magic” solution.

Also, this scheme in general leaves the meaning of things like self or nil open to changes by a class for all its subclasses and nested classes. Perhaps there is such a thing as too much flexibility. But even if these ideas are too much or too expensive for the core language, all of the same mechanisms are available to a designer of an internal DSL, and that’s what makes them interesting. For example, the “dispatcher receiver” pattern illustrated with super and outer was conceived in Hopscotch and used there to support NewtonScript-like object hierarchy-bound message dispatch.

(Continues in part 4).

A Taste of Nested Classes, part 3

Sunday, February 15th, 2009

(Continues from part 2).

After an unplanned hiatus caused by various real life issues including the irritating surprise of becoming unemployed in these interesting times, here is the long-promised third part of the nested classes series.

Class nesting in our earlier example represented the natural nesting relationship of an asteroid object inside the game that contained it. Implicit receiver sends were a convenient mechanism of communicating with the game from inside an asteroid instance “for free”, without the need to maintain an explicit object reference. Today we look at a few examples with implicit sends of a slightly different flavor.

In a typical Smalltalk system, methods in the Object class define the behavior common to all objects. Some of them, such as class or identityHash are implemented to return some interesting information extracted from the receiver. Others, such as isNil are overridden in subclasses to vary their behavior polymorphically, again depending on the receiver object. In both of these cases, the receiver is involved one way or the other.

There is a third kind, methods like halt, error:, or flag: (in Squeak). They don’t extract any information from the receiver. Also, they are not overridden, so the implementation in Object tells the full story of their behavior. We send them to self and expect that they behave the same regardless of what self happens to be. As far as these messages are concerned, the receiver itself is unimportant.

These messages are utilities representing “ambient” functionality—something potentially useful everywhere in the code. We send them to self because it’s the most readily available receiver, and they are implemented in Object so that the sends work regardless of what self happens to be. Object in this case doubles as the container of these utility functions and inheritance makes them globally visible.

This global visibility is what also makes such utilities potentially problematic. Suppose I want to have a logError: utility that writes an entry into a log file, and I want to use it throughout my application. It’s fine to have it in Object only as long as another application doesn’t decide to use the same message but log the error to the system console instead.

Here is an alternative way to implement such utilities in Newspeak.

class AsteroidsGame = (...)
(
    ...
    class Space = (...)
    (
        moveObject: anObject to: position = (
            ...
            logError: 'invalid position'
        )
        ...
    )

    class Asteroid = GameObject (...)
    (
        respondToHit = (
            isLive ifFalse: [logError: 'invalid asteroid'].
            ...
        )
    )
    ...

    logError: message = (
        errorStream nextPutAll: message; cr.
    )
)

We put logError: in the top-level class representing our application. Any method of any nested class can invoke that utility through an implicit receiver send. The superclass of the nested class doesn’t matter—the utility is accessible because the class is nested in the one that provides it. Compared to the Smalltalk example, the utility is now available everywhere in our application and nowhere outside of it.

Here is another thing that’s interesting about the Newspeak alternative. In Smalltalk, logError: would have to be an extension method of Object. Extension methods are officially ungood in Newspeak, and the example shows the goodthink alternative.

In fact, such use of class nesting as a scoping mechanism for utilities solves an entire class of data conversion problems traditionally solved by extensions. Say, we want to convert an integer to a string with its representation as a Roman numeral. A common solution would be to add an asRomanNumeral extension method to Integer to do the conversion. This is convenient because we can write x asRomanNumeral anywhere, but has the disadvantage of the function leaking into the rest of the system. A more controlled alternative of implementing a convertToRomanNumeral: method in one of the application classes has the disadvantage of being easily accessible only in the implementing class and its subclasses. In other places we need to arrange for a reference to an instance that understands the message. In Newspeak, convertToRomanNumeral: can be a method of the application (module) so that it’s easily accessible, but only within the module.

Here is an interesting hypothetical example that continues the same theme. Suppose we implement the following method in a top-level class:

if: condition <Boolean> then: then <Block> else: else <Block> = (
    ^condition ifTrue: then ifFalse: else
)

Anywhere in the class and its nested classes we can now write conditionals in this more familiar to the general public style:

if: a < b
then: [^a]
else: [error: 'invalid a']

This example is hypothetical because it could only be fast enough on a very sophisticated VM with a good adaptive optimizer. But as far as less common and therefore less time-critical control messages go, they are no different from other utilities. We can define them in the top-level application class and use them throughout the application. For example:

^return: 5 if: #bar notUnderstoodIn: [foo bar]

defined in a top-level class as

return: value <Object> if: selector <Symbol> notUnderstoodIn: block <[]> = (
    ^block
        on: MessageNotUnderstood
        do: [:ex |
            ex message selector = selector
                ifTrue: [ex return: value]
                ifFalse: [ex pass]]
) 

To sum up, some methods define the kind of ambient functionality that is used throughout an application. For that reason it’s best if it can be accessed implicitly, without arranging for an explicit reference to the object that provides it. In classic Smalltalk, the only mechanism to support that is inheritance, with the disadvantage that the functionality defined this way escapes the defining module. Nested classes and implicit receiver sends in Newspeak provide an alternative mechanism that keeps such ambient operations contained inside the module.

(Continues in intermission).

A Taste of Nested Classes, part 2

Sunday, December 7th, 2008

(Continues from part 1)

Let’s now look at some message sends. Suppose we design the application so that the message respondToHit is sent to an asteroid when it’s hit by a missile. In Smalltalk, a typical response would look something like

respondToHit
    game
        replace: self with: self fragments;
        incrementScore.

game here is an instance variable holding a reference to the Game instance the asteroid belongs to, presumably initialized at the time the asteroid was created. How could we port this to Newspeak? Translating everything literally, we would define a slot named “game” in the Asteroid class and port its initialization logic. We would then port all the methods. The original respondToHit would remain unchanged, apart from the overall “packaging”:

respondToHit = (
    game
        replace: self with: self fragments;
        incrementScore. 
)

But is it all the same as before? Not quite, because just like in Self, Newspeak slots are accessible only through messages. What looks like a good old game variable reference has now become a send of the message game to (an implicit) self.

Let’s hold this thought and backtrack to the end of the previous post. We said that an asteroid always knows the game instance it belongs to, it’s the enclosing object. We don’t need the game slot we ported from the Smalltalk original because it holds onto the same thing!

We can drop it and any of its initialization logic because the language now keeps track of the enclosing game instance for us. Instead suppose we define this method in Asteroid:

game = (
    "Get and return the enclosing object. To avoid getting ahead
    of the presentation, we don't show how it's done yet."
)

As for respondToHit and any other methods, they are unchanged—a perfect example of the benefits of representation independence. When game is just a message send, nobody cares how it gets its result.

Now, instead of revealing how the method game is implemented, we are going to do something even better. We are going to get rid of it altogether.

Time to talk about implicit receiver sends. I’ve hinted more than once that an implicit receiver is not always self. If it’s not self, what else can it be? Is there any other object that is somehow implicitly “present” at the location where the message is sent? Of course, it’s the enclosing object. (Or objects, if there is more than one layer of nested classes). Think of it this way: the respondToHit method is contained not only in the Asteroid class, but also indirectly in the Game class. So, the code in the method runs not only in the context of the current receiver, an instance of Asteroid, but also in the context of the enclosing instance of Game.

Without further ado I’m going to show respondToHit rewritten to take advantage of implicit receiver sends, and then describe their exact behavior. This will also explain under what specific conditions such a rewrite would be possible.

respondToHit = (
    replace: self with: fragments.
    incrementScore.
)

This reads almost like plain English. replace:with: and incrementScore are now sent to the enclosing game via the implicit receiver mechanism. Note that fragments is also changed to have an implicit receiver, though in this case we expect that the receiver is self. How does it happen that in both cases the actual receiver is what we want it to be?

Let’s recap how these potential receivers are related. We have essentially a queue of them: first self, then its enclosing object. (With more than two levels of nested classes, it would be followed by the enclosing object of the enclosing object and so on). If we represent the situation as a picture of the classes of the objects involved and their inheritance, we get this comb-like structure (to make the picture a little more interesting, Asteroid here is subclassed from a hypothetical ScreenObject, even though our original code didn’t say that).

Classes potentially involved in message processing

Because a method with a matching selector might be defined by any class in the comb, it may seem that the most intuitively reasonable lookup strategy is the one adopted by NewtonScript. The policy there was to search the entire comb starting with the receiver and its superclasses, then the receiver’s enclosing object (called parent in NewtonScript) with its superclasses, and so on. In the example we would look first in Asteroid, ScreenObject and Object with the intent of the Asteroid instance being the receiver, then Game and Object with the intent of the Game instance being the receiver. The send would fail if none of the classes in the comb implemented a method to handle the message.

I emphasized “seem” because the strategy in Newspeak is different, for the reason I am outlining in the end of the post. But first, here is how it really works:

  • If one of the classes on the “trunk” of the comb implements a method with a matching selector—in other words, if a matching method is lexically visible at the send site—the message is sent to the instance of that class.
  • Otherwise, the message is sent to self.

Even more informally, we could say that when by “looking around and up,” “around” meaning the same class definition and “up”—the definitions it’s nested in, we can see one with a matching method, the instance of that class is what receives the message. Otherwise, it is sent to self (and results in an MNU if self does not understand it).

So, the lookup in Newspeak differs from NewtonScript-like approach in two important ways. First, it doesn’t traverse the entire comb. The only potential handlers of the message are the methods defined in the trunk classes and those in the superclasses of the class of self. Second, lexical visibility takes priority over inheritance. If a method of Asteroid sends the message redraw to an implicit receiver and a method named redraw is defined in both Game and ScreenObject, the ScreenObject implementation wins in NewtonScript, while the one in Game wins in Newspeak.

This explains how the correct receiver is chosen in our implementation of respondToHit. As long as replace:with: and incrementScore are defined in Game, they are sent to the enclosing game instance as lexically visible. As for fragments, it’s sent to the asteroid no matter whether it’s defined in Asteroid or inherited. If Asteroid defines it, it’s lexically visible and the asteroid receives it according to the lexical visibility rule. If ScreenObject defines it (and assuming Game doesn’t), it’s lexically invisible and the asteroid receives it because the asteroid is self.

What if Game also had a method named fragments? In that case the definition in Game would win, so in order to get the fragments of the asteroid we would need to write the send explicitly as self fragments. What if replace:with: is inherited by Game from the superclass? In that case we also need to make the receiver explicit, because otherwise it would go to self:

game replace: self with: fragments.

This brings us back to the problem of writing a method named game returning the enclosing instance of Game. Now we know enough to come up with a very simple solution: we add the method to the class Game (not Asteroid), defined as

game = (
    ^self
)

So, now that we know how it all works—why does it work this way? Why bring lexical scoping of messages into the picture? Is it Right™ to give priority to methods of another object over inherited methods of the receiver? Why not just follow the NewtonScript example—it appears simple and reasonable enough.

The thing is that compared to NewtonScript, class nesting in Newspeak is intended to solve a very different problem.

NewtonScript is prototype-based, and its object nesting is motivated by the need to represent the nesting of UI elements and support communication within that structure. The nesting and the associated message processing is essentially a built-in Chain of Responsibility pattern, very common in the UI field. (As an aside, Hopscotch implements NewtonScript-like mechanism for sending a message up the chain of nested UI elements. This is implemented as a simple library facility with no more language magic than doesNotUnderstand:. Implicit receivers further help make this very unobtrusive).

In Newspeak, even though the “free” back pointer from Asteroid to Game in our example was nice to have, having it was not the primary reason for nesting one class in the other. In fact, class nesting is a rather unwieldy mechanism for implementing arbitrary parent-child relationships—imagine having to define a group of classes for the scenario of “buttons and list boxes in a window,” and an entirely different one for “buttons and list boxes in a tab control in a window”!

Newspeak-like class nesting and method lookup are designed to handle relationships that have more to do with the modular structure of code, and with the organization of information sharing within a module. A is nested in B when B is a larger-scale component as far as the system architecture is concerned. Even though such nesting does effectively create a parent-child relationship between Bs and As, there are parent-child relationships that do not justify nesting.

In nested Newspeak classes, a child calls onto the functionality provided by the parent not as a fallback case in chain-of-responsibility processing, but as a way of interacting with its architectural context. The lookup policy supports that and at the same time allows to modularize the context. That is the motivation behind putting lexical visibility before inheritance.

If this sounds somewhat vague, I hope some of the more realistic examples in the next part will help clarify this point.

Continues in part 3.

A Taste of Nested Classes, part 1

Thursday, December 4th, 2008

In the recent posts I mentioned that because of class nesting the receiver of an implicit-receiver message send is not necessarily “self”. The full story with a comparison to other approaches is told in Gilad’s paper On the Interaction of Method Lookup and Scope with Inheritance and Nesting. In this series of posts I provide an informal example-oriented introduction for someone familiar with Smalltalk, to help better understand the full story.

First an important terminology note. When we say “class” we can refer to either a (textual) class definition or a class metaobject. Because the relationship between the two in Smalltalk is normally one-to-one, we can pretty much ignore that ambiguity. The Newspeak situation is different, and not thinking of definitions and metaobjects as separate things is perhaps the easiest way to get confused. To avoid that, in this post I’ll stick to spelling out “class definition” or “class metaobject” fully, unless there is no danger of confusion.

Quite naturally, “nested classes” first of all mean nested class definitions. Why would we want to nest one class definition inside another? This, of course, largely depends on what exactly such nesting translates to and what benefits it brings, and that’s what this whole series is about. For starters (even though this by far is not the primary benefit) let’s assume that we do it as a matter of code organization. For example, suppose we are implementing the immortal Asteroids game and decide to nest the definition of the Asteroid class inside the Game class. In the current Newspeak syntax it would look something like this:

class Game = ()
(
    class Asteroid = () ()
)

This rather basic implementation will not bring us hours of gaming enjoyment, but at least we do now have one class definition nested inside the other. How do we work with this thing?

Starting from the outside, in the current Newspeak system hosted in Squeak, this definition would create a binding named “Game” in the Smalltalk system dictionary, holding onto the Game class metaobject. This means that in a regular Smalltalk workspace we could evaluate

Game new

and get an instance of Game.

I want to emphasize right away that despite what this example may suggest, Newspeak has no global namespace for top-level classes. This availability of Game as a global variable in Smalltalk alongside the regular Smalltalk classes is the scaffolding, useful here to write examples in plain Smalltalk to keep things more familiar for the time being.

So what about Asteroid—how could we instantiate that one? Nesting a class in another essentially adds a slot to the outer class definition, with an accessor to retrieve the value of the slot. That value is the metaobject of the inner class. So, we get an inner class (metaobject) by sending a message to an instance of its outer class (metaobject). Notice that it’s not “by sending a message to the outer class”—“an instance of” is an important difference:

game := Game new.
asteroidClass := game Asteroid.
asteroid1 := asteroidClass new.
asteroid2 := asteroidClass new

So far nothing really surprising, apart from having to go through an instance of one class to get at another class. Let’s now change the example:

class1 := Game new Asteroid.
class2 := Game new Asteroid.
asteroid1 := class1 new.
asteroid2 := class2 new

This is more interesting. asteroid1 and asteroid2 are both instances of Asteroid and behave the way the class definition says. But since the Asteroid class references come from different game instances, what can we expect beyond that? For example, what would these evaluate to:

asteroid1 class == asteroid2 class
asteroid1 isKindOf: class2

The answer is false, for both of them. The two instances of Game produced by Game new evaluated twice contain independent Asteroid class metaobjects. Their instances asteroid1 and asteroid2 are technically instances of different classes, even if those metaobjects represent the same class definition. (Relying on explicit class membership tests is a poor practice in Smalltalk, and this illustrates why it’s even more so in Newspeak).

Let’s now look at another variation:

game := Game new.
asteroid1 := game Asteroid new.
asteroid2 := game Asteroid new.

In this case

asteroid1 class == asteroid2 class

is true because the two sends of the message Asteroid retrieve the same Asteroid class metaobject held onto by the instance of Game.

Let’s now summarize the objects involved and their relationships. They are shown in the following picture

Nested classes illustration

There is a class metaobject for the Game class, contained by the system dictionary (shown in gray to emphasize that it’s not an “official” part of the scheme). An instance, connected to the class with a dotted line to represent the instance-of/class-of relationship, holds the Asteroid class metaobject. That metaobject has instances of its own, which are the asteroid1 and asteroid2 of our example.

If the Game class has additional instances (one such instance is shown small in the diagram), each instance has its own copy of the Asteroid class metaobject. If there were multiple levels of nested classes, the scheme would repeat for the deeper levels.

The diagram also illustrates a concept that will be central to the next part of this series: the enclosing object. In terms of our example, an enclosing object of an asteroid (instance) is the game (instance) that owns the class (metaobject) of the asteroid. This relationship is obviously quite important and useful: simply put, it means that an asteroid knows what game it is a part of. That’s simply by virtue of nesting the Asteroid class definition inside Game, without having to explicitly set up and maintain a back pointer from an asteroid to its game.

(Continued in part 2).

How to Learn to Stop Worrying and Love Implicit Receivers

Sunday, November 30th, 2008

I remember a thread on squeak-dev sometime in summer 2007 discussing implicit self as a hypothetical feature for Smalltalk. Back then we were not yet talking about Newspeak, and so I held my peace. But not forever—Andres recently wrote a blog post about the same thing, with a good summary of most of the same points I remember from the thread. This time Newspeak is known out there and explicitly mentioned in the post, so it’s a good chance to offer an alternative view and correct some misconceptions.

Here is the most important one. Quoting from Andres’ post:

I find that the consistency offered by a few keystrokes makes it easier for me to read and understand code faster and more accurately. Therefore, since we read code much more often than we write it, I think that favoring reading speed over typing speed is the right decision to make.

As far Newspeak is concerned, what we have is not “implicit self”, and its purpose is not saving keystrokes. What Newspeak has are implicit receivers. Because of class nesting, a message with an implicit receiver may really be sent to an object different from the “real” self (the receiver of the current message). This feature is very important in supporting the minimalist module system of Newspeak. Thus, an implicit receiver is not simply an omitted self, and inserting “self” into a message send with an implicit receiver is not a behavior-preserving transformation.

Next point:

I’d rather see self than having to assume it by scanning the first token until the first occurrence of $: (or ‘::’) to only then be able to disambiguate between a receiver and a keyword.

In short: I prefer the work of my internal parser to be made easier by the use of prefixes, rather than to have to keep a stack that only goes down in the presence of a suffix.

This aurgmnet is falwed for the smiple raeson that our percpetion dose’nt wrok this way. We do’nt hvae an intarenl parser. What we rellay hvae culod be desrcbeid as a comlepx adpative, preditcive and bakcptaching paettrn recogznier. This is why we can still read the above even though most of the words are messed up. We don’t scan the text linearly one character and one token at a time. Words are pictures, not character arrays. The structure of lines such as:

foobar baz.
foobar: baz.
foobar:: baz: foo.

and the presence or absence of : or :: in each are perceived by an experienced reader in one “processor cycle”. In short, this is a usability issue, and it’s a mistake to see it as an engineering one.

Those left unconvinced should also consider that modern IDEs, Newspeak’s Hospcotch included, do the parsing for you by colorizing the source code.

So it’s by far not a given that implicit receivers reduce readability. On the other hand, there are situations when they improve readability by eliminating noise. A good example are DSLs embedded in Newspeak. So far we have two such languages widely used in the system: Gilad’s parser combinators and my UI combinators in Hopscotch. The feature common to both are definitions written in a declarative style combining smaller things into larger ones. Compare an example of such a definition the way it’s commonly written:

heading: (
    row: {
        image: fileIcon.
        label: fileName.
    })
details:
    [column: folderContents]

with the same definition with explicit receivers:

self heading: (
    self row: {
        self image: self fileIcon.
        self label: self fileName.
    })
details:
    [self column: self folderContents]

The first example has nothing but the structure it defines. It’s important what the expressions say. The fact that they are message sends is an implementation detail. The second example leaks this implementation, and it takes some effort to see what it really says in between all the “self”s.

And the final point. There’s much to be said about the human nature and the tendency to instinctively resist change to something familiar while trying to rationalize that resistance. It takes some time and experimenting to see a change for what it is and get a feel of the new tradeoffs. I don’t know how many people with a considerable Self expertise are out there, but I doubt there are too many. I do know all the experienced Newspeakers. Which is to say, the time for conclusions hasn’t come yet.

Update:

Peter Ahé pointed out offline that I didn’t mention another important property facilitated by implicit receivers in Newspeak as well as in Self: representation independence.

Brazil Example: a Classic Smalltalk Browser

Sunday, November 23rd, 2008

Here is an example of a classic Smalltalk browser implemented in Brazil. While the main point here is to illustrate Brazil API, an important related point I want to make first is that in our system Brazil is not used the way this example shows (and neither do our browsers look like that).

A good analogy to Brazil’s role in our UI is assembly language. One can program in it directly, but most of us these days don’t do that on any significant scale. In the same vein, one can think of Brazil as the UI assembly language. Earlier I described Hopscotch as an application framework, but that was simply to associate it with something familiar to everyone. In reality, Hopscotch is more of a higher-level UI language that hides or recasts in a different form the details of the low-level language it is based on. And just like one usually isn’t concerned with the exact translation of a high-level language program into machine code, a Hopscotch programmer doesn’t work with Brazil API directly.

The picture of the browser.

What it looks like

In the example, we create a window with the usual four list boxes and a text view, and wire them up so that list selection works the way it works in Smalltalk. The screenshot shows the browser open as a native Vista UI, and browsing the Object class from Squeak. The window in the background is the Newspeak workspace with the expression I used to instantiate and open the browser. The part in parentheses has to do with class nesting and the use of nested classes to implement modules. If you haven’t followed one of Gilad’s explanations, feel free to ignore the parenthesised part for the purposes of this example. The remainder,

ClassicBrowser new open

in the end produces the result a Smalltalker would expect—if not in the entirely expected way.

And here is the ClassicBrowser class itself. First, the class definition and the initializer:

class ClassicBrowser = (
"A simple implementation of a read-only Smalltalk browser."
|
	window = Window new.
	categories = ListBox new.
	classes = ListBox new.
	protocols = ListBox new.
	selectors = ListBox new.
	code = TextView new.
|
	assembleUI.
	configureUI.
)

This may look almost like a method with a comment, a list of temporaries in a not-quite-Smalltalk syntax (but note the vertical bars flushed to the left), and two expressions in the body. The comment here is the class comment, and the “temporaries” are actually slot declarations with initializers. Slots mean instance variables with automatically generated accessor methods. All slots of this class are read-only because of the use of the equals sign tying each slot to its initializer. A read-only slot is simply one with a getter method but no setter. The slots hold onto Brazil visuals assembled and configured into a coherent interface by the two methods called from the initializer. Here is how the assembly is done.

assembleUI = (
	| navigationRow |
	window area bounds: (200 @ 200 extent: 600 @ 600).
	window
		title: 'Brazil System Browser';
		content: Column new.

	navigationRow: Row new.

	window content
		add: navigationRow;
		addBlankSize: 3;
		add: code.

	navigationRow area height: 0; elasticity: 4.
	code area height: 0; elasticity: 6.

	navigationRow
		add: categories;
		addBlankSize: 3;
		add: classes;
		addBlankSize: 3;
		add: protocols;
		addBlankSize: 3;
		add: selectors.

	categories area elasticity: 1.
	classes area elasticity: 1.
	protocols area elasticity: 1.
	selectors area elasticity: 1.
)

The interesting point here all those “area” things the method talks to. Areas is a concept introduced in Brazil layout model to avoid the mess common in other UI frameworks. What’s messy is that it’s never clear who is in charge of a widget’s layout, and how to specify that layout in a particular situation. For example, in Morphic all morphs understand the message bounds: that repositions the morph, as well as the more focused messages position: and extent:. There are also specialized ones like vResizing: or cellSpacing:, which may or may not work in a particular setup. Given that some parent morphs such as AlignmentMorph manage the layout of their children, what happens when I place a morph inside one and then send position: to it? And how can I find out which of the plethora of layout messages that Morph understands will actually work in a given context?

The general problem here is that the protocol used to control the layout of a child depends on the context (the context being first of all the parent who manages the layout). Thus, Morph needs to expose the union of all possible layout attributes, only some of which are applicable at any given time.

The solution in Brazil is simple and natural, though I don’t know of other frameworks taking the same stance. A Brazil visual (the framework’s term for a widget) is unaware of any layout attributes, and has no methods to change them. Layout parameters are managed by a separate object called the visual’s area. The area is manufactured at the time a visual is added to the parent, as an instance of a class capturing the layout attributes meaningful for that particular visual in that particular parent. Thus an area represents the capability of positioning a widget inside the current parent. When a visual is a child of CompositeVisual (a free-form container similar to VisualWorks’ CompositePart), its area allows to arbitrarily position the visual by sending the bounds: message. When the same visual is a child of a Row, the area is an instance of a different class whose attributes and behavior are meaningful for a row cell. Arbitrarily moving the child should not be possible in that situation, and it is indeed impossible since this kind of area does not even understand the bounds: message.

Besides excluding what is impossible, areas also nicely manage and document the possible layout strategies. For example, in one situation we may want to force a Button inside a CompositeVisual to have a particular size. In a different situation, we might want to allow the button to resize to match the size of its label. These two layout disciplines are captured by two area classes called Frame and Anchor, both of which are allowed as areas of a button inside a composite. Assigning one or the other to a Button inside a CompositeVisual selects the layout strategy we want to use for the button.

A third benefit of this design feature is that unlike traditional frameworks whose layout strategies are hard-coded, with individual widgets holding onto layout attributes and behavior to control it, Brazil allows for pluggable layout strategies. For example, to extend the framework with a container that positions its children using a linear constraint system, all that’s needed is the container visual itself plus one or more area classes to capture the constraint parameters. None of the existing visuals require any changes to be used as children of the new container.

Back to the example. Now that the widgets are all assembled, the next method configures and initializes them:

configureUI = (
	configureCategoryList.
	configureClassList.
	configureProtocolList.
	configureSelectorList.
	configureCodeView.
)

Everything is straightforward here (again, we are freely using implicit receiver sends to avoid the clutter of “self” everywhere). Here is just one of the individual configuration methods—others are similar:

configureCategoryList = (
	categories 
		list: Smalltalk organization categories;
		displayBlock: [:symbol | symbol asString].
	categories selectionChanged => (self ~ #categorySelected:).
)

Again, nothing unusual, except perhaps for the last line. The part in parentheses is a section, in this case a shorthand for

[:category | self categorySelected: category]

The message => is part of Ducts, Brazil’s change notification framework, a minimalist alternative to Announcements (two classes and under 20 public methods, with all the same capabilities). It ties a change notifier supplied by the selectionChanged method to the section, so that whenever the selection changes the categorySelected: message is sent to the browser.

The categorySelected: method is straightforward again.

categorySelected: selection <Symbol | nil> = (
	selection
		ifNil: [classes list: Array new]
		ifNotNil: 
			[| names |
			names: (Smalltalk organization listAtCategoryNamed: selection).
			classes list: (names asSortedCollection collect: [:each | Smalltalk at: each])]
)

How to Design a Smalltalk UI Framework

Thursday, November 13th, 2008

It’s common for Smalltalk implementations to run on multiple platforms, and that leads to the problem of designing a framework to allow an application UI to display on all those platforms. The existing solutions (not only in Smalltalk) suffer from various shortcomings. In order to solve this problem better, Brazil from the start was based on a number of principles different from the industry examples. Here is an overview of those principles.

It’s not about widgets. A UI framework is not (just) a widget library. In that sense, Morphic is on the right track (and in the mainstream, so is WPF). Interfaces visualize information, and a good interactive visualization is typically more than a window with a few labels and list boxes plopped onto it. A UI framework needs to do much of what was traditionally expected from a structured graphics (“diagramming”) framework. Of course, buttons, list boxes and other primitive interactive elements are still necessary, but the value of a framework is primarily in composing rather than in implementing them. Which, in fact, it shouldn’t do at all.

When it’s about widgets, it’s about native widgets. Any OS provides about the same core set of primitive interactive elements. They are what the users expect to see in a “real program”, and the OS by definition does a better job implementing them than anyone could ever hope to. Emulated widgets are bound to be a poor replica of the original and are thus an exercise in futility. On the way from Smalltalk-80, one attraction of the emulated approach might have been the ability to customize, extend and roll your own, for the sake of creating more expressive UIs. But when a UI framework is also a graphics framework, expressive UIs can be built by composition rather than by widget hacking—so there really is no good reason not to use what’s already available, even if its implementation is sealed inside the OS.

Cross-platform shouldn’t mean platform-agnostic. So what about the lowest common denominator problem? It’s often assumed that using native widgets in a cross-platform environment means that only the subset of their features common across all platforms can be accessible. The mistake of this view is the assumption that it’s not cross-platform unless the user is shielded from platform differences. It doesn’t have to be that way, and shouldn’t be. Platforms are different, and a properly designed cross-platform framework should embrace and model those differences rather than try and inevitably fail creating an illusion that they do not exist. In practice, such modeling means that a framework object that represents a widget can provide access to the lowest common denominator features, while platform-specific features can be available as a “capability”—a separate object one can fetch from the widget when the capability is available (i.e. when the current platform is the one whose capabilities you ask for).

Qt is not your friend. It’s a common and understandable question—why not use an existing cross-platform framework like Qt or wxWidgets. Everything that has been said above is reason enough, and another important consideration is that a large complex hard-to-debug third party layer shielding you from the OS facilities you want to use sounds like bad news (something I dubbed as “high Space Shuttle factor” in the original implementation study). Which is also related to the last point.

If it’s not written in Smalltalk, it’s broken. There is a tendency (at least in Smalltalk-80 descendants) to do OS interaction through primitives. That is so perhaps because FFI arrived relatively late to ObjectWorks, and never was available for Squeak. Primitives are an arcane enough feature, so they are not discussed in Smalltalk style guides. If they were, the guides might have said something like this: many primitives you see in VisualWorks and Squeak should not have existed. A primitive is supposed to do what cannot be expressed in Smalltalk. If a call out to the OS or a call back from the OS cannot be expressed in Smalltalk, you need an FFI, not a VM plugin. OS interaction is not something that belongs in a primitive—and if fact, a primitive is about the worst place to do that kind of work. It means writing a large body of complex code in a low-level language with a higher probability of making a mistake, and then hiding it away in the VM where it can’t be easily debugged and fixed. Hardly a winning combination.