Common Processes

Metadata

Author: Richard A. O'Keefe
Date Created: 2011.08.30
Date Revised: 2012.03.02
Experts: (any implementors want to volunteer?)
Prerequisites: None
Supersessions: None
Incompatibility: None known
Existing material: The Blue Book; Inside Smalltalk; VisualWorks Smalltalk; VisualAge Smalltalk; Squeak; Pharo; Dolphin Smalltalk; GNU Smalltalk; POSIX.2 (Single Unix Specification)
License: The code in my Smalltalk is completely free, as is all code in this STEP.
Editor: Richard A. O'Keefe until replaced.
Abstract: The Blue Book description of concurrency support in Smalltalk-80 still holds good, or very nearly so, for the Smalltalks listed above. It's time this was in the standard. Multicore machines are now commonplace; in order to exploit this in portable Smalltalk we need standard concurrency support. The de jure concurrency interface might as well be the de facto one.

Motivation

There is no concurrency support in the ANSI Smalltalk standard.

There are at least four reasons why we need concurrency support.

There are problems which are most naturally modelled or solved using concurrency. One of them is being Smalltalk: evaluating (Process allInstances size) gave 3 in Visual Age, 4 in GNU Smalltalk, 7 in Squeak, and 31 in VisualWorks. User and network interfaces are especially obvious. The Blue Book used processes for discrete event simulation, which remains an application area of practical importance for which Smalltalk could be useful.
Multicore computers are no longer exotic or expensive. The laptop I'm using this year has 4 cores; the one I used last year 2. I have access to a 16-core machine, which my own Smalltalk runs on. The TILE64 64-core chip has been around since 2008. Tilera claim to be “Delivering the World's First 100-Core General Purpose Processor”. We should be able to program machines like these effectively in Smalltalk.
Smalltalk has had concurrency longer than Java has existed, yet what do we see advertised for multicore machines? C/C++, in some markets Fortran, and always, Java. Smalltalk's long-standing suitability has not been realised.
Present-day Smalltalk implementations are quite close to the Blue Book interface and could easily be made closer to each other. The few changes I suggest below are mostly “one-liners”. The problem is that current systems offer many methods that are not mutually compatible and there is no easy way for a programmer to find out what will port and what will not.

As it happens, I don't much care for the Smalltalk concurrency model. It's a simple model, but a dangerous one. For my own system, I was originally glad that concurrency was not in the standard, because that left me free to explore the possibility of shared-nothing message-passing concurrency like Erlang's. However, as a step towards that, I quickly hacked up a thread library to sit on top of POSIX threads, and was shocked at how quickly it came together and how close it was possible to get to the Blue Book.

Now if only there were some way we could all know what concurrency operations were expected to be portable…

Technical Specification

Warning: this is a draft. I have couched it in the classic language of classes, rather than the Standard's language of protocols.

Process states

For the purposes of this description, each Process will be in exactly one of the following states:

waiting: The Process is not executing because it is waiting for some event, such as a Semaphore being signalled.
suspended: The Process is not executing because it has just been created or has suspended itself.
terminated: The Process has completed execution, either successfully or unsuccessfully. It could be useful to discover whether a Process was terminated because of an uncaught exception, but that's not (yet) in this interface, so we do not need to distinguish these kinds of termination.
ready: The Process is ready to run. Some Smalltalk systems call this “runable”, which is a spelling mistake not to be perpetuated.
active: The Process is actually executing.

There is a standard way to tell whether aProcess isTerminated; there is no standard way to discriminate the other states, because the current state of a Process is an evanescent property, except for being terminated, which is stable.

“Evanescence” refers to the fact that in a truly concurrent system, if you ask what state a shared object is in, by the time you look at the answer the object may well not be in that state any more. Even in a single-core machine, if there is pre-emptive scheduling such properties must always be treated as out of date. Some properties, like names given to processes, could change and so be evanescent, but in practice they don't change, and have been omitted. One aim in developing this proposal is that no properties that change evanescently as part of normal operation should be included. That does not mean that existing systems cannot continue to offer everything they offer now, only that intrinsically unreliable operations should not be in the standard.

global constant Processor

Each Process sees a global constant called Processor. Since a Processor might have the responsibility of scheduling tasks on a particular core, it is credible that there might be a Processor object per core. It is therefore not specified whether there is one Processor variable or many. Since a Process might be moved from one core to another, it is not guaranteed that a process will always see the same Processor object:

Processor yield; yourself == Processor

might answer false because the process was migrated between the two evaluations of Processor; just as any other global variable might have different values before and after a #yield.

The class that the/each Processor belongs to is not specified. In particular, ProcessorScheduler is not to be part of the standard.

The methods specified here for Processor all either refer to the “active” process (the one making the request) or return a fixed constant.

object methods

activePriority
same as Processor activeProcess priority
activeProcess
answer the Process corresponding to the thread of execution that did the asking; this does not imply that no other thread was active at the same time. This Process is necessarily in the active state.
highestPriority
answer an Integer such that no process may have a larger priority.
lowestPriority
answer an Integer such that no process may have a smaller priority.
suspendActive
put the activeProcess into the suspended state, which it will remain in until sent #resume.
terminateActive
cause the activeProcess to cease execution. Forcibly exit any #critical: regions it was in. It is the programmer's responsibility to ensure that all data so protected are in consistent states so that this is safe.
userInterruptPriority
answer an integer suitable as a priority for urgent tasks
userScheduling
answer an integer suitable for tasks with a user interface
userBackgroundPriority
answer an integer suitable for non-urgent tasks without a user interface
waitFor: aDuration
put Processor activeProcess into a waiting state for the given length of time; if aDuration is not positive, this has the same effect as #yield.
waitUntil: aDateAndTime
put Processor activeProcess into a waiting state until the given time point is reached; if aDateAndTime is not in the future, this has the same effect as #yield.
yield
put Processor activeProcess into the ready state and then choose any of the ready Processes with highest priority to enter the running state; that might be the same Process.

Note that Processor lowestPriority ≤ Processor userBackgroundPriority ≤ Processor userSchedulingPriority ≤ Processor userInterruptPriority ≤ Processor highestPriority, but any or all of them might be equal.

While #lowestPriority is not implemented in all the systems I checked, it is implementable by adding a one-liner.

Only Dolphin has #suspendActive, but that's a one-line addition. It is important not to include a facility for suspending any other process in the standard.

VisualAge does not have #terminateActive, but that's a one-line addition. It is important not to include a facility for terminating any other process in the standard.

class Process

class methods

None presented.

instance methods

isTerminated
answer true if the receiver has completed execution, false if it has not.
name
answer a <readableString> or nil; if not nil the name is not necessarily unique
name: aString
aString should be a <readableString> or nil; it or a copy will be saved for use as the name of the process
onUncaughtExceptionDo: aBlock
If a Process gets an uncaught exception, it invokes aBlock, passing it the signalled exception and the Process. The block is executed as part of the failing Process. The effect is not specified if no handler of last resort is installed.
printOn: aStream
I'd like to say something about this, such as, the output must begin with 'a Process' and should mention the name and priority of the process. There is a rough consensus for 'a Process(name, priority, state)', but each system checked deviates from that in at least one way.
priority
answer an integer such that if process a has priority p and process b has priority q and p < q and both a and b are ready to run at the same instant it will not be the case that a is chosen to run and b is not (although it might be the case on a multicore system that both are chosen). These priorities need bear no relationship to thread or process priorities in the host operating system (if any).
priority: anInteger
if anInteger is an integer and is between Processor lowestPriority and Processor highestPriority inclusive, set the priority of the receiver to anInteger. If not, raise an exception. It is possible that there might be only one legal priority.
If it seems undesirable that one thread should be able to change another's priority, recall that pthread_setschedparam() can do the same in POSIX threads.
resume
cause a suspended process to become runnable. Sending this message to a waiting process is illegal. The effect of sending this to a process in any state other than waiting or suspended is not specified.

Rationale

VisualAge lacks #isTerminated; I'm not sure if #isDead is the same thing, but I think so. This state test is included because once it becomes true, it remains true. Other state tests are evanescent.

The Blue Book (and Common) methods #suspend and #terminate are deliberately absent from this draft. The safe uses of them have been replaced by #suspendActive and #terminateActive in Processor.

#suspend and #resume are useful building blocks for an operating system, which is why they have a place in real Smalltalks. They are disasters for building reliable user-level programs, because any synchronisation pattern you might set up could be disrupted from the outside. To quote the Java documentation:

Why are Thread.suspend and Thread.resume deprecated?
Thread.suspend is inherently deadlock-prone. If the target thread holds a lock on the monitor protecting a critical system resource when it is suspended, no thread can access this resource until the target thread is resumed. If the thread that would resume the target thread attempts to lock this monitor prior to calling resume, deadlock results. Such deadlocks typically manifest themselves as 'frozen' processes.

In addition, if you want to implement Smalltalk threads on top of POSIX threads, it matters that pthreads do not provide suspension or resumption.

#onUncaughtException: is new. The intention is to underpin an Erlang-style linking mechanism so that a failing Process can take down a whole group of Processes, or so that a worker can be restarted by a supervisor. But any approach you might want needs to be hooked in somehow, and allowing a single block to be invoked seemed the simplest and most general.

<niladicBlock>

instance methods

atPriority: aPriority
evaluate the receiver as if by sending #value, with the priority of the invoking process temporarily set to aPriority.
fork
↑ (self newProcess) resume; yourself
forkAt: aPriority
↑ (self newProcess) priority: aPriority; resume; yourself
forkAt: aPriority named: aString
↑ (self newProcess) priority: aPriority; name: aString; resume; yourself
forkNamed: aString
↑ (self newProcess) name: aString; resume; yourself
newProcess
answer a new Process that is ready to send #value to the receiver, but is in a suspended state, and has priority: Processor activePriority .

<blocks in general>

instance methods

newProcessWith: anArray
if self valueWithArguments: anArray would raise an
error, raise it now, in the calling process, otherwise
↑[self valueWithArguments: anArray] newProcess

As shown, any Smalltalk that supports #newProcess and the Process methods #name:, #priority:, and #resume can trivially support all of these methods.

class Semaphore

A Semaphore may be thought of as containing a non-negative integer counter and a possibly empty queue of suspended processes. Semaphores are identity objects.

class methods

new
answer a new semaphore with counter 0 and empty queue.
forMutualExclusion
answer a new semaphore with counter 1 and empty queue.

instance methods

critical: aBlock
self wait.
↑[aBlock value] ensure: [self signal].
printOn: aStream
The output begins with 'a Semaphore'.
signal
If the queue is empty, increment the counter,
otherwise remove the first process from the queue
and resume it.
signalAfter: aDuration
signal the receiver at some time aDuration after the present. If aDuration is not strictly positive, the receiver is signalled immediately.
signalAt: aDateAndTime.
signal the receiver at some time after aDateAndTime. If aDateAndTime is not in the future, the receiver is signalled immediately.
wait
If the counter is positive, decrement it,
otherwise suspend this process, add it as the last
element of the queue, and schedule some other process.

The #signalAfter: and #signalAt: methods are new to Semaphore. Most Smalltalks give that responsibility to Processor. The Blue Book and VisualAge have #signal:atTime:, GNU Smalltalk has #signal:atMilliseconds:, and Dolphin has #signal:afterMilliseconds:. Squeak has #timeoutSemaphore:afterMSecs: and puts it in Delay. With no agreement over what class has the responsibility, I've chosen to put delayed signalling in the same class as undelayed signalling. This provides a common interface to the implementation-specific methods.

DELETED METHODS

~~isEmpty~~
Answer true if the queue is empty, false if it is not.
~~notEmpty~~
Answer false if the queue is empty, true if it is not.
~~size~~
Answer the number of Processes waiting in the queue.

Beware! These are consensus methods. It is common practice for a Semaphore to be a LinkedList instead of having one, which results in these methods being inherited. They are well missing from VisualAge, where Semaphores are not collections of any kind, but have trivial implementations.

However, these are evanescent properties. In a truly concurrent environment, the fact that a semaphore's queue was (or was not) empty when you asked a few nanoseconds ago doesn't mean it is still empty (or not) now.

They are not part of this specification.

Semaphores have a grave defect, which is that if a process tries to acquire a resource it is already holding, it deadlocks itself. Another kind of synchronisation object is needed, which Pharo and my Smalltalk, following POSIX, calls a “Mutex” and VW and GNU ST call a RecursionLock. Mutex may be added to the next draft.

Priorities have been a strong feature of Smalltalk concurrency since the Blue Book. Locking using semaphores can lead to priority inversion, where a high priority process is delayed while waiting for a semaphore held by a low priority process. The best known methods for coping with this rely on knowing which process holds a lock so that its priority can be temporarily adjusted. As noted in the previous paragraph, there is no notion of a semaphore being held by a particular process. Even a semaphore created forMutualExclusion is just a semaphore initialised a particular way. This is another reason why Mutex (or RecursionLock) really belongs in the standard.

class SharedQueue

class methods

new
answer a new SharedQueue with a small capacity
new: size
answer a new SharedQueue with initial capacity size,
where size is a non-negative Integer.

instance methods

next
Wait, if necessary, until the queue is notEmpty.
Remove and return the first element of the queue.
nextPut: item
Add item as the new last element of the queue.

DELETED METHODS

~~isEmpty~~
answer true if the queue has no elements, false if it has some
~~notEmpty~~
answer true if the queue has some elements, false if none
~~peek~~
If the queue is empty, answer nil, otherwise answer
the first element of the queue without answering it.
~~size~~
Answer the number of elements in the queue.

Beware! #isEmpty, #notEmpty, #peek, #size are consensus methods, but they do not make sense in a truly concurrent system. In a classic system we expect

(x := aSharedQueue peek) isNil or: [x = aSharedQueue next]

to be true, but in a truly concurrent system (or even a single core system with pre-emptive scheduling) this can easily fail. These methods may be removed from the next draft.

There is another reason for omitting #peek, which is that existing systems do not agree. As described in Inside Smalltalk and implemented in GNU Smalltalk, if the queue is currently empty, #peek waits until there is an element and then returns it. As implemented in Squeak, VisualWorks, and Dolphin, #peek is really #peekOrNil, answering nil if there's nothing there at the moment.

Squeak offers a #nextOrNil method which answers nil if the queue is currently empty. Dolphin calls it #nextNoWait. Other Smalltalks do not seem to have it. Something like this might be in the next draft.

class Delay

The argument for putting Delay in the standard is that it is standard. All Smalltalk systems known to me include a Delay class which can be used to make a Process wait some amount of time, and they all include Blue Book methods.

The argument against putting Delay in the standard is that the behaviour is not common. Systems do not agree on the answer to “once a Process has begun to wait on a Delay, when is it safe to use that Delay again?” I've found three answers:

at once or even at the same time — a Delay is just a wrapped Duration and waiting is done by waiting with a timeout on a Semaphore (or other synchronisation object) that is never signalled;
when the delay time has expired — a Delay is part of the scheduling machinery and holds important information about the suspension while it is happening;
never — a Delay is part of the machinery and is never cleaned up.

The presence of #resumptionTime in the Blue Book protocol isn't compatible with the answer “at once”, although its undefinition when there is no delayed Process isn't compatible with safe use either.

One problem is that there isn't any commonly available way to tell whether a Delay is in use by another process (other than knowing that it has not escaped to any other process, or course) and in a multicore system there is no possible simple way to tell, this being an evanescent property.

The standard could provide Delay with single-shot semantics, which the other systems could support. The problem is that programmers using multi-shot Delays in their systems might think “I am using Delays; Delays are standard; therefore my program is standard” when it is not. This code, taken from a well known Smalltalk system, is not portable:

delay := Delay forMilliseconds: 50.
[self anyButtonPressed] whileFalse: [delay wait].

variable Transcript

One new method is required so that concurrent threads may safely share the transcript.

object methods

critical: aBlock
aBlock should be a <niladicBlock>. A lock is acquired, aBlock is invoked, and the lock is released.

Negative properties

With the exception of the classes and objects in this STEP, no standard objects are intended to be shared by concurrent processes. An object may be created by one thread and handed off to another, but there must be at least one synchronisation operation between the last access from the first thread and the first from the second.

Rationale

Processor

There appears to be no reason for most programs to be aware of the ProcessorScheduler class, so it's not included. Assorted books claim that there is only one Processor object. My system makes Processor a class with no instances, the easiest way to get a single named object, but that cannot be imposed on other systems. There are advantages in having a “scheduler” object per CPU core, but there's no reason that object has to be Processor.

The #terminateActive and #suspendActive methods cover the safe uses of #terminate and #suspend, so that we do not need to include those rather dangerous operations in the standard.

SharedQueue

If you have Semaphores, you can have SharedQueues; the Blue Book is quite clear about how to do that. So there's really very little reason not to provide them.

Backwards-incompatible changes

As noted, #suspend and #terminate are not in this interface. Nothing prevents an implementation adding them.

There are two changes that deserves serious consideration.

For the first we have the example of the Single Unix Specification before us. A process that terminates itself can be assumed to know what locks it is holding and to be responsible for ensuring that breaking these locks is safe. But if one process terminates another, the killer cannot know what locks the victim holds or whether it is safe to break its hold on them, nor is the victim expecting to be killed so that it can make this safe. POSIX offers robust locks: if a thread that holds a mutex is killed, the next thread to claim the mutex is given it, but warned that it is in an inconsistent state. That thread may then repair the state, and tell the mutex that all is well again. If not, a second attempt to claim the lock will be treated as an error. Mimicking that requires the inclusion of Mutex/RecursionLock.

Reference implementation

Existing Smalltalk implementations are already pretty close to this. I could supply change sets for Squeak, Pharo, Dolphin, and Visual Works, and an additional source file for GNU Smalltalk easily enough.

Additions to Processor

This is expressed in terms of ProcessorScheduler, just so that it can be used with some existing systems.

ProcessorScheduler
  methods:
    suspendActive
      self activeProcess suspend.
    terminateActive
      self activeProcess terminate.
    waitFor: aDuration
      (Delay forSeconds: aDuration asSeconds) wait.
    waitUntil: aDateAndTime
      self waitFor: DateAndTime now - aDateAndTime

Additions to Semaphore

Semaphore
  methods:  
    signalAfter: aDuration
      |d|
      0 < (d := aDuration asSeconds) 
        ifTrue:  [[(Delay forSeconds: d) wait. self signal] fork]
        ifFalse: [self signal].
    signalAt: aDateAndTime
      self signalAfter: aDateAndTime - DateAndTime now.

SharedQueue

Here I provide a model implementation of SharedQueue for three reasons.

VisualAge omits it.
The #nextPutAll: method is not common.
Existing implementations use an OrderedCollection, which is more of a performance disaster in some systems than others, but is never good.

Time millisecondsToRun: [
  |n q|
  n ← 1000.
  q ← class new: n.
  1 to: n do: [:x | q nextPut: x].
  1 to: 1000000 do: [:i | q nextPut: q next]]

Dialect	Built-in	This version	Speedup
VisualWorks	12,345 msec	400 msec	30·85
Pharo	57,490 msec	1,094 msec	52·55
GNU	5,274 msec	3,801 msec	1·39
Dolphin	6,815 msec	5,406 msec	1·26
astc*	585 msec	342 msec	1·71

This is of course a contrived case, but it's easy to contrive. [*] The astc code uses POSIX mutexes and conditions rather than semaphores.

Object subclass: #SharedQueue
  instanceVariableNames: 'array head tail size capacity mutex avail'
  “invariants:
    array isMemberOf: Array
    array size = capacity
    1 ≤ head ≤ capacity
    1 ≤ tail ≤ capacity
    0 ≤ size ≤ capacity
    avail size = size”

  class methods for: 'instance creation'
    new
      ↑self new: 5
    new: n
      ↑self basicNew pvtPostNew: (n max: 1)

  methods for: 'initialization'
    pvtPostNew: n
      array    ← Array new: n.
      capacity ← n.
      size     ← 0.
      head     ← 1.
      tail     ← 1.
      mutex    ← Semaphore forMutualExclusion.
      avail    ← Semaphore new.

  methods for: 'accessing'      
    next
      |r|
      avail wait.
      mutex critical: [
        r ← array at: head.
        array at: head put: nil.
        head ← head = capacity ifTrue: [1] ifFalse: [head + 1].
        size ← size - 1].
      ↑r
    nextPut: item
      mutex critical: [
        size = capacity ifTrue: [
          |a n p|
	  n ← capacity + size.
	  a ← Array new: n.
	  p ← head.
	  1 to: size do: [:i |
            a at: i put: (array at: p).
            p ← p = capacity ifTrue: [1] ifFalse: [p + 1]].
	  array    ← a.
	  capacity ← n.
	  head     ← 1.
	  tail     ← head + size].
	array at: tail put: item.
	tail ← tail = capacity ifTrue: [1] ifFalse: [tail + 1].
	size ← size + 1].
      avail signal.
      ↑item
    nextPutAll: items
      mutex critical: [
	|m|
	m ← items size.
	size + m > capacity
	  ifTrue: [
            |a n p|
	    n ← (capacity max: m) + size.
	    a ← Array new: n.
	    p ← head.
	    1 to: size do: [:i |
	      a at: i put: (array at: p).
	      p ← p = capacity ifTrue: [1] ifFalse: [p + 1]].
	    items do: [:each |
	      a at: (size ← size + 1) put: each].
	    array    ← a.
	    capacity ← n.
	    head     ← 1.
	    tail     ← head + size]
          ifFalse: [
	    items do: [:each |
	      a at: tail put: each.
	      tail ← tail = capacity ifTrue: [1] ifFalse: [tail + 1]].
	    size ← size + m]].
        1 to: m do: [:i | avail signal].
      ↑items

Additions to Block and NiladicBlock

Block
  methods:
    newProcessWith: anArray
      ((anArray isKindOf: Array) and:
       [anArray size = self argumentCount]
      ) ifFalse: [self valueWithArguments: anArray "die"].
      ↑[self valueWithArguments: anArray] newProcess

NiladicBlock
  methods:
    atPriority: aPriority
      |thisProcess oldPriority|
      thisProcess ← Processor activeProcess.
      oldPriority ← thisProcess priority.
      ↑[thisProcess priority: aPriority. self value]
         ensure: [thisProcess priority: oldPriority]

Common Processes

Metadata

Motivation

Technical Specification

Process states

global constant Processor

object methods

class Process

class methods

instance methods

Rationale

<niladicBlock>

instance methods

<blocks in general>

instance methods

class Semaphore

class methods

instance methods

DELETED METHODS

class SharedQueue

class methods

instance methods

DELETED METHODS

class Delay

variable Transcript

object methods

Negative properties

Rationale

Processor

SharedQueue

Backwards-incompatible changes

Reference implementation

Additions to Processor

Additions to Semaphore

SharedQueue

Additions to Block and NiladicBlock

The End.