Arrays considered harmless

In February 2017 I encountered a web article called “Arrays considered somewhat harmful by Eric Lippert. The article is dated September 22, 2008.

Eric Lippert is a experienced and capable programmer with a depth of knowledge about C# that I could never hope to equal. Anything he writes is worth reading. But even Homer nods. In this case, there is a fact about C# which he appears to take for granted as The Way Things Must Be. But it isn't so.

Context

Eric Lippert wrote, “I got a moral question from an author of programming language textbooks the other day requesting my opinions on whether or not beginner programmers should be taught how to use arrays.”

This is a seriously weird question. Should beginning carpenters be taught about screws? Should beginning lawyers be taught about contracts? Should beginning pilots be taught how to operate radios? We don't need to teach beginning programmers about classes or assignment statements, and arguably shouldn't. And just as we teach children the alphabet before we teach the concept of a verb, so it makes sense to teach about numbers and functions before we teach about arrays. Based on the kind of code I see 3rd-year programmers writing, I would say that it is deeply unethical to teach beginning programmers about ArrayList<T> before they have mastered arrays.

Lippert's claims

Every single one of these claims is true for C#, but false for some other languages, which shows us that the problems are not problems with the array concept but with C#. Java has the same gross defect. Ironically, C and C++ don't.

The presupposition

Arrays in general provide

Mutable arrays provide

Stretchy arrays provide the benefits of mutable arrays plus

A read-only array is an array that is not mutable.

Lippert's presupposition is that read-only arrays do not or cannot exist, that all arrays are at least mutable.

That is true in Java and C#, bad cess to them. Ironically, it isn't true in C or C++.

The Joy of Read-Only Arrays

Let's look at read-only arrays in Standard ML.

There are no mutating operators.

Read-only arrays provide compact storage, fast size and element access, and can support a rich collection of operations. (I have not begun to show a complete list here.)

My Smalltalk system has a ReadOnlyArray class (amongst many read-only collection classes), supporting 975 operations. The Array class adds only 73 more. The joy and freedom, moving from Java or C# to a language where there's a sequence data structure with minimal overheads that can be freely passed around without fears of inadvertent mutation is inexpressible. (Maybe Oh what a beautiful morning can express it.)

It isn't just arrays that are the problem

You probably should not return any mutable object as the value of a public method or property, or if you do, you should return a new copy. It's not the arrayness that's the problem, it's the mutability. Consider the following classes:

class Point {
    public int x;
    public int y;
    Point(int x_, int y_) { x = x_; y = y_; }
}
class Rectangle {
    public Point topLeft;
    public Point bottomRight;
    Rectangle(Point tl, Point br) { topLeft = tl; bottomRight = br; }
}

where you can imagine getter and setter methods if you like.

    Point aPoint = aRectangle.topLeft;
    aPoint.x = 0;

changes the position of the top left corner of aRectangle without aRectangle being involved in any way. Instead you have to write

class Rectangle {
    private int top, left, bottom, right;
    public Point topLeft() { return new Point(left, top); }
    public void topLeft(Point tl) { top = tl.y; bottom = tl.x; }
    public Point bottomRight() { return new Point(right, bottom); }
    public void bottomRight(Point br) { bottom = br.y; right = br.x; }
    public Rectangle(Point tl, Point br) { topLeft(tl); bottomRight(br); }
}

Now

    Point aPoint = aRectangle.topLeft();
    aPoint.x = 0;

does nothing to the rectangle. It isn't topLeft() returning an array that's the problem. It doesn't. It returns a Point. The problem is that topLeft() returns a mutable object. If Points were read-only, a Rectangle would not have to generate new Points every time it was asked for a corner.

A more deeply hidden belief

Why would anyone imagine that an array must be mutable? The obvious answer is “because they think that's how arrays have to be initialised”. That's not true. The building block we need is something like tabulate. For example, to implement map we can write

fun map (f, v) =
    tabulate (length v, fn i ⇒ f (sub (v, i)));