Bitwise operations

Sometimes a number is just a number. But binary computers have long offered us many ways to interpret an n-bit word:

It is usual for programming language type systems to distinguish sharply between addreses, floating point numbers, and integers.

Integers

Signed integers could be handled in many ways (e.g., by using base -2). In practice three ways are used, one of them common.

For the rest of this note, we shall assume that the bits of an integer are numbered from 0 as the least significant bit to n==31 (for a C/Java int) or n>=63 (for a C long long or Java long) and that the value of a signed integer a is that v between -2n-1 and 2n-1-1 inclusive such that

v ≡ Σi=0nai2i (mod 2n)

Bitwise operations

Recall your truth tables
x y NOT y x AND y x OR y x XOR y
FalseFalseTrue FalseFalseFalse
FalseTrue FalseFalseTrue True
True FalseTrue FalseTrue True
True True FalseTrue True False

There are several ways to represent True and False in a computer. Two machines I've used took any even number as False and any odd number as True; their conditional branch instructions looked at the least significant bit only. Some languages I've used took any negative number as True and any non-negative number as False; their conditional branch instructions looked at the sign bit only. C and its successors take 0 as False and any other number as True. Note that operations in C that deliver a newly calculated logical result return 0 or 1, never -1.

We shall take 0 for False and 1 for True. Here are the truth tables again.
x y NOT y x AND y x OR y x XOR y
001000
010011
101011
110110

The bitwise operations in C work on all the bits in a word independently:
a = ~y a[i] = NOT y[i]
a = x & y a[i] = x[i] AND y[i]
a = x | y a[i] = x[i] OR y[i]
a = x ^ y a[i] = x[i] XOR y[i]

Note the difference between ~ and !: !0 is 1, but ~0 is all 1s (= -1 in 2s complement).

But what if you want to do the equivalent of testing a[i], setting a[1] = 1, or clearing a[i] = 0? We need things called shift instructions for that. Conventionally, a machine will have four of them:

On twos-complement machines, the signed left shift, if any, will do the same as the unsigned one, except perhaps for setting flags differently. For example, the ARM CPU in your phone offers LSL (logical=unsigned shift left), LSR (logical shift right), and ASR (arithmetic=signed shift right).

In C, there are two shift operators: << and >>. Historic practice was that
unsigned << amountis logical shift left
signed << amountis arithmetic shift left
unsigned >> amountis logical shift right
signed >> amountis arithmetic shift right

although the standard weaseled out and allows a compiler to use the unsigned right shift for signed numbers if it really feels like it. To get around this, Java offers
signed << amountsome left shift, you can't tell
signed >> amountarithmetic shift right
signed >>> amountlogical shift right

So here is one way we can use these operators.
To test a[i] ((a >> i) & 1) != 0
To clear a[i] = 0 a &=~ 1 << i
To set a[i] = 1 a |= 1 << i

Counting in base B

Suppose we want to count in base B. For example, suppose we wanted to write a function to test whether a given string is parenthesis balanced. Here's an attempt:

boolean is_balanced(String s) {
    int depth = 0;
    final int n = s.length();
    for (int i = 0; i < n; i++) {
        switch (s.charAt(i)) {
            case '(': depth++; break;
            case ')': if (depth == 0) return false;
                      depth--;  break;
            default : break;
        }
    }
    return depth == 0;
}

It would be a very good idea to test this. The first time I typed it up I accidentally omitted the depth-- statement. I only noticed because the tests came out wrong. In order to test it, we'd like to generate strings of L characters drawn from the alphabet {'x','(', ')'}.

final int L = 4 /* for example */;
final char[] digit = new char[L+1];
for (int i = 0; i <= L; i++) digit[i] = 'x' /*0*/;
while (digit[L] == 'x') {
    final String s = new String(digit, 0, L);
    System.out.println(s + " => " + is_balanced(s));
    int i = 0;
    while (digit[i] == ')' /* B-1 */) {
        digit[i] = 'x';    /*  0  */
        i++;
    }
    digit[i] = digit[i] == 'x' ? '(' : ')'; /* increment digit[i] */
}

Suppose we want to count using base 2. Then we could use an array of Boolean values:

final boolean[] digit = new boolean[L+1];
for (int i = 0; i <= L; i++) digit[i] = false;
while (!digit[L]) {
    ... use digit[] ...
    int i = 0;
    while (digit[i]) {
        digit[i] = false;
        i++;
    }
    digit[i] = true;
}

But if L < 32, we can pack the bits into an int, or if L < 64, we can pack them into a (long) long.

int a = 0;
while (((a >> L) & 1) == 0) {
    ... use a ...
    int i = 0;
    while (((a >> i) & 1) != 0) { /* test bit i */
	a &=~ 1 << i; /* clear bit i */
	i++;
    }
    a |= 1 << i; /* set bit i */
}

Or we can go a step further:

for (int a = 0; a < (1 << L); i++) {
    ... use a ...
}

This is the fastest, simplest, way I know to do an exhaustive search over choices out of L things. The only way to improve on it is not to do an exhaustive search.

Sets

A string of n bits (an n-bit integer) can represent a subset of {0...n-1}. The value i is in the set if and only if the ith bit is 1.
x UNION y x | y
x INTERSECTION y x & y
x RELATIVE-COMPLEMENT y x &~ y
x SYMMETRIC-DIFFERENCE y x ^ y
i IS-ELEMENT-OF x (((x >> i) & 1) != 0
x UNION {i} x | (1 << i)
x RELATIVE-COMPLEMENT {i} x &~ (1 << i)