Search

# The bitwise complement, or not operator, is unsafe

I came upon a problem with bitwise complement (logical not): it isn’t safe. It’s nothing new, but it really registered while implementing the logical operations in Leaf. The issue is the result depends on the size of the type. This leads to generating invalid results when combined with operators like `and`, `or`, `xor`. I don’t see an easy fix for this problem.

### The problem

The bitwise complement inverts the bits of a binary value: a logical not operation on each bit. This operation only makes sense if we’re speaking of a fixed bit-size. This means numbers have different complements depending solely on the size of their type. Consider the value `1`:

• `¬1` = `0`
• `¬01` = `10`
• `¬001` = `110`
• `¬0001` = `1110`

The complement’s value depends on the size of `1`. That almost sounds silly, and slightly disconcerting. But it’s the reality in programming. And it’s problematic.

 ```1 2 3 4``` ```var a : binary 32bit = pine() var b : binary 64bit = cone() var c = b and not a ```

At first glance this seems fine, but it most likely doesn’t produce the desired result. `not a` only produces a complement for the 32 bits in `a`. It will then be 0 extended to 64-bits. The top 32-bits will always be cleared, which is not likely the intent.

Something curious happens when we use a signed value. If the top-most bit of a signed 2’s complement value is `1`, the value is negative. When negative values are extended, they are sign-extended: the new bits take the value of the highest most bit in the input. This keeps negative values negative.

This makes the not operator magically work in a lot of common situations. A positive number is negative in its complement form, and will be bit extended “correctly”.

C also promotes all small integers to the normal integer size before doing any bitwise operations. Given a `int8_t a` the result of `~a` is a 32-bit value. This again, makes some operations work as expected but is confusing when you start dealing with larger integers.

This leads to results that aren’t as immediately obvious as one might hope (the result is in the comment on each line):

 ```1 2 3 4 5 6 7 8 9``` ```printf( "%lX\n", ~int(0) & ~int64_t(1) ); //FFFFFFFFFFFFFFFE printf( "%lX\n", ~int64_t(0) & ~int64_t(1) ); //FFFFFFFFFFFFFFFE printf( "%lX\n", ~uint64_t(0) & ~uint64_t(1) ); //FFFFFFFFFFFFFFFE printf( "%lX\n", ~unsigned(0) & ~uint64_t(1) ); //FFFFFFFE printf( "8bit: %X\n", ~int(0) & 0x7F ); //7F printf( "8bit: %X\n", ~int(0) & int8_t(0x7F) ); //7F printf( "8bit: %X\n", ~int(0) & 0x80 ); //80 printf( "8bit: %X\n", ~int(0) & int8_t(0x80) ); //FFFFFF80 ```

It’s these small unexpected situations that can create hard to track defects in code.

### Leaf complexity

Leaf allows for literal rational types. This is to ensure values are not truncated prematurely, or assigned to the wrong type. It solves the 1/2 = 0 problem. It however magnifies the bitwise complement problem.

Literal values may be declared, for convenience, without an explicit type:

 ```1 2``` ```literal flag_a = 0x1 literal flag_b = 0x2 ```

That is very much different than:

 ```1 2``` ```literal flag_a : binary = 0x1 literal flag_b : binary = 0x2 ```

Which is also different than explicit bit-sizes:

 ```1 2``` ```literal flag_a : binary 16bit = 0x1 literal flab_b : binary 128bit = 0x1 ```

The `not` of these values is a different scalar depending on the size of the type. Without care it can be error-prone to use it. I have yet to think of a good general solution, but have some ideas.

#### Material nonimplication operator

One key location where this causes a problem is in the expression `a and not b`. It requires `not b` be evaluated without knowing the size of `a`, thus resulting in an incorrect complement.

On closer look though, this operation is actually the material nonimplication operation. If instead we had a single operator the problem would go away. With `a ↛ b` the two types can be unified in size first, and then the operation done. This compiles to the standard `a and not b` but with `b` having the correct bit size.

The operator `↛` works in Leaf, but on the odd chance your keyboard doesn’t have that symbol, I’ve given it a second one: `and_not`. I’m uncertain of the name, but certainly more programmers will understand that than material nonimplication. Possibly `exclude` is another name for it.

#### Removing not

I thought about removing bitwise complement entirely, but that seemed somewhat radical. Though not always safe, it is a useful operation.

I’ve also added several other operators to see if it helps: xor ↮, nor ↓, nand ↑. I won’t know if this is helpful until more Leaf code is created.

The other option is take a C-like approach and special encode the complement. Not as a negative, but perhaps a special complement marker. This would be an easy addition to rational numbers, but for binary values it would require an actual new type marker to encode. It sounds like more trouble than it’s worth.

`not` isn’t safe and I don’t know what to do about it!

Please join me on Discord to discuss, or ping me on Mastadon.

### The bitwise complement, or not operator, is unsafe

A Harmony of People. Code That Runs the World. And the Individual Behind the Keyboard.

#### Mailing List

Signup to my mailing list to get notified of each article I publish.

Search