Are All Linux Kernel Memory Barriers Transitive?

Now this one was a trick question! ;–)

This trick question involved the following code fragment, where each function foo_n() runs on CPU n:

int x, y; /* shared variables */
int r1, r2, r3; /* private variables */

void foo_0(void)
{
	ACCESS_ONCE(x) = 1;
}

void foo_1(void)
{
	r1 = x;
	smp_rmb();  /* The only change. */
	r2 = y;
}

void foo_2(void)
{
	y = 1;
	smp_mb();
	r3 = x;
}

Suppose that the following assertion runs after all of the preceding functions complete. Can this assertion ever trigger?

assert(!(r1 == 1 && r2 == 0 && r3 == 0));

Appealing to Intuition

Once again, when it comes to memory barriers, a little intuition can be a very dangerous thing. Nevertheless, let's once again at least see where it leads, which is pretty much the same place as before.

Let's assume that the assertion can trigger. This means that r1 == 1, which means that foo_0() must have executed before foo_1() did. For the assertion to trigger, we must also have r2 == 0, which means that foo_1() must have executed before foo_2() did. But if foo_0() executed before foo_1() and foo_1() executed before foo_2(), then foo_2()'s load from x must assuredly see foo_0()'s store to x. This in turn means that r3 == 1, preventing the assertion from triggering.

So, once again, most people's intuitions would be much happier if the assertion could never trigger.

Appealing to Linux Kernel Documentation

Digging through the Linux kernel documentation gives us the same result as before, namely that there is no guarantee.

Appealing to Hardware Documentation

But the truly insane will read the actual code along with all of the relevant hardware documentation. Because ARM does not have a weak barrier that preserves the order of reads, ARM defines smp_rmb() to be the same as smp_mb(), which means that ARM gives the same result as before. (ARM does have a weaker memory barrier that orders stores, but this is not used for smp_wmb() as of the 2.6.34 Linux kernel.)

Power uses the lwsync instruction for smp_rmb() which orders prior loads against subsequent loads and stores and also prior stores against subsequent stores. However, foo_0()'s store to x and foo_2()'s read from x constitute a store followed by a load, for which the lwsync instruction does not guarantee ordering. The assertion can therefore trigger, and in does trigger on real hardware.

So if you want your memory accesses to act in a transitive fashion, use smp_mb() rather than either smp_rmb() or smp_wmb().