Declarative semantics define a program execution. They map a program to a set of candidate executions, by defining a consistency predicate.
1. Candidate Execution Graphs
We can represent candidate executions as graphs, where nodes represent events or memory instructions (reads, writes, updates, fences). Relations correspond to program order and reads-from. This is not EVERY POSSIBLE EXECUTION, but a CANDIDATE ONE.
In the graph above:
Solid edges represent program order (). This is a strict ordering that specifies which instructions must occur before others.
Dashed edges represent reads-from (). This is a relation that connects write instructions to the read instructions that read the value written.
Nodes represent memory instructions: writes () and reads ().
Columns of nodes represent different threads.
Example 1
To map our store buffering example to a graph, we can do:
And here three valid candidatae execution graphs:
DeclarativeSemanticsAInitXW x 0AWriteXW x 1AInitX->AWriteXAWriteYW y 1AInitX->AWriteYAReadXR x 0AInitX->AReadXrfAInitYW y 0AInitY->AWriteXAInitY->AWriteYAReadYR y 0AInitY->AReadYrfAWriteX->AReadYAWriteY->AReadXBInitXW x 0BWriteXW x 1BInitX->BWriteXBWriteYW y 1BInitX->BWriteYBInitYW y 0BInitY->BWriteXBInitY->BWriteYBReadYR y 1BWriteX->BReadYBReadXR x 1BWriteX->BReadXrfBWriteY->BReadXCInitXW x 0CWriteXW x 1CInitX->CWriteXCWriteYW y 1CInitX->CWriteYCInitYW y 0CInitY->CWriteXCInitY->CWriteYCReadYR y 0CWriteX->CReadYCReadXR x 67CWriteY->CReadX
We allow anything to be read unless it is constrained by a reads-from edge.
2. Formalising Executions
An event is a triple where:
is a unique event identifier
is a thread identifier ( for initial writes)
is a non-empty label
In the graphs, for consiceness, we omit and , only showing . Labels are defined below, where and :
We can define some helper functions:
, we write to denote the type of label (i.e., , , ).
Similarly, we can define , , and :
For an event , and .
We can lift , , , and to events by defining , , , and .
2.1 Relational Notation
Given a set and relations , we write:
for the identity relation of .
for the domain of .
for the range of .
for the inverse of .
for the composition of and .
for the reflexive closure of .
where and is the transitive closure of .
is the reflexive-transitive closure of .
for the irreflexivity of .
for the acyclicity of .
Given an event set , a relation , a location and a thread , we write:
denotes the initialisation events in .
denotes the events in thread .
denotes the events accessing location .
denotes the restriction of to thread .
denotes the restriction of to location .
is restricted to internal edges.
is restricted to external edges.
is restricted to edges on the same location.
2.2 Defining Executions
A (candidate) execution graph is a tuple where:
is a finite set of events, and is the set of initialisation events in .
where is a strict total order on . This is the program order.
is the reads from binary relation on such that is a function (i.e. if then ) and we have:
and
Let be an execution graph. Then:
is the set of read events in .
is the set of write events in .
is the set of update events in .
is the set of read or update events in .
is the set of write or update events in .
2.3 Consistency Predicates
A consistency predicate defines the set of allowed executions for a given memory model.
The most basic consistency predicate is completeness, which requires that every read event reads from some write event. Formally, an execution graph is complete if:
2.4. Sequential Consistency
Let be a strict total order on . is SC-consistent wrt if:
or . The execution order should respect program order.
or . The execution order should respect reads-from order.
. You always read the latest write.
Then is SC consistent if it is complete and SC consistent wrt some on .
Example 2
To map our store buffering example to a graph under SC, we can do:
DeclarativeSemanticscluster_validALLOWEDcluster_invalidFORBIDDENAInitXW x 0AWriteXW x 1AInitX->AWriteXAWriteYW y 1AInitX->AWriteYAInitYW y 0AInitY->AWriteXAInitY->AWriteYAReadYR y 0AInitY->AReadYrfAWriteX->AReadYAReadXR x 1AWriteX->AReadXrfAWriteY->AReadXBInitXW x 0BWriteXW x 1BInitX->BWriteXBWriteYW y 1BInitX->BWriteYBReadXR x 0BInitX->BReadXBInitYW y 0BInitY->BWriteXBInitY->BWriteYBReadYR y 0BInitY->BReadYBWriteX->BReadYBWriteY->BReadX
We allow anything to be read unless it is constrained by a reads-from edge.
2.5 Alternative SC
This alternative definition of SC, which is easier to reason about. is the modification order of , where is a strict total order on . Then is SC consistent if:
It is complete ().
where . Reads-before means that if a read reads from a write, then any later write in modification order must come after the read. It relates a read to later writes.
Intuitively, is the order in which writes reach memory.
Example 3
To map our store buffering example to a graph under alternative SC, we can do:
DeclarativeSemanticscluster_validALLOWEDcluster_invalidFORBIDDENAInitXW x 0AWriteXW x 1AInitX->AWriteXAInitX->AWriteXmoAWriteYW y 1AInitX->AWriteYAInitYW y 0AInitY->AWriteXAInitY->AWriteYAInitY->AWriteYmoAReadYR y 0AInitY->AReadYAWriteX->AReadYAReadXR x 1AWriteX->AReadXAWriteY->AReadXAReadY->AWriteYrbBInitXW x 0BWriteXW x 1BInitX->BWriteXBInitX->BWriteXmoBWriteYW y 1BInitX->BWriteYBReadXR x 0BInitX->BReadXBInitYW y 0BInitY->BWriteXBInitY->BWriteYBInitY->BWriteYmoBReadYR y 0BInitY->BReadYBWriteX->BReadYBWriteY->BReadXBReadY->BWriteYBReadX->BWriteXrb
Much simpler to verify, as we can just check for cycles in the union of relations.
2.6 Total Store Order (TSO)
Let be a strict partial order on . is TSO-consistent wrt if:
is a total order on . Reads can be reordered, so its total on everything but reads.
. As in SC, must be ordered, except for a write-read, which can be reordered.
(implying that). All reads are either from the same () or different ().
.
is all the external reads-from edges in . This means reads from other threads.
is TSO consistent if it is complete and TSO consistent wrt some on .
Example 4
To map our store buffering example to a graph under TSO, we can do:
DeclarativeSemanticscluster_validALLOWEDcluster_invalidALLOWEDAInitXW x 0AWriteXW x 1AInitX->AWriteXAWriteYW y 1AInitX->AWriteYAInitYW y 0AInitY->AWriteXAInitY->AWriteYAReadYR y 0AInitY->AReadYrfAWriteX->AReadYAReadXR x 1AWriteX->AReadXrfAWriteY->AReadXBInitXW x 0BWriteXW x 1BInitX->BWriteXBWriteYW y 1BInitX->BWriteYBReadXR x 0BInitX->BReadXBInitYW y 0BInitY->BWriteXBInitY->BWriteYBReadYR y 0BInitY->BReadYBWriteX->BReadYBWriteY->BReadX
2.7 Alternative TSO
This alternative definition of TSO, which is easier to reason about. is the modification order of , where is a strict total order on . Then is TSO consistent if it is complete and such that:
. Internal reads-from and reads-before must respect program order.
where and . Preserved program order relates all non-write-read pairs in program order, and reads-from and reads-before must respect modification order.
3. COH & RA
3.1 Coherence (COH)
SC is very expensive to implement in hardware, and forbids optimisations that are sound for sequential code. Hardware and compilers preserve SC "per-location" (coherence).
An execution graph is coherent if is complete and, there exists a strict total order on such that:
. Program order on location is respected.
. Reads-from on location is respected, and there is no interleaving write.
3.2 Alternative COH
Let be the modification order of . Then is coherent wrt if where . RF, RB are already on the same location, so we don't need to write for them.
is coherent if it is complete and coherent wrt some on .
Coherence Examples
Here are some examples:
3.3 Problems with COH
Under COH, spinlocks do not guarantee mutual exclusion:
Moreover COH does not allow message passing, so we can't really write any concurrent code. To solve this, we can strengthen COH with "observed" writes - change an -cycles to - cycles.
3.4 Release Acquire (RA)
Let be the modification order of . Then is RA consistent wrt if holds where and . Here, is the happens-before relation under RA, which relates events that are ordered by program order or reads-from on the same location.
This is much like saying every read is a "release" and every write is an "acquire" in C.
A execution graph is RA consistent if it is complete and RA consistent wrt some on .
3.5 Ordering Models
Given two memory models and , we can say that if :
If an execution is consistent, then it is also consistent.
If an execution is not consistent, then it is also not consistent.
4. C11 Semantics
We want to be careful not to over-synchronise our code, as that can lead to performance degredation. C11 allows us to specify memory access modes. Each memory access has mode :
Reads where .
Writes where .
RMWs or where .
Fences where .
Given an event , we write to get the mode of . We can order mode strength: . Given a set of events and mode , we can write:
4.1 C11 Definition
We can define happens-before with:
Let be the modification order of . Then is C11 consistent wrt if holds where . Then is C11 consistent if it is complete and C11 consistent wrt some on .
The synchronises-with relation is defined as:
This can be expanded to 4 different cases:
Release Sequences
We allow multiple updates between a release and acquire, hence the instead of just .
4.2 Catch Fire Semantics
Given a C11-execution graph we say that two events are in C11-race, , if:
. They are distinct events.
On different loactions.
. At least one is a write or RMW.
At least one is non-atomic.
and . They are not related by .
is C11-racy if .
A program has undefined behaviourunder C11 if it has a racy C11-consistent execution graph. An execution of program is a C11-outcome if is C11-consistent or has undefined behaviour under C11.