Single-Static Assignment Form and PHI

We’ll take a look at the same very simple max function, as in the previous section.

int max(int a, int b) {
  if (a > b) {
    return a;
  } else {
    return b;
  }
}

Translated to LLVM IR:

define i32 @max(i32 %a, i32 %b) {
entry:
  %retval = alloca i32, align 4
  %0 = icmp sgt i32 %a, %b
  br i1 %0, label %btrue, label %bfalse

btrue:                                      ; preds = %2
  store i32 %a, i32* %retval, align 4
  br label %end

bfalse:                                     ; preds = %2
  store i32 %b, i32* %retval, align 4
  br label %end

end:                                     ; preds = %btrue, %bfalse
  %1 = load i32, i32* %retval, align 4
  ret i32 %1
}

We can see that the function allocates space on the stack with alloca [2], where the bigger value is stored. In one branch %a is stored, while in the other branch %b is stored to the stack allocated memory. However, we want to avoid using memory load/store operation and use registers instead, whenever possible. So we would like to write something like this:

define i32 @max(i32 %a, i32 %b) {
entry:
  %0 = icmp sgt i32 %a, %b
  br i1 %0, label %btrue, label %bfalse

btrue:
  %retval = %a
  br label %end

bfalse:
  %retval = %b
  br label %end

end:
  ret i32 %retval
}

This is not valid LLVM IR, because it violates the static single assignment form (SSA, [1]) of the LLVM IR. SSA form requires that every variable is assigned only exactly once. SSA form enables and simplifies a vast number of compiler optimizations, and is the de-facto standard for intermediate representations in compilers of imperative programming languages.

Now how would one implement the above code in proper SSA form LLVM IR? The answer is the magic phi instruction. The phi instruction is named after the φ function used in the theory of SSA. This functions magically chooses the right value, depending on the control flow. In LLVM you have to manually specify the name of the value and the previous basic block.

end:
  %retval = phi i32 [%a, %btrue], [%b, %bfalse]

Here we instruct the phi instruction to choose %a if the previous basic block was %btrue. If the previous basic block was %bfalse, then %b will be used. The value is then assigned to a new variable %retval. Here you can see the full code listing:

define i32 @max(i32 %a, i32 %b) {
entry:
  %0 = icmp sgt i32 %a, %b
  br i1 %0, label %btrue, label %bfalse

btrue:                                      ; preds = %2
  br label %end

bfalse:                                     ; preds = %2
  br label %end

end:                                     ; preds = %btrue, %bfalse
  %retval = phi i32 [%a, %btrue], [%b, %bfalse]
  ret i32 %retval
}

PHI in the Back End

Let’s have a look how the @max function now maps to actual machine code. We’ll have a look what kind of assembly code is generated by the compiler back end. In this case we’ll look at the code generated for x86 64-bit, compiled with different optimization levels. We’ll start with a non-optimizing backend (llc -O0 -filetype=asm). We will get something like this assembly:

max:                                    # @max
# %bb.0:                                # %entry
    cmpl    %esi, %edi                  # %edi = %a, %esi = %b
    jle     .LBB0_2
# %bb.1:                                # %btrue
    movl    %edi, -4(%rsp)              # mov src, dst
    jmp     .LBB0_3
.LBB0_2:                                # %bfalse
    movl    %esi, -4(%rsp)              # mov src, dst
    jmp     .LBB0_3
.LBB0_3:                                # %end
    movl    -4(%rsp), %eax              # return value in eax
    retq

The parameters %a and %b are passed in %edi and %esi respectively. We can see that the compiler back end generated code that uses the stack to store the bigger value. So the code generated by the compiler back end is not what we had in mind, when we wrote the LLVM IR. The reason for this is that the compiler back end needs to implement the phi instruction with real machine instructions. Usually that is done by assigning to one register or storing to one common stack memory location. Usually the compiler back end will use the stack for implementing the phi instruction. However, if we use a little more optimization in the back end (i.e., llc -O1), we can get a more optimized version:

max:                                    # @max
# %bb.0:                                # %entry
    cmpl    %esi, %edi
    jg      .LBB0_2
# %bb.1:                                # %bfalse
    movl    %esi, %edi
.LBB0_2:                                # %end
    movl    %edi, %eax
    retq

Here the phi function is implemented by using the %edi register. In one branch %edi already contains the desired value, so nothing happens. In the other branch %esi is copied to %edi. At the %end basic block, %edi contains the desired value from both branches. This is more like what we had in mind. We can see that optimization is something that needs to be applied through the whole compilation pipeline.

[1]Wikipedia: Static single assignment form