Egor Bogatov — Developer at Microsoft

11 May, 2020

Adding peephole optimization to Clang

“Adding peephole optimization to GCC” article covers it for GCC and I decided to cover Clang. Clang is basically a front-end on top of LLVM. It parses C/C++ files into an AST and converts them to an intermediate representation - LLVM IR.

High-level architecture for C++ AOT scenario.

opt is a console application, it accepts LLVM IR and optimizes it using a loop of analysis and transformation phases (produces optimized LLVM IR as an output). Then llc (a console app as well) emits machine code/assembly. opt is basically a PassManager with some generic (actually, C/C++-friendly) order of optimization passes. Other languages or JIT-compilers should use their own order of passes. Thus, when we introduce a new optimization in LLVM we improve many programming languages at once! E.g. Rust, Swift, C, C++, C# (Mono and Burst) - all of them use LLVM as a primary back-end. I assume you’re familiar with some LLVM IR basics such as SSA-form, and If you’re not I highly recommend watching/reading these:

Let’s optimize something!

Smart LLVM #1: Optimizing range checks

Sometimes I explore LLVM sources and play with godbolt.org in order to find some interesting optimizations (not only the peephole ones) so I think I’ll post some here in my blog from time to time. Also, if an optimization is simple enough I try to implement it in RuyJIT.
And today I am going to share a nice LLVM trick to optimize some common range checks.
So, let’s say we have a function that checks if a char belongs to a list of reserved chars:
(I actually copy-pasted it from CoreFX)

1
2
3
4
5
6
7
8
9
10
11
12
bool IsReservedCharacter(char character) // uint16_t
{
    return character == ';'
        || character == '/'
        || character == ':'
        || character == '@'
        || character == '&'
        || character == '='
        || character == '+'
        || character == '$'
        || character == ',';
}

Now let’s compare outputs for RuyJIT and LLVM:

Peephole optimizations in C++ and C#

“Performance gains due to improvements in compiler optimizations will double
the speed of a program every 18 years” © Proebsting’s Law

When we solve equations, we try to simplify them first, e.g. Y = -(5 - X) can be simplified to just Y = X - 5. In modern compilers it’s called “Peephole Optimizations”. Roughly speaking, compilers search for certain patterns and replace them with corresponding simplified expressions. In this blog post I’ll list some of them which I found in LLVM, GCC and .NET Core (CoreCLR) sources.

Let’s start with simple cases:

1
2
3
4
  X * 1          =>  X
 -X * -Y         =>  X * Y
-(X - Y)         =>  Y - X
  X * Z - Y * Z  =>  Z * (X - Y)

and check the 4th one in C++ and C# compilers:

1
2
3
int Test(int x, int y, int z) {
    return x * z - y * z;       //  =>  z * (x - y)
}

Now let’s take a look at what the compilers output:

1
2
3
4
5
Test(int, int, int):
  mov eax, edi
  sub eax, esi   ; -         
  imul eax, edx  ; *         
  ret

C++ (Clang, GCC, MSVC)

1
2
3
4
5
6
C.Test(Int32, Int32, Int32)  
  mov eax, edx
  imul eax, r9d  ; *
  imul r8d, r9d  ; *
  sub eax, r8d   ; -
  ret

C# (RyuJIT)

Hello world

1
Console.WriteLine("Hey!");