Sometimes I explore LLVM sources and play with godbolt.org in order to find some interesting optimizations (not only the peephole ones) so I think I’ll post some here in my blog from time to time. Also, if an optimization is simple enough I try to implement it in RuyJIT, e.g.:

Today I am going to share a nice LLVM trick to optimize some common range checks.
So, let’s say we have a function that checks if a char belongs to a list of reserved chars:
(I actually copy-pasted it from CoreFX)

Now let’s compare outputs for RuyJIT and LLVM:

As you can see C# generated a pretty simple set of 9 cmp + jumps for each logical OR. LLVM, at the same time, generated something strange with magic numbers and just two branches. Let’s try to convert (disassemble) LLVM’s output to C#:

so insted of 9 cmp we have add, cmp, shr, and Let me explain the magic constants.
First, we need to convert chars to their ASCII numbers:

The biggest is @ (64) and the smallest is \$ (36). So, the range starts from 36 and the length is 64 - 36 = 28. Thus the first if simply ignores all values outside of [36..64] range. Here is how I explained the first two magic numbers. Now it’s 314575237s turn:

Since the range is known and the length is 28 which easily fits into a 32/64bit CPU register we can encode it to a special bit-map (a set of 0 and 1) - a 32/64 bit integer (depending on a platform). Here is how it’s done:

So, for each char we push (shift) 1 to the left according to c - 36 value (as you remember 36 stands for \$ so its index will be zero - on the right)
and our bitmap becomes:

Now when we do 314575237 >> (c - 36) we either get 1 (symbol is one of the reserved) or 0 (doesn’t belong to the set)

Let’s benchmark it! I have a random string here and I need to calculate how many symbols are reserved:

The results are:

The improved version is 43% faster! (Core i7 8700K)

Feature request for RuyJIT dotnet/coreclr#12477

LLVM opt: godbolt.org (convert to switch)
LLVM llc: godbolt.org (DAG*)