# Compile-time factorization

It is well known that C++ templates are Turing-complete and therefore any algorithm can be theoretically implemented to be executed at compile-time. But what about reality? How far the current available compilers can go? In this article I’d like to discover the limits of compilers, but before that I will explain a relatively simple and suitable factorization algorithm.

Yes, it is about decomposition of an integer N into its prime factors at compile-time. A list of prime factors is not available at compile-time and a generation of such a list is rather unnecessary. Let’s go a simpler way starting with a trial division. In worst case, what is needed, is to test all the integers less or equal M, where M^2<=N. Such an algorithm is feasible even at compile-time, but can involve many template instantiations. Reducing the number of instantiations, a bigger N can be factorize. Let’s see how far that can work.

The idea

Evidently, once trying to divide by 2, it is useless to divide by any number containing 2 as a factor. The same is valid for the factor 3 and so on. So, instead of having the list of primes, it better to generate a larger list of numbers using a simple formula that would contain all the primes. The first trial factor is 2. Then the first formula is 2k+1, k = 1,2,…. It returns odd numbers and reduces the amount of all trial integers by 50%. Setting k=1,2 delivers the next trial factors:3 and 5. The next formula eliminating the factor 3 is 2*3*k + r = 6*k + r. The remainder r equals 1 or 5, otherwise the numbers generated by the formula will be divisible by 2 or 3. The second formula contains only 1/3 of all integers. With the third formula, let’s generalize. It is 2*3*5*k + r = 30*k + r, where r is 1 or any number in the range (6,30) not divisible by 2, 3 and 5, i.e. r = 1, 7, 11, 13, 17, 19, 23, 29. So, r is 1 or a prime number smaller than 30, excepting 2,3,5. However, for the next formula 210*k+r, the list of reminders r will also contain a couple of composite numbers divisible by 11, 13 and 17 and totally 48 reminders. The latter formula 210*k+r represents 48/210 (less than 23%) of all integers and contains all the prime factors. Such formulas can be seen as the sieve of Eratosthenes in a functional way – exactly what is needed in template metaprogramming.

Preliminaries – compile-time arrays

My original implementation goes back before C++11 and is based on Loki::Typelist by Andrey Alexandrescu (book Modern C++ design). Those compile-time arrays can be used to save the reminders of the formulas as well as the final factorization. In C++11, the variadic templates becomes handy and can be used as compile-time arrays too. What is need is a holder typelist. Here I follow the notation of Eric Niebler. A natural operation on the arrays is insertion of new entries from both sides as well as concatenation of two arrays

As a holder for a compile-time integer and a pair (factor, power) I define two simple class templates

I intentionally don’t want to misuse std::tuple to emphasize a purely compile-time programming.

Implementation

A formula of the type qk+r_i should start with k=1. The main class template has therefore two tasks:

1) Process all the prime factors less than (q+r). They are given as InitialPrimesList

2) Start a loop over k

The actual loop over k-values and the set of prime reminders r_i is realized in the next class template. N is currently factorized integer. It is reduced, if a factor has been found. RList is a typelist<…> containing the prime reminders r_i.

The template loop below goes first through RList and then increments K, if RList is empty and the exit condition is still not reached.

An exit condition occurs, if the square of a candidate factor becomes greater than the current N. That can happen either with an empty or non-empty RList.

Besides std::conditional from the standard header <type_traits>, there are two class templates still to define:

1) try_factor is a simple loop doing trial division of N by the given factor so many times as N is divisible. It returns power and the full factor candidate^power, even if power=0.

2) select_factors is generally optional. It can be eliminated and Reminders is taken instead of RList. But tests have shown a significant reduction of the compilation time, when for a next k, the list of reminders is sieved before applying test_factor.

Two ‘global’ typelists InitialPrimesList and Reminders are similar. The latter is derived from the first one. For instance, for q=2*3=6, InitialPrimesList should contain all primes less than 6.

Of course, such a small primes list can be generated for a given q at compile-time, but this is a topic of another article. Reminders is obtained from InitialPrimesList by taking out all the prime factors of q (all primes smaller than q) and adding sint<1> in the beginning. This simplified procedure to get Reminders is valid for formulas 6k+r and 30k+r. Starting with 210k+r Reminders will contain composite numbers, e.g. with 11 and 13 as factors. The class template below eliminates all entries from List that are less or equal to Start:

Usage

To try out the implementation, a simple output of a typelist containing either numbers in sint<> holder or factorization as a list of spair<> is needed:

… And finally the program:

Conclusion

Does this implementation work? For which values of N? Try it yourself by copying the code pieces from here in a right order and compile them with a C++11 capable compiler. Don’t forget to set a higher template depth for a large N, include <iostream> and <type_traits> headers as well as reorganize classes in a proper order. Or get the full code including an implementation using Loki::Typelist from github.

It was very surprisingly to me, that the worst case (the largest prime) in the range of unsigned int – 4294967291, could be tried within a few seconds of compilation time using g++ 4.8.3. Even more interesting, that the factorization of larger numbers from the range of unsigned long depends rather on RAM availability than the compilation time. More detailed benchmarks will be given in the next article.

The stuff here can be seen as a metaprogramming fun, but I came to this task during my work on GFFT library. The library performs fast Furier transforms of size N, where N is a compile-time integer constant. I started with the simplest and fastest case N=2^P, but later moved to a general size N, which had to be factorized. Sure, it is worthy to do that at compile-time. And I didn’t know, if a compile-time factorization will work for large numbers. That made this work even more exciting.

Get the source code to the article

References

1. Hans Riesel. Prime Numbers and Computer Methods for Factorization. 2nd Edition, 2011
2. Andrei Alexandrescu. Modern C++ Design, Generic Programming and Design Patterns Applied. 2001