The compile-time integer factorization implementation described before is a good benchmark for the C++ template meta-programs compilation performance. In the middle of the year I’d like to publish the compilation performance of four compilers in their latest versions and compare to the initial benchmarks.
All compilers did an improvement in handling of variadic templates, that is visible on longer compile-times. Insignificant variations of short compile-times may have another than meta-programming nature. Unfortunately, no new cases could be compiled and the crashing examples are still crash (a dash in the tables below). The compile-time from the initial benchmarks is given in the braces for quick comparison.
GCC 5.4
Command line: g++ -O3 -std=c++11 -ftemplate-depth-1000000
Compile-time in [s] | g++ 5.4 | 6k + r | 30k + r | 210k + r | 2310k + r |
---|---|---|---|---|---|
Variadic templates | 65521 | 0.6 (0.4) | 0.6 (0.4) | 0.6 (0.4) | 8.2 (9.1) |
4294967291 | 12.5 (10.0) | 4.7 (3.5) | 3.8 (3.2) | 47.6 (80.2) | |
18446744073709551615 | 11.6 (10.0) | 4.6 (3.4) | 3.8 (3.3) | 47.4 (81.8) | |
Loki::Typelist | 65521 | 0.5 (0.3) | 0.5 (0.4) | 0.6 (0.4) | 2.6 (1.9) |
4294967291 | 8.5 (8.0) | 3.3 (2.4) | 2.8 (1.8) | 20.1 (17.0) | |
18446744073709551615 | 8.5 (8.0) | 3.3 (2.5) | 2.8 (1.8) | 20.0 (17.0) |
Clang 3.8
Compile-time in [s] | Clang 3.8.0 | 6k + r | 30k + r | 210k + r | 2310k + r |
---|---|---|---|---|---|
Variadic templates | 65521 | 0.8 (0.5) | 0.6 (0.4) | 0.6 (0.4) | 8.0 (5.1) |
4294967291 | - | - | 2.8 (2.2) | 16.3 (10.4) | |
18446744073709551615 | - | - | 2.8 (2.2) | 16.3 (10.4) | |
Loki::Typelist | 65521 | 0.6 (0.4) | 0.6 (0.4) | 0.6 (0.4) | 1.5 (1.3) |
4294967291 | - | - | 1.9 (1.7) | 2.8 (2.4) | |
18446744073709551615 | - | - | 2.0 (1.7) | 2.8 (2.4) |
Intel C++ 16.0.1
Intel improved compilation of very long variadic template parameters with factor from 1,5 to 3.
Compile-time in [s] | icpc 16.0.1 | 6k + r | 30k + r | 210k + r | 2310k + r |
---|---|---|---|---|---|
Variadic templates | 65521 | 0.8 (0.6) | 0.6 (0.5) | 0.7 (0.6) | 35.0 (111.5) |
4294967291 | - | - | 4.6 (6.6) | 94.0 (305.2) | |
18446744073709551615 | - | - | 4.8 (6.5) | 93.6 (306.4) | |
Loki::Typelist | 65521 | 0.6 (0.6) | 0.6 (0.6) | 0.8 (0.8) | 309.8 (420.7) |
4294967291 | - | - | 8.5 (12.5) | 1070.4 (1441.5) | |
18446744073709551615 | - | - | 8.6 (12.5) | 1062.4 (1436.9) |
Intel C++ 17.0.1
There are no significant changes to the version 16.0.1.
Compile-time in [s] | icpc 16.0.1 | 6k + r | 30k + r | 210k + r | 2310k + r |
---|---|---|---|---|---|
Variadic templates | 65521 | 0.7 (0.6) | 0.7 (0.5) | 0.8 (0.6) | 37.1 (111.5) |
4294967291 | - | - | 4.7 (6.6) | 90.5 (305.2) | |
18446744073709551615 | - | - | 4.7 (6.5) | 88.6 (306.4) | |
Loki::Typelist | 65521 | 0.7 (0.6) | 0.7 (0.6) | 0.8 (0.8) | 269.8 (420.7) |
4294967291 | - | - | 9.4 (12.5) | 925.9 (1441.5) | |
18446744073709551615 | - | - | 9.4 (12.5) | 1129.2 (1436.9) |
MSVC 14.24720 (Visual Studio 2015)
Unfortunately, there is still no setting of template depth. It is fixed to 2046. But the tests for the large numbers could compile significantly faster (factor 10).
Compile-time in [s] | MSVC 14 | 6k + r | 30k + r | 210k + r | 2310k + r |
---|---|---|---|---|---|
Variadic templates | 65521 | 0.8 (0.4) | 0.7 (0.67) | 0.8 (0.8) | - |
4294967291 | - | - | 4.5 (43.2) | - | |
18446744073709551615 | - | - | 4.4 (42.7) | - | |
Loki::Typelist | 65521 | 0.7 (0.7) | 0.7 (0.7) | 0.8 (0.8) | - |
4294967291 | - | - | 2.7 (42.2) | - | |
18446744073709551615 | - | - | 2.7 (41.8) | - |
MSVC 15.26228.4 (Visual Studio 2017)
There are no significant changes to the previous table.
Compile-time in [s] | MSVC 15 | 6k + r | 30k + r | 210k + r | 2310k + r |
---|---|---|---|---|---|
Variadic templates | 65521 | 1 (0.4) | 1 (0.67) | 1.8 (0.8) | - |
4294967291 | - | - | 4.9 (43.2) | - | |
18446744073709551615 | - | - | 5.0 (42.7) | - | |
Loki::Typelist | 65521 | 1 (0.7) | 1 (0.7) | 1 (0.8) | - |
4294967291 | - | - | 3.9 (42.2) | - | |
18446744073709551615 | - | - | 3.9 (41.8) | - |