If a project has no includes, but uses only C++20 modules, will the compiler see every function body?

186 Views Asked by At

Take for example this piece of code (for the talk LLVM Optimization Remarks - Ofek Shilon - CppCon 2022, at 21:28):

void somefunc(const int&);
int whateva();

void f(int i, int* res)
{
    somefunc(i);
    i++;
    res[0] = whateva();
    i++;
    res[1] = whateva();
    i++;
    res[2] = whateva();
} 

Since somefunc and whateva's bodies are not available for the compiler in this translation unit (TU from now on), a lot of optimization opportunities can be missed.

Is this problem "solved" if your project uses only C++20 modules? I understand that, when using modules, the compiler generates some files with metadata about the module, similar to a precompiled header, let's call it precompiled-modules, and these precompiled-modules must be included (on the command line) in any other module that imports them.

Does it mean that the "translation unit boundaries pessimization" (I made up the name; I don't know if it has one) is definitely gone when using modules, since all modules sees all the bodies (thanks to the attached precompiled-modules) of every function?

In other words, what is the new "compiler visibility boundary" (ignoring LTO) if your project doesn't use any single #include, except for, maybe, dynamically-linked libraries?

2

There are 2 best solutions below

2
Nicol Bolas On BEST ANSWER

Is this problem "solved" if your project uses only C++20 modules?

No.

You can put those function definitions (inlined) in the module interface files. And that will make the function definition potentially available to the compiler when importing such a module. But it doesn't happen just because you built the code as a module. If you put the function in a module implementation unit, then it won't be visible to TUs that import that module.

This is fundamentally no different than putting the definition in a header vs. a source file. The primary difference is that, with modules, the compiler won't have to compile the code of that definition every time the file is imported.

Transitive imports (you import module A which imports module B but doesn't export it) will retain visibility. Even though you can't access module B directly, by importing A, all of B's stuff that's visible to the compiler will be visible to your TU. So if you call an inline function of A which calls an inline function of B, the inlining can propagate properly.

0
Arsenović Arsen On

No, modules do not replace that boundary. You still need LTO to optimize across TUs.

If you want to verify this yourself, you could look at the content of CMI files. As an example, I've created a module that exports a function that calls puts ("foo") but that string never appears in the resulting module:

~/f$ gcc -c -O3 -std=c++23 -fmodules-ts modtest.cpp
~/f$ grep foo gcm.cache/helloworld.gcm || echo no match
no match

LTO is generally quite usable nowadays (for instance, many GNU/Linux distributions build nearly all packages with it, as do I), so, a new solution that requires changing existing code isn't really necessary, and modules set out to solve a different problem (lack of logical modularity between TUs)