Is there a way to flag a global variable being set in multiple places in c++?

166 Views Asked by At

I'm looking for a way to, at compile time, flag/throw an error if a particular variable is set by more than one line in C++.

i.e. something like:

int a; // ~special~

int main(){
    while(1){
        fn1();
        fn2();
        print(a);
    }
}

void fn1(){
    a++;
}

void fn2(){
    print("hi");
}

This should compile fine, BUT changing fn2() to:

void fn2(){
   a = 2;
}

should throw an error, since a is modified by both fn1 and fn2.

Note a must be modified often, at runtime (so setting it as a const, or making a function that locks after setting it once at runtime, does NOT work). It just shouldn't be modified by more than a single function.

Note that this MUST be done at compile-time, as ANY chance of a "collision" means a serious bug. It CAN be assumed that no two threads/cores will ever call the same function that sets a variable (i.e. if thread1 calls fn1, it can be assumed thread2 will NEVER call fn1), so it is sufficient to check that a variable is only written to on one line.

This is embedded and on a very slow processor, run-time checks will add too much overhead, as well.

Sorry if this is a repeat question, I did search, but everything was for setting a variable only once (i.e. const), not on only one line.


(the rest of this question is just context, skip if you've already got an answer)

A critique of "you shouldn't be doing that!" "global variables bad!" or something isn't helpful - this is a very specific situation (multi-core bare metal real-time embedded software, whose execution path must change completely based on any input from several different cores/threads) - it's kind of gross but something like this IS the cleanest way to do this.

Note that working around this with something like:

void fn1(){
   modifyA(1);
} 

void fn2(){
   modifyA(2);
} 

void modifyA(int in){
   a = in; 
}

does NOT work (though it's fine if the answer to this question WOULDN'T catch this).

However, something like:

#define setA(in) *some preprocessor magic*(a=in)*more magic*

(so maybe something like #ifdef A_IS_SET #error "a set twice" #else #define A_IS_SET ; a=*whatever* #endif? But that doesn't quite work)

where having setA() twice in the program would cause an error.

The reason for this is there's a huge global state vector (~100 variables), each of which should only ever be SET by one single core/thread, but needs to be read by all.

Each core/thread has a local "copy" of the global state, and there's a single mutex for the global state. Each core will grab the mutex, modify some variables in the global state, copy everything else in it to its local copy (so it's synchronized with the global state), then release the mutex. This may happen at any time, in any order (so core 0 might do it five times in a row, then core 2, then 0 again, then 1, or whatever).

Simplified example:

int a;
int b;
int c;
float d;
bool e;
mutex_t globalMutex;

int core0_a; //"coren_x" is local copy of global var x, for core/thread n (yes this part is handled better in the real thing, simplified here)
...
bool core0_e;
int core1_a;
... (and so on for all other cores/vars)

fnRunByCore0(){
    getMutex(globalMutex);
    a = core0_a;                     //so a is now "owned" by core0
    c = core0_c;                     //so c is now "owned" by core0 too

    core0_b = b;
    core0_d = d;
    core0_e = e;
    releaseMutex(globalMutex);
}

fnRunByCore1(){
    getMutex(globalMutex);
    b = core1_b;                     //so b is now "owned" by core1
    d = core1_d;                     //so d is now "owned" by core1

    core1_a = a;
    core1_c = c;
    core1_e = e;
    releaseMutex(globalMutex);  
}

fnRunByCore2()
... (same deal, but sets only e)

The reason for this question is to make sure that I'm keeping track of which global vars are written by each core/thread/function. In theory, I should just need to read each of those functions and make sure that they're each only set once. In practice, that's a LOT of checking by eye, this is just to double-check.

The hardware doesn't have the performance overhead available to keep something like an index that's checked at runtime for what core owns each variable, or individual mutexes, or anything like that - this is run ~100 types per loop, and any runtime code will make performance unacceptable. Besides, this indicates a bug in the program, so should be caught at compile-time anyway.

5

There are 5 best solutions below

4
Weijun Zhou On BEST ANSWER

It is possible if you don't mind adding some header to each of the functions you want to check, and the functions are all in the same TU.

First, we define a macro to obtain a unique type from each function.

#define DECLARE_ONE_PLACE_CHECK using _globalOnceCheck = decltype([](){})

Then, we abuse the rule that a single global variable cannot be declared twice with different types.

#define SET_ONE_PLACE(var, val) do{extern _globalOnceCheck checkUseOnce##var; var = val;}while(0)

Now, if you do,

void f1(){
    DECLARE_ONE_PLACE_CHECK;
    SET_ONE_PLACE(a, 42);
    SET_ONE_PLACE(a, 4242)
    if(true){
        SET_ONE_PLACE(a, 424242);
    }
}

The compiler will happily accept it, meaning you can call it multiple times in the same function, regardless of the scope.

However, if instead you have

void f1(){
    DECLARE_ONE_PLACE_CHECK;
    SET_ONE_PLACE(a, 42);
}

void f2(){
    DECLARE_ONE_PLACE_CHECK;
    SET_ONE_PLACE(a, 42);
}

The compiler will complain about redeclaration of checkOneUsea as a different type, because the type of lambda is unique to each instance.

Despite the the single-TU restriction (which I admit is a rather strong limitation, but anyway has been confirmed to work in OP's case), this solution is truely zero-cost in run time. All the decltype and extern declaration stuff are purely compile time, and the check in do while(0) can be optimized away even by the dumbest compiler, so it really the go-to solution for real-time systems unless a more general and expressive one exist.

If the check really need to spread over multiple TUs, I think we can make a file that combines all the potentially problematic source files to a single one by using #include, and then try to compile the resulting file just for the sake of checking, That file should of course be excluded from the actual main build process. I am aware that #includeing multiple source files is not always the same as having multiple TUs, especially if static is involved. However, I believe it's something worth trying and should work in many cases.

1
Jan Schultke On

You're right that your original modifyA does nothing for you, but you can write one which detects concurrent modifications.

From what I can tell, the goal is that only one thread can ever write to the variable, which can be accomplished using atomics.

// initially, a is unclaimed
std::atomic<std::thread::id> a_owner = {};
int a = 0;

void modifyA(int new_a) {
    auto me = std::this_thread::get_id();
    if (a_owner.load(std::memory_order::relaxed) == me) {
        a = new_a; // we own a, we can modify it, everything OK
        return;
    }

    // atomically do the following:
    // if (a_owner == std::thread::id{}) a_owner = me;
    std::thread::id expected = {};
    if (a_owner.compare_exchange_strong(expected, me)) {
        a = new_a; // we have become the new owner
    }
    else {
        throw ":("; // someone else already owns a, or we have failed to claim ownership
         // note: the id of the thread who stole a from us is now stored in expected
    }
}

You can also wrap this up in a nicer interface, like

struct UniquelyModified {
private:
    std::atomic<std::thread::id> ownership = {};
    int value = {};
    
public:
    UniquelyModified() = default;

    UniquelyModified& operator=(int x) {
        // TODO: what modifyA() does in the example above
        return *this;
    }

    // TODO: delete other assignment operators, add a getter, etc.
};

Alternative approach using fetch_add

It might be simpler and more efficient to rely on std::atomic::fetch_add. You can also achieve consensus about who was first by simply incrementing a, although this doesn't tell you who owns it.

std::atomic<int> a_ticket = {};
int a = 0;
thread_local bool own_a = false;

void modifyA(int new_a) {
    if (own_a) {
        a = new_a;
        return;
    }

    own_a = a_ticket++ == 0; // we were the first to claim a ticket
    if (own_a) {
        a = new_a;
    }
    else {
        throw ":("; // who knows who owns a, but it isn't us!
    }
}

Note on thread sanitizers

Modern compilers ship with thread sanitizers. You can enable this for Clang with -fsanitize=thread.

Thread sanitizers can automatically spot concurrent modifications for you, without any changes to your code. This obviously comes with some run-time cost though.

0
KamilCuk On

Looking for a way to, at compile time, flag/throw an error if a particular variable is set by more than one line in c++

Write your own small script that checks that. Add that script to tests in your chosen build system. Such scripts are typically called linters.

At simplest, make the variable static, create a function void setA(int); to set it, and add a script test "$(grep -c -R --include '*.cpp' setA dir_with_your_project) -eq 2 to your build system tests. 2 because definition and one usage, so I think that's 2.

3
JaMiT On

One reason your situation is difficult is that you are going against the grain of the language. You put a bunch of data out there, basically said "honor system: each variable can be claimed by only one function", and now you are trying to enforce the honor system. C++ access control is about giving access, not claiming it. Each variable should declare who can modify it.

One way to give access is by making write-access private and declaring a friend. This is not the best use of friend, but possibly not worse than using global variables. (A better approach would be to make the "friend" the owner. That is, instead of a global int a with friend class foo attached somehow, you would have class foo { static int a; } and some sort of read-only access for outside the class. However, this potentially loses data locality of the globals, which could be significant in a performance-critical portion of code.)

So how to make write access private? You could wrap each variable in a class, which would be defined via a template. This data itself would be private, but with a public getter function (which any decent compiler will optimize away, so no runtime overhead) and a friend who can change the data.

template <typename Type, class Setter>
class Global {
    friend Setter;
    Type value{};
  public:
    Type get() const { return value; }
};

With this template, you would provide the type of the variable and a class whose member functions can change the value. (The syntax makes it more convenient to declare a class friend than a function friend.) If you forward declare the friends, you can define your globals.

struct RunByCore0;
struct RunByCore1;
struct RunByCore2;

Global<int, RunByCore0> a;
Global<int, RunByCore1> b;
Global<int, RunByCore0> c;
Global<float, RunByCore1> d;
Global<bool, RunByCore2> e;

Now your "run by core" functions can be implemented as members of the forward declared classes. The main changes are using .value when setting a global value and using .get() when reading the value. For example:

struct RunByCore0 {
    static void fn(){
        getMutex(globalMutex);
        a.value = core0_a;         //so a is now "owned" by core0
        c.value = core0_c;         //so c is now "owned" by core0

        core0_b = b.get();
        core0_d = d.get();
        core0_e = e.get();
        releaseMutex(globalMutex);
    }
};

Trying to set a variable without access gives a compiler error as desired:

error: 'int Global<int, RunByCore0>::value' is private within this context

0
JaMiT On

In practice, that's a LOT of checking by eye,

This struck me as something that could be ameliorated without looking for the desired compiler error. The "LOT" comes from having "~100 variables". What if there were fewer variables?

You could group your variables based upon which core is allowed to modify them. This would reduce the number of variables and make it easier to spot if a core is setting a variable it should not.

struct OwnedByCore0 {
    int a;
    int c;
};
// And similarly for the other cores

These structs could be used for your global and core-local caches, and you could define access functions so that most code does not have to be concerned about where a given setting lives. The access functions should be optimized away (inlined) by a compiler that does any optimizations, so this is not runtime overhead.

// Global copy; protected by mutex
OwnedByCore0 globals0;
OwnedByCore1 globals1;
OwnedByCore2 globals2;

// Core-local copy
OwnedByCore0 core0_globals0;
OwnedByCore1 core0_globals1;
OwnedByCore2 core0_globals2;
int getCore0A() {
    return core0_globals0.a;
}

This is a bunch of compile-time infrastructure that should be mostly invisible at runtime. The one exception I can think of is that the compiler might add padding to these structures, and a compile flag might alleviate that.


Advanced version

You can get some pretty slick infrastructure going with a few more templates. The idea is to make it easy to do the right thing; harder to do the wrong. Code using the wrong approach should become obviously more convoluted than necessary. It would take some time to set up, but compare that to the time saved maintaining the code.

First, some technical mess: A utility template that converts a class template into a tuple of instantiations, one per core. That is, given a template Foo, this template becomes an alias for std::tuple<Foo<0>, ..., Foo<N-1>> where N is the number of cores. There might very well be a cleaner way to define this.

#include <mutex>
#include <utility>

constexpr std::size_t NUM_CORES = 3;

// Utility template to tie together per-core pieces.
// CoreTuple<P> is std::tuple<P<0>, ..., P<NUM_CORES-1>>.
template<template<std::size_t> class Pattern>
using CoreTuple = decltype([]<std::size_t... I>
    (std::index_sequence<I...>) -> std::tuple<Pattern<I>...> {
        return {};
    }(std::make_index_sequence<NUM_CORES>{}));

Next, define the actual data. As above, the data is grouped by which core can modify it. A difference here is that each grouping is an instantiation of a common template. This allows some generic programming (like the above template).

// Data owned by each core
template <std::size_t Core>
struct OwnedByCore; // Base template declared, not defined

template<>
struct OwnedByCore<0> {
    int a;
    int c;
};
template<>
struct OwnedByCore<1> {
    int b;
    float d;
};
template<>
struct OwnedByCore<2> {
    bool e;
};

At this point, we can define the core-local copies of the data. These are implemented as variable templates and access functions. If a variable ever needs to change owners, its definition would need to be moved from one of the above struct definitions to another, and the associated access function would need to change the number specified for get<>. Pretty simple infrastructure changes.

// Local copies of the global data (variable template)
template <std::size_t Core> CoreTuple<OwnedByCore> core_data;

// Define some convenience wrappers for reading the core data.
// The template argument to `get<>` is the core that updates the data.
template <std::size_t Core>
int getA() { return std::get<0>(core_data<Core>).a; }
template <std::size_t Core>
int getB() { return std::get<1>(core_data<Core>).b; }
template <std::size_t Core>
int getC() { return std::get<0>(core_data<Core>).c; }
template <std::size_t Core>
float getD() { return std::get<1>(core_data<Core>).d; }
template <std::size_t Core>
bool getE() { return std::get<2>(core_data<Core>).e; }

// When updating data, use this:
template <std::size_t Core>
OwnedByCore<Core>& myCoreData() { return std::get<Core>(core_data<Core>); }

Now we get to the global data. Modification access will be limited to a sync member function. This function performs the updates of the global data. This is where you make it hard to do the wrong thing, as when sync updates the global data owned by core 0, it updates from the core-specific data of core 0. You cannot go wrong unless code goes out of its way to access the wrong core's data. While not impossible, such an endeavor should stand out like a sore thumb.

Caveat: if you dig into what sync does, you can find a bit of extra work because of (hidden) assignment pairs like x = y; followed by y = x;. I'm relying on the compiler to find this also and optimize away the second assignment.

template <std::size_t Core>
class SyncedData {
  public:
    const OwnedByCore<Core>&  get() const { return value; }
    // Conversion to support assigning tuples:
    operator const OwnedByCore<Core>&() const { return get(); }

    // This hardcodes which core's data is looked at, so it is
    // pointless to call it on the wrong core.
    static void sync();

  private:
    OwnedByCore<Core> value{};
};
// The global data.
inline CoreTuple<SyncedData> global_data;
inline std::mutex globalMutex;

// Define the sync function (after `global_data` is declared).
template <std::size_t Core>
void SyncedData<Core>::sync() {
    std::lock_guard lock(globalMutex);

    // The following line should be the only place where global data is updated.
    std::get<Core>(global_data).value = std::get<Core>(core_data<Core>);
    core_data<Core> = global_data;
}

The remaining piece is to define functions that will call sync. This goes outside the question's code, but presumably they would update variables, then call SyncedData<0>::sync() where 0 is the core number. There could still be some eyeballing here. If a function calls myCoreData more than once or gives it the wrong core number, something is wrong. (The template arguments are the core numbers.)

void fnRunByCore0() {
    // Update values
    auto& data = myCoreData<0>();
    data.a = 1;
    data.c = getE<0>() ? 2 : 3; // To demonstrate reading e
    
    SyncedData<0>::sync();
}

Multi-file version

One weakness so far is that this approach relies on supplying the correct core number to certain templates. This can be made better if the code is split into multiple files, where each source file is either core-agnostic or core-specific. With this setup, the "advanced version" code (minus fnRunByCore0) goes into a header file with a few changes.

For the access functions, replace template <std::size_t Core> with static and Core with MY_CORE.

static int getA() { return std::get<0>(core_data<MY_CORE>).a; }
static int getB() { return std::get<1>(core_data<MY_CORE>).b; }
static int getC() { return std::get<0>(core_data<MY_CORE>).c; }
static float getD() { return std::get<1>(core_data<MY_CORE>).d; }
static bool getE() { return std::get<2>(core_data<MY_CORE>).e; }

static OwnedByCore<MY_CORE>& myCoreData() {
    return std::get<MY_CORE>(core_data<MY_CORE>);
}

Also add a wrapper for the correct sync function.

static void syncData() {
    SyncedData<MY_CORE>::sync();
}

These are declared static so that each translation unit can have a different definition. (This should not be a bloat issue since the compiler should be inlining all of these functions.)

What is MY_CORE? Glad you asked. In the core-specific source files, start by defining MY_CORE before including the header.

#include <cstddef>
constexpr std::size_t MY_CORE = 1;

#include "OwnedByCore.hpp" // Or whatever name you gave it

Now the "run by core" functions should be easy to check by eye. If there are any template parameters, it's wrong. They should look like the following.

void fnRunByCore0() {
    // Update values
    auto& data = myCoreData();
    data.a = 1;
    data.c = getE() ? 2 : 3; // To demonstrate reading e
    
    syncData();
}

Note: I prefer this version, but a comment on another answer added the information that the OP has all the functions are in a single file, so I started with that version.