Difference in Copy Elision for Trivial vs. Non-trivial Types

218 Views Asked by At

I'm inspecting copy-elision between trivial and non-trivial copy-able types when one function's return by value directly passes by value into another function. For the non-trivial case, it appears the object is directly transferred as expected, but for the trivial case, it appears the output object is copied on the stack to make the input object for the second function. My question is, why?

If this is expected, this is surprising, as the non-trivially copy-able type is more efficiently passed between these functions.

Source:

struct Trivial_Struct
{
    unsigned char bytes[ 4 * sizeof( void* ) ];
};

struct Nontrivial_Struct
{
    unsigned char bytes[ 4 * sizeof( void* ) ];
    Nontrivial_Struct( Nontrivial_Struct const& );
};

Trivial_Struct trivial_struct_source();
Nontrivial_Struct nontrivial_struct_source();
void trivial_struct_sink( Trivial_Struct );
void nontrivial_struct_sink( Nontrivial_Struct );

void test_trivial_struct()
{
    trivial_struct_sink( trivial_struct_source() );
}

void test_nontrivial_struct()
{
    nontrivial_struct_sink( nontrivial_struct_source() );
}

GCC Output Assembly:

test_trivial_struct():
    sub     rsp, 40
    mov     rdi, rsp
    call    trivial_struct_source()
    push    QWORD PTR [rsp+24]
    push    QWORD PTR [rsp+24]
    push    QWORD PTR [rsp+24]
    push    QWORD PTR [rsp+24]
    call    trivial_struct_sink(Trivial_Struct)
    add     rsp, 72
    ret
test_nontrivial_struct():
    sub     rsp, 40
    mov     rdi, rsp
    call    nontrivial_struct_source()
    mov     rdi, rsp
    call    nontrivial_struct_sink(Nontrivial_Struct)
    add     rsp, 40
    ret

godbolt.org. I tried GCC, Clang, and MSVC; GCC's assembly is easier for me to read, but all compilers seems to make similar code for the trivially copy-ably case.

Misc:

  • Apparently, I can accidentally make 'Nontrivial_Struct' actually be trivial if I declare the copy constructor inside the class definition as Nontrivial_Struct( Nontrivial_Struct const& ) = default; if I add Nontrivial_Struct::Nontrivial_Struct( Nontrivial_Struct const& ) = default; after the class definition then it remains non-trivial.
  • I can change the '4' to large values, such as '64', and it still occurs.

Speculation:

1

There are 1 best solutions below

0
Jeff Garrett On

The calling convention is mandated by the ABI. The ABI specifies that both the source functions' return values are allocated by the caller and a hidden pointer is passed. The ABI specifies that the trivial struct is passed on the stack and the nontrivial one is passed by hidden pointer. Reference: x86-64 and C++ ABIs.

[class.temporary]/3 gives implementations latitude to create temporaries for arguments and return values, which makes the observed behavior OK. It does not mandate it.

The trivial struct is the return value which is initialized in the stack and must be passed on the stack (both because of ABI). One might ask, why does it copy the struct from its first location on the stack to the second location on the stack? That copy is indeed unnecessary. The compiler could do better. Here's the GCC bug.