Generally, the default constructor should be the fastest way of making an empty container. That's why I was surprised to see that it's worse than initializing to an empty string literal:
#include <string>
std::string make_default() {
return {};
}
std::string make_empty() {
return "";
}
This compiles to: (clang 16, libc++)
make_default():
mov rax, rdi
xorps xmm0, xmm0
movups xmmword ptr [rdi], xmm0
mov qword ptr [rdi + 16], 0
ret
make_empty():
mov rax, rdi
mov word ptr [rdi], 0
ret
See live example at Compiler Explorer.
Notice how returning {} is zeroing 24 bytes in total, but returning "" is only zeroing 2 bytes. How come return ""; is so much better?
This is an intentional decision in libc++'s implementation of
std::string.First of all,
std::stringhas so-called Small String Optimization (SSO), which means that for very short (or empty) strings, it will store their contents directly inside of the container, rather than allocating dynamic memory. That's why we don't see any allocations in either case.In libc++, the "short representation" of a
std::stringconsists of:basic_string<char>)For an empty string, we only need to store two bytes of information:
The constructor accepting a
const char*will only write these two bytes, the bare minimum. The default constructor "unnecessarily" zeroes all 24 bytes that thestd::stringcontains. This may be better overall though, because it makes it possible for the compiler to emitstd::memsetor other SIMD-parallel ways of zeroing arrays of strings in bulk.For a full explanation, see below:
Initializing to
""/ Callingstring(const char*)To understand what happens, let's look at the libc++ source code for
std::basic_string:This ends up calling
__init(__s, 0), where0is the length of the string, obtained fromstd::char_traits<char>:__set_short_sizewill end up writing only a single byte, because the short representation of a string is:After compiler optimizations, zeroing
__is_long_,__size_, and one byte of__data_compiles to:Initializing to
{}/ Callingstring()The default constructor is more wasteful by comparison:
This ends up calling
__default_init(), which does:Value-initialization of a
__rep()results in 24 zero bytes, because:Conclusion
If you want to value-initialize everywhere for the sake of consistency, don't let this keep you from it. Zeroing out a few bytes unnecessarily isn't a big performance problem you need to worry about.
In fact, it is helpful when initializing large quantities of strings, because
std::memsetmay be used, or some other SIMD way of zeroing out memory.