Upon decompiling various programs (which I do not have the source for), I have found some interesting sequences of code. A program has a c-string (str) defined in the DATA section. In some function in the TEXT section, a part of that string is set by moving a hexadecimal number to a position in the string (simplified Intel assembly MOV str,0x006f6c6c6568). Here is an snippet in C:
#include <stdio.h>
static char str[16];
int main(void)
{
*(long *)str = 0x006f6c6c6568;
printf("%s\n", str);
return 0;
}
I am running macOS, which uses little endian, so 0x006f6c6c6568 translates to hello. The program compiles with no errors or warnings, and when run, prints out hello as expected. I calculated 0x006f6c6c6568 by hand, but I was wondering if C could do it for me. Something like this is what I mean:
#include <stdio.h>
static char str[16];
int main(void)
{
// *(long *)str = 0x006f6c6c6568;
*(str+0) = "hello";
printf("%s\n", str);
return 0;
}
Now, I would not like to treat "hello" as a string literal, it might be treated like this for little-endian:
*(long *)str = (long)(((long)'h') |
((long)'e' << 8) |
((long)'l' << 16) |
((long)'l' << 24) |
((long)'o' << 32) |
((long)0 << 40));
Or, if compiled for a big-endian target, this:
*(long *)str = (long)(((long) 0 << 16) |
((long)'o' << 24) |
((long)'l' << 32) |
((long)'l' << 40) |
((long)'e' << 48) |
((long)'h' << 56));
Thoughts?
TL:DR: you want
strncpyinto auint64_t. This answer is long in an attempt to explain the concepts and how to think about memory from C vs. asm perspectives, and whole integers vs. individualchars / bytes. (i.e. if it's obvious that strlen/memcpy or strncpy would do what you want, just skip to the code.)If you want to copy exactly 8 bytes of string data into an integer, use
memcpy. The object-representation of the integer will then be those string bytes.Strings always have the first
charat the lowest address, i.e. a sequence ofcharelements so endianness isn't a factor because there's no addressing within achar. Unlike integers where it's endian-dependent which end is the least-significant byte.Storing this integer into memory will have the same byte order as the original string, just like if you'd done
memcpyto achar tmp[8]array instead of auint64_t tmp. (C itself doesn't have any notion of memory vs. register; every object has an address except when optimization via the as-if rule allows, but assigning to some array elements can get a real compiler to use store instructions instead of just putting the constant in a register. So you could then look at those bytes with a debugger and see they were in the right order. Or pass a pointer tofwriteorputsor whatever.)memcpyavoids possible undefined behaviour from alignment and strict-aliasing violations from*(uint64_t*)str = val;. i.e.memcpy(str, &val, sizeof(val))is a safe way to express an unaligned strict-aliasing safe 8-byte load or store in C, like you could do easily withmovin x86-64 asm.(GNU C also lets you
typedef uint64_t aliasing_u64 __attribute__((aligned(1), may_alias));- you can point that at anything and read/write through it safely, just like with an 8-byte memcpy.)char*andunsigned char*can alias any other type in ISO C, so it's safe to use memcpy and evenstrncpyto write the object-representation of other types, especially ones that have a guaranteed format / layout likeuint64_t(fixed width, no padding, if it exists at all).If you want shorter strings to zero-pad out to the full size of an integer, use
strncpy. On little-endian machines it's like an integer of widthCHAR_BIT * strlen()being zero-extended to 64-bit, since the extra zero bytes after the string go into the bytes that represent the most-significant bits of the integer.On a big-endian machines, the low bits of the value will be zeros, as if you left-shifted that "narrow integer" to the top of the wider integer. (And the non-zero bytes are in a different order wrt. each other).
On a mixed-endian machine (e.g. PDP-11), it's less simple to describe.
strncpyis bad for actual strings but exactly what we want here. It's inefficient for normal string-copying because it always writes out to the specified length (wasting time and touching otherwise unused parts of a long buffer for short copies). And it's not very useful for safety with strings because it doesn't leave room for a terminating zero with large source strings.But both of those things are exactly what we want/need here: it behaves like
memcpy(val, str, 8)for strings of length 8 or higher, but for shorter strings it doesn't leave garbage in the upper bytes of the integer.Example: first 8 bytes of a string
This compiles very simply, to one x86-64 8-byte mov instruction using GCC or clang on the Godbolt compiler explorer.
On ISAs where unaligned loads just work with at worst a speed penalty, e.g. x86-64 and PowerPC64,
memcpyreliably inlines. But on MIPS64 you'd get a function call.BTW, I used
sizeof(value)instead of8for two reasons: first so you can change the type without having to manually change a hard-coded size.Second, because a few obscure C implementations (like modern DSPs with word-addressable memory) don't have
CHAR_BIT == 8. Often 16 or 24, withsizeof(int) == 1i.e. the same as achar. I'm not sure exactly how the bytes would be arranged in a string literal, like whether you'd have one character percharword or if you'd just have an 8-letter string in fewer than 8chars, but at least you wouldn't have undefined behaviour from writing outside a local variable.Example: short strings with
strncpyThe
strncpymisfeatures (that make it not good for what people wish it was designed for, astrcpythat truncates to a limit) are why compilers like GCC warn about these valid use-cases with-Wall. That and our non-standard use-case, where we want truncation of a longer string literal just to demo how it would work. That's notstrncpy's fault, but the warning about passing a length limit the same as the actual size of the destination is.Big-endian examples: PowerPC64
Strangely, GCC for MIPS64 doesn't want to inline
strnlen, and PowerPC can more efficiently construct constants larger than 32-bit anyway. (Fewer shift instructions, asoriscan OR into bits [31:16], i.e. OR a shifted immediate.)Compiling as C++ to allow function return values as initializers for global vars, clang (trunk) for PowerPC64 compiles the above with constant-propagation into initialized static storage in
.datafor these global vars, instead of calling a "constructor" at startup to store into the BSS like GCC unfortunately does. (It's weird because GCC's initializer function just constructs the value from immediates itself and stores.)The asm for
tests1()can only construct a constant from immediates 16 bits at a time (because an instruction is only 32 bits wide, and some of that space is needed for opcodes and register numbers). GodboltI played around a bit with getting constant-propagation to work for an initializer for a
uint64_t foo = tests1()at global scope in C++ (C doesn't allow non-const initializers in the first place) to see if I could get GCC to do what clang does. No success so far. And even withconstexprand C++20std::bit_cast<uint64_t>(struct_of_char_array)I couldn't get g++ or clang++ to acceptuint64_t foo[stringbytes2("h")]to use the integer value in a context where the language actually requires aconstexpr, rather than it just being an optimization. Godbolt.IIRC std::bit_cast should be able to manufacture a constexpr integer out of a string literal but there might have been some trick I'm forgetting; I didn't search for existing SO answers yet. I seem to recall seeing one where
bit_castwas relevant for some kind of constexpr type-punning.Credit to @selbie for the
strncpyidea and the starting point for the code; for some reason they changed their answer to be more complex and avoidstrncpy, so it's probably slower when constant-propagation doesn't happen, assuming a good library implementation ofstrncpythat uses hand-written asm. But either way still inlines and optimizes away with a string literal.Their current answer with
strnlenandmemcpyinto a zero-initializedvalueis exactly equivalent to this in terms of correctness, but compiles less efficiently for runtime-variable strings.