Trouble understanding char* and string in CS50

156 Views Asked by At

So I know that a string is just an array of characters that are stored consecutively in a computer's memory.

I also know that in order to find out the location of a string, you just have to go to the location of the first character as its consecutive and the string ends when the program or function encounters the \0 character.

But what I don't understand is :

  1. char* s = "HI!";
    

Does it create an array of 4 characters? Or is it just a pointer pointing to the starting character location? Or is it doing both?

2.

    char* name = "BEN";
    printf("%c %c\n", *(name + 1), *name + 1);

Why do they both give two different outputs (E and C), instead of both giving E?

3

There are 3 best solutions below

0
Vlad from Moscow On BEST ANSWER

In this declaration

char* s = "HI!";

Two entities are created.

The first one is the string literal "HI!" which has static storage duration and array type char[4]. (In C++ it has constant character array type const char[4], in opposite to C.)

You can check this using printf

printf( "sizeof( \"HI!\" ) = %zu\n", sizeof( "HI!" ) );

Here the character array is used as an initializer of the pointer s. In this case it is implicitly converted to the first element of a pointer and the pointer s points to the address of the first element of the array.

As for this code snippet

char* name = "BEN";
printf("%c %c\n", *(name + 1), *name + 1);

The expression name + 1 has type char * and points to the second character of the string literal "BEN" (thus 'E'), due to the pointer arithmetic. Dereferencing the pointer expression like *(name + 1) you get the symbol of the string literal pointed to by the expression. Actually, the expression *(name + 1) is the same as name[1] that is the same as 1[name].:)

As for this expression *name, dereferencing the pointer name you get the first symbol 'B' of the string literal. Then, 1 is added to the internal code of the symbol ( *name + 1 ), so the expression takes the value of the next symbol after 'B', which is 'C'. The expression ( *name + 1 )is equivalent to the expressionname[0] + 1`.

Using the subscript operator like name[1] and name[0] + 1 makes the expressions more clear.

I think it would be interesting for you to know that the call of printf may be rewritten using only the original string literal. Some examples:

printf("%c %c\n", *( "BEN" + 1), *"BEN" + 1);

or

printf("%c %c\n", "BEN"[1], "BEN"[0] + 1);

or even

printf("%c %c\n", 1["BEN"], 0["BEN"] + 1);
0
Ted Lyngmo On
  1. In char* s = "HI!"; then "HI!" is a string literal. s points at the H. The string literal is 4 char long (including the terminating \0). Another (non-idiomatic) way to look at it:
    char(*s)[4] = &"HI!";
    
    ... and here s is a pointer to a char[4].
  2. In your second example, it's about in which order you do things, the operator precedence.
    char* name = "BEN";
    printf("%c %c\n", *(name + 1), *name + 1);
    
    • (name + 1) adds 1 to the char* name so you get a pointer pointing at the E. After that you dereference the pointer with * and get E.
    • *name dereferences the pointer, which points at B and then you + 1 to it, making it C.

It's the same as doing this:

printf("%c %c\n", name[1], name[0] + 1);
0
selbie On

But what I don't understand is :

char* s = "HI!";

Does it create an array of 4 characters? Or is it just a pointer pointing to the starting character location? Or is it doing both?

The right side of the assignment, = "HI!"; provisions a constant array of chars (4 including the null character). Where/how the memory is allocated for these 4 chars is not of concern. The compiler is free to choose where "HI!" lives and how long it lives for.

The left side of the assignment: char *s = declares a local pointer on the stack to point to the first character of the array (the 'H')

char* name = "BEN";

printf("%c %c\n", *(name + 1), *name + 1);

Why do they both give two different outputs (E and C), instead of both giving E?

  • name is a pointer. It points to the address of the letter B in that string.
  • name+1 is a pointer to one character past B in that string, in this case, E.
  • *name is the value held at the address pointed to by name. In other words the character 'B'.
  • *name + 1 is the same as (*name) + 1 or 'B' + 1 or 'C'. You're adding 1 to the value, not the address in this expression.
  • *(name + 1), evaluates to the second character in the string referenced by name. Or 'E' as you might expect