3

Let's take an example.

#include <stdio.h>
#include <string.h>

int main() {
    char str1[7] = "hello ";
    printf("Initial size of str1 is: %d\n", strlen(str1));
    char str2[] = "buddy";
    printf("%s\n", strcat(str1, str2));
    printf("Final size: %d\n", strlen(str1));
}

The output of the above program will be

Initial size of str1 is: 6
hello buddy
Final size: 11

--------------------------------
Process exited after 0.835 seconds with return value 0
Press any key to continue . . .

See? how the size of str1 changed from 7 to 11 (including null variable), regarding that what I think would have happened is :

  • Some function I do not know may have reallocated contiguous memory for str1 starting from same address as before i.e str1 with size strlen(str1)+strlen(str2)+1 +1 for null value, and then redefined it to get hello buddy.

If I am wrong please tell, if not then, what function is it and how does it work?

One more question: how can I write a code to do the above task without the use of strcat function.

I tried doing it using realloc() but didn't quite succeed may be that's because realloc() can only reallocate dynamically allocated memory, is it so?

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
Raon
  • 113
  • 8
  • 5
    You just overflowed buffer where `str1` points to. It is undefined behavior. – dimich Aug 02 '22 at 08:38
  • 3
    And there is nothing with dynamic memory allocation. All strings are located on stack. – dimich Aug 02 '22 at 08:40
  • "is it so?" The [manpage](https://man7.org/linux/man-pages/man3/realloc.3p.html) provides the answer: "If ptr does not match a pointer returned earlier by calloc(), malloc(), or realloc() or if the space has previously been deallocated by a call to free() or realloc(), the behavior is undefined." – Gerhardh Aug 02 '22 at 10:10

5 Answers5

2

Buffer overflow

OP's code fails as strcat(str1,str2) attempts to write past the end of str1[] - result: undefined behavior (UB). @dimich

Instead use a larger destination buffer.

// char str1[7]="hello ";
char str1[7 + 5]="hello ";
char str2[]="buddy";
printf("%s\n",strcat(str1,str2));

Use correct print specifier

strlen() returns a size_t, not an int.

// printf("Initial size of str1 is: %d\n",strlen(str1));
printf("Initial size of str1 is: %zu\n",strlen(str1));

Tip: enable all warnings.

Alternative

One of many alternatives: copy str2 to the end of str1.

// printf("%s\n",strcat(str1,str2));
strcpy(str1 + strlen(str1), strt2);
printf("%s\n",str1);

realloc()

realloc() can only reallocate dynamically allocated memory, is it so?

realloc() should not be used on pointers to non-allocated, non-NULL pointers.
In addition to re-allocating dynamically allocated memory, realloc() can start with no prior allocation.

char *p = realloc(NULL, size);
// just like
char *p = malloc(size);

Moral of the story

  • Be mindful of memory usage with string functions.
  • Enable all warnings.
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • can you please explain what does `strcpy(str1 + strlen(str1), str2);` actually does when i used `strcpy(str1 + strlen(str2), str2);` it returned _hellobuddy_ else it works completely fine. – Raon Aug 02 '22 at 16:26
  • and yea i got your point about buffer overflow but why does it gives required output every time even if i change string in _str1_ even when memory is statically allocated?? – Raon Aug 02 '22 at 16:35
  • @Raon `strcpy(str1 + strlen(str1), str2);` determines the location of the _null character_ with `str1 + strlen(str1)` as that is the _length_ of the _string_ from its beginning. `strcpy(str1 + strlen(str1), str2)` then copies `str2` to that location - the end of `str1`. – chux - Reinstate Monica Aug 02 '22 at 17:18
  • @Raon `strcpy(str1 + strlen(str2), str2);` should form _hellobuddy_ as the length of `str2` is one less than `str1`, so the copying destination starts one character earlier, overwriting the `' '`. – chux - Reinstate Monica Aug 02 '22 at 17:20
  • @Raon Writing beyond the end of `char str1[7]` is UB. It might return what you want, it might freeze the program, it might ... It is _not_ defined. Your _every time_ is not that many times, not with many compilers, not with many platforms, etc. To form the correct output every time for you, me and others on other computers, other compilers, we need to use a destination buffer of sufficient size like `char str1[7 + 5];`. – chux - Reinstate Monica Aug 02 '22 at 17:24
  • 1
    @Raon If you need a language that catches your mistakes, C is not the language for you. C trusts the programmer is doing the right thing and so does not need to add lots of run-time checks that slow things down. C is [compact, powerful, goes everywhere](http://s3.crashworks.org.s3-website-us-east-1.amazonaws.com/if-programming-languages-were-vehicles/) – chux - Reinstate Monica Aug 02 '22 at 17:28
  • so basically `str1+strlen(str1)` is defining from which point of index str2 should be copied?? where `str1` is the target point and `strlen(str1)` gives index from which it should start copying? and one more thing how does this not cause buffer overflow? str1 is still defined to be of 7 characters! – Raon Aug 03 '22 at 03:40
  • is somehow the size of the _array storing str1 increases_? – Raon Aug 03 '22 at 03:41
  • 1
    @Raon: the *size* of the array `str1` does not change, only the *length* of the C string it contains does. The array`str1` must be defined with a large enough size to accommodate the final string plus a null terminator: `6 + 5 + 1` bytes. – chqrlie Aug 03 '22 at 09:30
  • @Raon "str1+strlen(str1) is defining from which point of index str2 should be copied" --> Yes. Calculating an address within an array, or one past it not UB. Attempting to store past the arrays' end is UB. Once an _array_ is defined, its size cannot change. – chux - Reinstate Monica Aug 03 '22 at 10:45
1
  1. You can only reallocate the memory you have dynamically allocated (ie using malloc family functions).

ee? how the size of str1 changed from 7 to 11 (including null variable), regarding that what I think would have happened is : A function idk which one, but it may have reallocated contiguous memory for str1 starting from same address as before i.e str1 with size strlen(str1)+strlen(str2)+1 +1 for null value, and then redefined it to get hello buddy. If i am wrong please tell, if not then , what function is it and how does it work?

  1. You are wrong. It is an example of an Undefined Behaviour. You have written some data outside array memory. Undefined Behavior means that your program behaviour from now is unpredictable. strcat does not reallocate any memory.

At last, how can i write a code to do the above task without the use of strcat function.

For example:

char *mystrcat(char *dest, const char *src, const int isDynamicMemory)
{
    size_t dlen = strlen(dest);
    size_t slen = strlen(src);

    if(isDynamicMemory)
    {
        dest = realloc(dest, slen + dlen + 1);
        if(!dest) return NULL;
    }
    memcpy(dest + dlen, src, slen + 1);
    return dest;
}

If dest was dynamically allocated you can reallocate it to the correct size by passing 1 as isDynamicMemory parameter

  1. Try to do not program in the main function. Use functions for such task like writing strcat like function.
0___________
  • 60,014
  • 4
  • 34
  • 74
1

The program has undefined behavior because in the call of strcat

printf("%s\n", strcat(str1, str2));

the memory beyond the character array str1 declared like

char str1[7] = "hello ";

is being overwritten.

There is no space in the array to accommodate the string "buddy" after the stored string "hello ".

The sizes of the array str1 and str2 themselves was not changed. The function strlen does not return the size of a character array. It returns the length of a string: a sequence of characters terminated by the zero-terminating character '\0'. To get the size of a character array you should use the operator sizeof.

Pay attention to that the return type of the function strlen is size_t. To output a value of this type you have to use the conversion specifier zu instead of d.

printf("Initial size of str1 is: %zu\n", strlen(str1));
                                 ^^^ 

To make the program correct you have to enlarge the character array str1. Here is a demonstration program.

#include <stdio.h>
#include <string.h>

int main( void )
{
    char str1[12] = "hello ";
    
    printf( "The size of str1 is: %zu\n", sizeof( str1 ) );
    printf( "The length of the stored string in str1 is: %zu\n", strlen( str1 ) );

    char str2[] = "buddy";

    printf("%s\n", strcat(str1, str2));

    printf( "The size of str1 is: %zu\n", sizeof( str1 ) );
    printf( "The length of the stored string in str1 is: %zu\n", strlen( str1 ) );
}

The program output is

The size of str1 is: 12
The length of the stored string in str1 is: 6
hello buddy
The size of str1 is: 12
The length of the stored string in str1 is: 11

As you can see from the output the size of the array str1 stays unchanged. What was changed is the length of the stored string in the array str1.

Pay attention to that the function strcat does not allocate or reallocate memory for arrays passed to the function as arguments. So if the destination array does not have enough memory to accommodate the appended string then the behavior is undefined because in this case the memory after the destination array will be overwritten.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
0

As i mentioned in the comment, overflow of str1 occures. strcat() doesn't know where and how strings are allocated. It searches end of destination string and appends source string overwriting null terminator. If you want dynamic allocation, it could be:

    char str1[]="hello ";
    printf("Initial size of str1 is: %zu\n", strlen(str1));
    char str2[]="buddy";
    char *str3 = malloc(strlen(str1) + strlen(str2) + 1);
    if (!str3)
        return 1;
    strcpy(str3, str1);
    strcat(str3,str2);
    printf("%s\n",str3);
    printf("Final size: %zu\n", strlen(str3));
    free(str3);

This may be optimized little bit. strlen() calculates length of string at runtime and you have strings known at compile time. So we can use it:

    char str1[]="hello ";
    printf("Initial size of str1 is: %zu\n", strlen(str1));
    char str2[]="buddy";
    char *str3 = malloc(sizeof(str1) + sizeof(str2) - 1);
    if (!str3)
        return 1;
    strcpy(str3, str1);
    strcpy(str3 + sizeof(str1) - 1, str2);
    printf("%s\n",str3);
    printf("Final size: %zu\n", strlen(str3));
    free(str3);

Also pay attention to strlen() return type: it is size_t. We should either cast it to int for printing with %d format or print with %zu.

EDIT: fixed length calculation, fixed printf formatting modifier.

dimich
  • 1,305
  • 1
  • 5
  • 7
  • Or even `memcpy(str3, str1, sizeof(str1)-1); memcpy(str3 + sizeof(str1)-1, str2, sizeof(str2));` instead of both `strcpy()`. – dimich Aug 02 '22 at 09:20
  • 1
    I think casting the `size_t` to `int` is pretty unhealty because `size_t`'s max value is way greater in every implementation I am aware of, and casting also adds unnecessary overhead. – Brogolem35 Aug 02 '22 at 09:22
  • @Brogolem35 Of course `%z` is preffered. I just don't remember what standard defines it. But in this case lengthes are definetly fit into `int`. Wrong output will be on very long strings but behavior will be still defined. – dimich Aug 02 '22 at 09:29
  • 1
    I couldn't find a source but from what I remember it was defined in C99. – Brogolem35 Aug 02 '22 at 09:40
  • 2
    @dimich "I just don't remember what standard defines it." `z` prefix came up in C99. To portable print a `size_t` in C89, code used `printf("%lu\n",(unsigned long) strlen(str1));`, something that worked very well back then and even today. – chux - Reinstate Monica Aug 02 '22 at 17:42
  • `char *str3 = malloc(sizeof(str1) + sizeof(str2));` is **very confusing**. It actually allocates one byte too many and good compilers will compute the allocation size at compile time for `char *str3 = malloc(strlen(str1) + strlen(str2) + 1);` without any risk of confusion and more importantly without undefined behavior in case `str1` or `str2` are changed from arrays to pointers in a separate edit. To prove my point, the urge for premature optimisation bit you immediately: `strcpy(str3 + sizeof(str1), str2);` is incorrect! it should have been `strcpy(str3 + sizeof(str1) - 1, str2);`. – chqrlie Aug 03 '22 at 09:26
  • I am downvoting your answer until you fix it. – chqrlie Aug 03 '22 at 09:26
  • @chqrlie You are right about length calculations, it's my mistake. Thank you for pointing. But i disagree about confusing by changing array to pointer when valriable is declared in the same funciton. A developer must know what he/she is working with. I've seen many times when relying on optimizer made O(2) instead of O(1) because of redundant length calculations. Better approach is to call `strlen()` once at the begginning and use stored value, of course. Please leave your downvote as is, it also matters. – dimich Aug 03 '22 at 12:14
  • Since you are adamant about optimisation, you might also replace `strcpy(str3, str1); strcpy(str3 + sizeof(str1) - 1, str2);` with `memcpy(str3, str1, sizeof(str1) - 1); memcpy(str3 + sizeof(str1) - 1, str2, sizeof(str2));`... why use `strcpy` when the sizes are known, especially at compile time. ` – chqrlie Aug 03 '22 at 17:03
  • @chqrlie I noted this in first comment. Profit of `memcpy()` is linear, just copying machine words instead of chars and different stop condition, which is O(1). – dimich Aug 03 '22 at 17:12
0

As other answers (1, 2) mentioned, your code has a buffer overflow, a kind of undefined behaviour.

One more question: how can I write a code to do the above task without the use of strcat function.

Either:

  1. Use snprintf()
  2. Use strlcpy()
  3. Implement your own version of strcat() (with some potential improvements)

Option 1

char str1[] = "hello ";
char str2[] = "buddy";
    
size_t total_size = sizeof(str1) + sizeof(str2) - 1; // sizeof counts \0 at the end
char res[total_size];

snprintf(res, total_size, "%s%s", str1, str2);

printf("sizeof(str1) = %zu\n", sizeof(str1));
printf("sizeof(str2) = %zu\n", sizeof(str2));
printf("total_size   = %zu\n", total_size);
printf("res = %s\n", res);
sizeof(str1) = 7
sizeof(str2) = 6
total_size   = 12
res = hello buddy

snprintf() lets you to control how many characters at most you want to print. This is comes handy in preventing a buffer overflow. For example, if you use sprintf(res, "%s%sxxx", str1, str2) in the above code, you'll get a BO. This is not the case with snprintf(res, total_size, "%s%sxxx", str1, str2).


Option 2

strlcpy() lets you to control how many characters at most you want to copy, and guarantees null-termination (unlike strncpy()). Possible implementations: 1, 2, 3.


Option 3 (credit: Back to Basics)

char *concatenate(char *dest, char *src)
{
    while (*dest) dest++;
    while ((*dest++ = *src++));
    return --dest;
}

Or:

char *concatenate_n(char *dest, char *src, int n)
{
    while (*dest) dest++;
    for (int i = 0; i < n && (*dest++ = *src++); ++i) {}
    return --dest;
}

Example:

char str1[12] = "hello ";
char str2[] = "buddy";

concatenate(str1, str2);
// concatenate_n(str1, str2, sizeof(str1) - 1 - strlen(str1)); // Remaining space in str1

printf("str1 = %s\n", str1);
printf("p    = %s\n", p);
Zakk
  • 1,935
  • 1
  • 6
  • 17
  • "strcpy() is very prone to BO: if str2 is larger than what str1 can hold, your code may behave in an undefined manner." --> so does `strncpy(str1 + strlen(str1), str2, strlen(str2));`. Using `strncpy()`, as posted here, has no benefit over `strcpy(str1 + strlen(str1), str2);`, just more complexity. – chux - Reinstate Monica Aug 02 '22 at 17:46
  • `concatenate_n()` is not _safer_ than `concatenate()` for similar reasons. A _safe_ concatenation function would use the size of the destination to limit copying. – chux - Reinstate Monica Aug 02 '22 at 17:50
  • @chux-ReinstateMonica I have edited the code, Now, both `strncpy()` and `concatenate_n()` accept as a third argument the remaining space available in the destination buffer: `sizeof(str1) - 1 - strlen(str1)`. – Zakk Aug 03 '22 at 08:47
  • @chux-ReinstateMonica `sizeof(str1)` is the total capacity, `-1` because of the counted `\0`, and `strlen(str1)` to exclude the used space. What remains is the free space available. Thanks for your comment btw. – Zakk Aug 03 '22 at 08:49
  • 1
    I removed the DV, but there are still some problems: `concatenate_n` does not null terminate the destination array if `n < strlen(src) + 1`. You should model this function after `strlcat`, with a prototype `char *concatenate_n(char *dest, size_t n, const char *src)` where `n` is the length of the array pointed to by `dest`, much simpler to pass than the available size at the end of `dest`. – chqrlie Aug 03 '22 at 10:42
  • @chqrlie OK. I'll fix that. Again, thanks a lot. – Zakk Aug 03 '22 at 10:45