Strtok in c

strtok() function in C with example code

In my one project, I have used the strtok function to parse the server response in TCP/IP client-server communication. I have also used the strtok function many times to parse the string.

If you are going to use the strtok in C, then you should know about it. Because if you don’t have much knowledge about the strtok you will get an undefined result.  The C library also provides a safer version of the strtok( strtok_s ) but I will discuss it in another article. So let’s see the strtok function and its usage in C programming.

A sequence of calls to the strtok function breaks the string pointed to by s1 into a sequence of tokens, each of which is delimited by a character from the string pointed to by s2. In simple words, we can say that strtok() divides the string into tokens.

 

Syntax strtok in C:

//General syntax of strtok()

char *strtok(char * restrict s1,
const char * restrict s2);


Parameters:

s1— The s1 string is modified and broken into smaller strings (tokens).

s2— The s2 string contains the delimiter characters. These may vary from one call to another.

Return:

The strtok function returns a pointer to the first character of a token or a null pointer if there is no token.

 

Let’s see an example code to understand the functionality of the strtok in C. In this C code, I am breaking a string s1 in sub sting using the strtok function and delimiter s2.

#include <stdio.h>
#include <string.h>

int main()
{
    //String which break in token
    char s1[] = "Aticle-world-.com";

    //delimiter character
    char * s2 = "-";

    // first call of strtok
    char* token = strtok(s1,s2);

    // Keep printing tokens while one of the
    // delimiters present in str[].
    while (token != NULL)
    {
        //printing token
        printf("%s\n", token);

        //subsequent calls in the sequence
        //have a null first argument.
        token = strtok(NULL,s2);
    }

    return 0;
}

Output:

strtok in C

Important points you must know before using the strtok in C:

1. You must include string.h header file before using the strncat function in C.

2. The first call in the sequence has a non-null first argument and subsequent calls in the sequence have a null first argument. We have seen in the above example that in subsequent calls we are passing NULL.
3. The separator string pointed to by s2 may be different from call to call. Let’s see an example to understand this point.
#include <stdio.h>
#include <string.h>

int main()
{
    //String which break in token
    char s1[] = "?aticle???world,,,#.com";

    // first call of strtok
    char *token = strtok(s1, "?"); // token points to the token "aticle"
    printf("token => %s\n", token);

    //subsequent calls in the sequence
    //have a null first argument.
    token = strtok(NULL, ","); // token points to the token "??world"
    printf("token => %s\n", token);

    token = strtok(NULL, "#,"); // token points to the token ".com"
    printf("token => %s\n", token);

    token = strtok(NULL, "?"); // token is a null pointer
    printf("token => %s\n", token);

    return 0;
}

Output:

strtok in c programming

4. On the first call of the strtok function, the strtok function searches the string pointed to by s1 for the first character that is not contained in the current separator string pointed to by s2 (delimiter).
4.1 If no such character is found, then there are no tokens in the string pointed to by s1 and the strtok function returns a null pointer. Let’s see an example code,
#include <stdio.h>
#include <string.h>

int main()
{
    //String which break in token
    char s1[] = "aaaaa";

    //delimiter
    char *s2 = "ab";

    // first call of strtok
    char *token = strtok(s1, s2);
    printf("token => %s\n", token);

    return 0;
}

Output:

While running this code you will find that strtok is returning a null pointer. It returning a null pointer because there is no character in s1 which is not present in the delimiter.

strtok return null

 

 

4.2 But if such a character is found, it is the start of the first token. The strtok function then searches from there for a character that is contained in the current separator string.

4.2.1 If no such character is found, the current token extends to the end of the string pointed to by s1, and subsequent searches for a token will return a null pointer.

Let’s see an example code,

#include <stdio.h>
#include <string.h>

int main()
{
    //String which break in token
    char s1[] = "aaaacbd";

    //delimiter
    char *s2 = "a";

    // first call of strtok
    char *token = strtok(s1, s2);
    printf("token => %s\n", token);

    return 0;
}

Output:

bcd

 

4.2.2 If such a character is found, it is overwritten by a null character (‘\0’), which terminates the current token. The strtok function saves a pointer to the following character, from which the next search for a token will start. Each subsequent call, with a null pointer as the value of the first argument, starts searching from the saved pointer and behaves as described above.

#include <stdio.h>
#include <string.h>

int main()
{
    //String which break in token
    char s1[] = "I@love_Aticle#world.com";

    //delimiter
    char *s2 = "#@._";

    //first call of strtok
    char *token = strtok(s1, s2);
    printf("token => %s\n", token);

    //second call of strtok
    token = strtok(NULL, s2);
    printf("token => %s\n", token);


    //third call of strtok
    token = strtok(NULL, s2);
    printf("token => %s\n", token);

    //fourt call of strtok
    token = strtok(NULL, s2);
    printf("token => %s\n", token);

    //five call of strtok
    token = strtok(NULL, s2);
    printf("token => %s\n", token);

    return 0;
}

Output:

strtok in subsequent call

 

5. The behavior is undefined if either s1 or s2 is not a pointer to a null-terminated byte string.

6. The first parameter of strtok must not be a literal string.

7. The strtok function modifies the source string (s1), so you should not pass the original string if later you require the original string.

8. The strtok function is not thread-safe. But I want you guys to do some experiments on strtok and share your finding in the comment box.

 

Is strtok changing its input string?

Do not assume that strtok() leaves the parse string unchanged. It changed the input string when strtok() finds a token, it changes the character immediately after the token into a ‘\0‘, and then returns a pointer to the token.

Let’s consider an example,

char str[] = Amlendra@Aticleworld@KR;

char * ptr = strtok (str,"@");
while (ptr != NULL)
{
  ptr = strtok (NULL, "@");
}

when you do strtok(str, “@”) and strtok(NULL, “@”) strtok() find token and put null in place of token (replace @ token with ‘\0’) and modify the input string. Consider the diagram it will help to understand the working of strtok().

 

  char str[] = Amlendra@Aticleworld@KR;

  char * ptr = strtok (str,"@");
  while (ptr != NULL)
  {
    ptr = strtok (NULL, "@");
  }


                 
  Str array in memory 
+---------------------------------------------------------------------------------------+
|'A'|'m'|'l'|'e'|'n'|'d'|'r'|'a'|@|'A'|'t'|'i'|'c'|'l'|'e'|'w'|'o'|'r'|'l'|'d'|@|'k'|'R'|
+---------------------------------------------------------------------------------------+
                                 ^  replace @ with \0  (ASCII value is 0)

 

To keep your original str unchanged you should first copy str into some tmpString variable and then use that tmpString in strtok(). See the below code.

char str[] = "Hi Aticleworld How are you";


//strlen not safe to use. You can use it as per your requirement 
char* tmp = calloc(strlen(str)+1, sizeof(char));


/*strcpy is not safe to use here I am using it because 
  I am assuming input string has only one null terminating char*/
strcpy(tmp, str);