Chapter 6: Arrays and Strings: Handling Collections of Data

Chapter 6: Arrays and Strings: Handling Collections of Data

So far, we’ve dealt with individual variables. But what if you need to store a collection of related items, like a list of student scores or a sequence of characters that form a name? This is where arrays and strings come in.

In C, arrays are fundamental for storing multiple values of the same data type in contiguous memory locations. Strings are a special case of character arrays. This chapter will cover:

  • Declaring and initializing arrays.
  • Accessing array elements.
  • The deep relationship between arrays and pointers.
  • Handling multi-dimensional arrays.
  • C strings: declaration, initialization, and common string manipulation functions.

6.1 Arrays

An array is a collection of elements of the same data type, stored at contiguous memory locations. Each element can be accessed using an index.

6.1.1 Declaring Arrays

To declare an array, you specify the data type of its elements, the array name, and its size (the number of elements it can hold).

Syntax:

dataType arrayName[arraySize];
  • dataType: The type of data the array will store (e.g., int, float, char).
  • arrayName: The name of the array.
  • arraySize: An integer constant expression that specifies the number of elements the array can hold.

Examples:

int scores[5];           // An array named 'scores' that can hold 5 integers
float temperatures[7];   // An array for 7 float temperatures
char name[20];           // An array for 20 characters (a string, plus null terminator)

Variable Length Arrays (VLAs) (C99 feature): In C99 and later, you can declare arrays whose size is determined at runtime, as long as it’s not a static or extern array.

#include <stdio.h>

int main() {
    int size;
    printf("Enter array size: ");
    scanf("%d", &size);

    int dynamic_array[size]; // VLA: size determined at runtime (C99)
    printf("Dynamic array of size %d created.\n", size);

    // VLAs were made optional in C11 and are not part of C23.
    // Dynamic memory allocation with malloc is generally preferred for runtime-sized arrays.
    return 0;
}

Note: While VLAs are convenient, their use is often discouraged in modern C in favor of dynamic memory allocation using malloc (which we’ll cover in the next chapter) because malloc provides more control and flexibility.

6.1.2 Initializing Arrays

You can initialize an array when you declare it.

Method 1: Initialize all elements explicitly

int numbers[5] = {10, 20, 30, 40, 50}; // All 5 elements initialized

Method 2: Partial initialization If you provide fewer initializers than the array size, the remaining elements are automatically initialized to 0 (for numerical types) or NULL (for pointers).

int data[5] = {10, 20}; // data[0]=10, data[1]=20, data[2]=0, data[3]=0, data[4]=0

Method 3: Let the compiler determine size If you omit the size, the compiler will calculate it based on the number of initializers.

int grades[] = {85, 90, 78, 92}; // Compiler sets size to 4

6.1.3 Accessing Array Elements

Array elements are accessed using their index, which is an integer indicating the element’s position. C arrays are zero-indexed, meaning the first element is at index 0, the second at index 1, and so on. For an array of size N, the valid indices range from 0 to N-1.

Syntax:

arrayName[index]

Example:

#include <stdio.h>

int main() {
    int scores[] = {85, 90, 78, 92, 65}; // Size is 5 (indices 0 to 4)

    printf("Scores:\n");
    printf("First score (index 0): %d\n", scores[0]);   // Output: 85
    printf("Third score (index 2): %d\n", scores[2]);   // Output: 78
    printf("Last score (index 4): %d\n", scores[4]);    // Output: 65

    // Modifying an element
    scores[1] = 95; // Change the second score from 90 to 95
    printf("Modified second score: %d\n", scores[1]); // Output: 95

    // Iterating through an array using a for loop
    printf("\nAll scores:\n");
    for (int i = 0; i < 5; i++) { // Loop from index 0 to 4
        printf("scores[%d] = %d\n", i, scores[i]);
    }

    // Accessing out of bounds (DANGER!)
    // scores[5] = 100; // This is out of bounds and leads to UNDEFINED BEHAVIOR
    //                  // The compiler might not warn you, but it's a serious bug.

    return 0;
}

Important: C does not perform bounds checking on array access. Accessing an array with an index outside its valid range (e.g., scores[5] for a 5-element array) will access memory that doesn’t belong to the array. This is a common source of bugs (buffer overflows) and security vulnerabilities. It’s your responsibility as a C programmer to ensure array accesses are within bounds.

6.2 Arrays and Pointers: An Intimate Relationship

In C, arrays and pointers are very closely related. In many contexts, an array’s name can “decay” or be implicitly converted into a pointer to its first element.

  • arrayName (without brackets) evaluates to the address of its first element (&arrayName[0]).
  • Therefore, you can assign an array’s address to a pointer of the same type.

Example:

#include <stdio.h>

int main() {
    int numbers[] = {10, 20, 30, 40, 50};
    int *ptr_numbers;

    ptr_numbers = numbers; // ptr_numbers now points to numbers[0]
                           // Equivalent to: ptr_numbers = &numbers[0];

    printf("Address of numbers[0]: %p\n", &numbers[0]);
    printf("Value of ptr_numbers (address of first element): %p\n", ptr_numbers);
    printf("Value at *ptr_numbers: %d\n", *ptr_numbers); // Dereferences to numbers[0] (10)

    // Accessing array elements using pointer arithmetic
    printf("numbers[0] via pointer arithmetic: %d\n", *(ptr_numbers + 0)); // Same as *ptr_numbers
    printf("numbers[1] via pointer arithmetic: %d\n", *(ptr_numbers + 1));
    printf("numbers[2] via pointer arithmetic: %d\n", *(ptr_numbers + 2));

    // Accessing elements using array-style indexing on the pointer
    printf("ptr_numbers[3] = %d\n", ptr_numbers[3]); // This syntax also works!

    return 0;
}

This close relationship means you can often use pointer arithmetic (*(ptr + i)) interchangeably with array indexing (array[i]). The compiler often converts array indexing to pointer arithmetic internally.

When an array is passed to a function, it is always passed as a pointer to its first element. The function does not receive a copy of the entire array.

#include <stdio.h>

// Function that takes an array (actually a pointer to its first element)
void print_array(int arr[], int size) { // 'arr[]' is syntactic sugar for 'int* arr'
    printf("Elements of the array:\n");
    for (int i = 0; i < size; i++) {
        printf("%d ", arr[i]);
    }
    printf("\n");
}

// Function that takes a pointer directly
void print_array_ptr(int *ptr_arr, int size) {
    printf("Elements of the array (via pointer):\n");
    for (int i = 0; i < size; i++) {
        printf("%d ", *(ptr_arr + i)); // Using pointer arithmetic
    }
    printf("\n");
}


int main() {
    int my_data[] = {100, 200, 300, 400, 500};
    int num_elements = sizeof(my_data) / sizeof(my_data[0]); // Calculate array size

    print_array(my_data, num_elements);
    print_array_ptr(my_data, num_elements);

    return 0;
}

6.3 Multi-dimensional Arrays

Arrays can have more than one dimension, allowing you to represent tables or matrices. The most common is a two-dimensional array, often called an array of arrays.

Syntax (2D Array):

dataType arrayName[rows][columns];

Example:

#include <stdio.h>

int main() {
    // A 2x3 integer array (2 rows, 3 columns)
    int matrix[2][3] = {
        {1, 2, 3}, // Row 0
        {4, 5, 6}  // Row 1
    };

    printf("Matrix elements:\n");
    for (int i = 0; i < 2; i++) { // Loop through rows
        for (int j = 0; j < 3; j++) { // Loop through columns
            printf("%d ", matrix[i][j]);
        }
        printf("\n");
    }

    // Accessing an element
    printf("Element at matrix[0][1]: %d\n", matrix[0][1]); // Output: 2

    // Modifying an element
    matrix[1][0] = 7;
    printf("Modified element at matrix[1][0]: %d\n", matrix[1][0]); // Output: 7

    return 0;
}

Multi-dimensional arrays are stored in row-major order in memory (all elements of the first row, then all elements of the second row, and so on). The name matrix (without indices) acts as a pointer to the first row (which is itself an array). matrix[0] is an array, &matrix[0] is its address. matrix points to &matrix[0], matrix+1 points to &matrix[1].

6.4 Strings in C

In C, a string is fundamentally a sequence of characters stored in a char array, terminated by a special null character (\0). The null character marks the end of the string.

6.4.1 Declaring and Initializing Strings

There are several ways to declare and initialize strings:

Method 1: Character array with explicit null terminator

char my_string[6] = {'H', 'e', 'l', 'l', 'o', '\0'};

This is cumbersome.

Method 2: Character array with string literal

char greeting[6] = "Hello"; // Compiler automatically adds '\0'
char another_greeting[] = "World"; // Compiler determines size (6 bytes for "World" + '\0')

Important: Ensure the array size is large enough to hold all characters plus the null terminator. “Hello” has 5 characters, so greeting needs at least a size of 6.

Method 3: Character pointer (string literal)

char *name = "C Programming"; // 'name' points to a string literal

Here, "C Programming" is a string literal stored in read-only memory. name is a pointer that holds the address of the first character of this literal. You cannot modify the characters of a string literal. Attempting to do so leads to undefined behavior (often a crash). This is a common point of confusion for beginners.

If you need a modifiable string, use a char array:

char modifiable_name[20] = "John Doe"; // Modifiable
// modifiable_name[0] = 'j'; // This is allowed

6.4.2 Common String Functions (<string.h>)

C provides a standard library <string.h> with functions to manipulate strings.

FunctionDescriptionExample
strlen(const char* s)Returns the length of the string (excluding \0).strlen("hello") returns 5.
strcpy(char* dest, const char* src)Copies src string to dest. Unsafe if dest is too small.char buf[10]; strcpy(buf, "hi");
strncpy(char* dest, const char* src, size_t n)Copies at most n characters from src to dest. Does NOT guarantee null termination.char buf[10]; strncpy(buf, "long_string", 5); buf[9] = '\0';
strcat(char* dest, const char* src)Concatenates (appends) src to dest. Unsafe if dest is too small.char s1[20]="Hello"; strcat(s1, " World");
strncat(char* dest, const char* src, size_t n)Appends at most n characters from src to dest. Guarantees null termination.char s1[20]="Hello"; strncat(s1, " World", 5);
strcmp(const char* s1, const char* s2)Compares s1 and s2. Returns 0 if equal, <0 if s1 < s2, >0 if s1 > s2.strcmp("apple", "banana") returns <0.
strncmp(const char* s1, const char* s2, size_t n)Compares at most n characters of s1 and s2.strncmp("apple", "apricot", 2) returns 0.
sprintf(char* buffer, const char* format, ...)Prints formatted output to a string buffer.char buf[50]; sprintf(buf, "Age: %d", 30);

Security Note: strcpy and strcat are inherently unsafe if the destination buffer is not large enough, leading to buffer overflows. Always prefer strncpy, strncat or, even better, safer functions from <string.h> extensions (like strlcpy/strlcat on some systems, or using snprintf) if available, or manually ensure buffer sizes.

Code Example: Arrays and Strings

#include <stdio.h>
#include <string.h> // For string functions like strlen, strcpy, strcmp

int main() {
    printf("--- Array Example ---\n");
    int numbers[5] = {1, 2, 3, 4, 5};
    int i;

    printf("Original numbers: ");
    for (i = 0; i < 5; i++) {
        printf("%d ", numbers[i]);
    }
    printf("\n");

    // Modify array elements
    numbers[2] = 10;
    printf("Numbers after modification: ");
    for (i = 0; i < 5; i++) {
        printf("%d ", numbers[i]);
    }
    printf("\n");

    // Array name as pointer
    int *ptr_nums = numbers;
    printf("Using pointer to access numbers[0]: %d\n", *ptr_nums);
    printf("Using pointer to access numbers[3]: %d\n", *(ptr_nums + 3));

    printf("\n--- Multi-dimensional Array Example ---\n");
    char board[3][3] = {
        {'X', 'O', 'X'},
        {'O', 'X', 'O'},
        {'X', 'O', 'X'}
    };

    printf("Tic-Tac-Toe Board:\n");
    for (int r = 0; r < 3; r++) {
        for (int c = 0; c < 3; c++) {
            printf("%c ", board[r][c]);
        }
        printf("\n");
    }

    printf("\n--- String Examples ---\n");

    // String declared as a character array (modifiable)
    char city[20] = "New York";
    printf("Original city: %s\n", city); // %s for strings
    printf("Length of city: %zu\n", strlen(city));

    // Copying a string
    char destination[20];
    strcpy(destination, city); // Unsafe for varying string lengths
    printf("Copied city: %s\n", destination);

    // Concatenating strings
    char greeting[30] = "Hello";
    strcat(greeting, ", World!"); // Unsafe
    printf("Concatenated string: %s\n", greeting);

    // Safer concatenation using strncat (ensure destination buffer has space for null terminator)
    char buffer[50] = "Safe ";
    char source[] = "concatenation example.";
    // buffer size is 50, 'Safe ' is 5 chars + 1 null = 6. Remaining space = 44.
    // We want to add 'source' (22 chars + 1 null = 23).
    // Max characters to append should be (total_buffer_size - strlen(current_content) - 1 for null)
    strncat(buffer, source, sizeof(buffer) - strlen(buffer) - 1);
    printf("Safer concat: %s\n", buffer);

    // Comparing strings
    char s1[] = "apple";
    char s2[] = "banana";
    char s3[] = "apple";

    if (strcmp(s1, s2) < 0) {
        printf("'%s' comes before '%s'\n", s1, s2);
    } else if (strcmp(s1, s2) > 0) {
        printf("'%s' comes after '%s'\n", s1, s2);
    } else {
        printf("'%s' is equal to '%s'\n", s1, s2);
    }

    if (strcmp(s1, s3) == 0) {
        printf("'%s' is equal to '%s'\n", s1, s3);
    }

    // String literal (read-only)
    const char *course = "C Programming Course"; // Use const char* for string literals
    printf("Course: %s\n", course);
    // course[0] = 'X'; // ERROR: Attempt to modify a string literal

    // Using snprintf for safe string formatting (C99 and later)
    char info_buffer[100];
    int age = 30;
    double height = 1.75;
    snprintf(info_buffer, sizeof(info_buffer), "Name: %s, Age: %d, Height: %.2f", city, age, height);
    printf("Formatted info: %s\n", info_buffer);


    return 0;
}

Compile and Run:

gcc arrays_strings.c -o arrays_strings
./arrays_strings

Exercise 6.1: Array Manipulation

Write a C program that:

  1. Declares an integer array numbers of size 10.
  2. Initializes the array with values from 1 to 10.
  3. Prints all elements of the array.
  4. Calculates and prints the sum of all elements in the array.
  5. Finds and prints the maximum value in the array.
  6. Reverses the order of elements in the array (e.g., 1 2 3 4 5 becomes 5 4 3 2 1) and prints the reversed array. You might need a temporary variable for swapping.

Exercise 6.2: Palindrome Checker (Mini-Challenge)

Write a C program that prompts the user to enter a word or phrase (without spaces for simplicity) and checks if it’s a palindrome. A palindrome reads the same forwards and backward (e.g., “madam”, “racecar”).

Instructions:

  1. Use a char array to store the input string (e.g., char word[100];).
  2. Read the input using scanf("%s", word);.
  3. Calculate the length of the string using strlen().
  4. Use a loop and two index variables (one starting from the beginning, one from the end) to compare characters.
  5. Print whether the word is a palindrome or not.

Example Input/Output:

Enter a word: level
'level' is a palindrome.

Enter a word: hello
'hello' is not a palindrome.

You now have a solid understanding of how to work with collections of data using arrays and the special handling required for strings in C. The intimate relationship between arrays and pointers is a cornerstone of C programming. In the next chapter, we’ll delve deeper into memory by exploring dynamic memory allocation, allowing your programs to request and release memory during runtime.