Chapter 8: Structures, Unions, and Enums: Custom Data Types

Chapter 8: Structures, Unions, and Enums: Custom Data Types

So far, we’ve worked with primitive data types like int, float, char, and arrays of these types. But real-world data is often more complex, requiring a way to group different types of information together. For instance, a Student might have a name (string), an ID (integer), and a GPA (float).

C provides tools to define your own custom data types:

  • Structures (struct): Allow you to group heterogeneous (different types) data items under a single name.
  • Unions (union): Similar to structures, but all members share the same memory location, allowing you to store different data types at different times in the same space.
  • Enumerations (enum): Provide a way to create named integer constants, improving code readability.

This chapter will teach you how to define, declare, and use these powerful constructs.

8.1 Structures (struct)

A struct is a user-defined data type that allows you to combine items of different data types under a single name. Think of it as a blueprint for creating objects that have a specific set of characteristics (members).

8.1.1 Defining a Structure

You define a structure using the struct keyword, followed by a tag name (optional, but good practice) and a list of member variables enclosed in curly braces.

Syntax:

struct struct_tag {
    dataType member1;
    dataType member2;
    // ...
}; // Don't forget the semicolon!

Example:

// Define a structure named 'Point'
struct Point {
    int x; // Member for x-coordinate
    int y; // Member for y-coordinate
};

// Define a structure named 'Student'
struct Student {
    int id;
    char name[50]; // Character array for name
    float gpa;
};

This definition does not allocate memory; it merely describes the template for a Point or Student variable.

8.1.2 Declaring Structure Variables

Once a structure is defined, you can declare variables of that structure type.

Syntax:

struct struct_tag variable_name;

Example:

#include <stdio.h>
#include <string.h> // For strcpy

// Structure definitions (typically at global scope or in a header file)
struct Point {
    int x;
    int y;
};

struct Student {
    int id;
    char name[50];
    float gpa;
};

int main() {
    // Declare variables of type struct Point
    struct Point p1;
    struct Point p2;

    // Declare variables of type struct Student
    struct Student student1;
    struct Student student2;

    return 0;
}

8.1.3 Accessing Structure Members

You access individual members of a structure variable using the dot operator (.).

Syntax:

variable_name.member_name

Example (continuing from above):

// ... inside main function ...

    // Access and assign values to members of p1
    p1.x = 10;
    p1.y = 20;

    // Access and print members of p1
    printf("Point p1: (%d, %d)\n", p1.x, p1.y); // Output: Point p1: (10, 20)

    // Assign values to members of student1
    student1.id = 101;
    strcpy(student1.name, "Alice Wonderland"); // Use strcpy for string arrays
    student1.gpa = 3.85f;

    // Access and print members of student1
    printf("Student 1: ID %d, Name: %s, GPA: %.2f\n",
           student1.id, student1.name, student1.gpa);
    // Output: Student 1: ID 101, Name: Alice Wonderland, GPA: 3.85

8.1.4 Initializing Structures

You can initialize structure variables at the time of declaration using curly braces, similar to arrays.

struct Point p3 = {30, 40}; // Order matters for initialization

struct Student student3 = {102, "Bob The Builder", 3.10f};

// Designated initializers (C99 and later)
struct Point p4 = {.y = 50, .x = 60}; // Order doesn't matter, more readable

8.1.5 Structures and Pointers (-> operator)

When you have a pointer to a structure, you cannot use the dot operator directly. Instead, you use the arrow operator (->) to access its members.

Syntax:

pointer_to_struct->member_name

This is equivalent to (*pointer_to_struct).member_name.

Example:

#include <stdio.h>
#include <string.h>
#include <stdlib.h> // For malloc, free

struct Point {
    int x;
    int y;
};

int main() {
    struct Point origin = {0, 0};
    struct Point *ptr_to_origin; // Declare a pointer to a Point struct

    ptr_to_origin = &origin; // Make pointer point to 'origin'

    // Access members using the arrow operator
    printf("Origin via pointer: (%d, %d)\n", ptr_to_origin->x, ptr_to_origin->y); // Output: (0, 0)

    // Modify members via pointer
    ptr_to_origin->x = 5;
    ptr_to_origin->y = 5;
    printf("Modified origin via pointer: (%d, %d)\n", origin.x, origin.y); // Output: (5, 5)

    // Dynamic allocation of a struct
    struct Student *new_student = (struct Student *) malloc(sizeof(struct Student));
    if (new_student == NULL) { /* handle error */ return 1; }

    new_student->id = 201;
    strcpy(new_student->name, "Charlie Chaplin");
    new_student->gpa = 3.99f;

    printf("Dynamically allocated student: ID %d, Name: %s, GPA: %.2f\n",
           new_student->id, new_student->name, new_student->gpa);

    free(new_student);
    new_student = NULL;

    return 0;
}

8.1.6 typedef for Structures

Typing struct struct_tag repeatedly can be tedious. The typedef keyword allows you to create an alias (a new name) for an existing data type. It’s very commonly used with structures to make code cleaner.

Syntax:

typedef existing_type new_type_name;

Example:

// Method 1: Define struct then typedef
struct Person_struct {
    char name[50];
    int age;
};
typedef struct Person_struct Person; // Now you can just use 'Person' instead of 'struct Person_struct'

// Method 2: Combine definition and typedef
typedef struct {
    float length;
    float width;
} Rectangle; // 'Rectangle' is now a type name

int main() {
    Person p;
    p.age = 30; // No need for 'struct Person_struct p;'

    Rectangle r;
    r.length = 10.0f; // No need for 'struct Rectangle r;'

    return 0;
}

Throughout the rest of this guide, we’ll generally use typedef for structures for brevity and readability.

8.1.7 Nested Structures

Structures can contain other structures as members.

Example:

#include <stdio.h>
#include <string.h>

typedef struct {
    int day;
    int month;
    int year;
} Date;

typedef struct {
    char street[100];
    char city[50];
    char zipcode[10];
} Address;

typedef struct {
    char name[100];
    Date dob;       // Nested Date structure
    Address home_address; // Nested Address structure
} Employee;

int main() {
    Employee emp1;
    strcpy(emp1.name, "Dr. Alice Smith");
    emp1.dob.day = 15;
    emp1.dob.month = 6;
    emp1.dob.year = 1990;
    strcpy(emp1.home_address.street, "123 Main St");
    strcpy(emp1.home_address.city, "Anytown");
    strcpy(emp1.home_address.zipcode, "12345");

    printf("Employee Name: %s\n", emp1.name);
    printf("DOB: %d/%d/%d\n", emp1.dob.day, emp1.dob.month, emp1.dob.year);
    printf("Address: %s, %s %s\n",
           emp1.home_address.street, emp1.home_address.city, emp1.home_address.zipcode);

    return 0;
}

8.2 Unions (union)

A union is a special data type that allows different members to occupy the same memory location. This means a union can hold only one of its members’ values at any given time. The size of a union is determined by the size of its largest member.

Unions are useful for:

  • Memory optimization: When you know only one piece of data will be active at a time.
  • Interpreting the same memory in different ways: Treating the same set of bytes as an int at one point and a float at another.

8.2.1 Defining and Declaring Unions

The syntax is similar to structures, just using the union keyword.

Syntax:

union union_tag {
    dataType member1;
    dataType member2;
    // ...
}; // Don't forget the semicolon!

Example:

#include <stdio.h>

union Data {
    int i;
    float f;
    char str[20];
};

int main() {
    union Data data;

    printf("Size of union Data: %zu bytes\n", sizeof(data)); // Will be sizeof(data.str), typically 20

    data.i = 10;
    printf("data.i: %d\n", data.i); // Output: 10
    // At this point, data.f and data.str contain garbage because data.i overwrote them.

    data.f = 220.5f;
    printf("data.f: %.1f\n", data.f); // Output: 220.5
    // Now data.i and data.str are overwritten.

    sprintf(data.str, "C Programming"); // Use sprintf to safely put string in union member
    printf("data.str: %s\n", data.str); // Output: C Programming
    // Now data.i and data.f are overwritten.

    // DANGER: Accessing an inactive member after another has been written is undefined behavior.
    // However, if you know the last written member, you can read it.
    printf("Trying to read data.i AFTER data.str was set: %d\n", data.i); // Garbage
    printf("Trying to read data.f AFTER data.str was set: %.1f\n", data.f); // Garbage

    return 0;
}

The output for data.i and data.f after data.str is set will be garbage values, demonstrating that only one member is “active” at a time.

8.2.2 Common Use Case: Tagged Union

Unions are often used in conjunction with a structure that contains an enum (a “tag”) to indicate which member of the union is currently active. This is safer and makes code more robust.

Example:

#include <stdio.h>
#include <string.h>
#include <stdlib.h> // For malloc, free

// Define an enumeration for the type of data stored
typedef enum {
    INT_TYPE,
    FLOAT_TYPE,
    STRING_TYPE
} DataType;

// Define a union to hold different types of values
typedef union {
    int i_val;
    float f_val;
    char s_val[50];
} Value;

// Define a structure to combine the type tag and the union
typedef struct {
    DataType type;
    Value data; // The union member
} GenericItem;

// Function to print a GenericItem safely
void print_item(const GenericItem *item) {
    switch (item->type) {
        case INT_TYPE:
            printf("Integer: %d\n", item->data.i_val);
            break;
        case FLOAT_TYPE:
            printf("Float: %.2f\n", item->data.f_val);
            break;
        case STRING_TYPE:
            printf("String: %s\n", item->data.s_val);
            break;
        default:
            printf("Unknown type.\n");
            break;
    }
}

int main() {
    GenericItem item1;
    item1.type = INT_TYPE;
    item1.data.i_val = 123;
    print_item(&item1); // Output: Integer: 123

    GenericItem item2;
    item2.type = FLOAT_TYPE;
    item2.data.f_val = 45.67f;
    print_item(&item2); // Output: Float: 45.67

    GenericItem item3;
    item3.type = STRING_TYPE;
    strcpy(item3.data.s_val, "Hello from union!");
    print_item(&item3); // Output: String: Hello from union!

    // Example of allocating a GenericItem dynamically
    GenericItem *dyn_item = (GenericItem *) malloc(sizeof(GenericItem));
    if (dyn_item == NULL) { /* handle error */ return 1; }

    dyn_item->type = STRING_TYPE;
    strcpy(dyn_item->data.s_val, "Dynamic Union");
    print_item(dyn_item);

    free(dyn_item);
    dyn_item = NULL;

    return 0;
}

8.3 Enumerations (enum)

An enum (enumeration) is a user-defined data type that consists of a set of named integer constants. It makes your code more readable and maintainable by replacing “magic numbers” with meaningful names.

8.3.1 Defining and Declaring Enums

Syntax:

enum enum_tag {
    CONSTANT1,
    CONSTANT2 = value, // You can assign explicit values
    CONSTANT3,
    // ...
}; // Don't forget the semicolon!
  • By default, the first constant is assigned 0, the second 1, and so on.
  • If you assign a value to a constant, subsequent constants without explicit values will increment from that value.

Example:

#include <stdio.h>

// Default values: RED=0, GREEN=1, BLUE=2
enum Color {
    RED,
    GREEN,
    BLUE
};

// Explicit values: SUNDAY=1, MONDAY=2, TUESDAY=3 ...
enum Day {
    SUNDAY = 1,
    MONDAY, // MONDAY will be 2
    TUESDAY,
    WEDNESDAY,
    THURSDAY,
    FRIDAY,
    SATURDAY
};

// Values that jump: APPLE=10, ORANGE=20, GRAPE=21
enum Fruit {
    APPLE = 10,
    ORANGE = 20,
    GRAPE
};

int main() {
    enum Color chosen_color = GREEN;
    enum Day today = WEDNESDAY;

    printf("Color RED has value: %d\n", RED); // Output: 0
    printf("Chosen color (GREEN) has value: %d\n", chosen_color); // Output: 1

    printf("Day SUNDAY has value: %d\n", SUNDAY); // Output: 1
    printf("Today (WEDNESDAY) has value: %d\n", today); // Output: 4

    printf("Fruit GRAPE has value: %d\n", GRAPE); // Output: 21

    // You can use enums in switch statements for better readability
    switch (chosen_color) {
        case RED:   printf("It's red.\n"); break;
        case GREEN: printf("It's green.\n"); break;
        case BLUE:  printf("It's blue.\n"); break;
        default:    printf("Unknown color.\n"); break;
    }

    return 0;
}

8.3.2 typedef for Enums

Similar to structures, you can use typedef with enums to create cleaner code.

typedef enum {
    SMALL,
    MEDIUM,
    LARGE
} Size; // Now you can just use 'Size'

int main() {
    Size product_size = MEDIUM;
    printf("Product size: %d\n", product_size); // Output: 1
    return 0;
}

8.4 Structure Padding and Alignment (Brief Overview)

When you define a structure, you might expect its total size to be the sum of the sizes of its individual members. However, due to memory alignment rules, the compiler might insert padding (empty bytes) between members or at the end of the structure.

  • Alignment: Processors often access memory more efficiently if data is aligned on certain address boundaries (e.g., an int might need to start at an address that’s a multiple of 4).
  • Padding: The compiler adds unused bytes to ensure that each member is aligned correctly and that array of structures elements are also aligned.

This means sizeof(struct MyStruct) might be greater than the sum of sizeof() of its members.

Example:

#include <stdio.h>

struct Example1 {
    char c;    // 1 byte
    int i;     // 4 bytes
    char c2;   // 1 byte
}; // Expected: 1+4+1=6. Actual might be 12.

struct Example2 {
    char c;    // 1 byte
    char c2;   // 1 byte
    int i;     // 4 bytes
}; // Expected: 1+1+4=6. Actual might be 8.

int main() {
    printf("Size of Example1: %zu bytes\n", sizeof(struct Example1));
    printf("Size of Example2: %zu bytes\n", sizeof(struct Example2));
    return 0;
}

Output (example on a typical 64-bit system):

Size of Example1: 12 bytes
Size of Example2: 8 bytes

Notice how changing the order of members can affect the total size due to padding. We’ll delve into this more deeply in an advanced topics chapter. For now, be aware that sizeof a struct can be surprising.

Exercise 8.1: Student Information System

Define a structure Student that stores the following information:

  • student_id (integer)
  • name (character array, size 50)
  • age (integer)
  • gpa (float)
  • Date (nested structure, from the chapter example) for enrollment_date

Then, in your main function:

  1. Declare two Student variables.
  2. Initialize their data (one using direct member access, one using designated initializers if you prefer).
  3. Declare a pointer to a Student and dynamically allocate memory for one Student using malloc.
  4. Initialize the dynamically allocated student’s data using the arrow operator (->).
  5. Print the details of all three students (two static, one dynamic).
  6. Free the dynamically allocated memory.

Exercise 8.2: Product Inventory (Mini-Challenge with Enum & Struct)

Imagine you’re building a simple inventory system.

  1. Define an enum called ProductCategory with members like ELECTRONICS, FOOD, CLOTHING, BOOKS.
  2. Define a struct called Product that contains:
    • product_id (integer)
    • name (char array, size 100)
    • price (double)
    • quantity_in_stock (integer)
    • category (of type ProductCategory)
  3. In your main function:
    • Declare two Product variables (e.g., laptop, apple).
    • Initialize their details, including assigning appropriate ProductCategory enum values.
    • Print out the details of both products, including their category.
    • Bonus: Write a function void display_product_info(const Product *p) that takes a pointer to a Product and prints its details neatly.

Example output:

Product ID: 1001
Name: Laptop
Price: $1200.50
Quantity: 15
Category: ELECTRONICS

Product ID: 2005
Name: Gala Apple
Price: $1.99
Quantity: 500
Category: FOOD

You’ve now expanded your C programming toolkit significantly by learning how to create custom data types with structures, unions, and enumerations. These constructs are fundamental for organizing complex data, optimizing memory usage, and writing more readable code. In the next chapter, we’ll shift our focus to interacting with files, allowing your programs to read from and write to external storage.