Data Structures - Linked Lists

What is a linked list data structure?
    - A programmatic method of organizing a collection or set of data objects
    - The data objects in the set are connected or "chained" together
    - Each data object contains a reference to the next data object in the chain
    - List contains a beginning and ending point

 

In our discussion the term “data objects” refers to the data model using structures or, as we’ll see later, the classes (in C++) that are declared allocated to store data used in a program. In the example below, 10 “Particle” data objects are allocated.

Why do we need linked lists?
How are they related to object-oriented (OO) programming?
    - Recall with OO, we're beginning to think about programs as containing:
            1)  a data model(s)
            2)  functions to operate on a data model(s)
    - We instantiate (declare) data objects according to our data models

Example

// Data Model to define our data objects
struct Particle
{
    int id;
    float pos[3];
};

main()
{
         
// Declare 10 data objects of type Particle
    Particle p[10];
    ....
}

    - Oftentimes, our program will require a collection or set of data objects
    - We might need to perform some operation on each of the objects in the set
    - We might need to organize (i.e. sort) the set according to some criteria
    - Oftentimes, the amount of objects in a set is not known beforehand
    - A common example might include reading multiple records (or lines) from a file

     - We don’t know ahead of time (when we write the program) how many lines are in the file

    - We can’t declare an array of objects because we don’t know what the size should be

    - Instead, we create (allocate) objects at run-time (as the program is running)

    - In this way, our collection of data objects grows dynamically

    - Whenever a new object is needed, one is created on demand by the program
 

Pseudo-code example

main()
{
    OpenFile();

    while( there is still a record in the file )
    {
        // Dynamic object creation  
        CreateDataObject();

        ReadRecord();

        FillDataObject();

        // Grow the list  
        AddDataObjectToList();
    }

    CloseFile();
}

 

 

How do we implement a linked list?

-          Let’s use the following diagram to represent a data object allocated in memory

-          On the left is the value representing some memory address of the data object

-          Inside the memory block are the values for each variable in the data model

-          The address represents the memory location of the first variable in the model

 

-          For example, we can visualize a declared Particle data object as follows

struct Particle        // Data model definition
{
    int id;
    float pos[3];
};

main()

{

// Data object

Particle p;

p.id = 10;

p.pos[0] = 0.0;

p.pos[1] = 1.0;

p.pos[2] = 2.0;

}

 

 

 

-          To create linked lists, we begin by adding a reference to another object to our data model

-          The reference is actually a pointer to store the address to another object

-          The object being referred to is the “next” object in the list

-          Below we’ve added a pointer named next to our data model

-          We add storage to our diagram to store the address to the next object

-          Notice also how we initialize the pointer to NULL (no memory address)

-          This is very important! - We don’t want it pointing to some unknown memory location

-          Segmentation faults are common when pointers try to reference unknown locations

 

struct Particle    // Data model definition
{
    int id;
    float pos[3];

    Particle *next;   // Pointer to next object
};

 

main()

{

// Data object

Particle p;

p.id = 10;

p.pos[0] = 0.0;

p.pos[1] = 1.0;

p.pos[2] = 2.0;

p.next = NULL;

}

 

 

 

 

-          In use, our next pointer will store the address of the next object

-          The next object in the list also contains a next pointer to refer to the next object

-          This process continues until the end of the list of objects is reached

-          In this way, a “chain” of objects is created where each object refers to the next 

 

 

-          Note how we assign the pointer in the last object in the list to NULL

-          This is the way we designate that it is the last object in the list

-          Let’s look at an example of how we can link these objects together

-          In this simple example, we have 3 objects we link together with the next pointer

 

#include <iostream>

using namespace std;

 

main()
{
    struct Record
    {
        int id;
        Record *next;   // pointer to the next Record
    };

    Record n1, n2, n3;

    n1.id = 0;
    n1.next = &n2;

    n2.id = 1;
    n2.next = &n3;

    n3.id = 2;
    n3.next = NULL;

    cout << "n1.next->id: " << n1.next->id << endl;
    cout << "n2.next->id: " << n2.next->id << endl;
    cout << "n1.next->next: " << n1.next->next->id << endl;
}

Output
n1.next->id: 1
n2.next->id: 2
n1.next->next->id: 2
 

Notes
    - The "data model" is implemented as a C data structure
    - We use a structure pointer as our reference to point to the next element in the set
    - We declare 3 data objects of this data model type
    - We link the objects together using the next pointer member in the structure
    - . and -> has same order precedence, expressions evaluated from left to right


How do we use a linked list?
    - The simple example above chains three objects that were declared (allocated) at compile time

    - In other words, the number of objects was known at the time the program was written

    - In real applications, linked lists are used to chain an unknown number of objects

    - Unlike the example above, objects are allocated and chained dynamically at run time

    - To create and use a linked list effectively, we must be able to perform such tasks as

-          Maintaining a beginning and ending point for the list

-          Adding objects to the list

-          Looping through the list to perform some operation on each object

-          Removing objects from the list

-          Inserting objects into the list

 

    - Let’s begin to expand our pseudo-code example above to read lines (records) from a file

    - In this example, the number of lines in the file is unknown at the time of writing

    - Note how we are using our Data Modeling methods to design the program

    - Specifically, we design our data model and then operations on the data model

 

// Data model definition

struct Record
{
    int id;             // stores Record id

    string name;        // name of object
    Record *next;       // points to next object
};

 

// Operations on the data model

Record *ReadRecord();
void PrintRecord(Record *);

 

 

main()
{
    Record *head, *tail, *newp, *tmp;

    head = tail = newp = tmp = NULL;

 

    OpenFile();


    while( newp = ReadRecord() )
    {

        // Add object to the list 
        if( head == NULL )
        {
            // Beginning of the list 
            head = newp;

           

            // Current record 
            tail = newp;
        }
        else
        {
            // Previous record reference to new record 
            tail->next = newp;

           

            // Current record 
            tail = newp;
        }

    }

   

    // End of the list 
    tail->next = NULL;
 

    // Loop through the list  
    for( tmp=head; tmp!=NULL; tmp=tmp->next )
    {
        PrintRecord( tmp );
    }
}

Record *ReadRecord()
{
    Record *tmp = NULL;

    if( // records exist  )
    {
        // Dynamically create object 
        tmp = new Record;

   

  // Store information in created object ...
  tmp->id = . . .

  tmp->name = . . .

       
    }

   

    // Return record 
    return tmp;
}

void PrintRecord( Record *r )
{
    // Print record contents 
}

 

 

 

Linked List code analysis

    - Let’s examine the linked list code above a bit more closely

    - Pointers are created to store addresses of the objects created dynamically

    - head and tail will store object addresses at the beginning and end of the list

    - newp and tmp will be used as temporary storage for object addresses

    - All pointers are initialized to NULL

 

Record *head, *tail, *newp, *tmp;

head = tail = newp = tmp = NULL;

 

    - Next, we would open up a file to begin reading one line at a time

 

OpenFile();       // pseudo function

 

    - We then begin a loop to read a line until all lines have been read

 

while( newp = ReadRecord() )

 

 

    - The function ReadRecord reads a line from a file and returns an address

           

- If a line exists in the file, an object is created on the heap

- We use new to create an object of size Record on the memory heap

 

// Dynamically create object 
tmp = new Record;

 

- We can store information read from the file into our new object

- We designate with comments (to be completed as part of project assignment)

- We use tmp pointer to access the object to store information (i.e. tmp->id = ...)

 

      // Store information in object (tmp->id = ...)

 

            - Finally, we return the address of this new object to the calling function

 

// Return record 
return tmp;

 

- If a line does not exist, we simply return NULL

- We designate with comments (to be completed as part of project assignment)

 

if( // no more records exist  )
{
    return NULL;
}

 

    - The address returned from ReadRecord is stored in the temporary pointer, newp

 

while( newp = ReadRecord() )

 

    - When newp is NULL, the loop ends to indicate all lines from the file have been read

    - If it contains a valid address, we enter the while loop to chain the object pointed to by newp

    - If this address represents the first record, we assign it to our head pointer

    - We can tell it’s the first if the beginning of our list (head) is still NULL

    - The address is also assigned to the end of the list (tail) since it’s the only object

 

if( head == NULL )
{
    // Beginning of the list 
    head = newp;

           

    // Current record 
    tail = newp;
}

   

    - If this address does not represent the first record created (head is not NULL)

    - We assign it to the next pointer of the last object created

    - The address of the last object created is stored in the tail pointer

 

else
{
    tail->next = newp;

 

    - By assigning this address in this manner, we create a link in the chain of objects

    - We also update our tail pointer to point to this object to represent the last object

          
    tail = newp;
 }

 

    - This process of chaining newly created objects continues until ReadRecord returns NULL

    - We then terminate our linked list by assigning NULL to the next pointer of our last object

    - This is performed after our while loop is completed

 

// End of the list 
tail->next = NULL;

 

 

  - We now have a linked list of stored data objects

    - Pointers to the beginning and end of the list (head,tail) are used to access the list

    - We start with the object pointed to by head and follow the chain of pointers to the end

    - This is the process we use in the for loop below to access each of the objects

    - In this example, we are simply calling a function to print information in each object

    - Note the interesting use of pointers in the for loop to cycle through the objects

 

// Loop through the list  
for( tmp=head; tmp!=NULL; tmp=tmp->next )
{
    PrintRecord( tmp );
}

    - Note finally that access to an object is sequential beginning with the first object

    - It obviously takes time to access an object towards the end of the list

    - We’ll look later at data structures that provide different methods of storage and access