Linked Lists

Python, like most modern languages, allows programmers to create their own structures for organizing data.  We have already touched on this when we discussed objects and classes.  Now we are going to look at a very popular and useful user-defined data structure: the linked list.   (It is important to understand that in this case "user-defined" refers to the programmer who is using the language, as opposed to the person who actually buys or otherwise obtains and uses the program after it is written.)

Of course Python has a built-in list structure which lets us add and remove elements as we choose.  In most languages the built-in data types do not have this flexibility: when a data structure is first used, the program must specify its size.  The data structure cannot be expanded later to accommodate more data.  Data structures that can grow and shrink are called dynamic.  Our goal with the linked list structure is to create a general-purpose dynamic data structure.

We will first describe the list as an abstract data type, then we will discuss how to implement it in Python. 

Definition: a linked list is a collection of data objects, such that either
    - the collection is empty
or
    - there is a designated first object
    - there is a designated last object (which can be the same as the first object)
    - each object has a designated successor object, except for the last object, which has as its successor None, which is a special object containing no data
    - each object exept the first object is the successor of exactly one object
    - the first object is the successor of no object

Accessing the objects in the list can only be done sequentially, by starting at the first object and following the successor connections.

The objects in the collection can be reorganized as long as these requirements are met after the reorganization.

New objects can be added to the collection.  A new object can be added to the beginning by making the current first the successor of the new object, then designating the new object as the first.  A new object can be added at the end by making it the successor of the current last, and designating the new object as last.  A new object can be added in the middle of the list by making it the successor of some object X, and making the  successor of the new object be the original successor of X.

Objects can be removed from the collection.  After each removal, first, last, and successors must be updated if needed.


We will discuss how to implement a linked list in Python using classes and objects.  However there is an alternative implementation: see if you can work out how to implement a linked list using a dictionary.

First we define a class to represent the objects that we will store in our list.  In doing so we will give each object the ability to point to its successor.  To explain how this works, consider the following Python fragment

class Thing(object):
    def __init(self,value=0):
          self.value = value

# end of definition for class Thing

# main program

some_thing = Thing(23)
print some_thing.value


When this executes, the line some_thing = Thing(23) causes quite a lot to happen.  First, a chunk of memory is found, and set up to look like a Thing.  Next, the value 23 is passed to the new Thing, which stores it in its own attribute called value.  Finally, the address in memory of the new Thing is stored in the variable some_thing.

The next line print some_thing.value works like this.  "something" is recognized as an address.  Python looks at that address and finds a Thing.   That Thing has an attribute called value so that is the value that gets printed.

The key point is that some_thing is just a variable, and we use it to store the address of an object that resides somehere in memeory.  Do you see where this is going?  When we create the objects for our linked list, each one can contain an attribute that can store the address of the current object's successor.

class Node(object):                  # the "things" that make up the linked list are traditionally named Nodes

    def __init__(self, value = 0, next = None):
       self.value = value
       self.next = next

# main program
node_a = Node(15)
node_b = Node(12)
node_c = Node(99)

node_a.next = node_b
node_b.next = node_c

Voila!  We have created a very rudimentary linked list: node_a points to a Node which contains an attribute called next, which points to another Node (the one that node_b points to)  which in turn contains a next attribute that points to the third Node

At this point we can explore some of the clever things we can do with these address variables.  For example,

print node_a.next.next.value

This is interpreted as "find the Node that node_a points to, follow its next to another Node, follow that Node's next to another Node, and print that Node's value attribute"

if node_a.next == node_b:
    print "they point to the same thing"
else:
    print "they point to different things"

The point of this example is to drive home the idea that node_a.next and node_b are both address variables (pointers).  When we compare them we are strictly comparing the addresses they refer to, not the contents of the Nodes that are at those addresses, nor the addresses of the two pointers themselves.  For example, consider this

node_d = Node(57)
node_e = Node(57)
if node_d == node_e:
    print "They are equal"
else:
    print "Not equal"

At this point you should not be surprised to see that the output is "Not equal" - node_d points to the first of the two new Nodes, and node_e points to the second.  Even though these Nodes contain the same value (57) they sit in different locations in memory, so node_d and node_e contain different addresses.

So we now know how to link Nodes together in a list.  To bring together all the bits and pieces that define a linked list we create another class (with a very imaginative name):

class Linked_List(object):

    def __init__(self):
          self.head = None
          self.tail = None
          self.length = 0


# We usually define our classes so that the mechanisms of the implemented operations are hidden -
# a programmer using this class does not need to know how  the linked list is set up - for example
# the programmer should not need to know that the values are stored in objects called Nodes.  All
# the programmer using the Linked_List class needs to know are the functions that are used to
# add and remove values, search the list, etc.  Let's look at a few of these operations.

    def prepend(self, value):   # add a new value at the beginning of the list
       new_node = Node(value)      # create a node containing the new value
       if self.length == 0:      # the list is empty
          self.head = new_node
          self.tail = new_node
       else:
          new_node.next = self.head
          self.head = new_node
       self.length += 1

    def append(self,value):   # add a new value at the end of the list
       new_node = Node(value)
       if self.length == 0:
          self.head = new_node
          self.tail = new_node
       else:
          self.tail.next = new_node
          self.tail = new_node
       self.length += 1

    def print_list(self):
       current = self.head
       while current != None:
          print current.value,
          current = current.next
       print
   


Please refer to the demo code for several other standard operations on linked lists, almost all of which we went through in class.  You should make sure you understand how they work, and be able to write functions to perform these operations when asked.