OCR Computing A-Level Revision

Data Structures & Data Representation

Dynamic Data Structures (3.5.a, 3.5.b)

The size of a "dynamic" data structure changes as data is added and removed. However, it is more complex to program because the size is not fixed. All dynamic data structures can be stored within static data structures, as a hard disk (or other physical storage medium) is effectively a massive static data structure.

Arrays

Arrays are always static. However, it is possible to make a (static) array "appear" dynamic. Space is reserved for potential new items within the array. The more space that is reserved, the more new records can be added, but the less efficiently the data is stored.

Linked Lists

Every item in a linked list has a pointer. The pointer "points" to the next entry in the list. A pointer known as a "terminator" represents the end of the list. This means that the data does ` Qnot have to be stored in a "block", but can be scattered. It also allows the data (but not the structure) to be changed after the structure is created.

Stacks

A stack is similar to an array, but data is only added (or pushed) at one end. It is removed (or popped) from the same end of the structure. A stack is a "last in first out" structure. The top of the stack can also be viewed by peeking at it.

To add (push) data to a stack, the stack has to be checked to ensure it is not full. Similarly, when removing data, the stack has to be checked to ensure it is not empty.

Queues

A queue is similar to an array, but data is only added at one end (the tail). It is removed from the other end (the head). A queue is a "first in first out".

To add data to a queue (enqueue), the queue is first checked to confirm it is not full (if it is then an error is reported). Then, the tail pointer is pointed to the location of the new item. To remove data from a queue (dequeue), the queue is first checked to confirm it is not empty (if it is then an error is reported). Then, the data is removed, and the head pointer is moved to the next item.

Trees

Trees are incredibly powerful data structures. Trees consist of nodes, and each node can have up to two nodes coming off it. The "top" node is known as the root node. Nodes are connected using pointers.

A diagram of a tree data structure

When creating a tree structure, the root node is chosen. Data which comes before the root (in our chosen order) goes to the left, and data which is after the root goes to the right. This process is repeated until a "null" pointer is found - the new node is inserted here with "null" pointers.

The subtrees within a tree data structure

To travel through (traverse) a binary tree (in order), we visit the left subtree, and traverse that. We then visit the root, and then traverse the right subtree. This process is iterative.

Searching (3.5.c)

Serial searching is unavoidable when data is in a random order. The search starts at the first item and moves through the list until either the desired item is found or the end of the list is reached.

Binary Searching

Binary searching is substantially quicker (and exponentially quicker, the longer the list) than serial searching, but it relies on the list being in order.

In a binary search, the middle value is compared to the search value. If this value is the search string, then the search ends. Otherwise, the list is split into two sublists (a list of larger values and a list of smaller values). This process is iterative.

Merging Data Files (3.5.d)

To merge two (ordered) data files into one (ordered) data file:

This process is iterative.

Sorting (3.5.e)

Insertion Sort

The insertion sort "steps" through the list, inserting each item in term into the correct position (relative to the "sorted" items before it). The insertion sort is simple to implement, and generally quicker for small datasets than a quick sort.

Quick Sort

The quicksort selects an item at random (called the pivot). It then creates two new lists, one with all the items less than the pivot, and one with all the items greater than the pivot. This process is repeated until all the "sublists" have only one item.