Wyo Java Ch. 18 Lecture Notes
Objective #1: Understand the basics of sorting.
- When you rearrange data and put it into a certain kind
of order, you are sorting the data. You can sort data alphabetically,
numerically, and in other ways. Often you need to sort data before you use
searching algorithms to find a particular piece of data.
- There are a number of different sorting algorithms that are widely used by programmers. Each algorithm has its own advantages and disadvantages. The sorting algorithms include the sequential sort, the insertion sort, the bubble sort, the Shell sort, the quick sort, and the merge sort.
- The key field is the field upon which the data is sorted.
- A key value is a specific value that is stored within the key field.
- The input size is the number of elements in a list that will eventually be sorted.
- There are two basic approaches to sorting data, the incremental approach and the divide and conquer approach. Using the incremental approach, one sorts the whole list at once using loops usually. The divide and conquer approach splits the list up into parts and sorts each part separately. Then this approach manages to join the sorted parts together into a large sorted list.
Objective #2: Understand the selection sort.
- The selection sort is an incremental one.
- Every key value is examined starting at the beginning of the list. Assuming that you are sorting the list in ascending order, you use a temporary variable to "remember" the position of the largest key value. By the time you have examined every key value, you swap the key value that was the largest with the last key value in the list. Next, you repeat the process again from the beginning of the list, however, you will not need to compare anything to the new last key value in the list since you know it is the largest.
- This algorithm uses nested loops and is easy to code. However, it is quite inefficient since it continues processing even if the list is already sorted. Whether the original data is close to being sorted or not, this algorithm takes quite awhile since a lot of loop iterations and comparisons must be made.
- The Big Oh time performance of a selection sort is O(n2) in
the worst case. Because there is no early exit, it is also O(n2) in
the best case.
- Characteristics
- two nested loops
- relatively efficient for small lists of less than 100 elements
- many comparisons
- only
1 exchange (swap) at the end of each pass
- non-adjacent elements are compared
- no boolean variable is used
to allow for an early exit
- a "temp variable" is used as
a marker
Objective #3: Understand the insertion sort.
- The insertion sort is incremental in
nature. Mr. Minich also calls this sort, a poker-hand sort.
- Two lists are used throughout this algorithm. One list is sorted and the other is unsorted. The sorted list begins with one key value. Successively, one key value from the unsorted list is placed into the proper order with the smaller, sorted list. As more key values are placed into the sorted list, that list becomes larger.
- This is similar to the way a person usually organizes a hand of playing cards.
- The insertion sort is relatively quick for small lists
that are close to being sorted.
- The Big Oh time performance of an insertion sort is O(n2) in
the worst case. Because there is an early exit, it is also O(n) in
the best case where the data is already in order.
- Characteristics
- two nested loops
- boolean variable allows inner loop
to exit early saving a bit of time here and there
- compares non-adjacent elements
- no exchanges (swaps), rather many elements must shift
- a lot of comparisons are made slowing down this algorithm
- very efficient when data is close to being in order
- very inefficient when data is really out of order
Objective #4: Understand the merge sort.
- The merge sort is a divide and conquer algorithm.
- The whole list is divided into lists that consist of one element a piece. Then, every two adjacent lists are merged into one larger sorted list. The process continues until one sorted
list remains.
- The Big Oh time performance of a Mergesort is between O(n log n) in the best case and the worst case.
- Characteristics:
- recursive, divide and conquer algorithm
- takes lots of space (i.e. memory) since each of the log n stack frames contain a separate array variable.
Objective #5: Understand the quicksort.
- This is an AB topic only so only students in the AP Data Structures course need to study this objective.
- The Quicksort is a divide and conquer
algorithm and is more efficient than incremental sorts. It can be difficult
to code though since it uses recursion or stacks.
- The original list is partitioned into two lists. One of the lists contains elements that are greater than the first original element. The second list contains elements that are less than or equal to the first original element. Then, each of these two partitions are partitioned using the same algorithm. Eventually, partitions will have only one element each. When this happens, that single-item partition is considered to be sorted and only partitions with multiple-items are partitioned. When every partition consists of only a one element, the partitions are placed into order to make a fully sorted list.
- The Quicksort is horribly slow when the original data is mostly in order.
The partitions are imbalanced in this case. It is paradoxically beneficial
to shuffle the data first before applying the quicksort especially if you
think the data may be partially sorted.
- The Big Oh time performance of a Quicksort
is between
O(n log n) in the best case and O(n2) in
the worst case.
- Characteristics:
- recursive, divide and conquer algorithm
- on average, this is the fastest sorting algorithm
Objective #6:
Trace and apply the sequential search algorithm.
- Although there are more efficient ways to search for data, a sequential search (aka linear search) is a good choice to use when the amount of data to be searched is small.
- You simply check each element of an array position by position until you find the one that you are looking for.
- In any search, the item upon which the search is based is called the key and the field being searched is called the key field.
- The sequential search algorithm is relatively easy to code.
- Unfortunately the time performance of the sequential search
algorithm is O(n) in its worst case since the target key may be found in
the "last" position
that is probed.
- A boolean variable can be used to exit early from an indeterminate while loop when the target key is found. Also, a boolean value can be returned from a sequential search method to indicate whether or not the target key was found or not. Or, the position within an array where the target is found could be returned. An object variable can be passed to a search method and assigned the target object if it is found.
Objective #7: Trace and apply the binary search algorithm.
Objective #8: Trace and apply a hash algorithm.
- This is an AB topic only
so only students in the AP Data Structures course need to study this
objective. There are more lecture notes about hashing in Ch. 20.
- A lookup table (also known as a key-to-address
transformation)
can be used to allow very fast searches. Data elements are placed into a
lookup table by converting some characteristic of each element into a specific
position of the lookup table. Often the lookup table is a one or two-dimensional
array.
- Sometimes, there is no good algorithm or function for mapping each different
data element to a different position of the array. That is, two or more elements
both map to the same position of the array. This problem is called a collision.
If that happens, then there is no easy way to guarantee that someone can
find
the
element
they
are searching
for. They may find one of the other elements that mapped to the same array
location.
- Hashing is a search algorithm that improves
upon lookup tables by finding a way to minimize and resolve collisions. A hash
function is used to convert a
data element to a hash value. The hash function is essentially
an assignment statement. For example, to hash a bunch of words into a hash
table, you could use a hash function that finds the sum of the Unicode values
for each letter in the word. The hash value is used to find a position for
the element in a hash
table.
Like a lookup table, a hash table is simply a data structure such as a one
or two-dimensional array. But it could also be
an array of
linked lists or a linked list of linked lists among other kinds of data structures
that we haven't studied. The key difference though between using hashing
and using a lookup table it that the hashing algorithm deals with collisions
that occur.
- One way to handle collisions is to use a hash table data structure
that allows two or more elements to both occupy the same position. By using
a two-dimensional array where the row represents the initial hash values,
you can place the original element into column 0 and the colliding element
into column 1. A third element that collides to the same hash table row
could be placed in column 3 and so one. This method for dealing with collisions
is called chaining. Each row number in the two-dimensional array is called
a bucket. Instead of a two-dimensional array, the hash table could be a
one-dimensional
array with a linked list or a binary search tree as a bucket attached to
each row of the array.
- Another way to handle handle collisions is called probing. When a second
element hashes to an already occupied hash table position, a probing function
is used to find a new slot in the hash table. A real simple probing function
would be to increment the position by one and place the second element into
the next position in the hash table array. If that position is also occupied
then the probing function could be applied again to send the element to the
next position that is hopefully empty. A poor probing function such as this
one (i.e. incrementing by one) leads to clustering with lots of data elements
are settling into the same area of a hash table. It's better to balance the
elements out throughout the table so quadratic probing is sometimes used.
Where the probing function is index + n ^ 2 for each successive n.
- A big disadvantage to lookup tables and hashing is the
large amount of memory that may be necessary. The size of the one or two-dimensional
array used as the lookup or hash table may have to be very large to accomodate
a poor hash function. If their is not a lot of data to put into the lookup
or hash table, then the table will be very sparse.
- The time performance for a hash is O(1) in the
best case and O(n) in its worst-case.
Objective #9: Apply Big-O notation to sorting & searching
algorithms.
- This is an AB topic only so only students
in the AP Data Structures course need to study this objective.
- Big-O notation is a method of classifying algorithms
with regard to the time and space that they require. The O refers to "on
the order of".
- Formal definition of big-O: Suppose there exists some function f(n) for all n > 0 such that the number of operations required by an algorithm for an input of size n is less than or equal to some constant C multiplied by f(n) for all but finitely many n. The algorithm is considered to be an O[f(n)] algorithm relative to the number of operations it requires to execute.
- This algorithm is classified as O[f(n)].
- There is some point of intersection between the graph of the algorithm and the function f(n) where the function "bounds" the algorithm. That is, the function will lie above the algorithm for any given point n.
- If n is really small, it does not really matter what algorithm is used to sort or search since the time difference will be negligible.
- Common big-O categories:
| O(1) |
constant |
push & pop stack operations; insert & remove queue operations |
| O(log n) |
logarithmic |
binary search; insert & find for binary search tree; insert & remove for a heap |
| O(n) |
linear |
traversal of a list or a tree; sequential search; finding max or min element of a list |
| O(n log n) |
"n log n" |
quicksort, mergesort |
| O(n2) |
quadratic |
selection sort; insertion sort; bubble sort |
| O(n3) |
cubic |
|
| O(an) |
exponential |
recursive Fibonacci; Towers of Hanoi |
The linear, quadratic, and cubic algorithms are also
called polynomial algorithms.
- Since it is true for any a, b > 0 and a, b <> 1:
log2 n = log10 n / log10 2
then
log10 n = C log2 n
where C is a constant. Therefore, O(log2 n) is really the same as O(log n).
- O(1) < O(log n) < O(n) < O(n log n) < O(n2) < O(n3) < O(an)
- Divide and conquer algorithms like the quicksort, mergesort, & the binary search are very efficient since they have logarithmic growth.
- If an algorithm is O(n2) then it is also O(na) where a > 2 since the na bounds n2.
- To determine an algorithms big-O category, it is most useful to analyze its loops.
- In analyzing the loops of an algorithm, you must come up with some function f(n) where C * f(n) parallels (or "bounds") the actual number of operations within the algorithm as closely as possible. The constant C is known as the constant of proportionality and ultimately makes no difference when comparing two functions with the same order of magnitude. The order of magnitude is the degree of the function (the largest exponent of n) or a logarithmic or exponent term.
- In any derived function f(n), one term will be the dominant
term. A dominating function then is a function f(n) with the dominant term of the original derived function. For example, if you come up with the function f(n) = n2 + n log 2 n
+ 5, the dominating function is n2 since n2 is the
dominant term in f(n).
Examples: n dominates log n, n log n dominates n, n3 dominates n2 and a2 dominates nm
- When an outer loop iterates from 1 to n-1 and a nested loop iterates from 0 to i-1 (where i is the loop variable in the outer loop), the inner loop will iterate a total of 1 + 2 + 3 + .... + (n - 1) times. This is equal to n(n - 1)/2 which is the number of terms multiplied by the average of the first and last terms. Since n2 ends up being the dominant term in the polynomial, this explains the O(n2) category of several of the algorithms we will be analyzing.
- See this big-O sorting algorithm summary
- Here is a graph of the common big-O categories courtesy
of http://www.cs.odu.edu/~toida/nerzic/content/function/growth.html. Note
that in order to "fit" all of the functions on the graph, the vertical
time axis is marked in a
way that each marking is double the previous marking.
Objective #10: Understand the bubble & Shell sorts.
- This topic is not covered on the AP exam and it may not be covered on regular tests. See the instructor with any questions.
- The bubble sort is an incremental sort which is usually faster than the insertion and selection sorts.
- A bubble sort works similarly to the release of CO2 in carbonated soda.
- Beginning at one end of a list, adjacent key values are compared. Assuming that you are sorting the list into ascending order, these two key values would be swapped if the first
was larger than the second. Next you compare the larger of the two to the next adjacent key value in the list, swapping if necessary. By the time that you compare the last two key values in the list,
you know that the largest key value from the whole list will be in the last position. Using a loop, this process is continued (comparing adjacent key values) from the beginning of the list. However,
you will not need to compare anything to the last key value in the list since you know it is the largest.
- A Boolean variable is used in a bubble sort to "remember" if any swaps are made on a particular sweep of the list. If a swap is made the variable might
be set to TRUE. At the beginning of each successive sweep, the variable is set to FALSE. When a complete sweep of the loop has been made without any swaps, the variable will still be FALSE.
- The use of the Boolean variable causes this sort to only sweep the list one extra time after it has been fully sorted. This makes the bubble sort more efficient than a number of
other incremental sorts.
- The Big Oh time performance of a bubble sort is O(n2) in the worst case. Because there is an early exit, it is also O(n) in the best case when the data is already in order.
- Characteristics:
- two nested loops
- adjacent elements are compared
- many exchanges (i.e. swaps)
- boolean variable allows outer loop to exit early possibly saving time
- very inefficient when data is really out of order
- very efficient when data is close to being in order due to the early exit
- The Shell sort is an incremental sort which was named after it's inventor, D. L. Shell. It is sometimes called the comb sort.
- This sorting algorithm is similar to the bubble sort but instead of comparing and swapping adjacent elements, elements that are a certain distance away from each other are compared
and swapped. The certain distance can be called the gap size and might initially be set to half the input size.
- After each sweep, the gap is decreased until eventually adjacent elements are compared. A Boolean variable is used as it is in the bubble sort to add efficiency. It works the fastest
when the successive gap sizes are relatively prime to each other.
- The Shell sort is much faster than other incremental sorts since very large and very small key values are quickly moved to the appropriate end of the list.
- The Big Oh time performance of a Shell sort is between O(n) and O(n2) but closer to O(n2) in the worst case. Because there is no early exit, it is also O(n(log
n)2) in the best case.
Objective #11: Understand the radix sort
- This topic is not covered on the AP exam and it may not be covered on regular tests. See the instructor with any questions.
- A radix sort is an extremely fast sort but it requires special conditions. It uses a lookup or a hash table (see notes below).
- The Big Oh time performance for a radix sort is O(n) which is much faster than the other sorting algorithms that we've studied but the radix sort uses lots and lots of memory and
it only works under certain conditions.
- One requirement for a radix sort is that the data elements must be able to broken down into parts and the range of those values has limits. For example, numbers could be sorted with
a radix sort since each number can be separated into one of ten digits (0 through 9). Words could also be sorted this way since each word could be separated into letters (a through z). But objects
such as Fish objects may not be able to be sorted in this way. You can compare the weights of two fish as numbers but there is probably no upper limit to the weight of a fish. So Fish objects would
have to be sorted with one of the algorithms above that use comparisons (i.e. if statements) to compare pairs of fish.