CMIS 241 DE Study chapter 3, about unsorted and sorted lists classes and binary search. You can go lightly over the implementation details of the book's code. A list is an abstract data type that is a sequence of items. Sequence implies that each list item has an item before it and after it in the list (except for the first and last elements). The number of items in the list is the list's length or size. A partially filled array (i.e. filled from the beginning) is a kind of list, but a list is not an array. A list can be implemented with an array (it could also be implemented with a linked data structure [chapter 5]). Java has some built-in list classes. They are part of the Collections Framework. ArrayList is one of them. An ArrayList is a list of any kind of and any mix of objects (primitive type values have to be "wrapped" in its corresponding wrapper class, e.g. an int has to be turned in to a Integer). (Usually the items in the list are the same type, but they don't have to be.) A collection is a group of objects...C++ uses the word container... Incredibly easy to use. And efficient and very reliable. No one will create their own list class. The ArrayList grows and shrinks automatically as needed. //How to use an ArrayList: //1. declare and instantiate an ArrayList object: // ArrayList myArrayList = new ArrayList(); //2. use the methods on it: // myArrayList.add("a string object"); // myArrayList.add(anyObject); // if (myArrayList.contains(someObject))... The Collections class has some static methods to perform various algorithms on collections such as ArrayLists. Collections.sort (myArrayList) Collections.reverse(myArrayList) Collections.shuffle(myArrayList) randomly permute (mix up) the items Collections.min (myArrayList) Collections.max (myArrayList) A static method (also known as a class method) is a method that is not part of any particular instantiated object. It typically provides a utility for general use. It is typically invoked via the class name instead of an object reference. Math.sqrt(), JOptionPane.showInputDialog(), System.out.println() are static methods Run the ArrayListTest.java to understand ArrayList. A list that's not in any order can not be efficiently searched. Basically you have to look at all the items until either you find what you're looking for or you get to the end of the list. This is called sequential search. With small lists the time it takes is irrelevant, but many applications have huge lists that must be searched a lot. Say you've got one million items, and they need to be searched one thousand times every second. The searching is going to take up a lot of CPU time, whatever the mega/giga hertz. However, if the list is in sorted order, and it's stored in an array (i.e. the list is implemented with an array), you can take advantage of the power of array indexing to search very efficiently. Because arrays are laid out sequentially in memory, element access via an index is just a simple memory address calculation. Any element of an array can be accessed in the same time as any other. You're not limited to sequential access, you can access the elements in any order or pattern. The binary search repeatedly ("recursively" as we'll see in chapter 7) accesses the middle element of (sub)arrays that are half the size of the (sub)array just previously looked at. This allows searching in essentially no time at all. Pages 174-180 show the algorithm. The contains() method of the ArrayList class does sequential search: myArrayList.contains(someObject) returns boolean The indexOf() method of the ArrayList class also does sequential search: myArrayList.indexOf(someObject) returns index of where found... The Collections class has a static method for doing binary search on a sorted ArrayList (if it's not sorted the answer is undefined, meaning junk): Collections.binarySearch(myArrayList,someObject) returns index of where found, or negative number if not found Run the SearchArrayListTest.java program to get a feel for the efficiency difference between sequential and binary search. Binary search is vastly less time-consuming, how much so is discussed below. The drawback of binary search is that the list must be in sorted order... Analyzing the efficiency of algorithms is an important topic in computer science. The discussion in the book is a good introduction to it. Once an algorithm is correct, we can worry about its efficiency. Timings are too dependent on CPU, operating system, programming language, compiler and are not mathematical enough so they are not used. Basically we are just counting the number of times some fundamental operation of the algorithm executes. This operation will be the unit of work. For searches and sorts, the operation is a comparison of two values. For numeric algorithms, typically a multiplication and/or division is the "countable" operation. We want an algebraic function (one of those f(x) things) that relates the input size (or problem size, or amount of data) to the number of operations executed (x is the input size). The big-O notation allows us to disregard multiplicative factors and lesser terms in the actual analyzed function and just talk about the dominant term. Most practical algorithms are in one of the "bins" (functions) of figure 3.13 and table 3.3. Note that the last row with N=256 is still a small value, most real-world programs deal with much larger amounts of data, hence the critical need to find efficient algorithms (i.e. ones with smaller rate of growth function), if indeed there can be any (some problems require at least a minimum amount of work in order to be solved). The general discussion is then made concrete by applying it to the member functions of the sorted and unsorted list classes. Note in particular the difference between linear (sequential) search and binary search. Sequential search takes time proportional to the number of items in the list, it is an O(N) "order n" algorithm. Binary search takes time proportional to the logarithm of the number of items in the list, it is an O(lg N) "order log n" algorithm. lg means log to the base 2 instead of the more common log to the base 10. N O(N) O(lg N) -------- ---- ------ 100 100 7 1000 1000 10 1000000 1000000 20 billion billion 30 trillion trillion 40 Moral: an O(lg N) algorithm takes almost no time to run. By using a cleverer algorithm (and a sorted list on which to operate), searching can be done vastly more quickly.