1 13-AlgorithmsSoftware


Previous: 12-Strings.html

**"Simplicity is the key to brilliance." **
    **- Bruce Lee**

1.1 Screencasts

1.2 Algorithms

https://en.wikipedia.org/wiki/Algorithm

1.2.1 Evaluating algorithms

https://en.wikipedia.org/wiki/Time_complexity
https://en.wikipedia.org/wiki/Computational_complexity_theory
https://en.wikipedia.org/wiki/Computational_complexity
https://en.wikipedia.org/wiki/Asymptotic_analysis

How to measure the efficiency of an algorithm?
1. Empirical comparison (run example programs) - Pros and cons?
2. Asymptotic algorithm analysis (proofs) - Pros and cons?

Examples:
* Searching through an array
* Sorting an array

What impacts the efficiency of an algorithm?
* For most algorithms, running time depends on “size” of the input.
* Time cost is expressed as T(n), for some function T, on input size n.
* Draw this.

Ask/Discuss:
* Do all problems have a parametrically varying size of input?

1.2.2 Rate of growth?

What is rate of growth?
How does T increase with n?

1.2.2.1 Max

We refer back to a program from our first real class: 01-Computation.html
13-AlgorithmsSoftware/find_max.py
13-AlgorithmsSoftware/find_max_cfg.svg

#!/usr/bin/python3
# -*- coding: utf-8 -*-

from typing import List

def find_max(L: List[int]) -> int:
    max = 0
    for x in L:
        if x > max:
            max = x
    return max

print(find_max([11, 23, 58, 31, 56, 77, 43, 12, 65, 19]))

Define a constant, c, the amount of time required to compare two integers in the above function find_max, and thus:
T(n) = c * n

1.2.2.2 Nested loops

# python-style
sum = 0
for i in range(n):
    for j in range(n):
        sum = sum + 1

# c-style
sum = 0
for i = 1; i <= n; i = i + 1
    for j = 1; j <= n; j = j + i
        sum = sum + 1

We can assume that incrementing takes constant time, call this time c2, and thus:
T(n) = c2 * n2

1.2.2.3 Rates of growth

13-AlgorithmsSoftware/algorithm_comlexity.png

Rates of growth

Complexity class Name
O(1) constant
O(log log n) double log
O(log n) logarithmic
O(n) linear
O(n log n) quasi-linear
O(n^2) quadratic
O(n^c) polynomial
O(c^n) exponential
O(n!) factorial

Rates of growth
13-AlgorithmsSoftware/growthRates.png
Discuss:
* Small sizes don’t necessarily show same ordinal value as large sizes!
* Good at large input does not necessarily mean good at small input
* This actually matters for some algorithms, or implementations, maybe a server that processes lots of small batches
Mention: search versus sort trade-off for:
* Your inbox
* Google’s search implementation

Rate of growth?
How does T increase with n?

https://en.wikipedia.org/wiki/Linear_search

13-AlgorithmsSoftware/alg_linear_search.py
13-AlgorithmsSoftware/alg_linear_search_cfg.svg

#!/usr/bin/python3
# -*- coding: utf-8 -*-

from typing import List

def linear_search(numbers: List[int], key: int) -> int:
    for i, number in enumerate(numbers):
        if number == key:
            return i
    return -1

def main() -> None:
    numbers: List[int] = []
    numbers.append(2)
    numbers.append(4)
    numbers.append(7)
    numbers.append(10)
    numbers.append(11)
    numbers.append(32)
    numbers.append(45)
    numbers.append(87)

    print("NUMBERS: ")
    print(numbers)

    print("Enter a value: ")
    key = int(input())
    print("\n")

    # This takes O(n)
    keyIndex = linear_search(numbers, key)

    if keyIndex == -1:
        print(key)
        print(" was not found")
    else:
        print("Found ")
        print(key)
        print(" at index ")
        print(keyIndex)

if _name_ == "_main_":
    main()

Constant simple operations plus for() loop:
T(n) = c * n

1.2.3 Best, Worst, Average Cases

1.3 Definitions

Don’t worry if you don’t remember all the details of this.
We’ll cover it again in Data Structures, and again in Algorithms, and again, and again…

1.3.1 Big-Oh (O) upper bound

https://en.wikipedia.org/wiki/Big_O_notation
* Definition:
* For T(n) a non-negatively valued function, T(n) is in the set O(f(n)) if there exist two positive constants c and n0 such that T(n) <= c * f(n) for all n > n0:
* Meaning:
* For all data sets big enough (i.e., n > n0), the algorithm always executes in less than c * f(n) steps in {best, average, worst} case.

For example, an algorithm may be in in O(n2) in the {best, average, worst} case.

In image, everywhere to right of n0 (dashed vertical line) the lower line, f(n), is <= the top line, c * g(n), thus f(n) is in O(g(n)):

13-AlgorithmsSoftware/graphsO.png

If visiting and examining one value in the array requires cs steps, where cs is a positive number, and if the value we search for has equal probability of appearing in any position in the array, then in the average case T(n) = cs * n / 2.
For all values of n > 1, cs * n/2 <= cs * n.
Therefore, by the definition, T(n) is in O(n) for n0 = 1 and c = cs

1.3.1.2 A common mix-up Big-oh notation indicates an upper bound, and is NOT the same as worst case

1.3.1.3 Big-Oh (O) upper bound

Note: this definition uses set notation concepts: https://en.wikipedia.org/wiki/Set-builder_notation

1.3.2 Big-Omega (lower bound)

https://en.wikipedia.org/wiki/Big_Omega_function
* Omega(g(n)) = {T(n) : there exist positive constants
c, n0, such that
0 <= c * g(n) <= T(n)
for all n >= n0){width=700px
* g(n) is an asymptotic lower bound for T(n)
* Right plot below is Omega
13-AlgorithmsSoftware/graphs.png

1.3.3 Big-Theta (double-sided)

1.3.4 Little-oh (o)

https://en.wikipedia.org/wiki/Big_O_notation#Little-o_notation
* o(g(n)) = {T(n) : for any positive constant c > 0,
there exists a constant n0 > 0 such that
0 <= T(n) < c * g(n)
for all n >= n0){width=700px
* g(n) is an upper bound for T(n) that may or may not be asymptotically tight.

1.3.5 Little-omega

++++++++++++++++++
Cahoot-13.1
https://mst.instructure.com/courses/58101/quizzes/56221

**"There can be economy only where there is efficiency."**
    - Benjamin Disraeli

1.4 Analyzing programs

1.4.1 Rules to help simplify

If some function f(n) is an upper bound for your cost function, then any upper bound for f(n) is also an upper bound for your cost function.

1.4.1.1 Ignore lower order terms

Why?

1.4.1.2 Constants are discarded

You can ignore any multiplicative constants in equations when using big-Oh notation.

Why??

1.4.1.3 Combinations: sum

Given two parts of a program run in sequence (whether two statements or two sections of code), you need consider only the more expensive part.

Why??

1.4.1.4 Combinations: product

If some action is repeated some number of times, and each repetition has the same cost, then the total cost is the cost of the action multiplied by the number of times that the action takes place.

1.4.1.5 Polynomials

1.4.1.6 Log

1.5 Complexity

**"Simplicity is a great virtue,**
**but it requires hard work to achieve it,**
**and education to appreciate it.**
**And to make matters worse:**
**complexity sells better."**
    - https://en.wikipedia.org/wiki/Edsger_W._Dijkstra

++++++++++++++++++
Cahoot-13.2
https://mst.instructure.com/courses/58101/quizzes/56222

1.6 Classifying functions

How do we determine the order or growth rate of our code?

1.6.0.1 for() loops

Python style
~~~
sum = 0
for i in range(n):
sum = sum + n
~~~

C-style
~~~
sum = 0
for i = 1; i <= n; i = i + 1
sum = sum + n
~~~

1.6.0.2 Nested for() loops

Python style
~~~
k = 0
for i in range(n):
for j in range(n):
k = k + 1;
~~~

C-style
~~~
k = 0
for i = 0; i < n; i = i + 1
for j = 0; j < n; j = j + 1
k = k + 1;
~~~

Ask/Discuss: Are double nested for loops always n2 ?

1.6.0.3 Consecutive statements

Python-style
~~~
for i in range(n):
a[i] = 0

for i in range(n):
for j in range(n):
a[i] = a[i] + a[j] + i + j
~~~

C-style
~~~
for i = 0; i < n; i = i + 1
a[i] = 0

for i = 0; i < n; i = i + 1
for j = 0; j < n; j = j + 1
a[i] = a[i] + a[j] + i + j
~~~

1.6.0.4 if/else

1.7 Practice

1.7.1 Basics

1.7.1.1 Assignment?

a = b

Theta(?)

1.7.1.2 Simple for loop?

Go line-by-line, from the inside out.

Python style
~~~
sum = 0
for i in range(n):
sum = sum + n
~~~

C-style
~~~
sum = 0
for i = 1; i <= n; i = i + 1
sum = sum + n
~~~

Theta(?)

1.7.1.3 Mess of for loops?

Python-style
~~~
sum = 0
for i in range(n):
for j in range(i):
sum = sum + 1
for k in range(n):
A[k] = k
~~~

C-style
~~~
sum = 0
for i = 1; i <= n; i = i + 1
for j = 1; j <= i; j = j + 1
sum = sum + 1
for k=0; k<n; k = k + 1
A[k] = k
~~~

\(\sum_{i=1}^n i = n(n-1)/2\)

Theta(?)

1.7.2 Logarithms

https://en.wikipedia.org/wiki/Logarithm

1.7.2.1 Logarithm review

CS is mostly \(\log_2 x\)

Why?
13-AlgorithmsSoftware/Logarithm_plots.png
* Logarithm is the inverse operation to exponentiation
* \(\log_b x = y\)
* \(b^y=x\)
* \(b^{\log_b x}=x\)
* Example:
* \(\log_2 64 = 6\)
* \(2^6=64\)
* \(2^{\log_{2} 64}=64\)
* \(\log_2 x\)
* intersects x-axis at 1 and passes through the points with coordinates (2, 1), (4, 2), and (8, 3), e.g.,
* $_2 8 = 3 $
* \(2^3=8\)

1.7.3 Log cheatsheet

13-AlgorithmsSoftware/logs.png
* \(\log (nm) = \log n + \log m\)
* \(\log (\frac{n}{m}) = \log n - \log m\)
* \(\log(n^r) = r \log n\)
* \(\log_a n = \frac{\log_b n}{\log_b a}\)
(base switch)

Log general rule for algorithm analysis
* Algorithm is in O(log2 n) if it takes constant, O(1), time to cut the problem size by a fraction (which is usually 1/2).
* If constant time is required to merely reduce the problem by a constant amount, such as to make the problem smaller by 1, then the algorithm is in O(n)
* Caveat: Often, with input of n, an algorithm must take at least Omega(n) to read inputs.
* O(log2 n) classification often assumes input is pre-read…

++++++++++++++++++
Cahoot-13.3
https://mst.instructure.com/courses/58101/quizzes/56223

++++++++++++++++++
Cahoot-13.4
https://mst.instructure.com/courses/58101/quizzes/56224

"**Simplicity is the ultimate sophistication"**
- Source debated: Leonardo da Vinci? Clare Boothe Luce? Leonard Thiessen? Elizabeth Hillyer? William Gaddis? Eleanor All?

https://en.wikipedia.org/wiki/Binary_search

13-AlgorithmsSoftware/alg_binary_search.py
13-AlgorithmsSoftware/alg_binary_search_cfg.svg

#!/usr/bin/python3
# -*- coding: utf-8 -*-

from typing import List

def binary_search(numbers: List[int], key: int) -> int:
    low: int = 0
    high: int = len(numbers)
    mid = (high + low) // 2

    while high >= low and numbers[mid] != key:
        mid = (high + low) // 2

        if numbers[mid] < key:
            low = mid + 1
        elif numbers[mid] > key:
            high = mid - 1

    if numbers[mid] != key:
        mid = -1

    return mid

def main() -> None:
    numbers: List[int] = []
    numbers.append(2)
    numbers.append(4)
    numbers.append(7)
    numbers.append(10)
    numbers.append(11)
    numbers.append(32)
    numbers.append(45)
    numbers.append(87)

    print("NUMBERS: ")
    print(numbers)

    print("Enter a value: ")
    key = int(input())
    print("\n")

    # This takes around O(n log n)
    numbers.sort()

    # This takes O(log n)
    keyIndex = binary_search(numbers, key)

    if keyIndex == -1:
        print(key)
        print(" was not found")
    else:
        print("Found ")
        print(key)
        print(" at index ")
        print(keyIndex)

if _name_ == "_main_":
    main()

Binary search
13-AlgorithmsSoftware/binarySearch.png
* Inside the loop takes Theta(c)
* Loop starts with high - low = n - 1 and
finishes with high - low <= - 1.
* Each iteration, high - low must be at least halved from its previous value
* Number of iterations is at most log2(n - 1) + 2
* For example, if high - low = 128, then the maximum values of high - low after each iteration are 64, 32, 16, 8, 4, 2, 1, 0, -1
* This search is in Theta(log2 n)

An alternative way to think about this problem is recursively:
* T(n) = T(n / 2) + 1
* where for n > 1; T(1) = 1
* which means this is T(n) = log2 n

1.7.3.2 Exponentiation

https://en.wikipedia.org/wiki/Exponentiation
Compute bn in how many multiplications are involved?
13-AlgorithmsSoftware/alg_exp.py
13-AlgorithmsSoftware/alg_exp_cfg.svg

#!/usr/bin/python3
# -*- coding: utf-8 -*-

print("Enter base, hit enter, then exponent")
base: int = int(input())
exponent: int = int(input())
total: int = 1

for counter in range(0, exponent):
    total = total * base

print("Base ", base)
print(" to the power of ", exponent)
print(" is ", total)

Theta(?)

1.7.4 Nested loops always n^2 ?

Python-style
~~~
sum1 = 0
while k <= n: # Do log n times
k = k * 2
while j <= n: # Do n times
j = j + 1
sum1 = sum1 + 1

sum2 = 0
while k <= n: # Do log n times
k = k * 2
while j <= k: # Do k times
j = j + 1
sum2 = sum2 + 1
~~~

C-style
~~~
sum1 = 0
for k = 1; k <= n; k = k * 2 // Do log n times
for j = 1; j <= n; j = j + 1 // Do n times
sum1 = sum1 + 1

sum2 = 0
for k = 1; k <= n; k = k * 2 // Do log n times
for j = 1; j <= k; j = j + 1 // Do k times
sum2 = sum2 + 1
~~~

1.8 Space complexity

https://en.wikipedia.org/wiki/Space_complexity
* What are the space requirements for an array of of n integers?
* Discuss: mapping social networks for disease spread tracing. How do we represent this?
* To define binary connectivity between all elements with all other elements, we can use a fully connected matrix:
13-AlgorithmsSoftware/sparse.png
* dots imply connectivity
* What is Theta(?) here for space complexity?
* Can we do better for this kind of binary connectivity?

Discuss: space-time tradeoff
* Memoization, dynamic programming (simpler than we make it out to be).
* History of memory costs: memory is relatively cheaper compared to processing power than it used to be

**"A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away."**
    - Antoine de Saint-Exupery

1.9 Big picture

13-AlgorithmsSoftware/curves.png

Array sorting algorithms for example:
13-AlgorithmsSoftware/arraySorting.png

And data structures:
13-AlgorithmsSoftware/DStable.png

1.9.1 Algorithms as technology

Would you rather have a faster algorithm or a faster computer?
13-AlgorithmsSoftware/tech.png
Growth rate | old computer | 10x faster computer | Delta | ratio

1.9.2 Problem versus algorithms

1.10 Conclusions

13-AlgorithmsSoftware/joke-algorithm-is.jpg
Particularly in the case of academic publications…

Remember to https://en.wikipedia.org/wiki/KISS_principle

Next: 14-Recursion.html