1 10-DebuggingTesting

Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it.
- Brian W. Kernighan

1.1 Training and expertise

Like skill in some athletic or musical discipline, debugging is a trainable skill, and it’s limit in learning continues more or less indefinitely if you take an ever-learning approach!
It is a common area where young students who know a programming language well to think they can compete, but are trounced by the experts who’ve been coding for many more years, and not just by a little bit.
The ability to implement computing solutions to complex problems follows a similar learning curve and pattern over expertise and years of experience.
With an equal knowledge of a programming language, the same problem that might take you 3,000 lines to solve in a week, might take an expert 50 lines in an hour.
These between-the-lines skills are one my primary goals for you to take away in this class (not just python syntax, which is less critical than the debugging skills).
- Don’t ignore my debugging tips below; I see that frequently, and it doesn’t bode well for your success in coding…
- I have seen a lot of student mistakes over the years, and enumerate some of the most common below:

1.2 Screencasts

Password for the Vimeo videos is in Zulip chat.
https://vimeo.com/517294395
Tip: If anyone want to speed up the lecture videos a little, inspect the page, go to the browser console, and paste this in:
document.querySelector('video').playbackRate = 1.2

1.3 Reading

https://automatetheboringstuff.com/2e/chapter11/
http://inventwithpython.com/invent4thed/chapter6.html
https://python.swaroopch.com/problem_solving.html
http://greenteapress.com/thinkpython2/thinkpython2.pdf (each chapter has a section on debugging)

1.4 Troubleshooting in general

https://en.wikipedia.org/wiki/Troubleshooting
* A basic principle in troubleshooting is to start from the simplest and most probable possible problems first.
* This is illustrated by the old saying “When you see hoof prints, look for horses, not zebras”, or to use another maxim, use the KISS principle
* https://en.wikipedia.org/wiki/KISS_principle
* This principle results in the common complaint about help desks or manuals, that they sometimes first ask:
* “Is it plugged in, and does that receptacle have power?”
* Always check the simple things first, before calling for help.
* A troubleshooter could check each component in a system one by one, substituting known good components for each potentially suspect one.
* However, this process of “serial substitution” can be considered degenerate when components are substituted without regard to a hypothesis concerning how their failure could result in the symptoms being diagnosed.
* Randomly changing your code is even worse!
* Simple and intermediate systems can be characterized by lists or trees of dependencies among their components or subsystems.
* More complex systems can contain cyclical dependencies or interactions (feedback loops), and require graph models.
* Such systems are less amenable to “bisection” troubleshooting techniques.

1.4.1 Half-splitting / binary search / bisection

Efficient methodical troubleshooting starts with a clear understanding of the expected behavior of the system and the symptoms being observed.
’The troubleshooter forms hypotheses on potential causes, and devises (or perhaps references a standardized checklist of) tests, to eliminate these prospective causes.
This approach is often called “divide and conquer”.
One common strategies used by troubleshooters is to check for frequently encountered or easily tested conditions first
- For example, checking to ensure that a printer’s light is on and that its cable is firmly seated at both ends.
- This is often referred to as “milking the front panel.” or “searching for your keys under the spotlight.”
Another is to “bisect” the system
- For example in a network printing system, checking to see if the job reached the server to determine whether a problem exists in the subsystems “towards” the user’s end or “towards” the device.
- This technique can be particularly efficient in systems with long chains of serialized dependencies or interactions among its components.
- It is an the application of a binary search across the range of dependencies
- It is often referred to as “half-splitting”.
We’ll go over binary search through ordered containers soon!

1.4.2 Hypothesis testing

The general scientific method of abduction and hypothesis testing can be used to find bugs!
https://en.wikipedia.org/wiki/Abductive_reasoning
https://en.wikipedia.org/wiki/Scientific_method
https://en.wikipedia.org/wiki/Experiment

1.5 General programming errors (software bugs)

https://en.wikipedia.org/wiki/Software_bug
A software bug is an error, flaw, failure, or fault in a computer program or system that causes it to produce an incorrect or unexpected result, or to behave in unintended ways, often but not always crashing.

1.5.1 General types of errors

Programming errors come in several forms, and vary by language type:
https://en.wikipedia.org/wiki/Software_bug#Types

1.5.1.1 Arithmetic

Division by zero.
Arithmetic overflow or underflow.
Loss of arithmetic precision, due to rounding or numerically unstable algorithms.

1.5.1.2 Compile-time / syntax errors (won’t run at all)

For compiled languages (C, C++, Rust, etc.), these are caught by compiler.
For interpreted languages (Python, Lua, Ruby, Julia, Bash, etc.), these are the analogue of compile-time errors, known as interpreter or syntax errors.
A compile-time error results from the programmer’s misuse of the language.
A syntax error is a common compile-time error.
The compiler can only translate a program if the program is syntactically correct;
- otherwise, the compilation fails and you will not be able to run your program.
Syntax refers to the structure of your program and the rules about that structure.
E.g.,
- missing ; or ) or }
- mis-spelled keyword
- invalid construct in the language itself
- use of the wrong operator, such as performing assignment instead of equality test.
  - For example, in some languages x=5 will set the value of x to 5 while x==5 will check whether x is currently 5 or some other number.
Interpreted languages (e.g., Python) have many syntax errors types, and can easily catch these before/at runtime.
Compiled languages can catch syntax errors too, also before testing/running begins.
- A compiler ensures that the structural rules of the language are not violated.
- It can detect, for example, the malformed assignment statement and the use of a variable before its declaration.
In general, compilers catch more types of errors earlier (at this stage) than interpreters, which let errors through to later stages of processing.
These are easy to find!

1.5.1.3 Run-time errors (crashes)

Run-time errors do not appear until you run the program.
A program may not run to completion but instead terminate with an error.
We commonly say the program “crashed.”
Examples include:
- Mismatch of data type (usually only produces run-time error in interpreted programs).
- Array index out of range.
- A number divided by zero.
- An incompatible value input
Some error types at this stage that would be caught be a compiler (use of a variable before its declaration) are NOT caught at at early stage in interpreted languages like python, instead producing a run-time crash.
These are usually medium-difficulty to find.

1.5.1.4 Logic errors and semantics (strange outputs)

If there is a logical error in your program, it may compile and run successfully, and the computer might not generate any error messages, but it will produce incorrect output.
The problem is that the program you wrote is not the program you wanted to write.
The meaning of the program (its semantics) is wrong.
Errors that escape compiler detection (run-time errors and logic errors) are commonly called bugs.
Since the pre-processor and compiler are unable to detect these problems, such bugs are the major source of frustration for developers.
The frustration often arises, because in complex programs, the bugs sometimes only reveal themselves in certain situations that are difficult to reproduce exactly during testing.
These come in infinite variety, with common examples:
- Infinite loops and infinite recursion.
- Off-by-one error, counting one too many or too few when looping (the most common bug)!!!
These types of errors are more common and troublesome in dynamically typed interpreted scripting languages like python, because the compiler does not “mommy” you by checking all the types of errors possible to check, letting them slip through and produce logic errors!
- Next class, we’ll talk more about the new feature of python, “type hinting”, which elegantly addresses that issue in python.
These are often hard to find, and harder in interpreted languages!

1.5.1.5 Resource errors / Environment issues

In some compiled languages (C, C++, assembly/asm) attempting to do low-level stuff:
* Null pointer de-reference.
* Using an uninitialized variable.
* Using an otherwise valid instruction on the wrong data type (see packed decimal/binary coded decimal).
* Access violations.
* Resource leaks, where a finite system resource (such as memory or file handles) become exhausted by repeated allocation without release.
* Buffer overflow, in which a program tries to store data past the end of allocated storage. This may or may not lead to an access violation or storage violation.
* These are known as security bugs.
* Excessive recursion which, though logically valid, causes stack overflow.
* Use-after-free error, where a pointer is used after the system has freed the memory it references.
* Double free error (double delete a memory allocation).
* These can be hard to find, but don’t have to be!
* Languages like https://www.rust-lang.org/ actually pre-check much of this, functionally reducing or eliminating such issues, in addition to pre-checking all the features normally checked by a compiled language, producing a security-safe language!

1.5.2 Practice on error types

++++++++++++++++++
Cahoot-10.1
What kind of error is this?
https://mst.instructure.com/courses/58101/quizzes/56033

10-DebuggingTesting/debug_error_00.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

def division(numerator: float, denominator: float) -> float:
    answer = numerator / denominator
    return answer

print(division(37, 0))

++++++++++++++++++
Cahoot-10.2
What kind of error is this?
https://mst.instructure.com/courses/58101/quizzes/56034

10-DebuggingTesting/debug_error_01.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

def division(numerator: float, denominator: float) -> float:
    answer = (numerator / denominator
    return answer

print(division(37, 2))

++++++++++++++++++
Cahoot-10.3
What kind of error is this?
https://mst.instructure.com/courses/58101/quizzes/56035

10-DebuggingTesting/debug_error_02.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

def division(numerator: float, denominator: float) -> float:
    answer = numerator * denominator
    return answer

print(division(37, 2))

++++++++++++++++++
Cahoot-10.4
What kind of error is this?
https://mst.instructure.com/courses/58101/quizzes/56036

10-DebuggingTesting/debug_error_03.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

from typing import List

small_array: List[int] = [0, 1, 2]
small_array[3] = 3

Show this one: $ ipython3 debug_error_03.py

1.6 Python error types

Python itself has error types (called exceptions)
https://docs.python.org/3/library/exceptions.html
10-DebuggingTesting/exception_hierarchy.png
10-DebuggingTesting/debug_exceptions.py (in spyder3)

#!/usr/bin/python3
# -*- coding: utf-8 -*-

while True print('Hello world'):

10 * (1 / 0)

'2' + 2

Print('hello world')

print 'hello world'

print(hello world)

hello world

d = {1: 1, 2: 2}
d[3]

li = [1, 2, 3]
li[4]

f = open('nonexistantfile.txt', 'r')

# Note: check BEFORE the line python whines about:
if 1 == 1:
    print('hey'
    print('another')

The full tree of python3’s built-in exceptions is:

Tips for reading exceptions and finding little errors:
https://realpython.com/python-traceback/ (how to read basic errors!)
* Read them from the bottom up!
* Read them entirely!!
* Check it out in spyder3 (which uses the ipython3 terminal)
* Run in ipython3 instead of python3
* Syntax errors are often auto-pinpointed just AFTER they actually occur, so look before it tells you.
* Use auto-highlighting in your IDE, select ([{ by clicking at the end of the line!

1.7 Debugging

https://en.wikipedia.org/wiki/Debugging

Coming up with diagnostically informative input test cases can be a challenge, but is one of the key skills in learning to debug.
Often, you may want to try input cases near boundary conditions, for example, near the last expected iterations, where you may have one too many or one too few iterations actually executed.
Further, you may want to try a progressive range of inputs in increasing value or size, to see at which point the output starts to be incorrect.
We briefly told you that using output statements to debug was helpful.
You may have some idea where your bug is, or no idea at all.
For a program that is crashing, several simplified methods can help you use print statements more effectively:
- If you have no idea where a bug is in your program, then inserting an output statement in the middle might be a good idea.
  - If it prints to the screen, then put another output statement in the middle of the second half of the program.
- If it did not print to the screen, then move the output statement to the middle of the previously tested half.
We will cover a search method soon called binary search, which implements a similar algorithm.
If you have some idea where your bug is, then inserting a output statement nearby is a good idea.
- If it prints to the screen, move the statement slightly later, if it does not, then move it slightly earlier.
For a program that is not crashing, but is outputting wrong values, a similar approach can be taken.
- Instead of determining whether the output statement printed to the screen, determine where the value first deviates from it’s expected value.
- To to this, one can either take a binary search approach (above), target a suspect area or loop, or start at the beginning and work one’s way down through through the program.
- You would insert output statements in progressively changing locations to step-wise pinpoint where these values differ from expected.
Put print statements just before and after for/while/if/input statements, and include your loop control variables!

1.7.0.1 Debugging as diagnosis

One of the most important skills you should acquire as a programmer is debugging.
Debugging is like detective work, in that you are confronted with clues and you have to infer the processes and events that lead to the results you see.
Debugging is also like an experimental science, in that once you have an idea what is going wrong, you modify your program and try again.
If your hypothesis was correct, then you can predict the result of the modification, and you take a step closer to a working program.
If your hypothesis was wrong, you have to come up with a new one.

1.7.1 Programming process

Programming itself is can be seen as a process of gradually debugging a program until it does what you want.
You should always start with a working program that does something, and make small modifications, debugging them as you go, so that you always have a working program!!!!!!!!!!!!!!!!!!!!!!
Note: read the line above again…

1.7.2 Don’t randomly change your code

Debugging is accomplished by gathering data until you understand the cause of the problem.
Debugging is accomplished by comparing the data that you have to what you know the data from a working system should look like.
Do not change your code haphazardly trying to track down a bug.
This is like a scientist who changes more than one variable at a time.
- It is also why natural correlation studies are usually not confidently interpretable
It makes the observed behavior much more difficult to interpret, and you tend to introduce new bugs.
Don’t just change code and “hope” you’ll fix the problem!
Instead, make the bug reproducible, then use methodical “Hypothesis Testing”:

1.7.3 Hypothesis testing

While(bug)
    Ask, what is the simplest input that produces the bug?
    Identify assumptions that you made about program operation that could be false.
    Ask yourself "How does the outcome of this test/change guide me toward finding the problem?"

1.7.4 General principles

Clues to what is wrong in your code exist in the values of your variables and the flow of control.
If your code was working a minute ago, but now it doesn’t what was the last thing you changed???
- Change one thing at a time, then test!!
- Test your code as you, go rather than all at once.
If you find some wrong code which does not seem to be related to the bug you were tracking, fix the wrong code anyway.
- The wrong code can be related to or obscured the bug in a way you had not imagined.
Debugging depends on an objective and reasoned approach, not on frantic shotgun or blind changes!!
It depends on overall perspective and understanding of the workings of your code.
Be systematic, sequential, and sloooow!

1.7.4.1 Basics

Normally the first step in debugging is to attempt to reproduce the problem!
- Reproduce it repeatedly, thoroughly, and under varied conditions.
- Attempt to reproduce it in simpler conditions, progressively narrowing down on the simplest condition that shows the bug!!!!!!!!!!!!!!!!!!
  - Cut the little snippet out, and run it in it’s own ipython3 terminal.
  - This can be a non-trivial task, for example as with parallel processes or some unusual software bugs.
Also, specific user environment and usage history can make it difficult to reproduce the problem.
After the bug is reproduced, the input of the program may need to be simplified to make it easier to debug.
For example, a bug in a compiler can make it crash when parsing some large source file.
However, after simplification of the test case, only few lines from the original source file can be sufficient to reproduce the same crash.
Such simplification can be made manually, using a divide-and-conquer approach.
The programmer will try to remove some parts of original test case and check if the problem still exists.
When debugging the problem in a GUI, the programmer can try to skip some user interaction from the original problem description and check if remaining actions are sufficient for bugs to appear.
After the test case is sufficiently simplified, a programmer can use a debugger tool to examine program states (values of variables, plus the call stack) and track down the origin of the problem(s).
Alternatively, tracing can be used.
- In simple cases, tracing is just a few print statements, which output the values of variables at certain points of program execution.

1.7.4.2 Debugging patterns

https://en.wikipedia.org/wiki/Debugging_patterns

A debugging pattern describes a generic set of steps to rectify or correct a bug within a software system.
It is a solution to a recurring problem that is related to a particular bug or type of bug in a specific context.
A bug pattern is a particular type of pattern.
Some examples of debugging patterns include:

Eliminate Noise Bug Pattern
* Isolate and expose a particular bug by eliminating all other noise in the system.
* This enables you to concentrate on finding the real issue.

Recurring Bug Pattern
* Expose a bug via a unit test.
* Run that unit test as part of a standard build from that moment on.
* This ensure that the bug will not recur.

Time Specific Bug Pattern
* Expose the bug by writing a continuous test that runs continuously and fails when an expected error occurs.
* This is useful for transient bugs.

1.7.5 Techniques

Interactive debugging (below)
Print debugging
- (or tracing) is the act of watching (live or recorded) trace statements, or print statements, that indicate the flow of execution of a process.
- This is sometimes called printf debugging, due to the use of the printf function in C.
- This kind of debugging was turned on by the command TRON in the original versions of the novice-oriented BASIC programming language.
- TRON stood for, “Trace On.” TRON caused the line numbers of each BASIC command line to print as the program ran.
Remote debugging
- is the process of debugging a program running on a system different from the debugger.
- To start remote debugging, a debugger connects to a remote system over a communications link such as a local area network.
- The debugger can then control the execution of the program on the remote system and retrieve information about its state.
Post-mortem debugging
- is debugging of the program after it has already crashed.
- Related techniques often include various tracing techniques and/or analysis of memory dump (or core dump) of the crashed process.
- The dump of the process could be obtained automatically by the system (for example, when the process has terminated due to an un-handled exception), or by a programmer-inserted instruction, or manually by the interactive user.
“Wolf fence” algorithm:
- Edward Gauss described this simple but very useful and now famous algorithm in a 1982 article for communications of the ACM as follows:
- “There’s one wolf in Alaska; how do you find it? First build a fence down the middle of the state, wait for the wolf to howl, determine which side of the fence it is on. Repeat process on that side only, until you get to the point where you can see the wolf.”
- This is implemented e.g. in the Git version control system as the command git bisect, which uses the above algorithm to determine which commit introduced a particular bug.
- Recall bisect and binary search?
Causality tracking:
- There are automatic techniques to track the cause effect chains in the computation.
- Those techniques can be tailored for specific bugs, such as null pointer de-references.
Shotgun debugging
- can be defined as: A process of making relatively un-directed changes to software in the hope that a bug will be perturbed out of existence.
- Using the approach of trying several possible solutions of hardware or software problem at the same time, in the hope that one of the solutions (typically source code modifications) will work.
- Shotgun debugging has a relatively low success rate and can be very time consuming, except when used as an attempt to work around programming language features that one may be using improperly.
- When combined with domain expertise and a strong intuition for the underlying codebase, it can be a good starting point to gut-solve a buggy piece of code a few times before formally researching the corresponding error message.
  - It should be a starting point only!
- When used in this way, it may be a valuable technique that is faster than browsing through the Internet searching a particular error message every time.

1.7.6 Software debuggers

Better known as “tracing” code.

1.7.6.1 Simple step-through software!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

03-IntroPython.html illustrates several good ones for python!

Spyder3 debugger and pudb3 are what many people are amazed to find exists for many languages.
In C++ for example, you don’t normally get to step through your code, see your run-time variables, etc., during normal run-time. It just runs in a black box, taking input, and producing output virtually instantly.
In C++, the gdb debugger provides this functionality
- cgdb, qtcreator, kdevelop all provide nice front-ends (qtcreator the best!).
In Python, the debuggers: pudb, Spyder, and IDLE all provide step-through debugger capabilities.
Such debuggers are not only helpful for finding bugs, but also for teaching how programs follow a strict sequential “trace” of the code.
- Being able to trace like this in your head is very important, and using such a debugger is a great way to learn that skill.
Other languages have similar debuggers:
- https://en.wikipedia.org/wiki/List_of_debuggers
- https://en.wikipedia.org/wiki/Comparison_of_debuggers
Fun fact: these are handy for reverse engineering, and cracking DRM on software: https://en.wikipedia.org/wiki/Software_cracking

Note: Actually trace the code I give you in these!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

1.7.6.2 Active automated bug finding software

Some bug-finding software actually finds bugs for you.
For example, this study I did: 10-DebuggingTesting/DMSVIVA_2018.pdf
It does not work on all programs though.
This software requires that:
1. your program is good enough to produce correct output sometimes, and not sometimes, and
2. that you actually find some inputs that produce correct outputs and some inputs that produce incorrect outputs.
Once you have done that, the experimental debugger can give you a hint as to which line the bug may be on.
Other advanced approaches exist:
- https://en.wikipedia.org/wiki/Automatic_bug_fixing
- https://en.wikipedia.org/wiki/Software_testing#Automated_testing

1.8 Heisenbugs

https://en.wikipedia.org/wiki/Heisenbug
:)
When you get to languages like C++, the very act of printing to find a bug, may hide your bug!
Oh the uncertainty!

1.9 Software testing

https://en.wikipedia.org/wiki/Software_testing

There are many approaches available in software testing.
Reviews, walk-throughs, or inspections are referred to as static testing, pre-processing, or linting, whereas executing programmed code with a given set of test cases is referred to as dynamic testing.

1.9.1 Static, dynamic, passive, active

Static testing
- often implicit, like proofreading, plus when programming tools/text editors check source code structure or compilers (pre-compilers) check syntax and data flow as static program analysis.
Dynamic testing
- takes place when the program itself is run.
- Dynamic testing may begin before the program is 100% complete in order to test particular sections of code and are applied to discrete functions or modules.
- Typical techniques for these are either using stubs/drivers or execution from a debugger environment.
Passive testing
- means verifying the system behavior without any interaction with the software product.
Active testing,
- Testers do not provide any test data but look at system logs and traces.
- They mine for patterns and specific behavior in order to make some kind of decisions. This is related to offline run-time verification and log analysis.

1.9.2 Unit, integration, and validation testing

Unit testing tests little code chunks
Integration testing tests bigger aggregate chunks
Validation testing tests for fulfilling purpose, often by humans

1.9.2.1 1. Unit testing

https://en.wikipedia.org/wiki/Unit_testing
* Unit testing is a software testing method by which individual units of source code, sets of one or more computer program modules together with associated control data, usage procedures, and operating procedures, are tested to determine whether they are fit for use.
* Unit tests are typically automated tests written and run by software developers to ensure that a section of an application (known as the “unit”) meets its design and behaves as intended.
* The unit is typically a function and it’s return value or changes produced.
* To isolate issues that may arise, each test case should be tested independently.

Each test should be testing one (and only one) small unit of your software (typically a function)!
A single test is typically broken up into:
- Bootstrapping your test
- Expected behavior
- Actual behavior (running the code unit that is being tested)
- Comparing expected to actual behavior
- If actual behavior does not match expected behavior,
  - print out an informative message describing the problem

1.9.2.1.1 Flaws

Testing will not catch every error in the program, because it cannot evaluate every execution path in any but the most trivial programs.
This problem is a super-set of the halting problem, which is undecidable.
The same is true for unit testing.
Additionally, unit testing, by definition, only tests the functionality of the units themselves.
Therefore, it will not catch integration errors, or broader system-level errors (such as functions performed across multiple units, or non-functional test areas such as performance).

1.9.2.1.2 Frameworks

Different pre-built testing frameworks exist in many languages:
https://en.wikipedia.org/wiki/List_of_unit_testing_frameworks

For example, in Python:
https://en.wikipedia.org/wiki/List_of_unit_testing_frameworks#Python

One really neat one we will cover later this semester is:
https://en.wikipedia.org/wiki/Doctest

1.9.2.2 2. Integration testing

https://en.wikipedia.org/wiki/Integration_testing
* Integration testing (sometimes called integration and testing, abbreviated I&T) is the phase in software testing in which individual software modules are combined and tested as a group.
* Integration testing is conducted to evaluate the compliance of a system or component with specified functional requirements.
* It occurs after unit testing and before validation testing.

1.9.2.3 3. Validation testing

https://en.wikipedia.org/wiki/Software_verification_and_validation
In software project management, software testing, and software engineering, verification and validation (V&V) is the process of checking that a software system meets specifications and that it fulfills its intended purpose.
It may also be referred to as software quality control.

1.9.3 Development strategies

https://en.wikipedia.org/wiki/Test-driven_development (I suggest reading this one)

Test-driven development (TDD) is a software development process that relies on the repetition of a very short development cycle:
- Requirements are turned into very specific test cases, then the software is improved so that the tests pass.
- This is opposed to software development that allows software to be added that is not proven to meet requirements.
TDD emphasizes writing tests first that guide the development of your code.
Don’t write your code first!
Instead, start by writing tests
Progression of development: Fail, Pass, Repeat
- Scaffold your code:
- Define interface (i.e. header files),
- Return dummy values that do nothing and cause your test
- to fail
- Implement a single test: bootstrap, expected behavior,
- actual behavior, comparison
- Run your tests (it should fail)
- Go back to code and fix it
- Run your tests (it should pass)
- Move on to next test

1.10 Examples of debugging

1.10.1 Example 1: sorting bug

The program given below is supposed to sort a 1D array of integers into ascending order.
For example, {3, 2, 1} should become {1, 2, 3}.

10-DebuggingTesting/debug_example1.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

from typing import List

# line 1
swapped: int = 1
a: List[int] = [3, 2, 1]

while swapped == 1:  # line 2
    swapped = 0  # line 3
    for i in range(1, len(a)):  # line 4
        if a[i - 1] > a[i]:  # line 5
            a[i] = a[i - 1]  # line 6
            a[i - 1] = a[i]  # line 7
            swapped = 1  # line 8
            # line 9
        # line 10
    # line 11
# line 12

Control flow graph for the code
10-DebuggingTesting/pasted_image.png
The code contains a bug (i.e., it does not produce correct results for some inputs).
Given below are the results for 5 test cases including a code trace for each test case:

Via graph analysis, we determine the bug is in lines 6-9 of the code.
Namely, the code isn’t swapping a[i] and a[i-1] correctly.

10-DebuggingTesting/debug_example1_fix.py

1.10.2 Example 2: convert to binary bug

The program given below is supposed to output the binary representation of a specified positive decimal number.
- Note: this is different than the version we had you implement before.
For example, the output for the decimal number 39 should be 100111.

10-DebuggingTesting/debug_example2.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# line 1
n: int = 10
previous_power: int = 0
m: int = 0

gap: int = 0
total: int = 0

while n > 0:  # line 2
    m = 0  # line 3

    if (n % 2) != 0:  # line 4
        total = 1  # line 5
    # line 6
    else:
        total = 2  # line 7
    # line 8
    while (total * 2) <= n:  # line 9
        total = total * 2  # line 10
        m += 1  # line 11
    # line 12

    for gap in range(1, previous_power - m):  # line 13
        print(0)  # line 14
    # line 15

    print(1)  # line 16
    n = n - total  # line 17
    previous_power = m  # line 18
# line 19

for trailingZero in range(m, 0, -1):  # line 20
    print(0)  # line 21
# line 22

print("\n")  # line 23
# line 24

Control flow graph for the code:
10-DebuggingTesting/pasted_image002.png

The code contains a bug (i.e., it does not produce correct results for some inputs).
Given below are the results for 5 test cases including a code trace for each test case:

Via graph analysis, we determine the bug is in lines 7-8 of the code.
Namely, total should never be set to 2, it should always be set to 1.

10-DebuggingTesting/debug_example2_fix.py

1.10.3 Example 3: Caesar cipher unit tests (no bug)

Here are some unit tests for the Caesar functions we wrote last time:
10-DebuggingTesting/debug_unit_tests.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

from typing import List
import random

caesar_encoding: str = "ABCDEFGHIJKLMNOPQRSTUVWXYZ "

def key_gen() -> int:
    """
    Generates one Caesar key
    """
    # Correct
    key = random.randint(1, 26)

    # Bug
    # key = random.randint(0, 27)
    return key

def key_gen_test() -> bool:
    # What else could we test here?
    # What about randomness?
    # Will this test always work?
    # Does it guarantee correctness?
    for c in range(300):
        if (key_gen() > 26) or (key_gen() < 1):
            print("key_gen_test() failed")
            return False
    return True

def str_to_num_arr(message: str) -> List[int]:
    """
    Translates a string into a Caesar-encoded List
    """
    arr: List[int] = []
    for character in message:
        arr.append(caesar_encoding.find(character.upper()))
    return arr

def str_to_num_arr_test() -> bool:
    # TODO
    return True

def num_arr_to_str(encoded_arr: List[int]) -> str:
    """
    Translates a Caesar encoded list back into a string
    """
    plaintext: List[str] = []
    for counter, encoded_char in enumerate(encoded_arr):
        plaintext.append(caesar_encoding[encoded_char])
    plaintext_string = "".join(plaintext)
    return plaintext_string

def num_arr_to_str_test() -> bool:
    # TODO
    return True

def translate(encoded_arr: List[int], mode: int, key: int) -> List[int]:
    """
    Encrypts or decryps a Caesar-encoded List of ints
    """
    translated: List[int] = []
    for encoded_char in encoded_arr:
        if mode == 1:
            # 27 is the symbol set size (# letters in alphabet)
            # Note the space added above (bug from last time)!
            translated.append((encoded_char + key) % 27)
        else:
            translated.append((encoded_char - key) % 27)
    return translated

def translate_test() -> bool:
    # TODO
    return True

def encrypt_test() -> bool:
    for character in caesar_encoding:
        for key in range(1, 26):
            # Encrypt
            message_arr = str_to_num_arr(character)
            message_arr = translate(message_arr, 1, key)
            message = num_arr_to_str(message_arr)

            # Decrypt
            message_arr = str_to_num_arr(message)
            message_arr = translate(message_arr, 0, key)
            message = num_arr_to_str(message_arr)

            # check
            if message != character:
                print("encrypt_test() failed")
                return False
    return True

def run_tests() -> bool:
    """
    Note: greedy quitting with booleans means that
    you should put the more aggregate/complicated tests last,
    and the simple tests first.
    """
    return (
        key_gen_test()
        and str_to_num_arr_test()
        and num_arr_to_str_test()
        and translate_test()
        and encrypt_test()
    )

# Only runs main if tests pass:
if _name_ == "_main_" and run_tests():
    message: str = input("\nEnter your message, in English:\n")
    gen_key: str = input("Want to generate a key? (y/n)")

    if gen_key == "y":
        ok: int = 0
        while ok == 0:
            key = key_gen()
            print("Is the key it ok with you (1-yes, 0-no, make another): ")
            ok = int(input())
    else:
        key = int(input("What is your key (0-25)?"))

    print("\nYour Caesar key is: ")
    print(key)
    print("\n Share this with your partner. Don't tell anyone else\n")

    print("\nEnter 1 for encryption, and 0 for decryption: ")
    mode: int = int(input())
    message_arr = str_to_num_arr(message)
    message_arr = translate(message_arr, mode, key)
    message = num_arr_to_str(message_arr)
    print(message)

Ask in class: How would we complete the # TODO tests above?

You should write tests like this for every single programming assignment!

1.10.4 Example 4: hangman game (no overt bug)

A child’s game from the below book:
https://en.wikipedia.org/wiki/Hangman_(game)
http://inventwithpython.com/invent4thed/chapter7.html
http://inventwithpython.com/invent4thed/chapter8.html
http://inventwithpython.com/invent4thed/chapter9.html

10-DebuggingTesting/hangman.py (no typing)
10-DebuggingTesting/hangman_cfg.svg

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# TODO next time, I'll show you how to auto-type code you download,
# like this I downloaded from one of the free books in this class:

import random

HANGMAN_PICS = [
    """
  +---+
      |
      |
      |
     ===""",
    """
  +---+
  O   |
      |
      |
     ===""",
    """
  +---+
  O   |
  |   |
      |
     ===""",
    """
  +---+
  O   |
 /|   |
      |
     ===""",
    """
  +---+
  O   |
 /|\  |
      |
     ===""",
    """
  +---+
  O   |
 /|\  |
 /    |
     ===""",
    """
  +---+
  O   |
 /|\  |
 / \  |
     ===""",
]
words = "ant baboon badger bat bear beaver camel cat clam cobra cougar coyote crow deer dog donkey duck eagle ferret fox frog goat goose hawk lion lizard llama mole monkey moose mouse mule newt otter owl panda parrot pigeon python rabbit ram rat raven rhino salmon seal shark sheep skunk sloth snake spider stork swan tiger toad trout turkey turtle weasel whale wolf wombat zebra".split()

def getRandomWord(wordList):
    # This function returns a random string from the passed list of strings.
    wordIndex = random.randint(0, len(wordList) - 1)
    return wordList[wordIndex]

def displayBoard(missedLetters, correctLetters, secretWord):
    print(HANGMAN_PICS[len(missedLetters)])
    print()

    print("Missed letters:", end=" ")
    for letter in missedLetters:
        print(letter, end=" ")
    print()

    blanks = "_" * len(secretWord)

    # replace blanks with correctly guessed letters
    for i in range(len(secretWord)):
        if secretWord[i] in correctLetters:
            blanks = blanks[:i] + secretWord[i] + blanks[i + 1 :]

    # show the secret word with spaces in between each letter
    for letter in blanks:
        print(letter, end=" ")
    print()

def getGuess(alreadyGuessed):
    # Returns the letter the player entered.
    # This function makes sure the player entered a single letter,
    # and not something else.
    while True:
        print("Guess a letter.")
        guess = input()
        guess = guess.lower()
        if len(guess) != 1:
            print("Please enter a single letter.")
        elif guess in alreadyGuessed:
            print("You have already guessed that letter. Choose again.")
        elif guess not in "abcdefghijklmnopqrstuvwxyz":
            print("Please enter a LETTER.")
        else:
            return guess

def playAgain():
    # This function returns True if the player wants to play again;
    # otherwise, it returns False.
    print("Do you want to play again? (yes or no)")
    return input().lower().startswith("y")

def main():
    print("H A N G M A N")
    missedLetters = ""
    correctLetters = ""
    secretWord = getRandomWord(words)
    gameIsDone = False

    while True:
        displayBoard(missedLetters, correctLetters, secretWord)

        # Let the player enter a letter.
        guess = getGuess(missedLetters + correctLetters)

        if guess in secretWord:
            correctLetters = correctLetters + guess

            # Check if the player has won.
            foundAllLetters = True
            for i in range(len(secretWord)):
                if secretWord[i] not in correctLetters:
                    foundAllLetters = False
                    break
            if foundAllLetters:
                print('Yes! The secret word is "' + secretWord + '"! You have won!')
                gameIsDone = True
        else:
            missedLetters = missedLetters + guess

            # Check if player has guessed too many times and lost.
            if len(missedLetters) == len(HANGMAN_PICS) - 1:
                displayBoard(missedLetters, correctLetters, secretWord)
                print(
                    "You have run out of guesses!\nAfter "
                    + str(len(missedLetters))
                    + " missed guesses and "
                    + str(len(correctLetters))
                    + ' correct guesses, the word was "'
                    + secretWord
                    + '"'
                )
                gameIsDone = True

        # Ask the player if they want to play again (but only if the game is done).
        if gameIsDone:
            if playAgain():
                missedLetters = ""
                correctLetters = ""
                gameIsDone = False
                secretWord = getRandomWord(words)
            else:
                break

if _name_ == "_main_":
    main()

How would you design unit tests for the functions in this game?
How would you design integration tests for the main in this game?
How could you check to make sure there are no superficial type-conflicts (a type of bug)?

1.11 Conclusions

Remember:
10-DebuggingTesting/planning_coding.jpg
…that was sarcastic, and in case you didn’t notice, reverse it!