1 18-InputOutput


Previous: 17-Exceptions.html

On this page, we cover the Linux pipe, Standard I.O. redirection, and Python file handling!
18-InputOutput/penguin_pipe.jpg

1.1 Screencasts

1.2 Program-internal function input and output (I.O.)

Within a function, you can access other variables in your program in a variety of ways:
18-InputOutput/funcio.png
For example in python:

def procedure(input_parameter_values, thing_by_reference) -> return_value:
    global some_global
    print(output_parameter_values)
    some_global = value
    thing_by_reference[1] = value

++++++++++++++ Discussion question:
What are some advantages of shells over dedicated non-shell programming languages?

Note:
Heap (and stack) we deal with later, in C and C++

1.3 Standard in and out (standard stream IO)

This is actually an OS/Shell topic (bigger than just python).
Command line programs use standard input, output, and error, to interact with the user.
Graphical User Interface programs generally do not use this type if IO.

https://en.wikipedia.org/wiki/Standard_streams
https://ryanstutorials.net/linuxtutorial/piping.php
https://www.informit.com/articles/article.aspx?p=2854374&seqNum=5
https://catonmat.net/bash-one-liners-explained-part-three
https://ss64.com/bash/syntax-redirection.html
https://www.w3resource.com/linux-system-administration/i-o-redirection.php

From within your choice of shell,
you can trade data between processes running within the operating system,
as well as the keyboard, monitor/screen, and files on disk.
The keyboard is used to enter a command,
and respond to an interactive program/process,
and the process spawned by the command prints standout output and error to the monitor/screen:
18-InputOutput/stdio00.jpg

After a program/process is run,
input from keyboard can be interactively sent to the process,
and output from process sent to the monitor:
18-InputOutput/stdio01.jpg

Two output streams exist,
one for general output,
the other for error output:
18-InputOutput/stdio04.jpg
While these both head to the screen for your viewing,
a process can write to files as well (more to come later).

The streams are numbered pseudo-file devices in Linux:
18-InputOutput/stdio05.png
Programs/processes can interact with these pseudo-file input/output devices.
18-InputOutput/standard-streams.png

A command line utility also has an exit code,
which communicates it’s success (by convention):
18-InputOutput/std_in_out_err.png

There are many inputs and outputs for a process within an operating system:
18-InputOutput/processio.png

+++++++++++ Cahoot-18a-1

1.3.1 Redirection

Standard IO manipulation via shell redirection using the special characters:
< and >

One can intercept standard output before it hits the default monitor,
and instead redirect it to a file:
18-InputOutput/stdio02.jpg
For example

#!/bin/bash

ls
ls >file.txt
cat file.txt
less file.txt

# This overwrites the above file
echo hey >file.txt
less file.txt

# This appends to the above file
ls >>file.txt
less file.txt

# This produces error
ls nonexistentfile

# Re-directing std-err
ls nonexistentfile 2>stderr.txt
less stderr.txt

# What shows up here, when file.txt exists?
ls nonexistentfile file.txt 2>stderr_only.txt
less stderr_only.txt

# Send both std-err and std-out to a file:
ls nonexistentfile file.txt >stderr_and_stdout.txt 2>&1
less stderr_and_stdout.txt

# Alternative, to do the same:
ls nonexistentfile file.txt &>stderr_and_stdout.txt
less stderr_and_stdout.txt

One can trick your program into thinking that a file’s contents are being typed into it’s interactive standard input:
18-InputOutput/stdio03.jpg
For example:

#!/bin/bash

read realtyping
# type something
echo $realtyping

# Redirect hey into a file for use in capturing output below
echo hey >temp.txt

# Intercept output into temp.txt
read trickedread <temp.txt
echo $trickedread

Or with python
18-InputOutput/std_io_00_echo.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

print(input("Type something: "))
#!/bin/bash

python3 std_io_00_echo.py
# type something, watch it echo

# Capture output whiles still typing
python3 std_io_00_echo.py >capturedout.txt
# type something
less capturedout.txt

# Feed input, print to std-out
python3 std_io_00_echo.py <capturedoutput.txt
less capturedout.txt

# Feed input and capture output
python3 std_io_00_echo.py <capturedout.txt >metacapturedoutput.txt
less metacapturedoutput.txt

+++++++++++ Cahoot-18a-2

1.3.2 Pipes

https://en.wikipedia.org/wiki/Pipeline_(Unix)
One can send data from one program’s standard-out to another program’s standard-in.
The stdout/stdin arrows at right hand side are pipes:
18-InputOutput/pipe.jpg
The command process1 | process 2 intercepts stdout from the first process and feeds it into stdin for the second process:
18-InputOutput/pipe.png
For example:

#!/bin/bash

ls | less
echo "hey there you" | grep --color=auto hey
echo "hey there you" | grep --color=auto there
ls | sort | less

# hash a string with a pipe...
echo smoke | sha256sum

These tools allow you to blaze through data on your filesystem!

Another example:
18-InputOutput/pipe-example.JPG

#!/bin/bash

ls -l | grep '^d' | sort | less

# or a sorted list of the python files in the directory
ls | grep ".py" | sort | less

+++++++++++ Cahoot-18a-3

1.3.3 Summary of stdio, redirection, and piping

18-InputOutput/pipe-example1.png

1.4 Python internal standard I.O.

While this is an OS/Shell topic,
there are some neat python tricks,
like overwriting sys.stdin and sys.stdout in your program.
See code for examples: 18-InputOutput/std_io_01_stdio.py

1.4.1 Subprocess std-io from within python

https://docs.python.org/3/library/subprocess.html
The subprocess module allows running various system shell commands from within python.
See code for examples:
18-InputOutput/std_io_01_stdio.py

1.4.2 Mock/patch

Some unit testing frameworks allow you to simulate standard input and output (but overwriting sys.stdin and sys.stdout is usually better/easier):
19-TestingFrameworks.html

1.4.3 Un-buffered non-blocking keyboard input

What if you want python to use the characters you type in, but without hitting enter?
* See code for a pure-python example: 18-InputOutput/std_io_02_getch.py
* or use ncurses:
* https://docs.python.org/3/library/curses.html
* https://docs.python.org/3/howto/curses.html

1.4.4 Conclusions

Command line programs employ standard IO,
and GUI programs (graphical user interface) programs do not.

18-InputOutput/zipfile.jpg

1.5 Direct python file I.O.

How do we read and write files in python?
This type of input can work for BOTH a command line program, and a GUI (graphical user interface) program.

1.5.1 File types

Come in a couple major forms (with lots of sub-types):
1. Text (un-structured text, structured text, etc.)
2. Binary (At least some of the file is actual byte-based data.)

1.5.2 1. Text

1.5.2.1 ASCII and Unicode

18-InputOutput/ASCII_table1.png
https://home.unicode.org/
https://en.wikipedia.org/wiki/Unicode
https://en.wikipedia.org/wiki/UTF-8 (ASCII-compatible)

1.5.2.2 Special characters

18-InputOutput/escape.png

1.5.2.3 Newlines

https://en.wikipedia.org/wiki/Newline
18-InputOutput/eol.png
Watch out when moving files and source code from OS to OS!

1.5.3 2. Binary files

https://en.wikipedia.org/wiki/Binary_file
For example, ghex /usr/bin/bash:
18-InputOutput/binary-file.png
Binary files typically contain bytes,
that are intended to be interpreted as something other than text characters,
for example image data or executable code.
Compiled computer programs are typical examples;
indeed, compiled applications are sometimes referred to,
particularly by programmers, as binaries.
But binary files can also mean that they contain:
images, sounds, compressed versions of other files, etc.
in short, any type of file content whatsoever.
Even text files can be read in binary mode,
but then you have to interpret the bytes.

1.5.4 Python-specific file handling

What is the syntax and best practice for reading and writing files in python?

1.5.4.1 Files and code for this section

18-InputOutput/file_io_00_python_io.py
18-InputOutput/mbox-short.txt
18-InputOutput/data.txt
18-InputOutput/file_io_01_dos2unix.py

1.5.4.2 Reading

http://scipy-lectures.org/intro/language/io.html
https://automatetheboringstuff.com/2e/chapter9/
https://automatetheboringstuff.com/2e/chapter10/
https://books.trinket.io/pfe/07-files.html
https://docs.python.org/3.5/library/functions.html#open
https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files
https://python.swaroopch.com/io.html (general, other stuff too)
https://realpython.com/read-write-files-python/ (good)
https://realpython.com/working-with-files-in-python/ (extensive examples)
https://www.python-course.eu/python3_file_management.php
https://www.tutorialspoint.com/python3/python_files_io.htm

1.5.4.3 with (context managers)

How does the “context manager” with work, and why should you use it?
https://book.pythontips.com/en/latest/context_managers.html
https://jeffknupp.com/blog/2016/03/07/python-with-context-managers/
https://en.wikibooks.org/wiki/Python_Programming/Context_Managers

1.5.5 Special types of files

These are a couple common file types.

1.5.5.1 Code and file demo

18-InputOutput/file_io_00_python_io.py
18-InputOutput/employee_birthday.csv

1.5.5.2 csv (a type text file)

A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the comma as a field separator is the source of the name for this file format. A CSV file typically stores tabular data (numbers and text) in plain text, in which case each line will have the same number of fields.
https://en.wikipedia.org/wiki/Comma-separated_values
https://tools.ietf.org/html/rfc4180
https://docs.python.org/3/library/csv.html
https://realpython.com/lessons/reading-csvs-pythons-csv-module/
https://realpython.com/python-csv/
https://automatetheboringstuff.com/2e/chapter16/

1.5.5.3 pickle (binary and/or text file?)

If you want to save python objects to file, “preserving” or pickling them for later use.
https://en.wikipedia.org/wiki/Serialization#Pickle
https://docs.python.org/3/library/pickle.html
https://wiki.python.org/moin/UsingPickle
And more from the other links above.

1.5.6 Conclusions

Command line programs and GUI programs can both employ file IO.

Remember the shells we talked about? 02-GitLinuxBash.html
18-InputOutput/shells.png

1.6 Standard library modules for file-processing and system interaction

1.6.1 Conclusions

Command line programs and GUI programs can both employ system IO modules for their interaction with the operating system.

1.7 Command line arguments

ls          # is the base program, list
ls -a       # specifies the all option
ls -l       # specifies the long option
ls -a -l    # both options together
ls -al      # same as above, both
ls -al *.py # lists all python files in directiory
man ls      # ls was actually an argument to man

1.7.1 Python command line arguments

The user’s shell splits the command-line string into words before passing them as command-line arguments to Python, which assigns them to sys.argv:
18-InputOutput/argument_parser.png

There are many alternatives for writing programs that accept command line arguments upon launch:

Python-native:
1. Manually using sys.argv
2. Python native argparse/getopt.

Third-party:
3. Docopt module
4. Google’s Fire module
5. Click module

1.7.1.1 0. Overview reading

https://cli-guide.readthedocs.io/en/latest/index.html (incomplete, but pretty good intro to command line arguments).
https://pythonprogramminglanguage.com/command-line-arguments/
https://realpython.com/comparing-python-command-line-parsing-libraries-argparse-docopt-click/

1.7.1.2 1. Manual sys.argv

https://docs.python.org/3/library/sys.html

The list of command line arguments passed to a Python script.
argv[0] is the script name (it is operating system dependent whether this is a full pathname or not).
If the command was executed using the -c command line option to the interpreter,
then argv[0] is set to the string ‘-c’.
If no script name was passed to the Python interpreter,
then argv[0] is the empty string.”
sys.argv is a list[str] with sys.argv[0]=<yourscript.py>
and with n > 0, sys.argv[n]="<arg_n>").

Code:
18-InputOutput/arg_io_00_args.py
18-InputOutput/arg_io_01_manual.py

Notes:
This is what you use for 1-off simple stuff, not really production code.

1.7.1.3 2. Python native argparse/getopt

Reading:

https://docs.python.org/3/howto/argparse.html
https://docs.python.org/3/library/argparse.html
https://docs.python.org/3/library/getopt.html

https://www.tutorialspoint.com/python3/python_command_line_arguments.htm
https://realpython.com/command-line-interfaces-python-argparse/

Code:
18-InputOutput/arg_io_02_parse.py

Notes:
While argparse is standard python, and it can get the job done, it is pretty mediocre, imo.

1.7.1.4 3. Docopt

http://docopt.org/ (watch video here)
https://github.com/docopt/docopt (clone repo and show docopt/examples/ during lecture)

Notes:
This one is the coolest/best for real command line argument handling, imo!
It also has ports for many many other languages, so you’d theoretically need to learn one paradigm for writing command line arguments across languages:
(E.g., C++, Lua, Ruby, Scala, R, and many more)

The docopt python module is stable and working, but is also stagnant.
It has some active forks that you can track down on Github if you’re interested.
Learning docopt is actually a great way to learn the POSIX standard for command line arguments anyway!
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html

and GNU extensions to POSIX conventions:
https://www.gnu.org/prep/standards/standards.html#Command_002dLine-Interfaces
https://www.gnu.org/software/libc/manual/html_node/Argument-Syntax.html
If anything, docopt becomes a readme standard, in addition to an argument standard, also a good outcome.

1.7.1.5 4. Fire

https://github.com/google/python-fire
https://clay-atlas.com/us/blog/2021/04/05/python-en-fire-package-command-line-terminal/

Google’s python-fire is cool in a similar way to docopt!
It’s not posix-compatible.
It can be used to expose native python to shell with minimal modification!
Is this it’s primary purpose?
Maybe primarily more for exposing python in trivial/small examples,
but not for real production command line applications?

1.7.1.6 5. Click

https://click.palletsprojects.com/
https://github.com/pallets/click
https://pypi.org/project/click/
Meh, why the extra complexity?
Mediocre, imo, though it works.

1.7.2 Conclusions

In my opinion, if you care to efficiently write programs with command line arguments,
then learn:
basic sys.argv usage,
docopt, and
fire

+++++++++++ Cahoot-18d-1

Next: 19-TestingFrameworks.html