Algorithms and Programs
Contents
Algorithms and Programs¶
Our goal in this text is to understand what an algorithm is an how it works. We have selected C as the language because it has much simplier cost model than many other languages. It will be easier to answer questions like “how much memory will this code use?” Once you understand how to answer these questions in C, the concepts will easily transfer to other languages.
What is an Algorithm?¶
An algorithm is a list of steps you can follow to complete a task. A program is an algorithm that has been written in a programming language. A programming language is a language that can be understood by both humans and computers. The resulting program is something that can be run by a machine. These two concepts are often combined, but they are distinctly different.
A program has lots of specifics and limitations. It is limited by the hardware it is being run on. It needs to deal with specifications like programming language syntax and processor architecture.
An algorithm is a more general high level concept. It doesn’t even need to be related to computers at all. For example, the Euclidean algorithm was first described in 300 BC. The easiest recorded algorithms were written by the Babylonians in 2500 BC. These algorithms were written for computers, but rather humans to follow.
We can analyze both algorithms and programs. What we look for in each case will be slightly different, but in both cases we are trying to find the best solution. We will look at the algorithms and how we implement them as programs. If we can be sure an algorithm is optimal, any implementation will be asymptotically optimal. We will define asymptotically optimal formally later. For now, we will just call it “reasonably close to optimal”.
There are many algorithms you can’t program. You cannot run Overwatch on an Atari 2600, even if you had all the algorithms. It doesn’t have the hardware needed to accomplish the task. That doesn’t mean these problems are computationally impossible. It just means they can’t be implemented on this specific computer.
There must be limits to what an algorithm can do. Problems that no computer can solve, regardless of hardware. Alan Turing came up with 4 principles that make an algorithm effectively calculable [Copeland, 2020].
The algorithm is set out in terms of a finite number of exact instructions.
The algorithm will, if carried out without error, produce the desired result in a finite number of steps.
The algorithm can be carried out by a human being unaided by any machinery save paper and pencil.
The algorithm demands no insight or ingenuity on the part of the human being carrying it out.
Turing set out the idea that if all these requirements are met, then the algorithm can be implemented. If you can’t meet these requirements, then no computer can solve the problem. (Classical Computer, he didn’t know about quantum computer or molecular computation which might bend the rules.)
Alan Turing summed this up in a single sentence.
LCMs [logical computing machines: Turing’s expression for Turing machines] can do anything that could be described as “rule of thumb” or “purely mechanical”. [Turing, 1948]
All classic computers and programming language still fall within Turing’s guidelines. We aren’t even sure if any new methods of computation break them or not. It is possible, but not proven conclusively.
We will start to look at computation from the perspective of designing algorithms and then translating them into programs. We will use C as our medium for implementing programs, but we are really studing the algorithms. The algorithms are universally interesting in any programming language.
Programmers work in High Level Languages and let a computer convert it to machine code. This may be done by an interpreter or a compiler. The machine code is the actually code that is run on the computer processor, normally in binary.
Note
Machine Code: The details of how code actually runs on the operating system can be reasonably complex. Machine code is binary code that can be executed directly on the processor/operating system targeted. Exactly what the binary looks like is determined by the target OS/processor and programming language used.
Compiled vs Interpreted Languages¶
There are two broad classes of programming languages. Some languages are interpreted while others are compiled.
Compiled Languages¶
A compiled programming language requires a program called a compiler. This program converts the source code of the program into machine code. The source code is the plain text version of the program we write as programmers.
Once source code has been compiled, it can be run directly. There are a number of advantages to compiled languages. The compiled programming is called an executable. These executables can be distributed without the source code. This allows the program to be distributed without leaking any trade secrets found in the source code.
The compiler also gets to run independent of the program execution. This means it probably isn’t very important the compiler runs fast. It can take its time looking for optimizations and improvements to the code. All the work the compiler does will pay off later. Every time the executable runs, it will be faster because of the work of the compiler.
The compiler can also do analysis of the code. It can find and alert the programmer to bugs or issues that can be fixed before the code is ever run.
There are also disadvantages to compiled code. The programmer writing in the compiled language needs to wait for compilation before testing the code This can become a drag on the programmers time.
Organizations working with compiled code also need to deal with managing two sets of files. The executable files and the source code files. This can lead to additional management overhead.
Compiled languages also tend to be harder to make cross-platform. Code compiled for 32-bit linux won’t run on a 64-bit windows computer.
Interpreted Languages¶
Interpreted Languages use a program called an interpreter. The interpreter reads the source code and executes it at the same time. This can give the programmer more freedom. The source code and the executable file are one in the same. The programmer can change the source code and execute it immediately.
Interpreted Languages tend to have many advantages for the programmer. The same code can be run on any Operating System. The interpreter is the only thing that needs to be changed. The interpreter can also request memory as needed. That means the programmer doesn’t have to worry about exactly what memory needs to be allocated.
The interpreter can also introduce some problems. The interpreter has to run fast. It cannot spend much time analyzing the code before it starts executing. This limits the optimzations that can be applied ahead of time. The code also can’t be written for a specific target machine, it can’t take advantage of a special hardware feature. The code needs to run on any machine with an interpreter available. The programs limited ability to manage memory can also become a problem. If the program needs to be lightweight the interpreter might add to much overhead.
Trade-Offs¶
If there was one best solution, everyone would use it. Sadly, there are always trade-offs. It is up to the programmer, team, or company to weight the advantages and disadvantages. Then decide what is most important to them.
In the most broad sense, the two biggest differences come down to execution speed vs programming speed.
Compiled languages tend to execute more efficiently, but require more programmer effort.
Interpreted languages tend to execute slower, but require less programmer effort.
There are many other nuanced differences. These are just two of the biggest.
High Level Languages¶
An Algorithm can be implemented in any programming langauge. The difference in implementation will come down to syntax. The syntax of a language is the specific set of command/symbols/instructions used in that language.
There are a basic features that we expect in to be available in any language. When we design algorithms we assume these features are all available.
Variables¶
We need a way to store values into the computer’s RAM. A variable is a way to create an area of memory to store a value. We can change the value of a variable, which is where it gets it’s name.
In many languages, we need to declare the static type of the variable. We tell the computer how much memory and of what kind the variable needs. Other languages are dynamically typed. The compiler/interpreter figures out the type as the code executes. A dynamically typed language needs to do more work during execution because it has no idea what memory is needed ahead of time.
Three common types of variables are integers, floats, and strings. A float is a number with decimal places. A string is an ordered list of characters.
Some examples of variable declarations are shown in multiple languages below.
C
int num = 17;
float dec = 25.37;
char* word = "Example";
Go
var num int = 17
var dec float64 = 25.37
var word string = "Example"
Python 3
num = 17
dec = 25.37
word = "Example"
C++
int num = 17;
float dec = 25.37;
string word = "Example";
Java
int num = 17;
double dec = 25.37;
String word = "Example";
Basic Arithmetic¶
Every language must be able to support some basic arithmetic operations. If we can’t do math, we won’t be able to get very far. Although it is possible to make a Turing Complete language without arithmetic, it would be a very difficult language to program in. You would need to rebuild basic arithmetic yourself.
C
num = num + 10; //Addition
num = num - 2; //Subtraction
num = num / 10; //Division (normally integer by default)
num = num * 2; //Multiplication
num = num % 5; //Remainder
Go
num = num + 10 //Addition
num = num - 2 //Subtraction
num = num / 10 //Division (normally integer by default)
num = num * 2 //Multiplication
num = num % 5 //Remainder
Python 3
num = num + 10 #Addition
num = num - 2 #Subtraction
num = num // 10 #Integer Division in Python
num = num * 2 #Multiplication
num = num % 5 #Remainder
C++
num = num + 10; //Addition
num = num - 2; //Subtraction
num = num / 10; //Division (normally integer by default)
num = num * 2; //Multiplication
num = num % 5; //Remainder
Java
num = num + 10; //Addition
num = num - 2; //Subtraction
num = num / 10; //Division (normally integer by default)
num = num * 2; //Multiplication
num = num % 5; //Remainder
Loops¶
Loops allow us to execute the same code repeatedly. Loop are not a requirement, we can also create iteration using recursion. It is a common feature of most programming languages.
C
int total = 0;
int i = 0;
while( i < 10 )
{
total = total + (2*i + 1);
i = i + 1;
}
Go
//Sum up 10 odd numbers
var total int = 0
var i int = 0
for i < 10 {
total = total + (2*i + 1)
i = i + 1
}
Python 3
total = 0
i=0
while i < 10:
total = total + (2*i+1)
i = i + 1
C++
int total = 0;
int i = 0;
while( i < 10 )
{
total = total + (2*i + 1);
i = i + 1;
}
Java
int total = 0;
int i = 0;
while( i < 10 ){
total = total + (2*i + 1);
i = i + 1;
}
Conditionals¶
A program needs to be able to make decisions. This is normally called an if statement or a conditional. All languages must have some way to choose between multiple paths. Normally, we can make decisions based on a Boolean value. A Boolean value is a value that is either true or false.
C
int value = 22;
if(value%2==0){
printf("I am even\n");
}else{
printf("I am odd.\n");
}
Go
var value int = 22
if value%2 == 0 {
fmt.Println("I am even.")
} else {
fmt.Println("I am odd.")
}
Python 3
value = 22
if value%2==0:
print("I am even.")
else:
print("I am odd.")
C++
int value = 22;
if(value%2 == 0){
cout << "I am even." << endl;
} else {
cout << "I am odd." << endl;
}
Java
int value = 22;
if(value%2 == 0){
System.out.println("I am even.");
} else {
System.out.println("I am odd.");
}
Functions¶
Breaking code up into functions is crucial to reusability. We could rewrite the same code in a dozen places, but it would be terrible programming. Creating functions allows us to break large complex algorithms into smaller managable pieces.
Functions also allow for recursion. Without functions in a programming language, we could not create recursion.
C
//Function prototype
int factorial(int n);
//Function Definition
int factorial(int n){
if(n==0){
return 1;
}else{
return n*factorial(n-1);
}
}
Go
func factorial(n int) int {
if n == 0 {
return 1
} else {
return n * factorial(n-1)
}
}
Python 3
def factorial(n):
if n==0:
return 1
else:
return n*factorial(n-1)
C++
int factorial(int n){
if(n == 0){
return 1;
} else {
return n * factorial(n-1);
}
}
Java
public static int factorial(int n)
{
if(n==0){
return 1;
}else{
return n*factorial(n-1);
}
}
Arrays/Lists¶
One of the most basic Data Structures is the array. A collection of values that are all stored next to each other. This allows us to get or change a value by knowing it’s position. Every language has some basic data structure for storing a collection of values.
C
int* primes = malloc(6*sizeof(int));
primes[0]=2;
primes[1]=3;
primes[2]=5;
primes[3]=7;
primes[4]=11;
primes[5]=13;
for(int k = 0; k < 6; k++){
printf("%d\n",primes[k]);
}
free(primes);
Go
var primes []int = []int{2, 3, 5, 7, 11, 13}
for k := 0; k < len(primes); k++ {
fmt.Println(primes[k])
}
Python 3
primes = [2, 3, 5, 7, 11 ,13]
for k in range(0,len(primes)):
print(primes[k])
C++
int* primes = new int[6];
primes[0]=2;
primes[1]=3;
primes[2]=5;
primes[3]=7;
primes[4]=11;
primes[5]=13;
for(int k = 0; k < 6; k++){
cout << primes[k] << endl;
}
Java
int[] primes= new int[6];
primes[0]=2;
primes[1]=3;
primes[2]=5;
primes[3]=7;
primes[4]=11;
primes[5]=13;
for(int k = 0; k < primes.length; k++){
System.out.println(primes[k]);
}
Pointers¶
Many languages allow pointers. A pointer is just the memory address that a variable is stored at. All languages have a kind of pointer, but not all languages allow the programmer to access them directly.
Access to pointers can be a great efficiency tool, but it can also be dangerous. The programmer is directly effecting memory addresses.
C
int* myValue = malloc(sizeof(int));
*myValue = 9;
printf("Pointer: %p\n",myValue);
printf("Value: %d\n",*myValue);
free(myValue);
Go
var myValue int = 10
var myValuePointer *int = &myValue
fmt.Println(myValuePointer)
Python 3
#Python doesn't let us use the pointer address
#But we can see the ID to tell if two values point to the
#Same place
myValue = 10
myValueID = id(myValue)
print(myValueID)
C++
int myValue = 10;
int* myValuePointer = &myValue;
cout << myValuePointer << endl;
Java
//In Java we can't get a pointer
//Everything is always a reference
Data Structures¶
To build our own Data Structures we need to be able to collect together different values. A structure is just a collection of variables stored together in memory. A class is a collection of variables and functions, using in object oriented programing. A class with only variables and no functions is a structure by another name.
Note
A function that is part of a class is refered to as a method.
C
struct Person{
char* firstName;
char* lastName;
char* email;
};
Go
//Person is a struct that store info about a person.
type Person struct {
firstName string
lastName string
email string
}
Python 3
class Person:
firstName = ""
lastName = ""
email = ""
C++
struct Person {
string firstName;
string lastName;
string email;
};
Java
class Person {
public String firstName;
public String lastName;
public String email;
}
Input/Output¶
To make a useful program, we need to communicate with a user. Someone needs to use our program. They need to be able to give the program inputs so it can do computations. They also need to be able to see the results of the computations. This means we need ways it input and output text.
This is commonly one of the most unique parts of every language. Exactly how input/output works can be varied.
C
char c = getchar();
printf("Character was %c\n",c);
Go
reader := bufio.NewReader(os.Stdin)
fmt.Println("Hello! What is your name?")
text, _ := reader.ReadString('\n')
fmt.Println(text)
Python 3
print("Hello! What is your name?")
text = input()
print(text)
C++
string text;
cout << "Hello! What is your name?" << endl;
cin >> text;
cout << text << endl;
Java
BufferedReader reader =
new BufferedReader(new InputStreamReader(System.in));
System.out.println("Hello! What is your name?");
String text = reader.readLine();
System.out.println(text);
Libraries¶
A programing language almost never has every feature that everyone could ever need. Instead, programming languages can be extended by including libraries. We can load other code into our program and use it to solve problems.
C
#include "stdlib.h"
#include "stdio.h"
Go
import (
"bufio"
"fmt"
"os"
)
Python 3
import sys
import math
C++
#include <iostream>
#include <string>
using namespace std;
Java
import java.lang.Math;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
Psuedocode Algorithms¶
When algorithms are published, they are frequently written in Psuedocode. This code has all the basic features of any language, but no specific syntax rules. It ends up being a mix between plain english and real code. The goal is to provide all the details the reader needs to implement the algorithm, without getting caught up in the specifics of any one language.
You should never think of an algorithm as being tied to a specific language. The algorithm itself is always language agnostic. We just need to pick a language to translate our algorithm into something the machine can understand.
Although we will be primarily using C as our implementation language you can see the features are univeral. The ideas and concepts can be applied to any language. Any psuedocode that uses the basic features of high level programming languages can be implemented in any language.