Recap¶

Karën Fort (CC BY-NC-SA) -- 2024

This notebook corresponds to your fifth lecture and comes back on what we've already seen, going into more details and (hopefully) answering some of your questions.

It is inspired from different sources:

  • Loic Grobol's material
  • W3School tutorial

About strings¶

  • slicing
  • formatting
  • escape characters

Slicing¶

Returns a range of characters

In [11]:
my_string = "This is the end, my friend"

print(my_string[2:5])             # Th[is ] the next "i" is excluded (5th character)
is 
In [15]:
my_string = "This is the end, my friend"

print(my_string[:6])              # from the start to 6th position (excluded)
This i
In [16]:
my_string = "This is the end, my friend"

print(my_string[6:])              # from the 6th position to the end
s the end, my friend
In [18]:
my_string = "This is the end, my friend"

print(my_string[-6:-2])          # from the 6th position from the end, to the 2nd position from the end (not included)
frie

🥳 What's weird here?

Formatting¶

F-String was introduced in Python 3.6, it allows to format strings in a powerful way:

In [20]:
age = 51

confession = f"I am {age} year old"  # no need to cast an int into a string, Python does it for you!

print(confession)
I am 51 year old
In [22]:
age = 51

confession = f"I was born in {2024-age}"  # you can even add some instructions here!

print(confession)
I was born in 1973

Escape characters¶

\ is used to note characters which cannot be used (illegal) or cannot be seen (line break) in a string:

In [38]:
my_string = "So you are a \"feminist\"? I need to write a \\" 
    
print(my_string)
So you are a "feminist"? I need to write a \
In [37]:
my_string = "Winter is coming \nFind dragons!"  # \n means a new line 
print(my_string)

my_string = "Winter is coming \t Find dragons!"  # \t means a tabulation 
print(my_string)
Winter is coming 
Find dragons!
Winter is coming 	 Find dragons!

About encodings¶

  • in strings
  • in files

To learn more about encodings, refer to your course with B. Guillaume or read this (FR) or this (EN).

Since Python 3.0, all strings are by default in Unicode (UTF-8). You already know that you can also use Unicode character for variable names.

In [42]:
print("It is possible to write in different scripts like Маша or منزل, so it's Unicode")
It is possible to write in different scripts like Маша or منزل, so it's Unicode

🥳 ord returns the Unicode code of a character. chr does the opposite. Translate [233, 112, 97, 116, 97, 110, 116, 32, 33] into a readable string:

In [41]:
to_translate = [233, 112, 97, 116, 97, 110, 116, 32, 33]
for char in to_translate:
    print(chr(char))
é
p
a
t
a
n
t
 
!

Concerning files, it's a bit more complex:

  • by default, files are opened using the locale defined in your operating system
  • it is safer to specify the encoding with the appropriate parameter: f = open(fname, encoding="utf-8")
  • beware when you work in groups: check that you save your file in the proper encoding (be especially careful on Windows)
  • check you locale
In [46]:
import locale

locale.getdefaultlocale()
Out[46]:
('fr_FR', 'UTF-8')

About loops¶

  • on range and other iterables
  • break and continue
  • loops in loops (nested loops)

Iterables¶

on range allows to loop with for a specified number of times using what is called an iterable:

In [9]:
my_list = ["Karën","Maxime","Fanny"]

for x in range(len(my_list)):       # range can also be very precise: range(1,2)
    print(x)                        # Note the value of x
0
1
2

Break and continue¶

Break stops the loop before it's finished.

In [3]:
fruits = ["apple", "banana", "cherry"]

for x in fruits:
    print("Print before break: "+x)
    if x == "banana":
        break
        
for x in fruits:
    if x == "banana":
        break
    print("Print after break: "+x) 
Print before break: apple
Print before break: banana
Print after break: apple

continue stops the loops and continue to the next iteration:

In [5]:
fruits = ["apple", "banana", "cherry"]
for x in fruits:
    if x == "banana":
        continue
    print(x)
apple
cherry

Nested loops¶

Sometimes, we need to add a loop into a loop. The "inner loop" will be executed one time for each iteration of the "outer loop":

In [1]:
# ex of the combination of two lists
adj = ["red", "big", "tasty"]
fruits = ["apple", "banana", "cherry"]

for x in adj:                         # rank 1: for red
    for y in fruits:                  # associate each fruit
        print(x, y) 
red apple
red banana
red cherry
big apple
big banana
big cherry
tasty apple
tasty banana
tasty cherry

This is not really a problem, except that it can get heavy if the lists are big (think about a book or the Olympic game dataset).

But how heavy?

A little bit about complexity¶

Nested loops are known to be $O(n^2)$ (quadratic complexity). What does that mean?

[I used a lot this post for this part.]

Space complexity refers to the amount of memory required to execute a program.

Time complexity refers to the amount of time required to execute a program. It is expressed as O(). To compute it:

  • remove constants
  • only consider the highest complexity
In [2]:
def sum_list(lst): # K1
  total = 0        # K2
  for num in lst:  # n (n = len(lst)
    total += num   # K3
    return total   # K4

Here, the total complexity should be:

K1 + K2 + (k * n) + K4

But, as Ks are constants, we are left with k*n.

Therefore this piece of code is O(n), with n being the size of our list.

If we get back to our previous example of nested loop:

In [3]:
adj = ["red", "big", "tasty"]          # K1
fruits = ["apple", "banana", "cherry"] # K2

for x in adj:                          # K3
    for y in fruits:                   # K4
        print(x, y)                    # size of adj * size of fruits * K5
red apple
red banana
red cherry
big apple
big banana
big cherry
tasty apple
tasty banana
tasty cherry

In this case, the complexity is:

size of adj * size of fruits

Let's say the maximum size of a list is $n$ (here they are equal, of course) and we end up with the maximum complexity of $n*n$, i.e. $O(n^2)$.

Typical time complexities:

Name $O()$
Constant $O(c)$
Linear $O(n)$
Quadratic $O(n^2)$
Cubic $O(n^3)$
Exponential $O(2^n)$
In [1]:
# This was adapted from https://gist.github.com/niftycode/917bb33be4e91a10b1fffd020f987f94 with the author's approval
#!/usr/bin/env python3
# -*- coding: utf-8 -*-

'''
big-o-notation.py:
Version: 0.2
Python 3.6
Date created: 22/03/2017
'''

# Big-O   	Name
# 1   		Constant
# log(n)  	Logarithmic REMOVED
# n   		Linear
# nlog(n) 	Log Linear REMOVED
# n^2 		Quadratic
# n^3 		Cubic
# 2^n 		Exponential

import numpy as np
import matplotlib.pyplot as plt


# Stylesheets defined in Matplotlib
plt.style.use('bmh')

# Set up runtime comparisons
n = np.linspace(1, 10, 1000)
#labels = ['Constant', 'Logarithmic', 'Linear', 'Log Linear', 'Quadratic', 'Cubic', 'Exponential']
#big_o = [np.ones(n.shape), np.log(n), n, n * np.log(n), n**2, n**3, 2**n]
labels = ['Constant', 'Linear', 'Quadratic', 'Cubic', 'Exponential']
big_o = [np.ones(n.shape),  n,  n**2, n**3, 2**n]

# Plot setup
plt.figure(figsize=(12, 10))
plt.ylim(0, 50)

for i in range(len(big_o)):
    plt.plot(n, big_o[i], label=labels[i])

plt.legend(loc=0)
plt.ylabel('Relative Runtime')
plt.xlabel('Input Size')
#plt.savefig('big-o-notation.png')
Out[1]:
Text(0.5, 0, 'Input Size')

What happens if it's too complex (note that this notion evolves with time)?

🥳 copy the follwing piece of code and try it on 30, then on 60

In [15]:
def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)
Out[15]:
832040

Why not tracking the time it takes?

In [17]:
from time import perf_counter

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

start = perf_counter()
fibonacci(35)
stop = perf_counter()
print("Elapsed time during the whole program in seconds:",stop-start)
Elapsed time during the whole program in seconds: 6.956649056999595

About Lists¶

  • comprehension

Shorter syntax to create a new list (typically a subset) from an existing list:

In [4]:
teachers = ["Karën", "Maxime", "Hee-Soo", "Fanny", "Armelle"]

teachers_with_a = [teacher for teacher in teachers if "a" in teacher] # returns a list

print(teachers_with_a) 
['Karën', 'Maxime', 'Fanny']
In [9]:
# you can even do this:
my_list = [teacher if teacher != "Maxime" else "Olivier" for teacher in teachers] 

🥳 What does it do? Write it in another (longer) syntax.

In [8]:
# TODO code me!