cs107-lecture-examples

Example codes used during Harvard CS107 lectures
git clone https://git.0xfab.ch/cs107-lecture-examples.git
Log | Files | Refs | README | LICENSE

python_1.md (15527B)


      1 ---
      2 jupyter:
      3   jupytext:
      4     formats: ipynb,md
      5     text_representation:
      6       extension: .md
      7       format_name: markdown
      8       format_version: '1.3'
      9       jupytext_version: 1.13.8
     10   kernelspec:
     11     display_name: Python 3
     12     language: python
     13     name: python3
     14 ---
     15 
     16 # Introductory Python
     17 
     18 
     19 The main topic for today's lecture is Python and some of it's basic
     20 functionality.  We will cover the basics of 
     21 
     22 * using Python as a calculator
     23 * `print` statements
     24 * the list concept
     25 * opening and reading from files
     26 * dictionaries
     27 * strings
     28 
     29 I will show you some very basic examples and you will put them all together in a
     30 small script for your exercise.  The exercise is displayed at the top of this
     31 notebook.  If you already know how to do it, then just write up your script now.
     32 However, you may need some guidance.  You will find such guidance throughout the
     33 rest of the notebook.
     34 
     35 
     36 ## Important, Useful Libraries
     37 
     38 
     39 You should always try to use existing technologies to accomplish your goals
     40 whenever possible.  For example, don't write your own function to compute the
     41 square root of a number.  That would be really hard and your implementation
     42 would most likely not be very efficient.  Instead, use built-in functionality or
     43 functionality from a nice library such as `numpy`
     44 ([NUMericalPYthon](http://www.numpy.org/)).
     45 
     46 > NumPy is the fundamental package for scientific computing with Python. It
     47 > contains among other things:
     48 >
     49 > * a powerful N-dimensional array object 
     50 > * sophisticated (broadcasting) functions 
     51 > * tools for integrating C/C++ and Fortran code 
     52 > * useful linear algebra, Fourier transform, and random number capabilities 
     53 >
     54 > Besides its obvious scientific uses, NumPy can also be used as an efficient
     55 > multi-dimensional container of generic data. Arbitrary data-types can be
     56 > defined. This allows NumPy to seamlessly and speedily integrate with a wide
     57 > variety of databases.
     58 
     59 To import libraries into your Python application, do the following:
     60 
     61 ```python
     62 # The %... is an iPython thing, and is not part of the Python language.
     63 # In this case we're just telling the plotting library to draw things on
     64 # the notebook, instead of on a separate window.
     65 %matplotlib inline 
     66 # the line above prepares IPython notebook for working with matplotlib
     67 
     68 import numpy as np # imports a fast numerical programming library
     69 import scipy as sp #imports stats functions, amongst other things
     70 import matplotlib as mpl # this actually imports matplotlib
     71 import matplotlib.cm as cm #allows us easy access to colormaps
     72 import matplotlib.pyplot as plt #sets up plotting under plt
     73 import pandas as pd #lets us handle data as dataframes
     74 #sets up pandas table display
     75 pd.set_option('display.width', 500)
     76 pd.set_option('display.max_columns', 100)
     77 pd.set_option('display.notebook_repr_html', True)
     78 ```
     79 
     80 The way to understand these imports is as follows: _import the library `library`
     81 with the alias `lib`_ where `library` could be `numpy` or `matplotlib` or
     82 whatever you want and `lib` is the alias used to refer to that library in our
     83 code.  Using this flow, we can call methods like `plt.plot()` instead of
     84 `matplotlib.pyplot.plot()`.  It makes life easier.
     85 
     86 
     87 **NOTE:** It is not necessary to import _all_ of these libraries all of the
     88 time.  You should only import the ones you really need.  I listed a bunch above
     89 to give you a sampling of what's available.
     90 
     91 **NOTE:** DO NOT include `%matplotlib inline` in your Python scripts unless
     92 you're working in the Jupyter notebook.
     93 
     94 
     95 At the end of this course, someone should be able to `import
     96 your_kinetics_library` to use the kinetics library that you are about to start
     97 writing.
     98 
     99 
    100 ## The Very Basics
    101 
    102 
    103 We'll fly through this part because you should already know it.  If you don't
    104 understand something, please Google it and/or refer to the [Python
    105 Tutorial](https://docs.python.org/3/tutorial/).  I do not want to recreate the
    106 Python tutorial here; instead, I'll just summarize a few important ideas from
    107 Python.  We'll give more details a little later on how some of these language
    108 features work.
    109 
    110 Another very helpful resource that explains the basics below (and few additional
    111 topics) can be found here:
    112 [https://learnxinyminutes.com/docs/python/](https://learnxinyminutes.com/docs/python/).
    113 
    114 
    115 ### Calculating
    116 
    117 
    118 We can tell the type of a number or variable by using the `type` function.
    119 
    120 ```python
    121 type(3), type(3.0)
    122 ```
    123 
    124 Remember, every variable in python gets a type. Python is a strongly typed
    125 language. It is also a dynamic language, in the sense that types are assigned at
    126 run-time, rather then "compile" time, as in a language like C. This makes it
    127 slower, as the way data is stored cannot be initially optimal, as when the
    128 program starts, you dont know what that variable will point to.
    129 
    130 
    131 All the usual calculations can be done in Python.
    132 
    133 ```python
    134 2.0 + 4.0 # Adding two floats
    135 ```
    136 
    137 ```python
    138 2 + 4     # Adding two ints
    139 ```
    140 
    141 ```python
    142 1.0 / 3.0 # Dividing two floats
    143 ```
    144 
    145 ```python
    146 1 / 3     # Dividing two ints
    147 ```
    148 
    149 Note that in Python 2, the division of two ints would not be interpreted as a
    150 float; it is integer division.  This is new in Python 3!  Now, if you want
    151 integer division you have to use the `//` operator.
    152 
    153 ```python
    154 1 // 3    # Integer division
    155 ```
    156 
    157 ```python
    158 2**5      # Powers
    159 ```
    160 
    161 ```python
    162 3 * 5     # Multiplication
    163 ```
    164 
    165 #### More advanced operations
    166 
    167 We can use `numpy` to do some more advanced operations.
    168 
    169 ```python
    170 np.pi * np.exp(2.0) + np.tanh(1.0) - np.sqrt(100.0)
    171 ```
    172 
    173 Notice that I am always writing my floats with a decimal point.  You don't
    174 really need to do that in Python because Python will automatically convert
    175 between types.  For example:
    176 
    177 ```python
    178 type(np.pi * np.exp(2.0) + np.tanh(1.0) - np.sqrt(100.0)), type(np.pi * np.exp(2) + np.tanh(1) - np.sqrt(100))
    179 ```
    180 
    181 However, I like to make the types as explicit as I can so there's no confusion.
    182 
    183 
    184 ### `print`
    185 
    186 
    187 The `print` function is the basic way to write information out to the screen.  I
    188 will briefly review the new form of the `print` function.  In Python 2, `print`
    189 was a `statement` rather than a `function`.
    190 
    191 ```python
    192 print('Good morning!  Today we are doing Python!')                                  # Basic print
    193 print(3.0)                                                                          # Print a float
    194 print('{} is a nice, trancendental number'.format(np.pi))                           # Print just one number
    195 print('{} is nice and so is {}'.format('Eric', 'Sarah'))                            # Print with two arguments
    196 print('{0:20.16f}...: it goes on forever but {1} is just an int.'.format(np.pi, 3)) # Print with formatting in argument 0
    197 ```
    198 
    199 Here are some additional resources for the `print` function and formatting:
    200 * [7. Input and Output](https://docs.python.org/3/tutorial/inputoutput.html)
    201 * [Formatted Output](https://www.python-course.eu/python3_formatted_output.php)
    202 * [`Print` function](https://docs.python.org/3/library/functions.html#print)
    203 
    204 
    205 ### Variables
    206 
    207 
    208 We'll have more to say about variables in Python later.  For now, here's how you
    209 store them syntactically:
    210 
    211 ```python
    212 a = 1.0
    213 b = -1.0
    214 c = -1.0
    215 x = (1.0 + np.sqrt(5.0)) / 2.0
    216 val = a * x**2.0 + b * x + c
    217 print('{0}x^2 + {1}x + {2} = {3}'.format(a, b, c, val))
    218 ```
    219 
    220 Python has this nice feature where you can assign more than one variable all on
    221 one line.  It's called the multiple assignment statement.
    222 
    223 ```python
    224 a, b, c = 1.0, -1.0, -1.0
    225 x = (1.0 + np.sqrt(5.0)) / 2.0
    226 val = a * x**2.0 + b * x + c
    227 print('{0}x^2 + {1}x + {2} = {3}'.format(a, b, c, val))
    228 ```
    229 
    230 Looks a little cleaner now.
    231 
    232 
    233 ### Lists and `for` loops
    234 
    235 
    236 Lists are central to Python.  Many things behave like lists.  For now, we'll
    237 just look at how to create them and do basic operations with them.  I will not
    238 go through all the details.  Please refer to
    239 [Lists](https://docs.python.org/3/tutorial/introduction.html#lists) for
    240 additional examples.
    241 
    242 ```python
    243 primes = [2, 3, 5, 7, 11, 13]     # A list of primes
    244 more_primes = primes + [17, 19]   # List concatentation
    245 print('First few primes are: {primes}'.format(primes=primes))
    246 print('Here are the primes up to the number 20: {}'.format(more_primes))
    247 ```
    248 
    249 Notice that Python knows that type of `primes`.
    250 
    251 ```python
    252 print('primes is of type {}'.format(type(primes)))
    253 ```
    254 
    255 The `len` function can provide the number of elements in the list.
    256 
    257 ```python
    258 print('There are {} prime numbers less than or equal to 20.'.format(len(primes)))
    259 ```
    260 
    261 Now that we know what a list is, we can discuss `for` loops in Python.  The
    262 `for` loop iterates over an iterator such as a list.  For example:
    263 
    264 ```python
    265 for p in more_primes:
    266     print(p)
    267 ```
    268 
    269 A useful iterator (but not a list!) is the `range` function.
    270 
    271 ```python
    272 print(range(10))
    273 print(type(range(10)))
    274 ```
    275 
    276 It's not a list anymore (it used to be in Python 2) and therefore can't be
    277 sliced like a list can (see below).  Still, you can use it in `for` loops which
    278 is where it finds most of its use.
    279 
    280 ```python
    281 for n in range(10):
    282     print(n)
    283 ```
    284 
    285 There is something called a _list comprehension_ in Python.  List comprehensions
    286 are just a way to transform one list into another list.
    287 
    288 ```python
    289 not_all_primes = [p // 3 for p in more_primes]
    290 print('The new list is {}'.format(not_all_primes))
    291 ```
    292 
    293 We can also count the number of each element in the list.  There are a number of
    294 ways of doing this, but one convenient way is to use the `collections` library.
    295 
    296 ```python
    297 import collections
    298 how_many = collections.Counter(not_all_primes)
    299 print(how_many)
    300 print(type(how_many))
    301 ```
    302 
    303 We see that there are 2 ones, 1 two, 1 three, etc.
    304 
    305 We can even find the most common element of the list and how many occurrences of
    306 it there are and return the result as a list.
    307 
    308 ```python
    309 how_many_list = how_many.most_common()
    310 print(how_many_list)
    311 print(type(how_many_list))
    312 ```
    313 
    314 We see that the result is a list of tuples with the most common element of our
    315 original list (`not_all_primes`) displayed first.  We want the most common
    316 element of our original list, so we just access the first element using a simple
    317 index.
    318 
    319 ```python
    320 most_common = how_many_list[0]
    321 print(most_common)
    322 print(type(most_common))
    323 ```
    324 
    325 We're almost there.  We recall the first element of this tuple is the value from
    326 our original list and the second element in the tuple is its frequency.  We're
    327 finally ready to get our result!
    328 
    329 ```python
    330 print('The number {} is the most common value in our list.'.format(most_common[0]))
    331 print('It occurs {} times.'.format(most_common[1]))
    332 ```
    333 
    334 List indexing is also very important.  It can also do much more than what we did
    335 above.
    336 
    337 ```python
    338 print(primes[2])   # print the 3rd entry 
    339 print(primes[2:5]) # print the 3rd to 5th entries
    340 print(primes[-1])  # print the last entry
    341 print(primes[-3:]) # print the three entries
    342 ```
    343 
    344 Other types of slices and indexing can be done as well.  I leave it to you to
    345 look this up as you need it.  It is a **very** useful thing to know.
    346 
    347 
    348 Two convenient built-in functions are `enumerate` and `zip`.  You may find
    349 various uses for them.
    350 
    351 * `enumerate` gives a representation of a list of tuples with each tuple of the
    352   form `(index, value)`.  This provides an easy way to access the `index` of the
    353   value in the `list`.
    354 * `zip` takes elements from each list and puts them together into a
    355   representation of a list of tuples.  This provides a nice way to aggregate
    356   lists.
    357 
    358 
    359 We'll make two lists for the following examples:
    360 
    361 ```python
    362 species = ['H2', 'O2', 'OH', 'H2O', 'H2O2']
    363 species_names = ['Hydrogen', 'Oxygen', 'Hydroxyl', 'Water', 'Hydrogen Peroxide']
    364 ```
    365 
    366 #### `enumerate` example
    367 
    368 ```python
    369 print(enumerate(species)) 
    370 ```
    371 
    372 Notice that `enumerate()` just returns an iterator object.  To actually see
    373 what's in the iterator object, we need to convert the iterator object to a list
    374 
    375 ```python
    376 print(list(enumerate(species)))
    377 ```
    378 
    379 We see that we have a list of tuples (in the form `(index, value)` where `index`
    380 starts from 0).  Here's just one way that this might be used:
    381 
    382 ```python
    383 for i, s in enumerate(species):
    384     print('{species} is species {ind}'.format(species=s, ind=i+1))
    385 ```
    386 
    387 What happened is that the `for` loop iterated over the iterable (here
    388 `enumerate`).  The first index in the `for` loop corresponds to the first entry
    389 in the `enumerate` tuple and the second index in the `for` loop corresponds to
    390 the second entry in the `enumerate` tuple.
    391 
    392 
    393 #### `zip` example
    394 
    395 
    396 Let's see how `zip` works.  We'll aggregate the `species` and `species_names`
    397 lists.
    398 
    399 ```python
    400 print(zip(species, species_names))
    401 print(list(zip(species, species_names)))
    402 ```
    403 
    404 ```python
    405 for s, name in zip(species, species_names):
    406     print('{specie} is called {name}'.format(specie=s, name=name))
    407 ```
    408 
    409 We see that this worked in a similar way to `enumerate`.
    410 
    411 
    412 Finally, you will sometimes see `enumerate` and `zip` used together.
    413 
    414 ```python
    415 for n, (s, name) in enumerate(zip(species, species_names), 1):
    416     print('Species {ind} is {specie} and it is called {name}.'.format(ind=n, specie=s, name=name))
    417 ```
    418 
    419 ### Opening Files
    420 
    421 
    422 There are a variety of ways to open files in Python.  We'll see a bunch as the
    423 semester progresses.  Today, we'll focus on opening and reading text files.
    424 
    425 ```python
    426 species_file = open("species.txt") # Open the file
    427 species_text = species_file.read() # Read the lines of the file
    428 species_tokens = species_text.split() # Split the string and separate based on white spaces
    429 species_file.close()               # Close the file!
    430 ```
    431 
    432 ```python
    433 print(species_tokens)
    434 print(type(species_tokens))
    435 ```
    436 
    437 Notice that we get a list of strings.
    438 
    439 
    440 Here's a better way to open a file.  The `close` operation is handled
    441 automatically for us.
    442 
    443 ```python
    444 with open('species.txt') as species_file:
    445     species_text = species_file.read()
    446     species_tokens = species_text.split()
    447 ```
    448 
    449 ### Dictionaries
    450 
    451 
    452 Dictionaries are extremely important in Python.  For particular details on
    453 dictionaries refer to
    454 [Dictionaries](https://docs.python.org/3/tutorial/datastructures.html#dictionaries).
    455 From that tutorial we have a few comments on dictionaries:
    456 
    457 > Unlike sequences, which are indexed by a range of numbers, dictionaries are
    458 > indexed by keys, which can be any immutable type; strings and numbers can
    459 > always be keys.
    460 >
    461 > It is best to think of a dictionary as an unordered set of key: value pairs,
    462 > with the requirement that the keys are unique (within one dictionary). A pair
    463 > of braces creates an empty dictionary: {}. Placing a comma-separated list of
    464 > key:value pairs within the braces adds initial key:value pairs to the
    465 > dictionary; this is also the way dictionaries are written on output.
    466 >
    467 > The main operations on a dictionary are storing a value with some key and
    468 > extracting the value given the key.
    469 
    470 
    471 Let's create a chemical species dictionary.
    472 
    473 ```python
    474 species_dict = {'H2':'Hydrogen', 'O2':'Oxygen', 'OH':'Hydroxyl', 'H2O':'Water', 'H2O2':'Hydrogen Peroxide'}
    475 print(species_dict)
    476 ```
    477 
    478 The entries to the left of the colon are the keys and the entries to the right
    479 of the colon are the values.  To access a value we just reference the key.
    480 
    481 ```python
    482 print(species_dict['H2'])
    483 ```
    484 
    485 Pretty cool!
    486 
    487 Suppose we want to add another species to our dictionary.  No problem!
    488 
    489 ```python
    490 species_dict['H'] = 'Atomic Hydrogen'
    491 print(species_dict)
    492 print(species_dict['H'])
    493 ```
    494 
    495 Why should we use dictionaries at all?  Clearly they're very convenient.  But
    496 they're also fast.  See [indexnext |previous |How to Think Like a Computer
    497 Scientist: Learning with Python 3: 20.
    498 Dictionaries](http://openbookproject.net/thinkcs/python/english3e/dictionaries.html)
    499 for a decent explanation.