class: big, middle # Engineering 1020: Introduction to Programming .title[ .lecture[Lecture \22\:] .title[Dictionaries] ] .footer[[/lecture/22/](/lecture/22/)] --- # Previously: ### Ordered collections: -- * lists (flexible, mutable) * tuples (simple, immutable) * arrays (fixed-size, homogeneous, high-performance) -- ### Unordered collections: -- sets --- # Today: ### Dictionaries * sophisticated _associative_ collection * semantics * syntax * usage --- # Finding things in collections ```python def find(needle, haystack): for value in haystack: if value == needle: print('Found it!') return print("Didn't find it.") find('Jon', ['Alice', 'Bob', 'Charlie', 'Diana']) find('Diana', ['Alice', 'Bob', 'Charlie', 'Diana']) ``` -- ### How long does this take? -- ### How about with 1 M values? -- 100 M? -- 1 B? --- # Naming things in collections -- ```python sid = int(input('Enter student ID> ')) index = None for i, s in enumerate(something_that_returns_students()): if s.id == sid: index = i print('Found student at index:', index) ``` -- ### What is the "name" for the student in this collection? -- ```python print(students[i]) # print student i students[i].name() # get student i's name students[i] = ... # set to another student ``` --- # That's odd... ### Does it matter whether you're at index 12 or 93? -- * doesn't matter whether you registered first, tenth or last -- * "Student in seat 4" not a meaningful way to refer to you! -- ### What's a better way to refer to you? -- * name -- * student ID --- # Another type of collection? ### Sometimes the _order_ of things doesn't matter -- ### Sometimes we need a sensible _name_ for things -- > There are only two hard things in Computer Science: cache invalidation and naming things. — Phil Karlton -- .centered[ <img src="hard-problems-tweet.png" width="900"/> ] --- # Python dictionary ### A _dictionary_ holds _named_ values -- * Python type: `dict` -- * it's an _unordered_ collection -- * every _value_ in a dictionary also has a _key_ (a name) -- * can look up values **by key** -- * can iterate over keys, values **or both** --- # Dictionary syntax ### Create a dictionary: ```python students = { 200125805: 'Jonathan Anderson', 202412345: 'Somebody New', } ``` -- * enclosed in curly _braces_ (not brackets or parentheses) -- * comma-separated _items_ -- * each item has a _key_ and a _value_ --- # Dictionary values ### Can use any type for values: -- .floatleft[ ```python csf_rooms = { 2111: "Alice Faisal", 2112: "Computer lab", # ... 4101: "Workstations", 4103: "VISOR lab", # ... 4123: "Jonathan Anderson", # ... } ``` ] -- .floatleft[ ```python course_averages = { 1010: 57.5, 1020: 67.6, 1030: 78.7, 1040: 74.9, } ``` ] -- .floatleft[ ```python grades = { 200125805: [90, 98], # ... } ``` ] --- # Dictionary keys ### Can use _many_ types for keys: -- .floatleft[ ```python populations = { 'CBS': 24_848, 'Corner Brook': 19_886, 'Gander': 11_054, 'Grand Falls-Winsor': 13_725, 'Mount Pearl': 24_284, 'Paradise': 17_695, "St. John's": 106_172, } ``` ] .footnote[ Population data is a bit stale: it's as of the [2011 census](https://www12.statcan.gc.ca/census-recensement/2011/dp-pd/hlt-fst/pd-pl/Table-Tableau.cfm?LANG=Eng&T=302&SR=1&S=3&O=D&RPP=9999&PR=10). ] -- .floatleft[ ```python checkers = { (0, 0): 'red', (0, 2): 'red', # ... (7, 1): 'black', (7, 3): 'black', } ``` ] -- .floatleft[ ```python an_error = { [0, 0]: 'whut', } ``` ] --- # Valid dictionary keys ### `TypeError: unhashable type: 'list'` — ??? -- ### Keys must be [_hashable_](https://docs.python.org/3/glossary.html#term-hashable) -- ### Python's _immutable_ containers are hashable -- * tuples are OK (if its elements are hashable), lists are not -- * strings are OK, arrays of characters are not --- # Indexing ### Can access individual elements just like indexing: ```python s = students[200125805] # looks a lot like a list or array p = populations['Gander'] # well that's new! populations["St. John's"] += 1 # congratulations to the new parents? ``` --- # Iterating over dictionary keys and values -- ### By default, you iterate over _keys_: ```python for sid in students: print(sid, ':', students[sid]) for city in populations.keys(): # this does the same thing print(city, ':', populations[city]) ``` -- ### Can also iterate over values ```python for pop in populations.values(): print(pop) # but we don't know which city we're referring to ``` --- # Iterating over dictionary items ### Can also iterate over _items_ (key, value tuples) -- ```python for sid, student in students.items(): print(sid, ':', student) for city, pop in populations.items(): print(city, ':', pop) ``` -- ### Ordering _may not_ be preserved* .footnote[ * _Fine print (not on the exam): Python 3.7+ preserves insertion order in the `dict` type, but many Python packages that interact with `dict` don't assume that ordering will be preserved, so they may not work to preserve it in the data you import or export._ ] ??? Depending on the version of Python and other factors, it's possible that iterating twice over a dictionary might give you a different order each time. --- # Using dictionaries ### Helpful when _name_ more important than _order_ -- ### Allow very fast search by key * no need to look at all 40 B records! -- * how? details will come later (ECE 4400 or equivalent) -- ### Basis for lots of code in Python packages you may use --- # Example with pandas* .footnote[ * See the [rest of the code](https://gist.github.com/trombonehero/b7b2ec2667dab2bf3bb09399984a8046) as well as the [data it operates on](https://docs.google.com/spreadsheets/d/1vyBnGK-3c5e9wsL5e278NR1G3uTqdUH6wOwCRzgUb4I)] -- <img src="covid.png" align="right" width="380" alt="COVID-19 cases in Newfoundland and Labrador today"/> -- [pandas](https://pandas.pydata.org) frames behave like dicts: ```python import pandas as pd # # Read data and compute some summaries: # data = pd.read_csv('covid-data.csv') data['Total'] = data['New'].cumsum() data['Deceased'] = data['Deaths'].cumsum() data['Recovered'] = data['Recoveries'].cumsum() # ... ``` --- # Summary: ### Dictionaries * sophisticated _associative_ collection * semantics * syntax * usage --- class: big, middle (here endeth the lesson)