class: big, middle # Engineering 1020: Introduction to Programming .title[ .lecture[Lecture \15\:] .title[Testing] ] .footer[[/lecture/15/](/lecture/15/)] --- # Today ### What is correctness? ### How can we know what's right? ### What can we do to check our code's correctness? --- # Correctness -- ### When our code _always_ does the right thing -- ### No mistakes! -- * "one mistake in a million isn't so bad, is it?" -- * processor execute **billions** of instructions per second! -- * how about about [1 in 9 billion](https://en.wikipedia.org/wiki/Pentium_FDIV_bug)? ??? This is a very high bar to clear! --- # Checking correctness -- ### Ultimately: mathematical _proof_ -- <img src="https://i.ytimg.com/vi/AdakDMYu4lM/hqdefault.jpg" alt="First slide of Matthew Brecknell's Intro to seL4 proofs talk" class="floatright" width="375"/> * possible in some circumstances -- * often difficult in practice ??? Even some of the highest-assurance systems (e.g., [seL4](https://sel4.systems/Info/FAQ/proof.pml)) have a lot of caveats around their proofs of correctness. Their authors really have constructed mathematical proofs of correctness, but as they say, such proofs "prove exactly what you have stated, not necessarily what you mean or what you want". So, while this is a worthwhile activity, and research continues to push new boundaries, most code **can't be proven correct**. This is for one of two reasons: * the code is **beyond the complexity that we can handle proving** or * the code is **not correct**! -- * area of active research! -- ### What's the next best thing? --- # Testing -- <img src="testing.png" alt="Testing triangle: code, specification and test" class="floatright" width="700"/> ### Code ### Specifications ### Tests --- layout: true # Testing <img src="testing.png" alt="Testing triangle: code, specification and test" class="floatright" width="400"/> --- ### Code * what you've worked so hard on * hopefully does what we want... -- * ... but how can we tell? ??? Few things are more frustrating than when something doesn't work, you send it in to get fixed and you get the response, "it works for me!" In my last car, the air conditioning often wouldn't turn on. I would bring it into the shop and say, "it's hot outside, the aircon doesn't blow cold air, and my family is really hot all of the time." When the mechanic turned the car on, however, it "worked for them". So was the air conditioning working or not? -- ##### "It works on my computer" -- ##### "It works perfectly for me" --- ### Specification * precise description of requirements -- * can never capture everything we _want_ -- * attempts to describe all _required_ behaviour -- * like any model, should _correspond_ with the real world ??? Mathematical models are **precise** and **reproducible**, which is very helpful for testing. However, if they don't **correspond** to the real world, they're not very **useful**! It's easy to write a specification that you can live up to where that specification isn't terribly meaningful in the real world... just ask the staff who have to wear winter coats to work in a new building where the HVAC systems have been checked for conformance with the specification! --- ### Test * a set of _inputs_ and _expected outputs_ -- * based on the specification -- * can be **checked independently** against the specification --- layout: false # Boundaries <img src="https://naeye.net/wp-content/uploads/2021/01/water_phase_diagram-900x782.jpg" alt="Phase diagram of water" width="500" class="floatright"/> ### Limitations of testing -- How long would it take to test $2^{64}$ possible inputs? -- How should we test the following? ```python def phase(pressure, temp): """Find the phase of water at the given pressure and temperature. """ # ... ``` ??? We usually can't test **exhaustively**, i.e., checking **every single possible** input. --- # Example* Given a specific year as an input, a function should return true if the provided year is a leap year and false if it is not. A year is a leap year if: * the year is divisible by 4; * and the year is not divisible by 100; * unless it is divisible by 400 (when it **is** a leap year) .footnote[ * From _Software Testing: From Theory to Practice_, MaurĂcio Aniche, 2020. https://sttp.site ] --- # Summary ### What is correctness? ### How can we know what's right? ### What can we do to check our code's correctness? --- class: big, middle (here endeth the lesson)