Write Python functions to compute measures of central tendancy.

Tue, 5 Mar 2024 18:00

Write Python functions to compute measures of central tendancy.

There are (at least!) three ways to measure central tendancy of a set of values: the mean, the median and the mode. In this assignment, you will write three Python functions, one to calculate each of them.

Background

Mean

We are very familiar with the arithmetic mean, having seen it several times already in this course:

\[ \frac{1}{n} \sum_{i=0}^{n} X \]

where $X$ is a set of values and $n$ is the number of values in $X$. However, there is also another kind of mean called the geometric mean. It works just like the arithmetic mean, but with values multiplied together instead of added together and then taking the $n^\textrm{th}$ root of the product rather than dividing by $n$:

\[ \sqrt[\large{n}]{ \prod_{i=0}^{n} X} \]

Median

The median of a set of numbers is the middle value if you were to sort them all from smallest to largest (or vice-versa). This can give very different results from the mean, tending to ignore the effect of extreme outliers. For example, if the incomes in a province are mostly in the range of CAD 40k/year but there is one super-wealthy individual making CAD 1B/year, the mean will be affected by the outlier but the median will not.

Mode

The mode of a set of numbers is the most common number in that set. For example, in the set $S = {{ 1, 1, 1, 2, 5, 3, 9 }}$, the most common number is 1, even though it is neither the mean nor the median value.

Tip

Update / clarification:

The statistical mode of a set of values may not be unique. You may have two modes (a bimodal distribution) or more than two modes (a multimodal distribution). So, it’s a fair question to ask: "how should my code handle these situations?"

The answer for this assignment is that the mode function should return any valid mode. If there are two more more modes, just pick one arbitrarily (the first one, the last one, etc.). I won’t consider any valid mode to be incorrect.

Requirements

After working through a few examples of the concepts above, implement the following functions in Python. Each function should accept a parameter that represents a sequence of values (which may be a list or another iterable). For all three functions, if an empty list is passed as an argument, the function should return None.

  1. mean should calculate the mean, as described above

    • in addition to the parameter for input values, this function should also have a parameter called geometric with a default argument of False

    • if the argument True is passed to geometric, the mean function should calculate the geometric mean instead of the arithmetic mean

  2. median should calculate the median, as described above

  3. mode should calculate the mode, as described above

Define these three functions in a file called central.py and submit to Gradescope. As always, please don’t hesitate to contact me if you have questions! Also as always, remember that assignments are individual work.

Bonus: Geothmetic Meandian

As per XKCD 2435, I will assign bonus points if you also submit a file called meandian.py with a function meandian that computes the "geothmetic meandian":

geothmetic meandian 2x

This function, like those above, should take a list (or other iterable) as input.