name: title class: middle, center, dark # Describing and plotting data (Part 1) --- class: light # Here is a lot of numbers <div class=rtable> <table> <tbody> <tr> <td style="text-align:right;"> 52 </td> <td style="text-align:right;"> -23 </td> <td style="text-align:right;"> 57 </td> <td style="text-align:right;"> 91 </td> <td style="text-align:right;"> -42 </td> <td style="text-align:right;"> 34 </td> <td style="text-align:right;"> -59 </td> <td style="text-align:right;"> -50 </td> <td style="text-align:right;"> -80 </td> <td style="text-align:right;"> -71 </td> <td style="text-align:right;"> 35 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 57 </td> <td style="text-align:right;"> 49 </td> <td style="text-align:right;"> -14 </td> </tr> <tr> <td style="text-align:right;"> 100 </td> <td style="text-align:right;"> -48 </td> <td style="text-align:right;"> -49 </td> <td style="text-align:right;"> -25 </td> <td style="text-align:right;"> -75 </td> <td style="text-align:right;"> 81 </td> <td style="text-align:right;"> -69 </td> <td style="text-align:right;"> -5 </td> <td style="text-align:right;"> 79 </td> <td style="text-align:right;"> -85 </td> <td style="text-align:right;"> -5 </td> <td style="text-align:right;"> -69 </td> <td style="text-align:right;"> 98 </td> <td style="text-align:right;"> -11 </td> <td style="text-align:right;"> 89 </td> <td style="text-align:right;"> -24 </td> </tr> <tr> <td style="text-align:right;"> -55 </td> <td style="text-align:right;"> -14 </td> <td style="text-align:right;"> -51 </td> <td style="text-align:right;"> 49 </td> <td style="text-align:right;"> 74 </td> <td style="text-align:right;"> -71 </td> <td style="text-align:right;"> 91 </td> <td style="text-align:right;"> 77 </td> <td style="text-align:right;"> 68 </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> 13 </td> <td style="text-align:right;"> -81 </td> <td style="text-align:right;"> 21 </td> <td style="text-align:right;"> 86 </td> <td style="text-align:right;"> -32 </td> <td style="text-align:right;"> 15 </td> </tr> <tr> <td style="text-align:right;"> 65 </td> <td style="text-align:right;"> -22 </td> <td style="text-align:right;"> 85 </td> <td style="text-align:right;"> -57 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 54 </td> <td style="text-align:right;"> 100 </td> <td style="text-align:right;"> 76 </td> <td style="text-align:right;"> -11 </td> <td style="text-align:right;"> 83 </td> <td style="text-align:right;"> -60 </td> <td style="text-align:right;"> 74 </td> <td style="text-align:right;"> -61 </td> <td style="text-align:right;"> 30 </td> <td style="text-align:right;"> 93 </td> <td style="text-align:right;"> -53 </td> </tr> <tr> <td style="text-align:right;"> 90 </td> <td style="text-align:right;"> 90 </td> <td style="text-align:right;"> -68 </td> <td style="text-align:right;"> 51 </td> <td style="text-align:right;"> 85 </td> <td style="text-align:right;"> -58 </td> <td style="text-align:right;"> -56 </td> <td style="text-align:right;"> 38 </td> <td style="text-align:right;"> -34 </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> 66 </td> <td style="text-align:right;"> -52 </td> <td style="text-align:right;"> 14 </td> <td style="text-align:right;"> -10 </td> <td style="text-align:right;"> -34 </td> <td style="text-align:right;"> -42 </td> </tr> <tr> <td style="text-align:right;"> 99 </td> <td style="text-align:right;"> 24 </td> <td style="text-align:right;"> -30 </td> <td style="text-align:right;"> -1 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 46 </td> <td style="text-align:right;"> -11 </td> <td style="text-align:right;"> 15 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> 67 </td> <td style="text-align:right;"> -17 </td> <td style="text-align:right;"> -48 </td> <td style="text-align:right;"> 36 </td> <td style="text-align:right;"> -62 </td> <td style="text-align:right;"> -86 </td> </tr> <tr> <td style="text-align:right;"> -24 </td> <td style="text-align:right;"> -28 </td> <td style="text-align:right;"> -9 </td> <td style="text-align:right;"> -13 </td> <td style="text-align:right;"> 19 </td> <td style="text-align:right;"> -3 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 90 </td> <td style="text-align:right;"> -63 </td> <td style="text-align:right;"> -28 </td> <td style="text-align:right;"> -18 </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> 92 </td> <td style="text-align:right;"> 28 </td> <td style="text-align:right;"> -94 </td> <td style="text-align:right;"> -25 </td> </tr> <tr> <td style="text-align:right;"> 26 </td> <td style="text-align:right;"> 93 </td> <td style="text-align:right;"> 21 </td> <td style="text-align:right;"> 39 </td> <td style="text-align:right;"> -90 </td> <td style="text-align:right;"> 62 </td> <td style="text-align:right;"> -19 </td> <td style="text-align:right;"> 36 </td> <td style="text-align:right;"> 14 </td> <td style="text-align:right;"> -27 </td> <td style="text-align:right;"> -67 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> -19 </td> <td style="text-align:right;"> -46 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> 48 </td> </tr> <tr> <td style="text-align:right;"> -45 </td> <td style="text-align:right;"> 98 </td> <td style="text-align:right;"> -56 </td> <td style="text-align:right;"> -48 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> 98 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> -32 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> 68 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> -99 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> 66 </td> <td style="text-align:right;"> 65 </td> <td style="text-align:right;"> -80 </td> </tr> <tr> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 57 </td> <td style="text-align:right;"> -49 </td> <td style="text-align:right;"> 92 </td> <td style="text-align:right;"> 65 </td> <td style="text-align:right;"> -54 </td> <td style="text-align:right;"> -95 </td> <td style="text-align:right;"> -73 </td> <td style="text-align:right;"> -61 </td> <td style="text-align:right;"> -71 </td> <td style="text-align:right;"> -61 </td> <td style="text-align:right;"> 70 </td> <td style="text-align:right;"> 52 </td> <td style="text-align:right;"> -1 </td> <td style="text-align:right;"> 8 </td> </tr> </tbody> </table> </div> --- class: light # What can we say about them? We can see they aren't all the same. Not much else really. Looking at a bunch of numbers is hard work. <div class=rtable> <table> <tbody> <tr> <td style="text-align:right;"> 52 </td> <td style="text-align:right;"> -23 </td> <td style="text-align:right;"> 57 </td> <td style="text-align:right;"> 91 </td> <td style="text-align:right;"> -42 </td> <td style="text-align:right;"> 34 </td> <td style="text-align:right;"> -59 </td> <td style="text-align:right;"> -50 </td> <td style="text-align:right;"> -80 </td> <td style="text-align:right;"> -71 </td> <td style="text-align:right;"> 35 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 57 </td> <td style="text-align:right;"> 49 </td> <td style="text-align:right;"> -14 </td> </tr> <tr> <td style="text-align:right;"> 100 </td> <td style="text-align:right;"> -48 </td> <td style="text-align:right;"> -49 </td> <td style="text-align:right;"> -25 </td> <td style="text-align:right;"> -75 </td> <td style="text-align:right;"> 81 </td> <td style="text-align:right;"> -69 </td> <td style="text-align:right;"> -5 </td> <td style="text-align:right;"> 79 </td> <td style="text-align:right;"> -85 </td> <td style="text-align:right;"> -5 </td> <td style="text-align:right;"> -69 </td> <td style="text-align:right;"> 98 </td> <td style="text-align:right;"> -11 </td> <td style="text-align:right;"> 89 </td> <td style="text-align:right;"> -24 </td> </tr> <tr> <td style="text-align:right;"> -55 </td> <td style="text-align:right;"> -14 </td> <td style="text-align:right;"> -51 </td> <td style="text-align:right;"> 49 </td> <td style="text-align:right;"> 74 </td> <td style="text-align:right;"> -71 </td> <td style="text-align:right;"> 91 </td> <td style="text-align:right;"> 77 </td> <td style="text-align:right;"> 68 </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> 13 </td> <td style="text-align:right;"> -81 </td> <td style="text-align:right;"> 21 </td> <td style="text-align:right;"> 86 </td> <td style="text-align:right;"> -32 </td> <td style="text-align:right;"> 15 </td> </tr> <tr> <td style="text-align:right;"> 65 </td> <td style="text-align:right;"> -22 </td> <td style="text-align:right;"> 85 </td> <td style="text-align:right;"> -57 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 54 </td> <td style="text-align:right;"> 100 </td> <td style="text-align:right;"> 76 </td> <td style="text-align:right;"> -11 </td> <td style="text-align:right;"> 83 </td> <td style="text-align:right;"> -60 </td> <td style="text-align:right;"> 74 </td> <td style="text-align:right;"> -61 </td> <td style="text-align:right;"> 30 </td> <td style="text-align:right;"> 93 </td> <td style="text-align:right;"> -53 </td> </tr> <tr> <td style="text-align:right;"> 90 </td> <td style="text-align:right;"> 90 </td> <td style="text-align:right;"> -68 </td> <td style="text-align:right;"> 51 </td> <td style="text-align:right;"> 85 </td> <td style="text-align:right;"> -58 </td> <td style="text-align:right;"> -56 </td> <td style="text-align:right;"> 38 </td> <td style="text-align:right;"> -34 </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> 66 </td> <td style="text-align:right;"> -52 </td> <td style="text-align:right;"> 14 </td> <td style="text-align:right;"> -10 </td> <td style="text-align:right;"> -34 </td> <td style="text-align:right;"> -42 </td> </tr> <tr> <td style="text-align:right;"> 99 </td> <td style="text-align:right;"> 24 </td> <td style="text-align:right;"> -30 </td> <td style="text-align:right;"> -1 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 46 </td> <td style="text-align:right;"> -11 </td> <td style="text-align:right;"> 15 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> 67 </td> <td style="text-align:right;"> -17 </td> <td style="text-align:right;"> -48 </td> <td style="text-align:right;"> 36 </td> <td style="text-align:right;"> -62 </td> <td style="text-align:right;"> -86 </td> </tr> <tr> <td style="text-align:right;"> -24 </td> <td style="text-align:right;"> -28 </td> <td style="text-align:right;"> -9 </td> <td style="text-align:right;"> -13 </td> <td style="text-align:right;"> 19 </td> <td style="text-align:right;"> -3 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 90 </td> <td style="text-align:right;"> -63 </td> <td style="text-align:right;"> -28 </td> <td style="text-align:right;"> -18 </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> 92 </td> <td style="text-align:right;"> 28 </td> <td style="text-align:right;"> -94 </td> <td style="text-align:right;"> -25 </td> </tr> <tr> <td style="text-align:right;"> 26 </td> <td style="text-align:right;"> 93 </td> <td style="text-align:right;"> 21 </td> <td style="text-align:right;"> 39 </td> <td style="text-align:right;"> -90 </td> <td style="text-align:right;"> 62 </td> <td style="text-align:right;"> -19 </td> <td style="text-align:right;"> 36 </td> <td style="text-align:right;"> 14 </td> <td style="text-align:right;"> -27 </td> <td style="text-align:right;"> -67 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> -19 </td> <td style="text-align:right;"> -46 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> 48 </td> </tr> <tr> <td style="text-align:right;"> -45 </td> <td style="text-align:right;"> 98 </td> <td style="text-align:right;"> -56 </td> <td style="text-align:right;"> -48 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> 98 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> -32 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> 68 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> -99 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> 66 </td> <td style="text-align:right;"> 65 </td> <td style="text-align:right;"> -80 </td> </tr> <tr> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 57 </td> <td style="text-align:right;"> -49 </td> <td style="text-align:right;"> 92 </td> <td style="text-align:right;"> 65 </td> <td style="text-align:right;"> -54 </td> <td style="text-align:right;"> -95 </td> <td style="text-align:right;"> -73 </td> <td style="text-align:right;"> -61 </td> <td style="text-align:right;"> -71 </td> <td style="text-align:right;"> -61 </td> <td style="text-align:right;"> 70 </td> <td style="text-align:right;"> 52 </td> <td style="text-align:right;"> -1 </td> <td style="text-align:right;"> 8 </td> </tr> </tbody> </table> </div> --- class: light # Summary numbers It would be nice to reduce the big set of numbers down to a few numbers that we can look at and make sense of. **Sameness (Central Tendency)** - What are all the numbers close to? **Differentness (Variance)** - How different are the numbers? --- class: light # Descriptive Statistics - Give us summaries of big sets of numbers - Useful single numbers to look at - They tell us about patterns of sameness and differentness --- class: light, center, middle, clear # Graph the numbers to get a better look --- class: light # Dot plot (unordered) Graphing the numbers gives a quick and dirty sense of what they are like. Here's 200 numbers presented as dots <img src="2-Descriptives_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> --- class: light # Dot plot (ordered) Sorting the numbers from smallest to largest <img src="2-Descriptives_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> --- class: light # Histograms Histograms count up the numbers inside specific ranges <img src="2-Descriptives_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" /> --- class: light # Histograms Bars show you which bins have more or less numbers in the range <img src="2-Descriptives_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> --- class: light # So what are these numbers like? What single number would you say best describes most of these numbers? <img src="2-Descriptives_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> --- class: light # Question Is the red or blue value a better summary of all the numbers? <img src="2-Descriptives_files/figure-html/unnamed-chunk-8-1.png" style="display: block; margin: auto;" /> --- class: light, center, middle, clear # Measures of Central Tendency --- class: light # Central Tendency 1. **Central tendency** should describe what most of the data is like -- 2. We want our summary number to be most like the other numbers. We want it to be a **representative value** -- 3. There are **multiple measures** of central tendency with different properties -- 5. Some work better than others depending on the data --- class: dark, center, middle, clear # Mode --- class: light # Mode The mode is the single most frequently occuring number > 1 1 2 2 3 4 5 6 7 7 7 7 7 - The mode is 7 because 7 happens the most - Find the mode by counting the occurence of each number, the mode is the most frequently occuring number - If there is a tie, then you have two or three or more modes (depends on how many different numbers tie) --- class: light # Finding the Mode in Python We make 25 numbers, how do we get python to find the mode? ```python import numpy as np a=np.random.randint(1,10+1, 25) counts = np.bincount(a) max=np.argmax(counts) max, counts[max] ``` --- class: light # Custom function for the mode in python You can always write your own function for the mode. This one is called `my_mode` ```python def my_mode(array): counts = np.bincount(a) max=np.argmax(counts) return max, counts[max] a=np.random.randint(1,10+1, 25) my_mode(a) ``` --- class: light # Thinking about the mode When should we use mode? Appropriate for many datasets; for nominal data (or oridinal), it may be one of the few reasonable descriptors --- class: light class: pink, center, middle, clear # Median --- class: light # Median The median is the middle number > 1 1 2 2 3 4 **5** 6 7 7 7 7 7 - The median is 5 because it is the middle number - Find the median by ordering the numbers from smallest to largest, then take the number in the middle --- class: light # Median (even number of numbers) If there are an even number of numbers, find the two in the middle, and > 1 2 3 **4** **5** 6 7 8 - The median is 4.5 because, 4.5 is in between the two middle numbers --- class: light # Finding the Median in Python Put some numbers in a variable. ```python a=np.random.randint(1,10+1, 12) np.median(a) ``` --- class: light # Thinking about the median When would the median be a good thing to know? Suitable for many datasets, and makes sense for ordinal data. More robust to outliers than mean --- class: light, center, middle, clear # Mean --- class: light # Mean The Mean (also called average) is the sum of the numbers, divided by the number of numbers `\(\text{Mean} = \frac{\text{sum of numbers}}{\text{number of numbers}}\)` > 1 1 2 2 3 4 5 6 7 7 7 7 7 - Sum = 1+1+2+2+3+4+5+6+7+7+7+7 = 59 - Number of numbers = 13 - Mean = 59/13 = 4.538462 --- class: light # Mean `\(\text{Mean} = \bar{X} = \frac{\sum_{i=1}^{i=N}{x_i}}{N}\)` - `\(\bar{X}\)` bar symbolizes the mean - `\(\sum_{i=1}^{i=N}{x_i}\)` Summation notation - `\(x\)` = all the numbers (1,2,3,4...) - `\(i\)` = an index value, representing the first to last and all the numbers in between of x. - `\(N\)` = the number of numbers - `\(\sum\)` = instruction to add up numbers --- class: light # Summation example `\(x = [4,7,9]\)` `\(\sum_{i=1}^{i=N}{x_i} = x_{i=1} + x_{i=2} + x_{i=3} = 4 + 7 + 9 = 20\)` --- class: light # Mean in a table <table> <thead> <tr> <th style="text-align:left;"> index </th> <th style="text-align:left;"> x </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 1 </td> <td style="text-align:left;"> 4 </td> </tr> <tr> <td style="text-align:left;"> 2 </td> <td style="text-align:left;"> 7 </td> </tr> <tr> <td style="text-align:left;"> 3 </td> <td style="text-align:left;"> 2 </td> </tr> <tr> <td style="text-align:left;"> 4 </td> <td style="text-align:left;"> 9 </td> </tr> <tr> <td style="text-align:left;"> 5 </td> <td style="text-align:left;"> 8 </td> </tr> <tr> <td style="text-align:left;"> Sum </td> <td style="text-align:left;"> 30 </td> </tr> <tr> <td style="text-align:left;"> N </td> <td style="text-align:left;"> 5 </td> </tr> <tr> <td style="text-align:left;"> Mean </td> <td style="text-align:left;"> 6 </td> </tr> </tbody> </table> --- class: light # The mean equally divides the sum <table> <thead> <tr> <th style="text-align:left;"> index </th> <th style="text-align:left;"> x </th> <th style="text-align:left;"> equal_parts </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 1 </td> <td style="text-align:left;"> 4 </td> <td style="text-align:left;"> 6 </td> </tr> <tr> <td style="text-align:left;"> 2 </td> <td style="text-align:left;"> 7 </td> <td style="text-align:left;"> 6 </td> </tr> <tr> <td style="text-align:left;"> 3 </td> <td style="text-align:left;"> 2 </td> <td style="text-align:left;"> 6 </td> </tr> <tr> <td style="text-align:left;"> 4 </td> <td style="text-align:left;"> 9 </td> <td style="text-align:left;"> 6 </td> </tr> <tr> <td style="text-align:left;"> 5 </td> <td style="text-align:left;"> 8 </td> <td style="text-align:left;"> 6 </td> </tr> <tr> <td style="text-align:left;"> Sum </td> <td style="text-align:left;"> 30 </td> <td style="text-align:left;"> 30 </td> </tr> <tr> <td style="text-align:left;"> N </td> <td style="text-align:left;"> 5 </td> <td style="text-align:left;"> 5 </td> </tr> <tr> <td style="text-align:left;"> Mean </td> <td style="text-align:left;"> 6 </td> <td style="text-align:left;"> 6 </td> </tr> </tbody> </table> --- class: light # The mean is the balancing point .pull-left[ - deviation = score minus mean - sum of deviations will always equal zero ] .pull-right[ <table> <thead> <tr> <th style="text-align:left;"> index </th> <th style="text-align:left;"> x </th> <th style="text-align:left;"> deviations </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 1 </td> <td style="text-align:left;"> 4 </td> <td style="text-align:left;"> -2 </td> </tr> <tr> <td style="text-align:left;"> 2 </td> <td style="text-align:left;"> 7 </td> <td style="text-align:left;"> 1 </td> </tr> <tr> <td style="text-align:left;"> 3 </td> <td style="text-align:left;"> 2 </td> <td style="text-align:left;"> -4 </td> </tr> <tr> <td style="text-align:left;"> 4 </td> <td style="text-align:left;"> 9 </td> <td style="text-align:left;"> 3 </td> </tr> <tr> <td style="text-align:left;"> 5 </td> <td style="text-align:left;"> 8 </td> <td style="text-align:left;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Sum </td> <td style="text-align:left;"> 30 </td> <td style="text-align:left;"> 0 </td> </tr> <tr> <td style="text-align:left;"> N </td> <td style="text-align:left;"> 5 </td> <td style="text-align:left;"> 5 </td> </tr> <tr> <td style="text-align:left;"> Mean </td> <td style="text-align:left;"> 6 </td> <td style="text-align:left;"> 0 </td> </tr> </tbody> </table> ] --- class: light # Finding the Mean in Python Use the `mean()` function ```python #make some numbers a=np.random.randint(1,10+1, 12) np.mean(a) ``` --- class: light # sum() and length() - `sum()` sums up the numbers - `.size` counts up the number of numbers in the variable ```python a=np.random.randint(1,10+1, 12) np.sum(a) ``` ```python a.size ``` --- class: light # Mean = sum()/length() ```python a=np.random.randint(1,10+1, 12) np.sum(a)/a.size ``` --- class: light # Thinking about the Mean When would the mean be a good thing to know? Most appropriate for interval and ratio data. But sensitive to outliers. --- class: light, center, middle, clear # Do descriptive statistics for central tendency actually describe the data? ## It depends on the data --- class: light # Histogram shape: Bell-Shaped Mean (Red), Median (Green), Mode (Blue) <img src="2-Descriptives_files/figure-html/unnamed-chunk-24-1.png" width="450px" style="display: block; margin: auto;" /> --- class: light # Right-skewed Mean (Red), Median (Green), Mode (Blue) <img src="2-Descriptives_files/figure-html/unnamed-chunk-25-1.png" width="450px" style="display: block; margin: auto;" /> --- class: light # Outliers Outliers are really big or really small values that are unusual compared to the rest of the data <img src="2-Descriptives_files/figure-html/unnamed-chunk-26-1.png" width="400px" style="display: block; margin: auto;" /> --- class: light # Mean, Median, and outliers The mean is influenced by outliers, the median is not. Mean (Red), Median (Green) <img src="2-Descriptives_files/figure-html/unnamed-chunk-27-1.png" width="400px" style="display: block; margin: auto;" /> --- class: light # Zooming in The big number (2000) makes the mean really big, because it is included in the sum. <img src="2-Descriptives_files/figure-html/unnamed-chunk-28-1.png" width="400px" style="display: block; margin: auto;" /> --- class: pink, center, middle, clear # Always plot your data --- class: light # Big ideas 1. Descriptive statistics help us reduce a large pile of numbers to a few numbers that "describe the data" -- 2. Mode, median, mean, are descriptives for central tendency in the data (meant to represent what most of the numbers are like) -- 3. Measures of central tendency can be "off" by quite a bit depending on the shape of the data, need to look at data to see if they are appropriate --- class: light Thanks to Todd Gureckis and Matt Crump for the slides.