// Numbas version: exam_results_page_options {"name": "Statistics - Frequency Tables, Measures of central tendency and Spread", "extensions": ["stats"], "custom_part_types": [], "resources": [], "navigation": {"allowregen": true, "showfrontpage": false, "preventleave": false, "typeendtoleave": false}, "question_groups": [{"pickingStrategy": "all-ordered", "questions": [{"variables": {"median": {"group": "Final data", "name": "median", "templateType": "anything", "definition": "median(a)", "description": ""}, "cf": {"group": "Final data", "name": "cf", "templateType": "anything", "definition": "[freq[0],freq[0]+freq[1],freq[0]+freq[1]+freq[2],freq[0]+freq[1]+freq[2]+freq[3],freq[0]+freq[1]+freq[2]+freq[3]+freq[4],freq[0]+freq[1]+freq[2]+freq[3]+freq[4]+freq[5],freq[0]+freq[1]+freq[2]+freq[3]+freq[4]+freq[5]+freq[6]]", "description": ""}, "mode": {"group": "Final data", "name": "mode", "templateType": "anything", "definition": "m[0]", "description": ""}, "a": {"group": "Final data", "name": "a", "templateType": "anything", "definition": "if(len(modea1) = 1, a1, if(len(modea2) = 1, a2, a3))", "description": ""}, "modea1": {"group": "Ungrouped variables", "name": "modea1", "templateType": "anything", "definition": "mode(a1)", "description": ""}, "as": {"group": "Final data", "name": "as", "templateType": "anything", "definition": "sort(a)", "description": ""}, "m": {"group": "Final data", "name": "m", "templateType": "anything", "definition": "mode(a)", "description": ""}, "a3": {"group": "Ungrouped variables", "name": "a3", "templateType": "anything", "definition": "shuffle(repeat(0, 7) + repeat(1, 10) + repeat(2, 5) + repeat(3, 4) + repeat(4, 2) + repeat(5,1) + repeat(6,1))", "description": ""}, "a2": {"group": "Ungrouped variables", "name": "a2", "templateType": "anything", "definition": "shuffle(repeat(random(0..1), 13) + 0 + 2 + 2 + repeat(random(2..3), 10) + repeat(random(4..5), 3) + random(0..6))", "description": ""}, "mean": {"group": "Final data", "name": "mean", "templateType": "anything", "definition": "mean(a)", "description": ""}, "scores": {"group": "Final data", "name": "scores", "templateType": "anything", "definition": "[0,1,2,3,4,5,6]", "description": ""}, "modea3": {"group": "Ungrouped variables", "name": "modea3", "templateType": "anything", "definition": "mode(a3)", "description": ""}, "modea2": {"group": "Ungrouped variables", "name": "modea2", "templateType": "anything", "definition": "mode(a2)", "description": ""}, "freq": {"group": "Final data", "name": "freq", "templateType": "anything", "definition": "map(\nlen(filter(x=j,x,a)),\nj, 0..6)", "description": ""}, "fx": {"group": "Final data", "name": "fx", "templateType": "anything", "definition": "scores[0]*freq[0]+scores[1]*freq[1]+scores[2]*freq[2]+scores[3]*freq[3]+scores[4]*freq[4]+scores[5]*freq[5]+scores[6]*freq[6]", "description": ""}, "a1": {"group": "Ungrouped variables", "name": "a1", "templateType": "anything", "definition": "shuffle(repeat(random(0..1), 13) + 0 + 2 + 2 + repeat(random(2..3), 10) + repeat(random(4..5), 3) + random(0..6))", "description": ""}}, "statement": "
Once we have gathered data two questions naturally arise:
\nFrequency distribution tables allow us to quickly and easily organise data and allow us to easily detemine the measures of central tendency and measures of spread. These measures allow us to gain valuable insight into our data set.
\n\n\n30 random students were asked about the number of siblings they have. These are their responses:
\n$\\var{a[0]}$ | \n$\\var{a[1]}$ | \n$\\var{a[2]}$ | \n$\\var{a[3]}$ | \n$\\var{a[4]}$ | \n$\\var{a[5]}$ | \n$\\var{a[6]}$ | \n$\\var{a[7]}$ | \n$\\var{a[8]}$ | \n$\\var{a[9]}$ | \n
$\\var{a[10]}$ | \n$\\var{a[11]}$ | \n$\\var{a[12]}$ | \n$\\var{a[13]}$ | \n$\\var{a[14]}$ | \n$\\var{a[15]}$ | \n$\\var{a[16]}$ | \n$\\var{a[17]}$ | \n$\\var{a[18]}$ | \n$\\var{a[19]}$ | \n
$\\var{a[20]}$ | \n$\\var{a[21]}$ | \n$\\var{a[22]}$ | \n$\\var{a[23]}$ | \n$\\var{a[24]}$ | \n$\\var{a[25]}$ | \n$\\var{a[26]}$ | \n$\\var{a[27]}$ | \n$\\var{a[28]}$ | \n$\\var{a[29]}$ | \n
Given a table of data, complete a frequency distribution table and use it to calculate the mean, mode, median and range.
"}, "name": "Statistics - Frequency Tables, Measures of central tendency and Spread", "advice": "Organising the data in a frequency table helps to make mistakes less likely when calculating statistics from our data, summarising the responses all in one place with fewer numbers.
\nEach row of the frequency column gives the number of students with the corresponding number of siblings.
\nNumber of siblings | \nf | \nfx | \ncf | \n
---|---|---|---|
$0$ | \n$\\var{freq[0]}$ | \n$\\simplify{{freq[0]*scores[0]}}$ | \n$\\var{cf[0]}$ | \n
$1$ | \n$\\var{freq[1]}$ | \n$\\simplify{{freq[1]*scores[1]}}$ | \n$\\var{cf[1]}$ | \n
$2$ | \n$\\var{freq[2]}$ | \n$\\simplify{{freq[2]*scores[2]}}$ | \n$\\var{cf[2]}$ | \n
$3$ | \n$\\var{freq[3]}$ | \n$\\simplify{{freq[3]*scores[3]}}$ | \n$\\var{cf[3]}$ | \n
$4$ | \n$\\var{freq[4]}$ | \n$\\simplify{{freq[4]*scores[4]}}$ | \n$\\var{cf[4]}$ | \n
$5$ | \n$\\var{freq[5]}$ | \n$\\simplify{{freq[5]*scores[5]}}$ | \n$\\var{cf[5]}$ | \n
$6$ | \n$\\var{freq[6]}$ | \n$\\simplify{{freq[6]*scores[6]}}$ | \n$\\var{cf[6]}$ | \n
Total | \n$30$ | \n$\\var{fx}$ | \n$30$ | \n
Always remember to check whether your frequency column adds up to the total (here, it is $30$) to make sure you have not left out any responses.
\nThe mean number of siblings is the total number of siblings, $\\sum x$, divided by the number of students in the sample, $n$.
\n\\begin{align}
\\sum x &= 0 \\times \\var{freq[0]} + 1 \\times \\var{freq[1]} + 2 \\times \\var{freq[2]} + 3 \\times \\var{freq[3]} + 4 \\times \\var{freq[4]} + 5 \\times \\var{freq[5]} + 6 \\times \\var{freq[6]}
\\\\
&= 0 + \\var{1*freq[1]} + \\var{2*freq[2]} + \\var{3*freq[3]} + \\var{4*freq[4]} + \\var{5*freq[5]} + \\var{6*freq[6]} \\\\&= \\var{sum(a)} \\text{.}
\\end{align}
The total number of students $n$ is $30$.
\nTherefore the mean is
\n\\begin{align}
\\bar{x} &= \\frac{\\sum x}{n} \\\\
&= \\frac{\\var{sum(a)}}{30} \\\\
&= \\var{mean} \\text{.}
\\end{align}
Rounding the answer to 2 decimal places, we get $\\var{precround(mean, 2)}$.
\nThe mode is the value with the highest frequency. Here, the mode is $\\var{mode}$ siblings, with frequency $\\var{freq[mode]}$.
\nThe median is the \"middle\" value in the sample, when arranged in numerical order.
\nSince $n = 30$, we have two middle values in this data (15th and 16th place). We can count from the top of the table until we locate rows where these middle values lie, as the numbers in the table are already sorted by order.
\nHere, both $15$th and $16$th value lie in the row $\\var{as[14]}$.Here, the $15$th value lies in the row $\\var{as[14]}$ while the $16$th value lies in the row $\\var{as[15]}$.
\nAs $15$th value $= 16$th value $= \\var{as[14]}$, the median is $\\var{as[14]}$.As $15$th value $= \\var{as[14]}$ and $16$th value $= \\var{as[15]}$, we need to find their mean.
\n\\[ \\displaystyle \\begin{align} \\frac{\\var{as[14]} + \\var{as[15]}}{2} &= \\frac{\\var{as[14] + as[15]}}{2} \\\\&= \\var{median} \\text{.} \\end{align}\\]
\nThis is the median for this data.
\n\nRange
\nThe range gives us an idea of the spread of the data and is simply the difference between the largest score and the smallest score. It can easily be found from our table by looking for the largest score with a non-zero frequency and subtracting the smallest score with a non-zero frequency.
\n\nrange = $\\var{max(a)}-\\var{min(a)}$=$\\simplify{{max(a)-min(a)}}$
\n\nAdding an outlier
\n\nAn outlier is a score that is \"much\" smaller or \"much\" larger than the majority of the other scores in a data set, exactly what we mean by \"much\" will be looked at in later years. We will now adjust our frequency distribution table to include the extra 19.
\n\nNumber of siblings | \nf | \nfx | \ncf | \n
---|---|---|---|
$0$ | \n$\\var{freq[0]}$ | \n$\\simplify{{freq[0]*scores[0]}}$ | \n$\\var{cf[0]}$ | \n
$1$ | \n$\\var{freq[1]}$ | \n$\\simplify{{freq[1]*scores[1]}}$ | \n$\\var{cf[1]}$ | \n
$2$ | \n$\\var{freq[2]}$ | \n$\\simplify{{freq[2]*scores[2]}}$ | \n$\\var{cf[2]}$ | \n
$3$ | \n$\\var{freq[3]}$ | \n$\\simplify{{freq[3]*scores[3]}}$ | \n$\\var{cf[3]}$ | \n
$4$ | \n$\\var{freq[4]}$ | \n$\\simplify{{freq[4]*scores[4]}}$ | \n$\\var{cf[4]}$ | \n
$5$ | \n$\\var{freq[5]}$ | \n$\\simplify{{freq[5]*scores[5]}}$ | \n$\\var{cf[5]}$ | \n
$6$ | \n$\\var{freq[6]}$ | \n$\\simplify{{freq[6]*scores[6]}}$ | \n$\\var{cf[6]}$ | \n
$19$ | \n$1$ | \n$19$ | \n$31$ | \n
Total | \n$31$ | \n$\\simplify{{fx+19}}$ | \n$31$ | \n
mean = $\\frac{\\simplify{{fx+19}}}{31}$ = $\\var{precround((fx+19)/31,2)}$
\nmode = $\\var{mode}$
\nSince there are now 31 scores the median is the 16th score, so:
\nmedian=$\\var{median(a+19)}$
\nrange=$19$
\n\nSo we can see that of the 3 possible measures of central tendency, the mean is the most sensitive to outliers, that is, it changed the most. This is the reason we have more than one choice to measure central tendency, sometimes one is better than the others. Similarly in future years we will introduce some choices for measures of spread as the range is also highly sensitive to outliers.
", "preamble": {"js": "", "css": ""}, "rulesets": {}, "functions": {}, "variablesTest": {"condition": "", "maxRuns": "1000"}, "parts": [{"showCorrectAnswer": true, "sortAnswers": false, "extendBaseMarkingAlgorithm": true, "customMarkingAlgorithm": "", "type": "gapfill", "scripts": {}, "showFeedbackIcon": true, "variableReplacementStrategy": "originalfirst", "variableReplacements": [], "marks": 0, "prompt": "Complete the following frequency table:
\nNumber of siblings | \nf | \nfx | \ncf | \n
---|---|---|---|
$0$ | \n[[0]] | \n[[7]] | \n[[15]] | \n
$1$ | \n[[1]] | \n[[8]] | \n[[16]] | \n
$2$ | \n[[2]] | \n[[9]] | \n[[17]] | \n
$3$ | \n[[3]] | \n[[10]] | \n[[18]] | \n
$4$ | \n[[4]] | \n[[11]] | \n[[19]] | \n
$5$ | \n[[5]] | \n[[12]] | \n[[20]] | \n
$6$ | \n[[6]] | \n[[13]] | \n[[21]] | \n
Total | \n$30$ | \n[[14]] | \n$30$ | \n
Find the mean, mode and median for this data.
\nMean = [[0]]
\nMode = [[1]]
\nMedian = [[2]]
", "gaps": [{"minValue": "mean", "showCorrectAnswer": true, "showPrecisionHint": true, "extendBaseMarkingAlgorithm": true, "customMarkingAlgorithm": "", "scripts": {}, "mustBeReducedPC": 0, "allowFractions": false, "variableReplacements": [], "marks": 1, "precisionMessage": "You have not given your answer to the correct precision.", "precisionPartialCredit": 0, "notationStyles": ["plain", "en", "si-en"], "precision": "2", "precisionType": "dp", "type": "numberentry", "unitTests": [], "strictPrecision": false, "showFeedbackIcon": true, "correctAnswerStyle": "plain", "maxValue": "mean", "correctAnswerFraction": false, "mustBeReduced": false, "variableReplacementStrategy": "originalfirst"}, {"minValue": "mode", "showCorrectAnswer": true, "mustBeReduced": false, "extendBaseMarkingAlgorithm": true, "customMarkingAlgorithm": "", "type": "numberentry", "scripts": {}, "showFeedbackIcon": true, "variableReplacementStrategy": "originalfirst", "maxValue": "mode", "correctAnswerFraction": false, "mustBeReducedPC": 0, "variableReplacements": [], "marks": 1, "allowFractions": false, "unitTests": [], "notationStyles": ["plain", "en", "si-en"], "correctAnswerStyle": "plain"}, {"minValue": "median", "showCorrectAnswer": true, "mustBeReduced": false, "extendBaseMarkingAlgorithm": true, "customMarkingAlgorithm": "", "type": "numberentry", "scripts": {}, "showFeedbackIcon": true, "variableReplacementStrategy": "originalfirst", "maxValue": "median", "correctAnswerFraction": false, "mustBeReducedPC": 0, "variableReplacements": [], "marks": 1, "allowFractions": false, "unitTests": [], "notationStyles": ["plain", "en", "si-en"], "correctAnswerStyle": "plain"}], "unitTests": []}, {"showCorrectAnswer": true, "sortAnswers": false, "extendBaseMarkingAlgorithm": true, "customMarkingAlgorithm": "", "type": "gapfill", "scripts": {}, "showFeedbackIcon": true, "variableReplacementStrategy": "originalfirst", "variableReplacements": [], "marks": 0, "prompt": "Range=[[0]]
", "gaps": [{"minValue": "max(a)-min(a)", "showCorrectAnswer": true, "mustBeReduced": false, "extendBaseMarkingAlgorithm": true, "customMarkingAlgorithm": "", "type": "numberentry", "scripts": {}, "showFeedbackIcon": true, "variableReplacementStrategy": "originalfirst", "maxValue": "max(a)-min(a)", "correctAnswerFraction": false, "mustBeReducedPC": 0, "variableReplacements": [], "marks": 1, "allowFractions": false, "unitTests": [], "notationStyles": ["plain", "en", "si-en"], "correctAnswerStyle": "plain"}], "unitTests": []}, {"showCorrectAnswer": true, "sortAnswers": false, "extendBaseMarkingAlgorithm": true, "customMarkingAlgorithm": "", "type": "gapfill", "scripts": {}, "showFeedbackIcon": true, "variableReplacementStrategy": "originalfirst", "variableReplacements": [], "marks": 0, "prompt": "If we added a score of 19 to the set of data at the top, calculate the following:
\nmean=[[0]]
\nmode=[[1]]
\nmedian=[[2]]
\nrange=[[3]]
", "gaps": [{"minValue": "precround(mean(a+19),2)", "showCorrectAnswer": true, "mustBeReduced": false, "extendBaseMarkingAlgorithm": true, "customMarkingAlgorithm": "", "type": "numberentry", "scripts": {}, "showFeedbackIcon": true, "variableReplacementStrategy": "originalfirst", "maxValue": "precround(mean(a+19),2)", "correctAnswerFraction": false, "mustBeReducedPC": 0, "variableReplacements": [], "marks": 1, "allowFractions": false, "unitTests": [], "notationStyles": ["plain", "en", "si-en"], "correctAnswerStyle": "plain"}, {"minValue": "mode", "showCorrectAnswer": true, "mustBeReduced": false, "extendBaseMarkingAlgorithm": true, "customMarkingAlgorithm": "", "type": "numberentry", "scripts": {}, "showFeedbackIcon": true, "variableReplacementStrategy": "originalfirst", "maxValue": "mode", "correctAnswerFraction": false, "mustBeReducedPC": 0, "variableReplacements": [], "marks": 1, "allowFractions": false, "unitTests": [], "notationStyles": ["plain", "en", "si-en"], "correctAnswerStyle": "plain"}, {"minValue": "median(a+19)", "showCorrectAnswer": true, "mustBeReduced": false, "extendBaseMarkingAlgorithm": true, "customMarkingAlgorithm": "", "type": "numberentry", "scripts": {}, "showFeedbackIcon": true, "variableReplacementStrategy": "originalfirst", "maxValue": "median(a+19)", "correctAnswerFraction": false, "mustBeReducedPC": 0, "variableReplacements": [], "marks": 1, "allowFractions": false, "unitTests": [], "notationStyles": ["plain", "en", "si-en"], "correctAnswerStyle": "plain"}, {"minValue": "max(a+19)-min(a+19)", "showCorrectAnswer": true, "mustBeReduced": false, "extendBaseMarkingAlgorithm": true, "customMarkingAlgorithm": "", "type": "numberentry", "scripts": {}, "showFeedbackIcon": true, "variableReplacementStrategy": "originalfirst", "maxValue": "max(a+19)-min(a+19)", "correctAnswerFraction": false, "mustBeReducedPC": 0, "variableReplacements": [], "marks": 1, "allowFractions": false, "unitTests": [], "notationStyles": ["plain", "en", "si-en"], "correctAnswerStyle": "plain"}], "unitTests": []}], "type": "question", "contributors": [{"name": "Christian Lawson-Perfect", "profile_url": "https://numbas.mathcentre.ac.uk/accounts/profile/7/"}, {"name": "Stanislav Duris", "profile_url": "https://numbas.mathcentre.ac.uk/accounts/profile/1590/"}, {"name": "Paul Hancock", "profile_url": "https://numbas.mathcentre.ac.uk/accounts/profile/1738/"}]}]}], "contributors": [{"name": "Christian Lawson-Perfect", "profile_url": "https://numbas.mathcentre.ac.uk/accounts/profile/7/"}, {"name": "Stanislav Duris", "profile_url": "https://numbas.mathcentre.ac.uk/accounts/profile/1590/"}, {"name": "Paul Hancock", "profile_url": "https://numbas.mathcentre.ac.uk/accounts/profile/1738/"}]}