// Numbas version: exam_results_page_options
{"name": "Quadratic regression", "extensions": ["stats", "jsxgraph"], "custom_part_types": [], "resources": [], "navigation": {"allowregen": true, "showfrontpage": false, "preventleave": false, "typeendtoleave": false}, "question_groups": [{"pickingStrategy": "all-ordered", "questions": [{"functions": {"regfun": {"definition": "\nvar div = Numbas.extensions.jsxgraph.makeBoard('400px','400px',\n{boundingBox:[-15,maxy,maxx,-400],\n axis:false,\n showNavigation:false,\n grid:false,\n });\n\n var board = div.board; \nvar xaxis = board.create('line',[[0,0],[1,0]], { strokeColor: 'black', fixed: true,name:\"Advertising spend\",withLabel:false});\nvar xticks = board.create('ticks',[xaxis,10],{\n drawLabels: true,\n label: {offset: [-4, -10]},\n minorTicks: 0\n});\n\n// create the y-axis\nvar yaxis = board.create('line',[[0,0],[0,1]], { strokeColor: 'black', fixed: true,name:\"Revenue\",withLabel:false });\nvar yticks = board.create('ticks',[yaxis,500],{\ndrawLabels: true,\nlabel: {offset: [-20, 0]},\nminorTicks: 0\n});\nfor (j=0;j
$\\displaystyle SPXY=\\sum xy - \\frac{(\\sum x)\\times (\\sum y)}{\\var{n}}=\\var{sxy}-\\frac{\\var{t[0]}\\times \\var{t[1]}}{\\var{n}}=\\var{spxy}$
\n$\\displaystyle SSX=\\sum x^2 - \\frac{(\\sum x)^2}{\\var{n}}=\\var{ssq[0]}- \\frac{\\var{t[0]}^2}{\\var{n}}=\\var{ss[0]}$
\nHence $\\displaystyle a=\\frac{\\var{spxy}}{\\var{ss[0]}}=\\var{spxy/ss[0]}$.
\n\nThen $\\displaystyle b = \\frac{1}{\\var{n}}\\left[\\sum y-a \\sum x\\right]=\\frac{1}{\\var{n}}\\left[\\var{t[1]}-\\var{atrue}\\times\\var{t[0]}\\right]=\\var{btrue}$
\n", "rulesets": {"std": ["all", "fractionNumbers", "!collectNumbers", "!noLeadingMinus"]}, "parts": [{"stepsPenalty": 0, "prompt": "{regfun(r1,r2,max(r1)+30,max(r2)+30,rsquared,sumr,n,1)}
\nFirst we perform a linear regression $y=ax+b$ to find the equation of the line fitted as shown in the above diagram.
\nYou have to calculate the coefficients $a,\\;b$
\nIf you input your line (before submitting) it will be drawn on the above graph to see how it compares with the fitted line.
\nNote carefully that the line drawn only gives a rough guide to the fitted line and you should always calculate $a$ and $b$ to ensure you obtain the required result.
\nInput your line in the form $y=ax+b$ for values of $a$ and $b$ accurate to 3 decimal places,
\n$y=\\;$[[0]].
\nClick on Show steps if you want more information on calculating $a$ and $b$. You will not lose any marks by doing so.
\n\n", "marks": 0, "gaps": [{"expectedvariablenames": [], "checkingaccuracy": 0.001, "vsetrange": [0, 1], "showpreview": true, "vsetrangepoints": 5, "showCorrectAnswer": true, "answersimplification": "all,!noLeadingMinus", "scripts": {}, "answer": "{a}*x+{b}", "marks": 1, "checkvariablenames": false, "checkingtype": "absdiff", "type": "jme"}], "showCorrectAnswer": true, "scripts": {}, "steps": [{"type": "information", "showCorrectAnswer": true, "scripts": {}, "prompt": "
To find $a$ and $b$ you first find $\\displaystyle a = \\frac{SPXY}{SSX}$ where:
\n$\\displaystyle SPXY=\\sum xy - \\frac{(\\sum x)\\times (\\sum y)}{\\var{n}}$
\n$\\displaystyle SSX=\\sum x^2 - \\frac{(\\sum x)^2}{\\var{n}}$
\nThen $\\displaystyle b = \\frac{1}{\\var{n}}\\left[\\sum y-a \\sum x\\right]$
\nNote that it is advisable to do all your calculations to at least 5 decimal places in order to ensure that $a$ and $b$ are accurate to 3 decimal places at the end of the calulations. This means that when you calculate $b$ using $a$ then $a$ should be accurate to at least 5 decimal places.
\n\n", "marks": 0}], "type": "gapfill"}, {"type": "information", "showCorrectAnswer": true, "scripts": {}, "prompt": "
Next we fit the data by using quadratic regression, that is we look for a function of the form $y=ax^2+bx+c$ for constants $a,\\;b\\;c$ which best fit the data.
\nYou are not expected to find this quadratic and we display it below.
\nNote the SSE is displayed, which is a measure of how well the quadratic regression fits the data. Compare this with the SSE for the linear regression, also displayed.
\n{regfun(r1,r2,max(r1)+30,max(r2)+30,rsquared,sumr,n,2)}
\nYou can increase the degree of the polynomial fitting the data by moving the slider and the measure of how well it fits is given by the updated SSE. But note that it doesn't make much sense in doing this if there are only marginal gains in fitting the polynomial as measured by the SSE. The rule is that you should always go for the simplest model!
", "marks": 0}], "statement": "It is believed that there is a quadratic relationship between the amount a company spends on advertising ( $X$, in thousands of pounds) and their revenue ( $Y$, also in thousands of pounds).
\nTo investigate, $X$ and $Y$ were collected for a randomly selected number $\\var{n}$ of companies in the U.K.
\n\n\n\n\n\n\n\nYour first task is to find the equation of the linear regression fitted as shown as the black line. You are given the information below in order to complete the task.
\n$X$ | \n$\\sum x=\\;\\var{t[0]}$ | \n$\\sum x^2=\\;\\var{ssq[0]}$ | \n
---|---|---|
$Y$ | \n$\\sum y=\\;\\var{t[1]}$ | \n$\\sum y^2=\\;\\var{ssq[1]}$ | \n
Also you are given $\\sum xy = \\var{sxy}$.
\n\n", "variable_groups": [], "progress": "ready", "preamble": {"css": "", "js": ""}, "variables": {"ch": {"definition": "random(0..n-1)", "templateType": "anything", "group": "Ungrouped variables", "name": "ch", "description": ""}, "atrue": {"definition": "spxy/ss[0]", "templateType": "anything", "group": "Ungrouped variables", "name": "atrue", "description": ""}, "b1": {"definition": "random(2..5#0.1)", "templateType": "anything", "group": "Ungrouped variables", "name": "b1", "description": ""}, "sxy": {"definition": "sum(map(r1[x]*r2[x],x,0..n-1))", "templateType": "anything", "group": "Ungrouped variables", "name": "sxy", "description": ""}, "res": {"definition": "map(precround(r2[x]-(b+a*r1[x]),2),x,0..n-1)", "templateType": "anything", "group": "Ungrouped variables", "name": "res", "description": ""}, "spxy": {"definition": "sxy-t[0]*t[1]/n", "templateType": "anything", "group": "Ungrouped variables", "name": "spxy", "description": ""}, "ls": {"definition": "precround(b+a*sc,2)", "templateType": "anything", "group": "Ungrouped variables", "name": "ls", "description": ""}, "tol": {"definition": 0.001, "templateType": "anything", "group": "Ungrouped variables", "name": "tol", "description": ""}, "a": {"definition": "precround(atrue,3)", "templateType": "anything", "group": "Ungrouped variables", "name": "a", "description": ""}, "btrue": {"definition": "1/n*(t[1]-spxy/ss[0]*t[0])", "templateType": "anything", "group": "Ungrouped variables", "name": "btrue", "description": ""}, "ssq": {"definition": "[sum(map(x^2,x,r1)),sum(map(x^2,x,r2))]", "templateType": "anything", "group": "Ungrouped variables", "name": "ssq", "description": ""}, "sumr": {"definition": "round(sum(map(res[x]^2,x,0..n-1)))", "templateType": "anything", "group": "Ungrouped variables", "name": "sumr", "description": ""}, "a1": {"definition": "random(2..5#0.5)", "templateType": "anything", "group": "Ungrouped variables", "name": "a1", "description": ""}, "c1": {"definition": "random(0.1..1#0.1)", "templateType": "anything", "group": "Ungrouped variables", "name": "c1", "description": ""}, "tsqovern": {"definition": "[t[0]^2/n,t[1]^2/n]", "templateType": "anything", "group": "Ungrouped variables", "name": "tsqovern", "description": ""}, "b": {"definition": "precround(btrue,3)", "templateType": "anything", "group": "Ungrouped variables", "name": "b", "description": ""}, "obj": {"definition": "['A','B','C','D','E','F','G','H']", "templateType": "anything", "group": "Ungrouped variables", "name": "obj", "description": ""}, "r1": {"definition": "repeat(round(normalsample(50,10)),n)", "templateType": "anything", "group": "Ungrouped variables", "name": "r1", "description": ""}, "r2": {"definition": "map(round((a1+b1*x+c1*x^2+normalsample(0,20))),x,r1)", "templateType": "anything", "group": "Ungrouped variables", "name": "r2", "description": ""}, "ss": {"definition": "[ssq[0]-t[0]^2/n,ssq[1]-t[1]^2/n]", "templateType": "anything", "group": "Ungrouped variables", "name": "ss", "description": ""}, "n": {"definition": "random(20..35)", "templateType": "anything", "group": "Ungrouped variables", "name": "n", "description": ""}, "t": {"definition": "[sum(r1),sum(r2)]", "templateType": "anything", "group": "Ungrouped variables", "name": "t", "description": ""}, "sc": {"definition": "r1[ch]", "templateType": "anything", "group": "Ungrouped variables", "name": "sc", "description": ""}, "rsquared": {"definition": "precround(spxy^2/(ss[0]*ss[1]),3)", "templateType": "anything", "group": "Ungrouped variables", "name": "rsquared", "description": ""}}, "metadata": {"notes": "10/02/2014:
\nCreated. Based on the Numbas question JSXGraph line of best fit plot 5
\n14/02/2014:
\nImproved display and modified so that $20 \\le n \\le 35$. Also sharpened up the error term to be N(0,20) rather than N(0,100) (see variable r2) so that the quadratic regression would clearly be best.
", "description": "The data is fitted by linear and quadratic regression. First, find a linear regression equation for the $n$ data points, $20 \\le n \\le 35$.
\nThey then are shown that the quadratic regression is often a better fit as measured by SSE. Also users can experiment with fitting polynomials of higher degree.
\n", "licence": "Creative Commons Attribution 4.0 International"}, "type": "question", "showQuestionGroupNames": false, "question_groups": [{"name": "", "pickingStrategy": "all-ordered", "pickQuestions": 0, "questions": []}], "contributors": [{"name": "Bill Foster", "profile_url": "https://numbas.mathcentre.ac.uk/accounts/profile/6/"}]}]}], "contributors": [{"name": "Bill Foster", "profile_url": "https://numbas.mathcentre.ac.uk/accounts/profile/6/"}]}