

{"id":322,"date":"2020-07-30T14:41:00","date_gmt":"2020-07-30T18:41:00","guid":{"rendered":"https:\/\/sites.temple.edu\/klugman\/?p=322"},"modified":"2020-07-30T16:29:41","modified_gmt":"2020-07-30T20:29:41","slug":"oh-pisa","status":"publish","type":"post","link":"https:\/\/sites.temple.edu\/klugman\/2020\/07\/30\/oh-pisa\/","title":{"rendered":"Oh PISA"},"content":{"rendered":"<figure id=\"attachment_323\" aria-describedby=\"caption-attachment-323\" style=\"width: 300px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/sites.temple.edu\/klugman\/files\/2020\/07\/adjuster-1200-1200-675-675-crop-000000.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-323 size-medium\" src=\"https:\/\/sites.temple.edu\/klugman\/files\/2020\/07\/adjuster-1200-1200-675-675-crop-000000-300x169.jpg\" alt=\"\" width=\"300\" height=\"169\" srcset=\"https:\/\/sites.temple.edu\/klugman\/files\/2020\/07\/adjuster-1200-1200-675-675-crop-000000-300x169.jpg 300w, https:\/\/sites.temple.edu\/klugman\/files\/2020\/07\/adjuster-1200-1200-675-675-crop-000000-1024x576.jpg 1024w, https:\/\/sites.temple.edu\/klugman\/files\/2020\/07\/adjuster-1200-1200-675-675-crop-000000-768x432.jpg 768w, https:\/\/sites.temple.edu\/klugman\/files\/2020\/07\/adjuster-1200-1200-675-675-crop-000000.jpg 1200w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><figcaption id=\"caption-attachment-323\" class=\"wp-caption-text\">Trying not to go for something obvious here<\/figcaption><\/figure>\n<p>In <em>Damned Lies and Statistics<\/em>, Joel Best argues that consumers of statistics need to especially scrutinize international comparisons because there are so many opportunities to mix up apples and oranges (I have <a href=\"https:\/\/sites.temple.edu\/klugman\/2020\/07\/15\/quantifying-police-killings-part-2\/\">discussed<\/a> this with regard to the conceptual definitions used to quantify police-related deaths in different countries).\u00a0 \u00a0One of Best&#8217;s examples was international comparisons of test scores; he pointed out that sampling strategies used vary across countries and often countries&#8217; performance levels could be chalked up to the broadness of their sampling strategy.\u00a0 In particular, countries with comprehensive secondary school systems (like the United States, where all, or nearly all, adolescents are exposed to an academic-focused curriculum) would sample from the entire population of schools, while countries with\u00a0 &#8220;streaming&#8221; systems (like in Germany), where some adolescents go to academic high schools while others go to more vocationally-oriented schools, would sample from the academic high schools only.\u00a0 This would stack the deck against countries like the United States.<\/p>\n<p><em>Damned Lies and Statistics<\/em> came out in 2001, and the international testing comparisons Best talked about have been supplanted by the Programme for International Student Achievement (PISA), run by the OECD.\u00a0 When I teach Best in my statistics class, I show students the general sampling <a href=\"https:\/\/www.oecd.org\/pisa\/sitedocument\/PISA-2015-Technical-Report-Chapter-4-Sample-Design.pdf\">strategy<\/a> of PISA:<\/p>\n<p><em>The desired base PISA target population in each country consisted of 15-year-old students attending educational institutions located within the country. This meant that countries were to include (i) 15-year-olds enrolled full-time in educational institutions, (ii) 15-year-olds enrolled in educational institutions who attended on only a part-time basis, (iii) students in vocational training types of programmes, or any other related type of educational programmes&#8230;<\/em><\/p>\n<p>Sure, there were some <a href=\"https:\/\/www.brookings.edu\/research\/lessons-from-the-pisa-shanghai-controversy\/\">problems<\/a> with China, but what you are going to do?\u00a0 Surely PISA must be good for comparing democratic countries, right?<\/p>\n<p>Well, no.\u00a0 A team of UCL educational researchers headed by Jake Anders have <a href=\"https:\/\/doi.org\/10.1007\/s11092-020-09329-5\">analyzed<\/a> the 2015 Canadian sample for PISA and their analysis raises questions about the quality of comparisons involving Canada, which does very well on the PISA in terms of high average scores.<\/p>\n<p>Their article is nice for walking the reader through the sampling strategy of PISA countries.<\/p>\n<ul>\n<li>First you have to talk about sample exclusions&#8211;what part of the population are you trying to generalize to, and what part are you not trying to generalize to?\u00a0 As shown above, PISA is trying to get at 15-year olds in any kind of educational institution.\u00a0 In Canada, that covers 96 percent of 15-year olds so you are dropping 4 percent right off the bat there (the Anders article has a really nice table comparing Canada&#8217;s figures to other countries including the United States, where 95% of 15-year-olds are in educational institutions). <a href=\"https:\/\/sites.temple.edu\/klugman\/files\/2020\/07\/anders-tab2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-324 size-full\" src=\"https:\/\/sites.temple.edu\/klugman\/files\/2020\/07\/anders-tab2.png\" alt=\"\" width=\"544\" height=\"787\" srcset=\"https:\/\/sites.temple.edu\/klugman\/files\/2020\/07\/anders-tab2.png 544w, https:\/\/sites.temple.edu\/klugman\/files\/2020\/07\/anders-tab2-207x300.png 207w\" sizes=\"auto, (max-width: 544px) 100vw, 544px\" \/><\/a><\/li>\n<li>Not mentioned above is that PISA lets countries exclude students from their target population based on Special Needs (although PISA caps this at 5% of students).\u00a0 In 2015 Canada broached this cap&#8211;Anders et al. note that Canada &#8220;has one of the highest rates of student exclusions (7.5%).&#8221; So now Canada&#8217;s sample is supposed to cover 88.8 [96%*(1-.075)] percent of Canadian 15-year olds.<\/li>\n<li>PISA countries are in charge of their own sampling, and I did not realize how much discretion they have.\u00a0 With nationally representative samples, you really need to do stratified sampling (and probably clustered sampling as well which Anders et al do not get into).\u00a0 Stratified sampling means countries divide schools into strata based on combinations of variables and sample from each of their strata to ensure a representative sample.\u00a0 Countries choose their own stratifying variables (!), and in Canada these are &#8220;province, language, and school size&#8221; which seems fine to me.<\/li>\n<li>We have not even talked about school and student non-compliance, and here is where things get really messy.\u00a0 Canada selected 1008 schools to participate in the 2015 PISA, and 30% refused.\u00a0 What countries can do is try to recruit &#8220;replacement schools&#8221; that are similar to refusing schools based on the stratifying variables as well as another set of variables (which Anders et al. refer to as &#8220;implicit&#8221; stratifying variables).\u00a0 Canada was able to recruit 23 replacement schools, but it is not clear that the variables Canada used to implicitly stratify schools were that meaningful for test scores&#8211;meaning it is possible that the replacement schools are very different from the originally-selected schools in unobserved ways.\u00a0 It is only 2% of the sample but this problem of using meaningless variables to gauge the representativeness of the sample will be an issue.<\/li>\n<li>Anders et al. points out that at 70%, Canada is <strong>fourth-worst among OECD countries<\/strong> in terms of the response rate of initial schools (beaten out by the Netherlands, New Zealand, and USA).\u00a0 In terms of overall response rate (after including the replacement schools) Canada&#8217;s 72 is the worst (the US, at 83 percent, still looks very bad relative to other OECD countries, most of which are at 95 percent or above).<\/li>\n<li>PISA requires countries with low initial response rates between 65% and 85% to do a non-response bias analysis (NRBA).\u00a0 Countries below 65% (like the Netherlands) are supposed to be excluded, but in this case, they were not.\u00a0 PISA does not report the details of these NRBAs, and Anders et al.\u00a0 tracked down Canada&#8217;s province-specific NRBAs and found they were pretty superficial, and suffered from the problem of using a handful of variables to show that non-responding schools are similar to responding schools (although in the case of Quebec there were significant differences between refusal and complying schools).<\/li>\n<li>Now we get into pupil non-response, and again, Canada is among the worse in this regard, with 81% of students in complying schools taking the PISA tests (the other two countries with worse or comparable rates are Austria at 71% and Australia also at 81%).\u00a0 We know that non-participating students tend to do worse on the tests, and one way to get around this is to weigh student participants such that those with characteristics similar to non-participants are weighted more.\u00a0 But again, we get into this issue where Canada uses weights based on variables that do not really matter for test scores (the stratifying variables plus &#8220;urbanisation, source of school funding, and [school level]&#8221;).<\/li>\n<li>I am not sure how Anders et al. calculate this, but all told, Canada&#8217;s sample is really only representative of 53% of Canadian students (although I wonder if they meant to say 53% of Canadian 15-year olds).<\/li>\n<li>Anders et al. do some simulations for reading scores, assuming that non-participant students on average would perform worse on the PISA tests than participant students.\u00a0 If we assume that the non-participating students do moderately worse on the PISA instrument (say, at the 40 or 35th percentiles), Canada&#8217;s reading scores are still better than average but are not at the &#8220;super-star&#8221; levels it enjoys with its reported performance.\u00a0 If we assume the non-participants do substantially worse (say, at the 30th percentile) for Canada&#8217;s mean PISA reading scores take a serious dive and Canada starts looking more like an average country.<\/li>\n<\/ul>\n<p>The one thing that really sticks out here&#8211;and also with Tom Loveless&#8217;s <a href=\"https:\/\/www.brookings.edu\/research\/lessons-from-the-pisa-shanghai-controversy\/\">discussion<\/a> of PISA and China&#8211;is that PISA&#8217;s behavior is not consistent with the dispassionate collection and analysis of data.\u00a0 They have opened the door to countries (especially wealthy ones) fudging their data and they do not really seem to care.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In Damned Lies and Statistics, Joel Best argues that consumers of statistics need to especially scrutinize international comparisons because there are so many opportunities to mix up apples and oranges (I have discussed this with regard to the conceptual definitions used to quantify police-related deaths in different countries).\u00a0 \u00a0One of Best&#8217;s examples was international comparisons &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/sites.temple.edu\/klugman\/2020\/07\/30\/oh-pisa\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Oh PISA&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1802,"featured_media":323,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[14,9,8],"class_list":["post-322","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","tag-canada","tag-international-comparisons","tag-sampling","entry"],"_links":{"self":[{"href":"https:\/\/sites.temple.edu\/klugman\/wp-json\/wp\/v2\/posts\/322","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sites.temple.edu\/klugman\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sites.temple.edu\/klugman\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sites.temple.edu\/klugman\/wp-json\/wp\/v2\/users\/1802"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.temple.edu\/klugman\/wp-json\/wp\/v2\/comments?post=322"}],"version-history":[{"count":0,"href":"https:\/\/sites.temple.edu\/klugman\/wp-json\/wp\/v2\/posts\/322\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sites.temple.edu\/klugman\/wp-json\/wp\/v2\/media\/323"}],"wp:attachment":[{"href":"https:\/\/sites.temple.edu\/klugman\/wp-json\/wp\/v2\/media?parent=322"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sites.temple.edu\/klugman\/wp-json\/wp\/v2\/categories?post=322"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sites.temple.edu\/klugman\/wp-json\/wp\/v2\/tags?post=322"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}