Home Bibliography Interpreting international comparisons in academic achievement

Interpreting international comparisons in academic achievement

December 9, 2010

4061

Hong Kong was one of the top three 'scorers' in the OECD rankings

OECD (2010) OECD Program for International Assessment Paris: OECD

The media, worldwide, have been having a field day interpreting the latest OECD PISA results, which primarily measure the reading, math and science performance of 15 year olds in 2009. (A much easier topic for hackneyed responses than WikiLeaks, for instance.)

New York Times: ‘The results of the survey were …. called “a wake-up call” by the U.S. education secretary.’

The Times of London: ‘UK schools tumble down international table‘.

The Globe and Mail, Canada: ‘How Canada is becoming outclassed in school.‘ Kate Hammer, the Globe and Mail reporter, states:

‘After nearly a decade of leading international test scores in reading, writing and mathematics, Canada is finding that it’s no longer lonely at the top. Korea and China are leaving us in the chalk dust while our smaller provinces are dragging down the scores. Measured against 65 other countries, Canada places fifth overall in reading, seventh in science and eighth in mathematics.’

So you would think that Canada is in big trouble here. Well, far from it. What has caused the fuss is that Shanghai, Korea, Hong Kong, Singapore and Finland scored slightly higher than Canada. In particular, these countries have made rapid improvements in their scores since 2000, and the fact that three largely Chinese cities have the highest scores has freaked out the Western media.

It’s a good job journalists don’t have to take the reading test, since level five in reading is defined as follows:

‘Tasks at this level that involve retrieving information require the reader to locate and organise several pieces of deeply embedded information, inferring which information in the text is relevant. Reflective tasks require critical evaluation or hypothesis, drawing on specialised knowledge. Both interpretative and reflective tasks require a full and detailed understanding of a text whose content or form is unfamiliar.‘

I would expect something more from reporters than just taking the OECD handout, quickly scanning it, and jumping to conclusions that are not really justified by even a cursory look at the data. So here are some critical questions that need to be asked about this study.

One thing that the OECD has not done yet is to describe its methodology in detail, and in particular the sample size in each country or city state surveyed (this is coming in June, 2011). We are told that the reading tests involved 470,000 15 year olds in 65 countries, representing 26 million students. This gives a sample size of 1.8% (yes, between 1 and 2 out of every 100 15 year olds). This can still be a reliable sample, depending on how the sample is drawn. For instance, so long as they were chosen randomly, taking one student from one school would provide a more reliable sample than choosing all the students from the same school. (Pick the ‘wrong’ or an atypical school, and you’ve immediately biased the sample). We can also calculate the average sample size across the 65 countries: 7,230 students. However, the countries include very large countries, such as the USA and Mexico (around 300 million), and very small countries or even cities. Finland for instance has a population of 5 million. The OECD reports that sample sizes ranged from 4 410 students in Iceland to 38 250 students in Mexico. The OECD report (Volume 1, p. 25) states:

‘the selection of samples was monitored internationally and adhered to rigorous standards for the participation rate, both among schools selected
by the international contractor and among students within these schools.’

Nevertheless, given such a diverse system as the USA and Mexico, how a sample of 1.8% is drawn would be critical. In a country with a less diverse system, the sample selection may not matter as much. Similarly, in a small country, with a smaller sample, any sampling bias will have a large effect on the results.

Why is the sample so important? Well, there is something in the maths tests particularly that needs very careful examination (a journalist might say that there is something fishy here, but that would be too strong.) The thing that has shaken the Western media most of all are the scores for Shanghai students on mathematics. The average student in the Shanghai sample performed at the top two levels of the math test. The OECD has a strict rule about exclusions from the sample, for cases such as defined mental disabilities, etc., of less than 2%. Yet over 50% scored at the two highest levels of math performance out of six levels. Their scores are considerably higher on average than the next country (Singapore) which is also statistically ahead of the following country (Hong Kong). The Shanghai scores look so ‘perfect’ that they need some kind of detailed explanation. It would be useful to know how many Shanghai students took the test, how and which schools were sampled, and who marked the tests. I am not of course suggesting that there was cheating of any kind (what, in China?!), but if you stick your head so far above the parapet, you will get noticed.

The second point to make is that there is a statistical phenomenon known as regression to the mean. Shanghai of course is not typical of China. It is the richest city and a key centre in its education system. In comparison, a large country such as the USA and Mexico will have large variations in its schools and populations. The greater variety, the closer to the mean you will score on any standardized item. This can be seen in Canada, which was ‘pulled down’ by its smaller provinces. So when ‘city states’ such as Shanghai and Hong Kong are compared with large and sprawling countries such as the USA, Mexico and Russia, we are not comparing like with like.

None of this should detract from the huge progress made by city ‘states’ such as Shanghai, Hong Kong, and Singapore. Korea, a large country, has made incredible progress since 2000, as has Finland.

However, to say that Canada is now ‘outclassed’ in these tests is a cheap shot and not true. Once you are at or near the top of the scores on tests, there is nowhere else to go but down. You can only wait to be caught by those who have more room to grow. (It was also reported that Canada did less well in 2009 than in 2000, but careful reading of the OECD report states that the difference was not statistically significant.) Canada, despite its diversity, has a very even performance, with smaller differences between the very top and very bottom scores than most countries. This suggests that its teachers and schools are doing a fantastic job.

We should in fact be celebrating that many countries are improving their education systems. I believe that this will over time make the world a better place. If it acts as a wake up call and leads to more improvement in our own system, all the better. But let us at least have accurate and responsible reporting about a process that leaves a lot of unasked questions.

Update

Ideally, I need more time to go through the PISA reports (there are six volumes), but I will wait until the detailed methodology and the digital learning scores are published in June. In any case, this site is focused more on post-secondary education. However, because of the growing trend towards standardized testing of learning outcomes for comparative purposes at a national level, I think it is important for post-secondary institutions to watch very carefully what is happening with the PISA publications.

To be fair to journalists – poor things – I would commend you to today’s article by Jeffrey Simpson in the Globe and Mail: Canada is not becoming outclassed (which I saw after I had written my post). Although he does not question the actual data, he makes similar points to mine about Canada’s relative performance, in strong contradiction of his colleague, Kate Hammer’s, earlier report.

He also makes a comment – that I deliberately avoided – that one reason for the better results is that the Asian countries focus more on rote learning. Now in fact PISA states that its tests at the higher levels require more than rote learning, and certainly you should not be able to score highly at the top level of reading, math and science scores by relying on memory alone. Examples of the tests are provided by PISA and you should read them and make your own judgement on this.

My view is that Asian students are probably pushed harder by both parents and the school system than many students in Western countries. They do better because they work harder at what the school requires them to do. What PISA does not attempt to measure is breadth of learning, or whether students who score very highly on standardized tests have the range of other skills such as sport, social and artistic, that are not measured by the OECD.

The most absurd suggestion that I saw in the media was the statement by British Prime Minister David Cameron, who suggested Britain should import the tests and systems used by China. In fact, Britain’s educational policy for schools over the last 20 years has been to move to more and more state control of the curriculum and teaching methods, yet it still dropped significantly in the world PISA rankings. As the OECD report said, educational performance depends as much on parents and teachers as it does on government policies.

My final comment is that standardized tests are a cornerstone of the right wing agenda referred to in an earlier post as the ‘dysfunctionality narrative’: state schools suck and the private sector would do better. The irony here is that the ‘best’ PISA performance comes from Shanghai, Hong Kong, Finland, and Singapore, all of which have a long history of state-controlled education. So, Sarah Palin, be careful what you wish for.

See also: Mortishead, C. (2010) Education rankings: The new schoolyard bully , Globe and Mail, Dec 8

Bradshaw, J. (2010) Brains alone won’t get you into med school Globe and Mail, December 13, p. A11

3 COMMENTS

Howard December 11, 2010 At 10:32 am

Hey Tony;
You”re making all good methodological points, but I would add one in validity. Economic competition is the subtext to this discussion. I would like to see a criterion correlation between test scores and economic success (or even item analysis). I think there are quite a few unstated assumptions in most analyses, even in academic discussions. I would like to see more creativity supported through-out life. There is at least a good theoretical link between creativity and economic success. More here as it relates to these tests. http://howardjohnson.edublogs.org/2010/12/09/some-factors-that-support-creativity-and-innovation/

Reply
Sui Fai John Mak December 11, 2010 At 3:46 pm

Hi Tony,
Thanks for sharing the information and your views on this. Here is my response http://suifaijohnmak.wordpress.com/2010/12/11/plenk2010-academic-achievement-personalization-of-education-and-learning/

I think there is a big difference between the education system of Canada, Hong Kong and China (represented by Shanghai here), and so when it comes to assessment based on test, it could be quite difficult to interpret such outcomes. As I was educated in Hong Kong, I realise that tests and examinations are the cornerstones of success in getting a place in the best Colleges (High Schools)and Universities, or getting a job. With that in mind, then academic achievement could be a yardstick towards success in study, or even employment, and that could be a strong motivating factor for some of the students. This is just my experience and perspective, and so I think there are many other factors which make up for the better performance in the tests. I am not sure how much differences there are in between the East (say Hong Kong and China’s education system) and West (say Canadian’s education system), but surely the emphasis may not be the same, isn’t it. From my limited experience of interacting with Canadian educators, I think Canadian education is encouraging more open online learning. How about the assessment like tests or examinations? Are they considered as important as that in the East like Hong Kong and China?
John

Reply
What is more important than grades? « Tony Bates December 14, 2010 At 2:46 pm

[…] an earlier posting on the OECD’s PISA tests, there was some discussion about what was NOT measured in standardized tests of reading, science […]

Reply