Total Pageviews

Saturday, June 08, 2013

Re: Hacking Into The Indian Education System

The last week of work prevented me from reading, before today, a piece I found rather interesting. That piece is the post by Cornell grad Debarghya Das on anomalies in the ICSE / ISC exam results, on Quora: http://deedy.quora.com/Hacking-into-the-Indian-Education-System?ref=fb

The time it took me to get around to reading it was enough for the issue to snowball, with national newspapers carrying articles related to it, and the arguments Debarghya has made being discussed threadbare on various fora. The more statistically inclined members of my policy class at Takshashila dissected the analysis in ways I couldn't even begin to follow, before the ever-reliable Krupakar took pity on us lesser mortals and put up a more readable refutation on his Education Policy blog, Pratyaya:
http://pratyaya.nationalinterest.in/2013/06/05/the-bogus-claims-of-hacking-indian-education-system-and-marks-tampering/

There's no faster way of being drawn into a debate than reading the comments threads on posts like this. Also, no more masochistic way, because you'll never come out of a >debate>in>comments satisfied. So I'm taking the easy way out, and putting my take in this blog post instead.

Quick Disclaimer: I disagree with Debarghya's conclusions, or at least the language he's used. That's why I use "proof" in quotes. Regardless, I admire the effort he's taken acquiring the data and playing with it. This *is* a cynical comment. (Congenital defect, this cynicism. Nothing I can do about it) but it's not aimed at disparaging him - it's not an "any idiot (myself included) could have done that too" post.

That said, my observations:

1. There are SERIOUS privacy issues, mostly because student and school names were accessible, but it would be a stretch to say this action was illegal. What Debarghya did is no different from you entering your classmates' roll numbers (and there's always a guy in your class who will). Writing a program to do what a human can do (but would find tedious) isn't hacking, it's... well, just programming, you know?

2. "Storm in a Teacup".

Am absolutely baffled that anyone finds "proof" of marks being tampered / modified / moderated / adjusted / rounded / normalised / forged / something Edward Munch dreamed up after a whiff of paint thinner to be controversial, or even news. As near as I can tell (rudimentary, rusty stats exposure) the point of contention seems to be that the marks do not follow a normal distribution. From the little I remember, this would require that all scores (0-100) be equally possible. Again, am I missing something, or does ANYONE actually expect marks to be allotted in such a way - at "random", in the technical sense of the word?
(http://www.merriam-webster.com/dictionary/random - adj, definition 2.b.)

Test design matters. I'm reasonably certain I could make a test where practically no one scores below 35, just by making sure that at least 35 items were of a low enough difficulty, e.g. p>0.95. Whether it would it be a good test would depend on its purpose. If the point was to make a test where only by returning a blank paper could you get a failing mark (<35) then it would be admirably effective.
(Is that the point of the ICSE/ISC/CBSE/what-have-you? I couldn't say, but does "No Child Left Behind" sound familar? On which note, do watch http://www.youtube.com/watch?v=wX78iKhInsc )

Similarly, why on earth are grace marks - of which there is "proof" in this data - a revelation? On a Mumbai University marksheet, if you scored just below passing mark and were upgraded, there's a little asterisk, which will show (overleaf) under which ordinance this grace was granted. When I was doing by BA, one could actively petition for some grace marks to upgrade the overall division (i.e. pass to second / second to first class) awarded.
(At least up to 2012, in fact, this rule was still in place - see here: http://www.mu.ac.in/4.101.pdf)

So the CISCE doesn't flag it in that manner, but do we even disapprove of the practice? What do those marks mean, in any case? In terms of intelligence, potential or achievement, what is the difference between a 34 and a 35? It's not so much that the Pass/Fail cut-off is arbitrary - to some extent, any such binary decision will involve leaving out some who could have made it - but that one barely knows what to make of the entire marking scheme. It's a scale without uniform units, which makes comparison a challenge - at which point, bring on the cliches. "It's not as if a child who scores 70 is twice as good as a child who scores 35", etc. Come result time, you will read ad nauseam that "marks are not representative of a student's true ability."

(Again, Mumbai University taught me this long before independent proof was forthcoming. We have a long-standing joke, in the Faculty of Law, that one's marks reflect more the marital satisfaction of the poor schmuck stranded correcting your bunch of papers than anything you actually wrote. I've failed in subjects I knew intimately well, and sailed through papers where I didn't even know 40 marks worth of matter. Again, the point is not the anecdote, but that many Indian readers will relate to it. We're inclined to take marks with a pinch of salt, and with good reason.)

To reiterate: the purpose of the test matters. The sole purpose of board exam marks today is securing admission to a college/course of one's choice, which has already given us paradoxical academic inflation. After all, what does a 100% mean if everyone's getting it? Boards are forced to invent new tricks each time to go one-up on each other, in a culture of competitive largesse. Needless to say, this only makes college admissions that much harder - and there's a valuable lesson right there, which is that you can't wish away a lack of quality infrastructure on the ground just by fiddling with the numbers.

In an ideal world, every child could learn at their own pace, and assessments would give us meaningful information about each child. In a slightly less-perfect world, marking would be fair and comparable for children across the country and exams would be tests of maximum performance returning a credible measure of achievement at the time of the test.
(Note that the IB/IGCSE systems claim to offer exactly this - flexibility and globally comparable evaluation / benchmarks. Guess where all the rich kids study now?)

What we *have* is a world where India flunks one round of PISA and chickens out of the next.
(See here: http://articles.timesofindia.indiatimes.com/2013-06-01/news/39674556_1_international-student-assessment-economic-cooperation-evaluation)

And we're worried that no one in the ICSE got 34?

7 comments:

  1. I agree with the fact that putting this into the category of 'hacking' is somewhat sensationalist (and rather lazy) but a couple of points you just hand-waved and skimmed over:

    1. *One* of the problems with the distribution that the original author points out is that the distribution was expected to be normal. Fair enough, it's not a 'random' sample in the strictest sense of the word (certainly not iid samples). But that is far from the biggest oddity. The big surprise is that not one candidate achieved certain scores. Not very few. Not p>0.95. Not a single one. And while we are at it, 34 isn't the only score that no one obtained. No one obtained 36, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 56, 57, 59, 61, 63, 65, 67, 68, 70, 71, 73, 75, 77, 79, 81, 82, 84, 85, 87, 89, 91, or 93 either. THAT is certainly surprising, and to my statistician head, good enough as 'proof' that the marks allocated are not raw marks. EVERYONE scoring 88 was given a grace mark and upgraded to 88? Unlikely (And that is the minimum amount of 'tampering' I can infer from the stats; it's likely to be much larger)

    2. Yes, it's not a huge concern if marks are somewhat modified/normalized/... while remaining in the general neighbourhood of the raw score. But the problem is the fact that this isn't disclosed to the test-takers. The lack of transparency in something that has the importance that these exams do, is a concern. Is it the biggest problem plaguing Indian education? Certainly not. But don't discard it as a non-issue.
    Exactly how much weightage to give to this in the face of other so-called more important problems is certainly up to the individual, but contending that this is too trivial to be a problem at all is hardly justified.

    PS: I've *assumed* that the modifications leave the marks largely unchanged, because there is no evidence to the contrary, but it could just as easily be a high-order modification and we would just have the board's word for it. That would force one to look at the issue in a new light, no?

    ReplyDelete
    Replies
    1. In a word, nope.

      I am, as I've said, not a statistician by training. But Krupakar's post, which I quoted, looks at some ways those results could arise. See also http://rq.nationalinterest.in/ post of 7th June for a more direct answer: the distribution is a result of the board following a form of relative grading.

      Let's agree that this manipulation of "raw" scores by the board is what is causing the result. If you have a problem with this - you see it as tampering, not normalisation - then the conceptual objection is to it happening at all, not to the magnitude of the manipulation of any one score!

      I agree that the lack of transparency is the real issue, but even there the conceptual objection would be to the scope for arbitrary action that comes with this secrecy, i.e. still not to the manipulation of raw scores per se, but to situations where particular students benefit while others with the same "raw" scores do not.

      So. "it could just as easily be a high-order modification and we would just have the board's word for it. That would force one to look at the issue in a new light, no?"

      Nope. One would still either take issue with the manipulation, or not. And the concern would only be in the context of college admissions.

      PS: I know more scores than 34 were missing. 34 was used to highlight the "grace marks" (non)issue - it is an acknowledged practice, and not a controversial one at all. Graphic or numerical proof of it doesn't change that in the slightest.
      (And, as with grace marks, so with manipulation in general.)

      Delete
  2. “As near as I can tell (rudimentary, rusty stats exposure) the point of contention seems to be that the marks do not follow a normal distribution. From the little I remember, this would require that all scores (0-100) be equally possible. Again, am I missing something, or does ANYONE actually expect marks to be allotted in such a way - at "random", in the technical sense of the word?”

    Very, very wrong. Normal distribution doesn’t require equivalent probability of all marks. He claims that the scores should follow something akin to normal simply because of this: most students with get average marks; some students will be below, some above. But of course, this isn't exactly true. But he is right about one thing - across the spectrum, the graph should be close to smooth, probably peeking somewhere in the midrange.

    The startling and primarily point of the whole exercise, for me, is simply this: a huge number of the marks attainable beyond passing have not been attained. Take a look at the data:

    36, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 56, 57, 59, 61, 63, 65, 67, 68, 70, 71, 73, 75, 77, 79, 81, 82, 84, 85, 87, 89, 91, 93.

    While I can understand promoting from 39 to 40, where's 41? Why is 69 included but 70 not? It doesn't really seem to follow a pattern that I can discern, and i've been looking at em for a while now.

    So, the question then is this: you have a sample space of 66. Assume all marks are equally probable (they aren't. very like that marks in the 60's 70's 80's are more likely than those in the late 30's, 40's and 90's.) Anyway, probability of any one person not getting any of these marks = 36/66.(They get something in the remaining 36 marks of the set). Probability of x people not getting it = (36/66)^x. For 100 students that comes out to 4.7*e^-27. And that is why this is evidence of some really weird tampering.

    ReplyDelete
  3. Your definition of normal is completely wrong.

    This thing is huge because of this. 30 or so of 66 possible marks are not attained. Assuming equal probability of all (which isn't strictly true, but still illustrative) the probability that 100 students do not get any of these numbers is like 4.7408534e-27. For the sample he got? That's just weird.

    Something's fishy in denmark. Well, fishier.

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. Abhishek, thanks for refining/framing the (im)possibility of such a scoring pattern occurring naturally. I don't think my point came across, though. Let me elaborate:

      Yes, on a typical test with a large-enough average population, most people should score in the mid-range, some doing better, some worse. This should tend to give us a smooth curve, peaking in the mid-range. (I meant that this is the pattern you can expect if all scores are possible - not if all are equally probable, as you correctly point out.)

      Fiddle with those assumptions - typical test, large average population - and you can see some different results (skewed or multimodal distributions, for instance). Fiddle with the assumption that all scores are possible, and you could see some gaps (e.g. anyone scoring 30-34 automatically gets 35 - 30-34 are impossible). Still, none of this should give you 30 of 36 possible scores missing. So their absence from the data does make it (as you calculate) vastly unlikely that what we're seeing are pristine, raw scores.

      So, yes, these scores have likely been through some kind of process, which has led to the elimination of these missing points. (My friend Karthik describes what that process could be on his blog Resident Quant @ rq.nationalinterest.in)

      My point was: so what? When did ANYONE think the scores they were seeing were pristine, raw scores - and why would they think that? I agree there is proof of processing - but how is that proof of something "fishy"?

      In other words, we agree on the math, but disagree on the value judgment inherent in calling this "tampering". I disagree because I do not attach any particular sanctity to the raw scores in the first place: they are meaningless unless put in the context of the performance of other students, which the CISCE is doing. Even then, they are meaningless except in the context of admissions to higher education, so unless you can show that someone lost out on an opportunity that was otherwise theirs because their scores were processed, I see neither victim nor crime.

      This cheese is Swiss, not Danish. It's meant to have those holes, and it always did smell funny.

      Delete
  4. Siddharth Agrawal08/06/2013, 22:50

    I totally agree with the arguments provided. And also would add that I recently read that now, these people are thinking of deflating the internals if your final scores are low. Doing that would defeat the purpose of having an internal marking scheme which ensures that one bad day doesn't ruin your subject.
    Also, marks in Boards have absolutely no meaning. All that the colleges these days want is donation money. A child who is inadvertently preferring off line mode of admission to secure a seat has to shell out a minimum of 4 lakh rupees to do so. You being in MU would be very well aware of that scheme. All this is leading our system to the dogs..

    ReplyDelete