Guest Post: The Aadhaar Judgment and Reality – I: On Uniqueness

(As we discussed yesterday, the Majority judgment in Aadhaar is founded on a set of factual assumptions. We will, therefore, be running a series of four guest posts by Anand Venkat focusing solely on the factual claims that underly the judgment. The first of the four-part series is on the claim that biometric authentication is “unique identification”.)

How do you arrive at scientific truth? There are two distinct approaches: via positiva, or via negativa. Modern science proceeds mostly through via negativa, and a philosophy of falsifiability. A statement which is not falsifiable through observation, evidence or logic is just a conjecture. Prescribed methodologies and peer reviews are used to examine evidence about a scientific statement to ascertain how truthful it is within the boundaries of the experiment.

Why is this important and relevant for an analysis on the Aadhaar judgement?

It turns out that the majority opinion that upheld the constitutional validity of the Aadhaar project did so by a fundamental misunderstanding of not only the science behind biometrics, but also about science itself.  

This post will constrain itself to the factual and technical aspects of the Aadhaar judgement and specifically about biometric uniqueness.

I am Unique (vs) My biometrics are unique

The statement “I am Unique” needs careful examination. If I lose both my hands, in an unfortunate accident, Am I still unique?

What if I lose my eyes? Am I still Unique?

Why am I still unique, even after losing my eyes and arms? Where does my uniqueness come from? Does it come from parts of my body? Or does it come from the sum total of all my experiences, my genetic lineage, my social relationships and hence from the space in my mind, that exist independent of anything else?

This question is important because the Majority opinion has conflated two independent concepts in its opening line – Identity and Identification and thus asserts that “Aadhaar is Unique Identity” and Being Unique makes you the only one.

The basic assumption behind the opinion follows the logical thought process as described below:

  • Every human being is unique.
  • Every human being has a unique set of biometrics, which when considered together is unique and does not exist in any other human, born before or after for all time.
  • A technological solution (Aadhaar) hence can thus be devised to create a unique pattern for any human across all time, which can be mapped to a number.

The first scientific problem that the above reasoning presents is the issue of proof. If a statement is presented as true across all time and is applicable for every human ever born or will be born, how can it be proven “via positiva”? Clearly the only way to make a positive proof of the above statement would be to collect biometrics of every human ever born and will be born, and show that they are unique via empirical data. That would of course be an impossible exercise, which the UIDAI (or any one else) would be foolish to attempt.

One other approach for “via positiva” is however possible. An experiment could be designed to assert that specific fact for a small set of population, and a mathematical formalism could be used to extrapolate it for all population, across all time.

The extrapolation approach, however, can’t be certain about the fact since it uses small samples to arrive at conclusions for all population, across all time and hence must always be qualified with “error rates” or “confidence intervals”. In other words, all via positiva approaches that use extrapolation, are “probabilistic truth” and not “deterministic truth”.

The scientific issues then become:

  • Are these “confidence intervals” overstated or understated?
  • Are the mathematical formalisms used for extrapolation accurate?

Since the questions are crucial, we need to consider the evidence presented by both the Petitioners and the Respondents to the court, and also analyze, how the court came to the specific conclusion in paragraph 55, that “When it comes to obtaining Aadhaar card, there is no possibility of obtaining duplicate card”

Legal pronouncements and Mathematical theorems

There are two specific methods in science to prove facts.

  • Empirical evidence (Not all swans are black)
  • Mathematical proof (using formal logic)

Let us now examine how the majority can claim with complete certainty “there is no possibility of obtaining duplicate card”.

The petitioners provided empirical evidence to the court that UIDAI itself has acknowledged that as early as 2012 (Page 4), that 0.035% of duplicate enrollments, will have “more than one Aadhaar number”. Further the Planning commission in its report had acknowledged that 34,015 Aadhaar numbers were detected as biometric duplicates.

The planning commission report had this Q&A which is reproduced here in full (sic)

Question:  Does UIDAI assure 100% duplicate free database? Will there be no duplicate aadhaar numbers?

Answer: Biometric matching systems or de-duplication systems are essentially based on pattern matching and can be designed to achieve an accuracy of more than 99%. Higher the quality of biometric capture, lesser the probability of a duplicate being generated. However UIDAI aims for inclusiveness so that failure to enroll is negligible. Therefore generation of duplicate aadhaar number cannot be ruled out totally.

The UIDAI’s reply to an RTI request on 2016 is even more damning. It acknowledged that 1.69 Lakh duplicate Aadhaar numbers were cancelled.

The petitioners further provided for the court’s perusal, a mathematical proof written by Hans Varghese Mathews of CIS India and published in EPW after extensive peer reviews. While the UIDAI claimed that it’s error rate, will be fixed across all time at 0.057% (Page 4), Hans’ paper found out that false positives increase over time and will touch 1%, when 100 crores have enrolled and will keep increasing over time.

Further empirical evidence of the accuracy of Hans’ predictions was provided by the UIDAI itself and is summarized in the table below.

Table

This specific question put forth by Mr. Shyam Divan during oral argument to the UIDAI and the responses provided are as follows:

What are the total number of biometric De-duplication rejections that have taken place till date? In case an enrolment is rejected either for: (a) duplicate enrolment and (b) other technical reason under Regulation 14 of the Aadhaar Enrolment Regulations, what happens to the data packet that contains the stored biometric and demographic information?
Ans.: The total number of biometric de-duplication rejections that have taken place are 6.91 crores as on March 21, 2018. These figures do not pertain to the number of unique individuals who have been denied Aadhaar enrolment resulting in no Aadhaar issued to them. This figure merely pertains to the number of applications which have been identified by the Aadhaar de-duplication system as having matching biometrics to an existing Aadhaar number holder.The biometric de-duplication system is designed to identify as duplicate those cases where any one of the biometrics (ten fingers and two irises) match. However, very often it is found that all the biometrics match. It is highly improbable for the biometrics to match unless the same person has applied again.

Let us examine the UIDAI’s response here very carefully. It uses the word “highly improbable” which for an untrained eye is the same as “impossible”. However the word “probable” itself has a precise scientific meaning and is usually accompanied by a numerical figure.

For instance, to clear a sample question paper on CBSE 10th standard mathematics, a student has to not only understand “probable”, but also learn how to compute a number to attach it with the term and get it right from first principles.

CBSE

How was the UIDAI allowed to get away without qualifying the word “improbable” by attaching a number, when even a school going child attempting a 10th standard mathematics question would be denied marks for a similar answer? More importantly, why did the majority chose not to engage with the “probability” argument at all?

While it is merely baffling that the majority chose not to engage with both the empirical evidence and mathematical formulations that buttress each other, provided by the petitioners, it is stunning that it asserted that a specific scientific statement is true, even when the respondents (Union of india) explicitly said in multiple forums, that it is not true.

And, most importantly, there can be no argument that these issues were not put to the Court. Not only are they on the oral record, but they find explicit acknowledgment in Justice Chandrachud’s dissenting opinion, where the constitutional arguments are grounded in the acknowledgment that biometric authentication is a fallible science.

Conclusion

The Aadhaar project is a biometric technological regime at its heart and mathematical theories and empirical evidence that lie beneath it require a deep engagement to arrive at the correct factual understanding of its perceived failures or success.

When generations of people’s lives are at stake, the least that the majority could have done is to understand the science and mathematics behind biometric de-duplication through careful engagement with the evidence. It is disappointing that it chose to make up its own facts and then believe it.

In terms of the magnitude of error,  “there is no possibility of obtaining duplicate card” comes very close to the church declaring that “The Sun moves around the earth” because it feared the consequences of the scientific truth and could not bring itself to face the simple fact that, its understanding of the world has irrevocably changed.

Scientific facts however do not change or become false, because a constitutional court declares them so, just like the church’s pronouncement did not change the basic fact that “Earth moves around the sun”.

The majority’s refusal to engage and face inconvenient scientific facts is a recurring theme in the Aadhaar judgement and further posts will point out these areas in great detail.

(Editor’s Note: It may be argued that this is attacking a straw-man: nothing in the world is “certain”. The point, however, is that this is how the Majority chooses to frame the issue. And this is no accident: the Majority uses the language of certainty to evade engaging in the hard constitutional enquiries about necessity and proportionality – something that becomes evident when we see how the dissenting opinion engages with these issues. Yes, the Majority could have said that biometric authentication is fallible, but that – all things considered – it is necessary and proportionate in this case. We would then be having a different argument today. But the Majority didn’t say that. It used the language of impossibility (of duplicates) and “unparalleled” accuracy, and shut out the constitutional enquiry on that basis. The Majority judgment, therefore, must stand or fall on the hill that it has chosen to die on.)

12 thoughts on “Guest Post: The Aadhaar Judgment and Reality – I: On Uniqueness

  1. There’s another thingy too: the SC stipulated that Aadhaar not be linked with Bank account numbers. However, it allowed the mapping of Aadhaar to PAN numbers.

    Now, the Income-tax dept regularly receives data about TDS from those who cut tax at source: employers, banks, others.

    When Banks cut TDS against term deposits or even savings accounts, they perforce have to record the account numbers along with the PAN. Check the Form 26as with incomtaxefiling.gov.in for more clarity.

    Since the Income-tax dept already has Aadhar mapped to PAN and PAN mapped to bank account numbers, it’d be a simple SQL query for someone in there to list out bank data, PAN, Aadhar for every account number with them.

    Perhaps such data has some value in the market place of the deep web?

    Of course, this is not to say that such an industry already does not exist.

    • They can already run this SQL query. What extra data will Aadhaar provide to them if its not used anywhere else? The judgement has drastically reduced the number of places where you need it. The poor and marginalized are still highly affected and its not even a one time exercise for them.

  2. The factual basis of the judgment is related to what issues were raised and how the issues were framed. The framing of issues also shows what issues were left out in the arguments or were not considered by the Court.

    A more critical analysis of the judgment is needed. One that goes beyond the judgment and also critiques the limitations of the petitions before the Court, the unsatisfactory nature of oral arguments, the unsatisfactory and incomplete manner in which facts were placed before the Court, the limitations of the Court process in such a complex case, how facts were “established” in Court or were assumed based upon pleadings, etc.

    An extremely flawed process has resulted in a flawed and very dangerous outcome.

  3. The article could have been written better. While it started off fairly rigorously, the ending conflates false positive and false negative rates.

    The majority judgement is only concerned with people trying to fraudulently obtain multiple identities. So bringing up the issue with false positives is not that relevant since false positive rates only have a second order effect on one’s chance of obtaining duplicate Aadhaar. To reduce manual verification load, UIDAI might be reducing the effectiveness of the de-dup process and thus allowing more duplicates to go through this stage resulting in an increase of false negatives. This also fits with their plan of enrolling as many people as possible before the SC delivered its verdict. The RTI figures about cancellation of Aadhaar due to duplicates is interesting. The number is issued only after the de-dup stage. So why were so many not detected at this stage and how were they detected later? The answer could have shed light on the actual false negative rate. Was this explored during the hearings?

    The bit about UIDAI’s “impropable” reply is also just being nitpicky. They’ve already stated elsewhere that they believe its 0.035%. Was this not pointed out to the bench?

    Comparing the statement about impossibility of obtaining duplicate Aadhaar with the bit about geocentric theory is hyperbole. They could have said that there is very little chance of obtaining duplicates and proceeded without much change in their arguments. They would be correct if using one’s own biometrics was the only way of obtaining duplicates (the 0.035% false negative rate). Manual review of false positives, synthetic biometrics, cracked enrollment software, backdoors in code provided by foreign companies all provide avenues for fraud at scale. Even if UIDAI can detect 99.965% of duplicate biometrics the percentage of fraudulent enrollments it can detect is much lesser. Did the bench just ignore how incompetent UIDAI is in actually curbing fraudulent enrollments?

    The majority didn’t buy the black money chant and they wouldn’t have bought the deduplication chant if more technical arguments had been made (Pandey’s PPT gave an opening for this). Forgive me if I am wrong and the bench just ignored such arguments.

    • There are two distinct problems:
      1. What UIDAI knew and told outside (which is of course a lot less than what has happened)
      2. What the majority assumed and made it’s own version of biometrics are unique.

      My criticism is mostly about (2). If you start up with the proposition that biometrics are unique and have no chance of duplicates, then it is obvious that false rejections are also implausible.

      Every single possible technical argument about how biometrics are not unique was made and recorded in DYC’s dissent. So it is not for lack of arguing, but it is the majority’s lack of engagement.

      On the issue of conflating false +ves and false -ves, notice that the mathematical theory is mostly the same. Hans’ paper is about finding limits to equations and those equations hold largely for false negatives as well. So an assertion that the error rates are linear and will stay constant irrespective of size, is an absurdity by itself, once Hans’ paper came out.

      That is what the majority chose to ignore.

      • Yes, the majority did say that. But we can give them some slack and say they really mean “very low chance” for whatever value of “very low” that they deem will suffice for the proportionality tests. Basically, they are saying that its much more difficult to obtain a duplicate Aadhaar when compared to other ids. They would be true if one only had one’s own biometrics. As pointed out earlier, other means to obtain duplicates exist and in fact UIDAI has no way of detecting them.

        Coming to Hans’ paper, in what way can you use it for false negatives? To summarize, he assumes that there is a fixed probability (zeta) that the biometrics of two random distinct people matches. He then proceeds to estimate this zeta using the trials conducted by UIDAI. The aim of the paper is to calculate the probability that a new enrollee’s biometric matches with any one of the already enrolled people (a false positive). Naturally, this increases with the number of enrolled people since you are not just making one comparision (zeta) but a lot of comparisions.

        Just like zeta, the probability that two biometrics obtained from the same person don’t match is some fixed number, call it p.

        Let’s say there are k people who are already enrolled and one among them is you. If you try enrolling again, what is your chance of success? If there are no manual reviews, you will succeed if you don’t match with yourself and you don’t falsely match with the other k-1 people. So its p*(1-zeta)^(k-1). If you have already succeeded n times and you want to try for one more the probability is p^n*(1-zeta)^(k-n). So the success rate goes on decreasing.

        If manual review is allowed, your chance will be better since the reviewer might manually approve issue of Aadhaar when there’s a false positive. However, this is never better than p.

        So your chance of success doesn’t change with number of enrollees.

        However, biometrics aren’t static and can change with time. So if you wait long enough you will probably succeed. Is UIDAI going to mandate frequent biometric updation to negate this? What is the cost of 100 crore people doing this every so often? I doubt such a mandate will pass any proportionality test. The only other way is to have frequent biometric authentication and use this for updation. They can use this on the poor people obtaining subsidies. There’s no other scheme which is allowed and which will require frequent biometric authentication. It is in fact very disappointing to see that they didn’t ask for an issual of smart cards or something so that poor and marginalized people don’t have to suffer every time they want their subsidies. Even if we assume that deduplication is possible, it could have been a one time exercise.

  4. Haven’t been able to go through it yet but just wondering whether there was a definitive ruling on the burden of proof question in the application of the proportionality test. Was it supposed to go the standard way: petitioners establish violation after which state is to show reasonableness of restriction on all three prongs? Or are the petitioners even required to make a prima facie case of lack of necessity and availability of alternatives?

  5. The SC has already struck down capture of metadata, usage by private entities and non-cognisance of complaints by individuals.

    Deduplication is the only chant that the SC has bought into. If we can dismantle this in a review petition, only then can we hope to get Aadhaar declared unconstitutional.

    We need to show that there’s plenty of avenues to obtain multiple identities. I’ll try to go through the DYC judgement and see what was covered.

    • Even if we are unsuccessful in convincing the court that the deduplication capabilities of UIDAI are not that great, we can make sure that the only purpose of Aadhaar is deduplication.

      CIDR should contain nothing more than a photo, fingerprints and iris scan and all demographic data should be deleted. The claim is that demographic data can easily be faked and UIDAI has itself said that it doesn’t validate any of it. So it shouldn’t be used for any deduplication purpose. It can’t be used for judging false positives either. If the end result is that a lot of people can’t get Aadhaar then we have proof that the whole exercise was a huge waste of time.

      The whole authentication service business should be dismantled and instead a smart card displaying nothing but a photo should be issued. So while enrolling, only photo, fingerprints and iris scan are taken and an enrollment number is given. The status should be trackable via a website and the smart card should be COLLECTABLE from the enrollment office after the enrollment is successful. (Some might want SMS/email notification and delivery. This should be voluntary and the data should be deleted after the user confirms delivery.)

      The smart card’s only purpose will be to “sign” any/all other id’s. This way other id’s also get “deduplicated”. (For example, if we have to deduplicate PAN cards, we can ask the smart card to sign some string/numberwhich is static over all people. The output can be recorded in the PAN database. Different smart cards will give different outputs and no smart card can give two different outputs.)

      A signed id can now be mandated.

      If the change in biometrics is a problem, the only solution is updation every few years. Just like passports signed ids can also expire. There is a huge cost and it needs to be compared to any savings. Hopefully, in the future, the situation has changed and we can just dismantle the biometric database.

      Remember that its up to the government to show that no methods other than the current form of Aadhaar will work. This is a purely technical argument and the majority seems to have neglected it. A lot of time needs to be spent on these technical arguments and we need to make sure the judges understand them. Aadhaar doesn’t pass the proportionality test and only legal arguments aren’t going to help us show it.

      If one accepts collection of biometrics for passport/DL one could accept this collection for “deduplication”. Of course, many people, including me, have issues with collection of biometrics for any purpose but that may not be a battle we can win right now.

  6. Good article, no doubt. Respects for your logic and analysis.

    But, the lead of the analysis is misplaced, stating that Aadhaar project does not know the science behind biometrics, but also about science itself; while UIDAI aim is “to create with the objective to issue Unique Identification numbers (UID), named as “Aadhaar”, to all residents of India that is (a) robust enough to eliminate duplicate and fake identities, and (b) can be verified and authenticated in an easy, cost-effective way”.

    Linking a socio-political tool for scientific scrutiny is overdoing and inappropriate.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s