Technological Design Choices

Research and Writing by Divyank Katira

The use of digital technologies to aid the identification of individuals, the subsequent authentication of their identity, and to allow authorisation on their behalf is a common practice in emerging national ID schemes. We describe principles for the appropriate use of digital technologies in ID systems, common technical architectures that have emerged in their design, and summarise some of the key characteristics of these digital technologies.

1

Appropriate Use of Technology

In this section, we present principles to achieve privacy, security, and inclusivity.

1.1

Digital technologies can supplement existing manual processes but not entirely replace them

Despite a rapid rate of advancement in recent decades, the technologies that are used to compose digital ID systems still suffer from reliability issues. Faults in software and hardware systems, gaps in network connectivity between them, and their inability to accurately and adequately represent people’s identities can all lead to exclusion of individuals from benefits and services. For instance, a study found that about one-fifth of transactions in India’s Aadhaar-enabled Payments System failed due to technical reasons (17.03% due to biometric mismatch, 3.71% due to other technical issues).1 Even large technology companies with well-funded engineering and operations teams only guarantee 99.99% uptime for their cloud offerings, which roughly translates to one hour of downtime in a year.2 While not all of these faults will lead to denial of access to benefits or services, their outcome can be particularly grave in cases where they prevent access to essential services such as food distribution or healthcare. For this reason, it is necessary to have robust manual processes in place for when technological systems fail.

In addition to being susceptible to unintentional faults, technological solutions are also susceptible to cyberattacks. Having well-tested human-operated processes to fallback upon also makes identity infrastructure and the services that rely on it less vulnerable to cybersecurity threats.

1.2

Storing biometric data in central repositories is ill-informed and not sustainable

Another trend that we have observed in national digital ID systems is the use of biometric information, such as fingerprint or iris scans, to identify individuals and then authenticate their identity. In comparison to knowledge and possession factors, biometric factors allow for quick, cost-effective and convenient authentication as they do not require users to remember a secret password or present costly physical tokens such as smartcards or security keys at the point of authentication. They also serve as highly accurate identification factors for most people. But from a cybersecurity perspective, biometrics are weak authentication factors. They are immutable and, in most cases, publicly visible. This makes them impossible to change in case the database used to store them is breached and also prone to forgery. To understand this, we can compare them to passwords, which have served as a de-facto authentication factor on the web for the past two decades. Over this time a total of at least 8.4 billion passwords have been leaked through successive data breaches.3 Without the invention of the mythical unhackable database, it is likely that the use of centralised biometric authentication will yield similar results over time. Leaks of biometric information are also more severe than passwords as they cannot be reset after a leak. As such, the appropriate use of biometrics is limited to local authentication i.e. when the storage and matching of biometrics takes place on an end-user device or credential such as a cellular phone, smartcard, or security key.

1.3

Foundational ID systems must ensure separation of responsibilities

Unlike Functional ID systems which are designed and built for a specific purpose, Foundational ID systems are general-purpose systems that can be used for many different purposes. They are responsible for conducting the processes of identification, authentication, and authorisation. However, the ease with which such systems allow processes to incorporate these mechanisms creates a potential for their misuse or overuse.

As an example, we can look at Nigeria’s National Electronic Identity Card. It is a Foundational ID system that allows for the creation of ‘applets’ which enable its use for different purposes. One such applet is e-Transport, which allows the use of the ID card as a payment system for travel through public transportation.4 Through such use, the National Identification Number, which is an identity-linked identifier, is collected and linked to an individual's movements as they travel through public transport — a purpose that should not require any identification documents. This is a case of misuse of identification functionality in a Foundational ID System. Since the identification and authentication mechanisms are intertwined and rely on the same identifier, the system unnecessarily identifies individuals and creates a log of their movements when the goal of the system here should only be to authenticate an individual and check whether they have loaded sufficient funds onto their card to make the journey. A workaround to this issue is to use a unique, pseudonymous identifier, which is not linked to a person’s identity, for each different purpose that the ID card is used for. The Web Authentication standard supports such anonymous authentication.

Another case of conflation of responsibilities in a Foundational ID system was seen in India’s Aadhaar ID program. Here, there were multiple instances where the process of authentication was used as a proxy for authorisation, leading to actions being taken on an individual’s behalf without their informed consent:

  • A telecom operator which also operates a payment service mistook authentication of individuals, which was required for KYC purposes, as authorisation to open an account on its payment service. This led to its users’ subsidy payments being silently redirected to this new account which they did not even know existed.5
  • Some individuals who used Aadhaar to authenticate themselves to receive vaccinations as part of the COVID-19 immunisation drive were also enrolled in a Unique Health ID program, without their consent.6

Foundational ID systems, owing to their expansive scope, should be carefully designed and strictly regulated through both technical and legal means to prevent such abuse. They must ensure separation of the responsibilities of identification, authentication and authorisation and only use them where necessary.

1.4

Seamless public-private interoperability increases data sharing and collection

In all of the identity systems we have encountered, Governments are the primary identity providers. From civil registration to driver’s licenses and voter IDs, there are many legitimate purposes for a state to identify its citizens. Even private-sector entities that take up the role of identity provider, such as the GOV.UK Verify platform7 or various commercial identity verification services,8,9 still rely on verification of Government-issued IDs as the source of identity attestation. Such document verification is done either manually or automatically and both processes suffer from drawbacks — they are either labour-intensive or expensive, and inaccurate.

To streamline the sharing and verification of identity data, several industrial actors10 are attempting to develop systems and protocols through which Governments can digitally issue identity credentials to individuals. These credentials can subsequently be shared with relying parties, who can then verify them. In addition to bringing down identity verification costs for the industry and increasing accuracy of the data collected, such systems would also benefit the individuals using them as they would no longer have to carry around multiple physical documents to prove their identity.

However, enabling streamlined access to sensitive identity data, with unprecedented levels of accuracy, will likely encourage industrial actors, who have historically collected data under meaningless, coercive notions of consent11, to collect and retain even more private information. The adoption of such technologies must be carefully considered, and governed by strong regulatory and technical barriers to prevent unfettered commercial access to this previously inaccessible identity information.

2

Typology of Information Architectures

Over time, identity systems have evolved into three distinct informational models, namely, centralised, federated, and decentralised systems. Given the wide range of uses these systems have been applied to in both the public and private sectors, these categories are sometimes overlapping and take on different meanings in different contexts. In this section, we attempt to explain the various meanings of these terms in the context of digital ID systems by analysing three different systems that are representative of these three informational models: India’s Aadhaar ID system12 with its “Central Identities Data Repository” is a centralised model, Canada’s Sign-in Partner service13 is a federated model where financial institutions play the role of Identity Provider, and the W3C standards14,15, proposed by organisations under the Decentralized Identity Foundation16 represent a typical decentralised model which aims to create a marketplace of Identity Providers and Relying Parties for quick and efficient data sharing.

We define the axes17 of centralisation/federation/decentralisation of information in digital ID systems as follows:

2.1

Source of identity or trust

Any identification, authentication, or authorisation operation in a digital ID system can be thought of as a transaction between an individual, an identity provider, and a relying party.

In a centralised system, there is a single identity provider that serves as a source of identity or trust for one or more relying parties.

A federated system allows an individual to choose a single identity provider from a set of choices. These can be of two types:

  • Many (identity providers) to one (relying party), for example, a website that allows Google and Facebook login.
  • Many (identity providers) to many (relying parties), for example, Canada and UK’s ID systems that allow online access to government services. Transactions in such systems are typically mediated by a ‘broker’ for ease of management so that not every relying party needs to learn about every identity provider, they simply trust the broker.

The proposed ‘decentralised’ or ‘self-sovereign’ ID systems also provide support for many identity providers with many relying parties, with even multiple identity providers participating in a single transaction (for added assurance). They have defined open standards with the hope of spawning a network or ecosystem of independent vendors, identity providers, and relying parties that can all interoperate with ease, for efficient data collection and sharing.

CENTRALISED SYSTEM

A single IDENTITY PROVIDER serves as a source of identity or trust for one or more RELYING PARTIES.

FEDERATED SYSTEM

An INDIVIDUAL can choose a single IDENTITY PROVIDER from a set of choices.

DECENTRALISED SYSTEM

A network or ecosystem of INDEPENDENT VENDORS, IDENTITY PROVIDERS, and RELYING PARTIES that can interoperate with ease, for efficient data collection and sharing.

2.2

Storage

Another important informational aspect of digital ID systems is where the vast amount of sensitive personal data and metadata handled by these systems resides. This is distributed among:

  • Identity providers, who must necessarily store such data as they are responsible for issuing credentials
  • Relying parties, which may verify and discard such information or store it indefinitely, depending on their privacy policies and other incentives. It is important to note that the information architecture of the ID system does not impact the data collection or retention policies of relying parties.
  • Other intermediaries, such as brokers (in federated systems) and network operators (in decentralised systems). These typically only store metadata (such as who accessed what service, when, and where), but this can also be sensitive, particularly if it is not de-identified or if there is a risk of re-identification.

In a centralised system, there is a large central identity provider that stores the private information of all participating individuals and metadata relating to usage of the ID. This forms a lucrative target for data breaches and presents a privacy risk as the operator of the system has insight into all activity within it.

A federated system distributes this risk to some extent by having multiple identity providers, each of which will only store the information and metadata of a subset of users. Brokered federated systems, however, present a central trusted point through which all data passes (but is not necessarily stored) and has insight into all activity (metadata) within the system.

CENTRALISED SYSTEM

A large central IDENTITY PROVIDER stores the private information of all participating individuals.

FEDERATED SYSTEM

Multiple identity providers store the information and metadata of a subset of users.

DECENTRALISED SYSTEM

Data is distributed among multiple identity providers. IDENTITY PROVIDER issues IDENTITY CREDENTIALS to the INDIVIDUAL, who shares it with RELYING PARTIES who can consult a DECENTRALISED STORAGE NETWORK to verify its authenticity.

Like federated systems, proposed ‘decentralised’ or ‘self-sovereign’ ID systems also distribute data among multiple identity providers. Additionally, in place of a central trusted broker, in this model an identity provider issues identity credentials to the individual, who stores it on their own device or a cloud storage solution provided by vendors of such systems. The individual, in turn, shares the credential with relying parties who can consult a decentralised storage network (typically a blockchain) to verify its authenticity. In this model, the decentralised storage network stores pseudonymised metadata about affiliations of individuals to IdPs and RPs and its operators can potentially18 glean metadata into usage activity.

Storage of Sensitive Data Across Identity Systems

WHO SEES/STORES SENSITIVE DATA IDENTITY PROVIDER BROKER/VENDOR DECENTRALISED STORAGE NETWORK INDIVIDUAL’S DEVICE/CLOUD STORAGE PROVIDED BY VENDOR RELYING PARTY
CENTRALISED Single large IdP N/A N/A N/A
FEDERATED Multiple smaller IdPs N/A N/A N/A
BROKERED FEDERATED Multiple smaller IdPs Metadata N/A N/A
DECENTRALISED/ SELF-SOVEREIGN ID Multiple smaller IdPs N/A Metadata

2.3

Control and fault tolerance

In a centralised ID system, a single large identity provider is tasked with the responsibility of issuing credentials and attesting the identity of all participants. This forms a single point of failure that could go offline (say, due to technical failure) leaving users with little recourse.

A federated system distributes this responsibility among multiple identity providers, providing redundancy and some resilience to technical failure.

In the proposed ‘decentralised’ or ‘self-sovereign’ ID systems, the individual holds their own credentials or delegates this responsibility to the vendors of these systems. This allows the credential to be used even when the issuing identity provider is unavaliable/offline, as the relying party can verify its authenticity by consulting a decentralised storage network.

Additionally, while identity providers have the power to unilaterally and arbitrarily revoke credentials in all of the three models described above, the decentralised model stores a tamper-resistant record of credentials in its storage network (usually a public or permissioned blockchain), which provide some accountability in the face of abuse of power by an identity provider.

CENTRALISED SYSTEM

A single large IDENTITY PROVIDER is tasked with issuing credentials and attesting the identity of all participants.

FEDERATED SYSTEM

Distributes issuing credentials and attesting the identity among multiple identity providers.

DECENTRALISED SYSTEM

Also consists of multiple identity providers, but the INDIVIDUAL holds their own credentials, once issued, or delegates storage to system vendors.

2.4

Risk of re-centralisation

Decentralisation is something that needs to be actively managed and maintained. Along each of the axes described above — source of identity, storage, and control — a decentralised or federated system can always regress to a more centralised one:

  • Source of identity: One or two popular identity providers could emerge, effectively resembling a centralised system.
  • Storage: The market could converge to a few popular vendors for storing credentials, creating a honeypot of sensitive data similar to a centralised system.
  • Control: The decentralised storage network (usually a blockchain) must be maintained by many disparate operators. If a single operator controls a majority of nodes in the network, it would resemble a regular database controlled by a single entity, negating the tamper-resistance guarantees it provides.

3

Credentials, Identification and Authentication Factors

3.1

Choice of identification factors

Privacy Accuracy Cost
Biometric Factors
This refers to the use of physiological features to identify individuals.
Low
The immutable nature of biometrics makes it hard to place meaningful limits on their future use. Some biometric factors, such as face and gait recognition, can be deployed without an individual’s consent
High
Biometrics are highly accurate identification factors. However, solely relying on them leads to exclusion as they are never fully accurate.
High
They require the use of dedicated hardware and software.
Document Verification

If the people being identified possess pre-existing identity documents, these can be used to identify them.

Verification is either done manually or through a computer-assisted process.

Medium
The use of existing identity documents minimizes the amount of additional identifying information that is collected.
Medium to high

Both human and computer-assisted verification processes are prone to error.

The use of security features such as holograms and microprinting can improve accuracy.

Medium to high
Manual processes are labour-intensive and computer-assisted processes require dedicated hardware and software.
Pseudonymous Identifiers
Individuals can be identified by identifiers that are not linked to their identities, such as email ID, phone number or a public key.
High
Allows individuals to transact anonymously.
N/A
Low




Ratings shown are relative to each other

3.2

Choice of identity artifacts

Security Cost
QR Code
A Quick Response (QR) code allows for convenient scanning of identity information encoded within it.
low
They can be copied with ease.
low
These are cheap to issue and can be printed on a piece of paper.
Microchip-based Cards (Smart Cards)
Identity attributes are encoded into an Integrated Circuit chip that is embedded into a physical document.
MEDIUM TO HIGH
The chips are typically secured by a second factor, such as a PIN.
medium
Costlier than paper-based ID documents.
Contactless Cards
These are similar to smart cards but can be scanned over a small distance.
low
They can be remotely accessed. Their use is limited to low-risk scenarios for convenient access.
Medium
Cost is similar to smart cards.
Security Keys
A security key is a thumb drive shaped device with an embedded chip for storage of identity attributes.
MEDIUM TO HIGH
These are typically used as a second factor.
Medium
Cost is similar to smart cards and contactless cards.
Smart Card or Security Key with Biometrics
A fingerprint scanner is embedded into a smart card or a security key.
high
Integrating possession, biometric, and knowledge factors into a single device makes them highly secure.
high
Costlier than regular smart cards or security keys.
Smartphones & Computers
A virtual credential can be issued that can be stored on a smartphone or a computer.
medium
Security properties are similar to smart cards and security keys. However, they are connected to the internet, making them more vulnerable.
high
This is only cost-effective if the intended users of the ID system already possess these devices.




Ratings shown are relative to each other

3.3

Choice of authentication factors

Security Privacy Cost
Biometrics (Centralised)
Biometric information is used to authenticate individuals. Centralised refers to storing many biometrics in a central database.
low
Biometrics are immutable and, in most cases, publicly visible. This makes them prone to forgery and impossible to change in case of a breach.
low
Storing biometrics on a large central database makes them vulnerable to breach.
low to medium
A small number of biometric readers are required at points of authentication.
Biometrics (Local)
This refers to matching and storing of biometric information on the end-user device performing authentication, such as a smartphone or security key.
medium
In this method, biometrics are typically used as a secondary factor.
high
Biometric information does not leave the device under the control of the individual.
high
The individual being authenticated must possess a device capable of biometric authentication.
Document verification
A pre-existing identity document is used to authenticate an individual. Verification can be done either manually or through a computer-assisted process.
Medium to high

Both human and computer-assisted verification processes are susceptible to forgery.

The use of security features such as holograms and microprinting can improve security.

medium
The physical document may have personal information printed on it — potentially revealing more data than is required for the purpose of authentication.
High
Manual verification is labour-intensive and automated verification requires dedicated hardware and software.
Passwords
An individual is authenticated on the basis of a secret piece of information.
medium

Both human and computer-assisted verification processes are susceptible to forgery.

This method’s reliance on individuals to choose secure passwords and not re-use or share them weakens its security.

high
Allows individuals to be authenticated anonymously.
low
Passwords are highly cost-effective authentication mechanisms.
One-Time Passwords (SMS)
An OTP is sent to an individual over SMS.
low
SMS is not considered a secure communication method.
medium
The individual’s phone number needs to be disclosed for authentication.
medium
A cellular connection is required.
One-Time Passwords (App-based)
In this method, after an initial registration phase, an OTP is automatically generated
high
Typically used as a secondary factor, this is considered a security best-practice.
high
Typically used as a secondary factor, this is considered a security best-practice.
medium
A cellular connection is not needed during authentication, but a smartphone or other device is required.
Physical ID Artifacts
If one has been issued, it can also be used for authentication.
low to high
Depends on the artifact chosen (see above).
medium to high
The physical artifact sometimes contains personal information printed on it, revealing more data than is required for authentication.
low to high
Depends on the artifact chosen (see above).




Ratings shown are relative to each other

Notes


1 Padmanabhan Balasubramanian et al., “Fintech For The Poor: Do Technological Failures Deter Financial Inclusion?”, SSRN (2021), https://ssrn.com/abstract=3840021.
2 “Compute Engine Service Level Agreement (SLA)”, Google Cloud, last accessed November 12, 2021. https://cloud.google.com/compute/sla; “Amazon Compute Service Level Agreement”, Amazon Web Services, last accessed November 12, 2021. https://aws.amazon.com/compute/sla/
3 Lance Whitney, “Billions of passwords leaked online from past data breaches”, TechRepublic, June 9, 2021, https://www.techrepublic.com/article/billions-of-passwords-leaked-online-from-past-data-breaches/.
4 “Mapping Digital Identity Systems: Nigeria”, Digital Identities: Design and Uses, November 03, 2020. https://digitalid.design/research-maps/nigeria.html.
5 Anand Venkatanarayanan and Srikanth Lakshmanan, “Aadhaar Mess: How Airtel Pulled Off Its Rs 190 Crore Magic Trick”, The Wire, December 21, 2017, https://thewire.in/banking/airtel-aadhaar-uidai.
6 Mehab Qureshi, “Govt Created Health IDs Without Consent, Say Vaccinated Indians”, The Quint, June 9, 2021, https://www.thequint.com/tech-and-auto/govt-created-uhid-without-consent-say-vaccinated-indians#read-more.
7 “Mapping Digital Identity Systems: UK”, Digital Identities: Design and Uses, July 10, 2019. https://digitalid.design/research-maps/uk.html.
8 “Document Verification”, Onfido, last accessed November 12, 2021. https://onfido.com/solutions/document-verification/.
9 “The easiest way to verify identities”, Stripe, last accessed November 12, 2021. https://stripe.com/identity.
10 Decentralised Identity Foundation, last accessed November 12, 2021. https://identity.foundation/.
11 A Critique of Consent in Information Privacy, The Centre for Internet and Society, last accessed December 7, 2021. https://cis-india.org/internet-governance/blog/a-critique-of-consent-in-information-privacy.
12 “Mapping Digital Identity Systems: India”, Digital Identities: Design and Uses, October 13, 2020. https://digitalid.design/research-maps/india.html.
13 Alaca, Furkan, and Paul C. Van Oorschot. "Comparative analysis and framework evaluating web single sign-on systems." ACM Computing Surveys (CSUR) 53.5 (2020): 1-34.
14 “Decentralized Identifiers (DIDs) v1.0”, World Wide Web Consortium, last accessed November 12, 2021. https://www.w3.org/TR/did-core/.
15 “Verifiable Credentials Data Model v1.1”, World Wide Web Consortium, last accessed November 12, 2021. https://www.w3.org/TR/vc-data-model/.
16 Decentralised Identity Foundation, last accessed November 12, 2021. https://identity.foundation/.
17 This term was used by Vitalik Buterin, who defined the axes of decentralisation for software. https://medium.com/@VitalikButerin/the-meaning-of-decentralization-a0c92b76a274.