without-H

Blockchain started as the backbone for cryptocurrencies and emerged as one ofthe most relevant technologies of the last decade. Today blockchain has numerous fields of application, one of these is in the identity management landscape. Identity management is the process to manage users’ identities and provide accessto technology resources. This process is gaining importance due to the growing number of digital identities that we deal with. As an increasing number of peoplehave been spending time online and as more activities have been drifting toward the digital world due to the recent pandemic, the notion of identity slided from the physical world to the digital world as well. The new idea of identity is inter-esting and complex at the same time and it represents the digital fingerprint ofa person in the digital space. Who you are with respect to how other see you, isa relevant question that we want to address when giving a definition of digital identity.

The first modern identity system can actually be traced back to Napoleon when a first version of ID card was invented to track the work force. Fast forwardto the 1950s when a magnetic stripe with data storage capabilities was embedded into the card and a decade later when was secured to plastic cards, the notion ofidentity started to be shaped as we know it today. The last step in this innovation was the chip, eventually manifested itself as a smart card with a micro processorand a memory. We lately entered to the digital (Internet) era where digital identities are intended to be an alternative to appearing in person with paper documents.

Recently, there has been a significant effort on crafting the term “identity” from a legal, political, and social prospective. For example [1] defines the identityas a map to a unique set of characteristics or as an “unchanging physical traits of the person that reflect someone else’s perceptions”. In 2016, [3] provided arelation-wise definition of identity as a mean to keep track of things that arerelated to the mind of the observer. More broadly, an identity has been definedas a collection of data that are directly tied to a person and personally identifiable information from official credentials. These information can prove who you are.

Descending from what we have said so far, we can give a first conceptual ideaof what is a digital identity. A digital identity is a collection of data about you that is available online. In other words, it is information of an entity used by computers systems to represent an external agent, such as a person or an organization [15]. The term digital has resulted from the widespread use of identity information to represent people in computer systems. These digital identities include an entire set of information generated by a person’s online activity, encompasses birth date, social security and purchasing history and are often tied to individuals civil or national identities.

The legal and social effect of digital identities are complex and challenging. However, these are a consequence of the increase use of computers and the need to provide computers with information that can be used to identify external agents. Consider for a moment all pages that are online and representyou: social media, your personal pages, other pages. Frequently websites require registration before they provide services, leaving us with an array of fragmented identities. These identities have attributes sprinkled across sites, databases and authoritative bodies. This leave us with an abundant of complication.

Complication is defined as many different systems adopt their own way of authenticating users that are difficult to manage and maintain, especially whenit comes to our passwords. Individuals may adopt weaker passwords as a resultof a increase in complications that are easily compromised using a set of preprovisioned credentials [14]. The idea behind the digital identity is to put all information in one spot, like an e-portfolio to make it easier for people to obtain certified information about you.

A problem in the digital space is knowing whom one is interacting with. Currently there is no way to precisely determine the identity of a person in digital space, even though there are some features associated to that person. From here,the Blockchain started as the backbone for cryptocurrencies and emerged as one of the most relevant technologies of the last decade. Today blockchain has numerous fields of application, one of these is in the identity management landscape. Identity management is the process to manage users’ identities and provide access to technology resources. This process is gaining importance due to the growing number of digital identities that we deal with. As an increasing number of people have been spending time online and as more activities have been drifting toward the digital world due to the recent pandemic, the notion of identity slided from the physical world to the digital world as well. The new idea of identity is interesting and complex at the same time and it represents the digital fingerprint of a person in the digital space. Who you are with respect to how other see you, is a relevant question that we want to address when giving a definition of digital identity. need for a unified identity system. Here, we provide a formal definition of digital identity by clarifying the difference between a digital identity and an identifier.

A digital identity is a set of attributes that are directly tied to personally identifiable information from official credentials. Digital identities may evolve over time as a result of interactions with other people, in the same way, attributes can be modified to suit these interactions. An attribute is a property ofeach individual. Attributes fall into one of these categories, namely 1)assigned, 2)accumulated or 3)inherent. The assigned and the accumulated characteristics may change over time. The former reflects relationships with other bodies, for example: emails, address, marital status, etc. The latter is acquired over timesuch as health records, language, currency, etc. Finally, the inherit traits are those inherited from birth, for example fingerprints.

As discussed in [9], there are no problems with trust, as the architecturehas been greatly simplified and the service provider acts as identity provider,credentials provider and service provider as well. The service provider protectsclient privacy and implements user registration procedures. The secur Blockchain started as the backbone for cryptocurrencies and emerged as one of the most relevant technologies of the last decade. Today blockchain has numerous fields of application, one of these is in the identity management landscape. Identity management is the process to manage users’ identities and provide access to technology resources. This process is gaining importance due to the growing number of digital identities that we deal with. As an increasing number of people have been spending time online and as more activities have been drifting toward the digital world due to the recent pandemic, the notion of identity slided from the physical world to the digital world as well. The new idea of identity is interesting and complex at the same time and it represents the digital fingerprint of a person in the digital space. Who you are with respect to how other see you, is a relevant question that we want to address when giving a definition of digital identity.ity level depends by the mechanisms implemented by the service provider.

Suppose that a ∈ A is an attribute within the set of attributesA.udenotesa user in the set of possible registered users U. We can define the function f(a) that relates users and their attributes as follow:

Definition 1. f(a) : A X U → V is the function that for each attribute and individual, maps the corresponding value of the attribute.

Here, X denotes the Cartesian product of elements in A within u, and V is the set of all possible value for those attributes.

By contrast, an identifier is a data string used to uniquely identify a person. This identifier is usually assigned by a third party and must respect two properties: 1)uniqueness and 2)singularity. The former means that the identifier must be unique, the latter means that no one has more than one identifier.

The work of Ferdous in 2014 [8] was one of the firsts to provide a mathemati-cal approach by discriminating between partial and total identity. Clarifying the difference between these two terms will be relevant to grasp the next sections. A partial identity is composed by many different attributes and is valid within a specific context or domain. For simplicity, we consider the domain as a subset of an organization without referring here to a specific category. For example, the attribute identifier of a user is used to identify a user within a company, and it only refers to that context. Hence, taking the formalization depicted in [7], if we denote d ∈ D as the specific domain, i is the identifier of the user u, n is the number of attributes for the user u, we may refer to a partial identity as:

Definition 2. partIdent = { (id, vid) , (a1, v1) , (a2, v2) , (a3, v3) , ... , (an, vn) } as the set of pairs attributes a values v plus the identifier i and its corresponding value v.

Similarly, we can define a total, or whole identity as a collection of all partialidentities in all domains:

Definition 3. totIdent = ∪ { (d, partIdent) | d ∈ D } the set of pairs attributes a values v plus the identifier i and its corresponding value v.

Having a digital identity does not mean that we are in control of our information. One impo Blockchain started as the backbone for cryptocurrencies and emerged as one of the most relevant technologies of the last decade. Today blockchain has numerous fields of application, one of these is in the identity management landscape. Identity management is the process to manage users’ identities and provide access to technology resources. This process is gaining importance due to the growing number of digital identities that we deal with. As an increasing number of people have been spending time online and as more activities have been drifting toward the digital world due to the recent pandemic, the notion of identity slided from the physical world to the digital world as well. The new idea of identity is interesting and complex at the same time and it represents the digital fingerprint of a person in the digital space. Who you are with respect to how other see you, is a relevant question that we want to address when giving a definition of digital identity.rtant thing is to rethink the way we do identity. Taking an example, if we get a flight ticket, that information would be linked to our digital identity which is linked into a database with a profile with all the information about us and our family. Everything gets recorded and becomes part of that identity. If then we decide to go into a bar and we want to buy a beer, the bartender might ask me to prove that I am over 21 and ask to check my driving licence. That includes more information than just my birth date. That information canbe cross-referenced to other information, for example he could get our credit history, wellness information or location information from the carrier.

The real problem with this is that when we give up too much information,it is really easy to cross-reference these stuff and figure out who people are andbuild an avatar of you that could be used by somebody else for their benefits. Part of the solution is something called Self-Sovereign – SSI – Identity that Blockchain started as the backbone for cryptocurrencies and emerged as one of the most relevant technologies of the last decade. Today blockchain has numerous fields of application, one of these is in the identity management landscape. Identity management is the process to manage users’ identities and provide access to technology resources. This process is gaining importance due to the growing number of digital identities that we deal with. As an increasing number of people have been spending time online and as more activities have been drifting toward the digital world due to the recent pandemic, the notion of identity slided from the physical world to the digital world as well. The new idea of identity is interesting and complex at the same time and it represents the digital fingerprint of a person in the digital space. Who you are with respect to how other see you, is a relevant question that we want to address when giving a definition of digital identity. is giving back control of private information to the individual. Returning to our example of the bartender, he might just ask if you are older enough to consume alcohol. That is done by a cryptographic proof that demonstrates that we are over 21 without having to provide our license and all information on there. Wedo not need to provide birth dates but just to answer a binary question and my Self-Sovereign ID answers a binary question. The benefit with Self-Sovereign Identity is that it does not link digital identities with other information. They just know that somebody bought a beer and somebody bought a flight ticket. You can put those two together and it is much more secure than the way we do things today. We will better explain the concept of Self-Sovereign identity in section 3.

In 2016 Christopher Allen gave a first definition of Self-Sovereign Identity by describing the identity models as an evolutionary path. Christopher Allen’s principles presented in [2] are still considered by the most as a reasonable starting point to get involved on Self-Sovereign Identity – SSI. We believe that four complementary category of identity models can be encompassed to define the sets offeatures for digital identities, as highlighted in [7]. In this section, we are going to discuss the firsts three of them, and we provide a complete definition of SSI in Section 3.

The centralized model was firstly introduced in online services [12] to deliver service specific resources. Here, the service provider allocates identities and credentials to users and separately distributes them to everyone, as described in [10]. Every person needs to register with an account to each service. This modelrespects the way Internet had been growing in the 90s when Certification Au-thorities – CAs – yielded a way to prove validity of IP addresses. These CAs constituted a centralized point of failure and gathered power on a single entity which could revoke a valid identity or even confirm false identities. As describedin the Allen’s post [2], to make things worse, organizations created further sub-set of subsidiaries which contributed to further split identities.

In this scenario, there are only two parties involved, namely 1)the providerof the service that provides credentials (username and password) to 2)users whowish to benefit the service. For this reason, each service handles a set of partialidentities with their corresponding credentials. As we said earlier, this creates abottleneck and concentrates power to a single spot. This kind of model is alsoknown as Siloed model [13] because to interact with a company, you need to openan account and credentials are never shared between organizations (credentialsremain indeed siloed). Typically, organizations store your data along with thecustomers data in a central repository controlled by the company. This identity management system requires that each user identifies herself to a different service provider.

As discussed in [12], there are no problems with trust, as the architecture hasbeen greatly simplified and the service provider acts either as identity provider and as credentials provider. The service provider protects client privacy andimplements user registration procedures. The security level depends from themechanisms implemented by the service provider.

This model works fine as long as the number of services and users is limitedbut is instead problematic when the number of users and the number of service providers they transact with rapidly increases. The real owner of your digital identity is the organization or institution. Furthermore, a central repository (like in this case) constitutes a honey pot for hackers and the lack on protecting personal information may cause distress to users. If a credential is somehowcompromised indeed, this will result in authentication failure or identity theft. Malicious users may also erase a person digital identity that might have taken awhile to foster.

The federated model was introduced to solve the problem with centralization. Federation is an agreement between parties that allows one party to leverage the existing infrastructure of another to authenticate or better saying, it is a way toconnect identity management systems together. Here, the identity provider andthe service provider refer to two different entities which are entitled to share details about their user within the same federation by mean of shared technologies and standards. Each service provider issues its own set of credentials and share partial identities to the identity provider to allow users to access the service bylogging to the identity provider. We can also see this service as a legal and technical bind among the service provider and the identity provider. Typically, each federated domain behaves like a SILO domain where each individual is entitled of a different set of credentials for each service provider she registers to. The operation of authentication and identification takes place inside the federation.

To familiarize with this model, we should firstly understand the concept of claim-based identity. A claim is a statement used by a service provider to obtain identity information about a user that another application has been authenticated. Claims are typically delivered to the application within a security token which is used to transfer identity information between the identity provider andthe service provider. A security token contains a complete set of claims information for a particular user and is issued by the Security Token Services operated by the entity provider. In this approach, trust is explicit.

Think of it as an internal employee portal with various intranet links to time-sheets, insurance information, company directory, etc. Instead of having employee login with their credentials on every single account, employees can havethe portal authenticate with the other intranet sites using a single username and password.

Like in the previous model, the process of authenticating requires trust among entities (users, identity provider and service provider) and is a two-step procedure where firstly, the user authenticates herself to the identity provider. Then, an indirect access path that does not require any new authentication re-directs the user to the service provider to consume the service [10]. The tricky part ofthis model is to organize and link partial identities (that have been created underdifferent service providers) to the same individual, so that a user by connectingto the identity provider can get access to every service.

The IT administrator is in charge to bind the Identity provider and the service provider adopting a contract that can be update whenever necessary. When the contract ends, the two parties involved may leave the federation. [6] defines the way how federation model can be managed as a four-step process thatcomprises: 1)association where identity and service provider share the contract. 2)Provisioning and 3)Maintenance where the user benefits the service, and thepolicy is updated. Finally, 4)revocation where the two entities are decoupled.

Federation is commonly seen under large businesses, where single sign-onmechanisms allow a user to access multiple internal services, providing a degreeof portability to a centralized identity. An example of this model is the universitynetwork, where we usually see one identity provider and many service providers,such as email, library, printing, etc. The identity provider keeps track of students’ username and passwords and by logging to one service (for example the email), students gain access to all other services.

This model has several advantages. As we mentioned earlier, the identityprovider does not have the burden to provide the service, but has just to man-age partial identities. Consequently, service providers can focus on the service, guaranteeing higher scalability and a standard approach to improve security andprivacy. Finally, users do not deal with many passwords, which improve the overall security of the system. Despite these advantages, the identity provider stillhas significant power on his hand and if an account is compromised, it mightresult in a greater damage than using the centralized model.

Example. We provide here a real life analogy with a case in which digital identity is useful. Think of the airport check and procedure. When you get tothe airport, you first check in with a counter and present your passport. After verifying that your passport matches your document and photo confirming that you have actually paid for the ticket, the agent prints a boarding pass with all relevant information about you. Now you can head to the security checkpoint and enter the boarding gate by presenting your boarding pass. A boarding passis a token containing a set of claim about you. The agent does not need to verify your document because they have been issued by a trusted authority.

To solve the problems with usability and scalability, a different paradigm emerged,putting users in the middle of their identities for a better user experience. The term user-centric refers to the technology that ensures that a user is placed back in control of her digital identity [5]. This paradigm shifts the focus from the service provider to the user’s perspective.

Technically, this model is similar to the previous one, with a number ofservice providers and one identity provider in charge to manage users’ partial identities, with a major difference though: there is no need to define trust amongentities because the concept of trust is intrinsically decentralized. Hence, a service provider does not need to bind itself into a federation, from here the name open-trusted model. Whenever an individual tries to access to a service provider, her request is forwarded to the identity provider which is in charge for authenticating the user that, in turn, releases a profile for the user to the service provider where an authorization decision is taken, based on her grants.

This model was firstly introduced by OpenID which quickly became successfully, except for the fact that a user could be taken away at any moment from the registering entity. Few years later, Facebook learned the lesson and provided a better interface but still lacks for not letting the user choose what service provider they could adopt. Furthermore, Facebook could arbitrarily close accounts with the risk of compromising access to other web services.

Differently from the identity models described in the previous chapter, the goal of Self-Sovereign Identity (SSI) is to put individuals fully in control of theirdigital identity. It is important to understand that when we talk about individual sovereignty, we have not only been talking about technology, but also of a seriesof principles. Most importantly, the user has control over what is disclosed to whom and how it is used. There are two important principles that come alongwith that: 1)the right to be forgotten (deleted). Even if I am using a service that is collecting tons of information about me, I should be able to tell them to delete it. The other one is 2)the right to move my information to another service. I should be able to do that in addition to my entire social graph.

SSI is part of the inevitable paradigm shift towards the decentralisation of trust and enhancement of privacy in computer systems and beyond.

Most of the efforts to define SSI enumerated its guiding principles, similarly to what has been done by Kim Cameron in [4] for the concept of identity. Thefirst to provide a principles-shaped definition was C. Allen proposing in [2] Ten Principles of Self-Sovereign Identity resulting with the most popular take on SSI.J. Andrieu proposed in [3] Core Characteristics of Sovereign Identityfocusing ona definition beyond technology. Instead, the Sovrin Foundation proposed in [9] a more technical perspective on digital identity ecosystems with the Principles Of SSI.

We use Allen’s principles to outline in detail the properties that must be satisfied by a SSI ecosystem, expanding the description with the principles from Sovrin and Andrieu.

1) Existence. Behind a digital identity there can be any kind of user (not only humans) whose life is not tied to the system being used. In other words, usersdon’t start to exist because they are represented digitally but their existenceis independent from their possible multiple partial identities. Anyone in a SSI ecosystem can autonomously create claims for themselves with the further ability to provide verifiable authenticity proofs. Claims about other users can be created too, but it is up to the claim subject to accept them. The intrinsic purpose of Self-Sovereign Identity is to ensure that the public and accessible representationsof an user are guaranteed to be a result of their will.

2) Control. Identity control is on user hands, i.e, users can autonomously update their social graph and their digital identity as well. Previously accepted claimsthat end to be meritorious or useful can be rejected. Not only the claims by other users, but also something claimed about themselves. The consequence of this control principle is that any party that should verify something about an identity should refer to the last version of it in the ecosystem.

3) Access. Users can easily and freely access to all the data about their digital identity. This principle does not allow users to have free access to other individual’s private information but only to their own set of information, unless after specific authorization. Once that users had access to the data, they must be ableto fully understand and manipulate what they retrieved.

4) Transparency. Transparency considers all layers in the identity network. The system and algorithms must be open source. Sharing policies entitle users and stakeholders and provides them access to the information they need to understand the rules and contribute to the decision process.

5) Persistence. The idea behind this principle is that digital identities last as long as users wish. Data that represents personal digital information must outlast the issuer. An identity can be rebuilt from scratch using the available claims and credentials. DLTs provide a solution but we are conscious that other possible solutions are available as well. In fact, blockchains store data in a secure way using cryptographic proofs, so that it is computationally hard to revert the encrypted information. Historicization of updates is persistent too for auditability and non-repudiation.

6) Portability. The platform and all linked information are transportable. This means that users are free to choose their favourite network for the identities, without being stuck to a certain platform. Multiple ways to move identities are provided with support to different protocols. Transportable identities make censorship harder because the freedom to switch to another provider is guaranteed.

7) Interoperability. SSI allows interactions between different systems, even outside of SSI, using common standards. An identity is not just valid on the platform where it has been created but it can be reused in other platforms. Hence, the limitation through the technology should not compromise the ability of the system to properly function in different regions.

8) Consent. Digital identity leverage the amount of information shared across platforms. As the number of identity platforms has increased, interoperability became a discriminant factor to guarantee a continuous identity service and with this, the ability of platforms to share users' data. This principle claims that users must always provide explicit consent over data shared. Basically, everything that exists in SSI can be considered valid by other entities only when the user consent occurs.

9) Minimalization. Users want to minimize the amount of information that they share but still be verifiable. In general, individuals want to allow people to see things and they can consent people to partially see their information and obfuscate other information by using a technology such as zero-knowledge proof. Companies may ask you for a lot of information that actually they do not need because they want to gather information from you. What users can do is to minimize the amount of information disclosed but still remain verifiable. An interesting part of this is that for example, users can declare to be over 21, without actually sharing their birth date. This prevents companies to collect large amount of data and creating a profile for each of their customers.

10) Protection. This is related with control. If I have control of my private key, I can also protect my information and if a conflict happens, the network must act to protect the rights of individuals. In the same way [12] declares that the SSI ecosystem shall not discriminate individuals. Decentralization comes handy from this point of view. The system must not rely on a centralized group of individuals or servers and must instead protect the fundamental identity needs.

1. Abelson, H., Lessig, L., Covell, P., Gordon, S., Hochberger, A., Kovacs, J., et al.:Digital identity in cyberspace. White Paper Submitted for 6.805/Law of Cy-berspace: Social Protocols (1998)

Allen, C.: The path to self-sovereign identity.[online] life with alacrity blog (2016)

3. Andrieu, J.: A technology - free definition of self-sovereign identity. Rebooting theWeb of Trust III (October), 2–5 (2016)

4. Cameron, K.: The laws of identity. Microsoft Corp12, 8–11 (2005)

5. El Maliki, T., Seigneur, J.M.: User-centric mobile identity management services.In: SECURWARE International Conference. Citeseer (2007)

6. Ferdous, M., et al.: User-controlled identity management systems using mobiledevices. Ph.D. thesis, University of Glasgow (2015)

7. Ferdous, M.S., Chowdhury, F., Alassafi, M.O.: In search of self-sovereign identityleveraging blockchain technology. IEEE Access7, 103059–103079 (2019)

8. Ferdous, M.S., Norman, G., Poet, R.: Mathematical modelling of identity, identity management and other related topics. In: Proceedings of the 7th InternationalConference on Security of Information and Networks. pp. 9–16 (2014)

9. Foundation, S.: Principles of ssi (2020)

10. Gr ̈uner, A., M ̈uhle, A., Gayvoronskaya, T., Meinel, C.: A comparative analysis oftrust requirements in decentralized identity management. In: International Confer-ence on Advanced Information Networking and Applications. pp. 200–213. Springer(2019)

11. Hoofnagle, C.J., van der Sloot, B., Borgesius, F.Z.: The european union generaldata protection regulation: what it is and what it means. Information & Commu-nications Technology Law28(1), 65–98 (2019)

12. Jøsang, A., Fabre, J., Hay, B., Dalziel, J., Pope, S.: Trust requirements in identitymanagement. In: Proceedings of the 2005 Australasian workshop on Grid comput-ing and e-research-Volume 44. pp. 99–108. Citeseer (2005)

13. Suriadi, S., Foo, E., Jøsang, A.: A user-centric federated single sign-on system.Journal of Network and Computer Applications32(2), 388–401 (2009)

14. Wang, D., Wang, P.: Offline dictionary attack on password authentication schemesusing smart cards. In: Information security, pp. 221–237. Springer (2015)

15. Williams, S.A., Fleming, S.C., Lundqvist, K.O., Parslow, P.N.: Understanding yourdigital identity. Learning Exchange1(1) (2010)

Toward Self-Sovereign Identity

1. Introduction

1.1 A formal definition of digital identity

1.2 The need for more control

2 Identity models

2.1 Centralized

2.2 Federated

2.3 User-centric

3. Definition of SSI

3.1 Principles of SSI

References