Jacob Ideskog — CTO @ Curity.io
Ever since the OAuth 2.0 specification was finalized, we have dealt with the limitations of the Resource Owner Password Flow (ROPC). It was intended as a flow to support legacy applications that did not have a browser available, or to be used as a last resort for legacy use-cases. The authors of the specification clearly signal that this flow is not recommended, and should be avoided.
At Curity, we agree with this sentiment. The ROPC flow presents an anti-pattern that is in conflict with the spirit of the OAuth specification as a whole — to not let the client be involved with user authentication. OpenID Connect does not even include the ROPC flow as an OpenID Connect compatible way of obtaining tokens, so the client can not rely on any identity data from the server.
The solution has always been to go full on and implement (preferably) the Authorization Code Flow, which in a mobile application means to open a browser — system, tab or view — and wait for the browser to callback to the application after authentication and authorization has been performed — the RFC 8252 idea.
The underlying notion that the client cannot be trusted with seeing any input that the user provides to the identity system is based on a few assumptions:
- The client application is a third-party application;. not one developed by the same organization that owns the APIs and the Authorization Server.
- The client is a hostile environment. This could be a public (non-confidential) client on a mobile device or on a Single Page Application (SPA) on the web.
- The security model of OAuth is built around the fact that redirects are protected by the browser, so we can safely(ish) assume that the user will be sent to the intended application after authorization is done.
The first assumption has proven to be exceptional. Most organizations that deploy OAuth do so to support their own API architecture used by their own client applications. They are in full control of the development and maintenance of all parts of the system. So, the applications are not third-party; they are first-party.
This provides a lot of friction between app developers and security operations. Why are we hindering our own app developers from developing award winning user experience based on assumptions made for 3rd party developers? I’ve personally argued with developers many times trying to convince them of the greater good in following the standard to the point, leaving UX as a second priority feature.
If point number one is void, that still leaves two and three. The mobile application is still a hostile environment. All mobile apps must be considered public clients from an OAuth perspective unless measures are taken to make them confidential. This has proven to be difficult using the standards alone. Several techniques can and should be used, such as following the Native Apps guidelines (implementing PKCE) and bootstrapping clients using Dynamic Client Registration. However, the main issue often remains: how to confidently assert that the application is not a clone or repackaged version of the original application.
This brings us to point three. The only way the standard provides to assert that we issue tokens to the intended applications, is to redirect to the application when authorization is done using deep links or universal links. This is protected by the operating systems, and, combined with PKCE, it provides a secure way to pass tokens to the application.
All this means that we are stuck with the browser. Even if ROPC were to be used instead, and the organization would be satisfied with the limitations of only allowing username/password as the authentication mechanism, the client is still public. As such, additional measures need to be taken to securely transform it into a confidential client.
The door is wide open for phishing apps to use the same mechanism to get hold of users credentials, and even obtain access tokens and operate on the user’s behalf.
Enter Authentication APIs
Looking at any authentication flow out in the wild, you will find one thing in common: Authentication is a state machine, moving the user through a number of steps to assert the user’s identity with some level of confidence.
It turns out that this lends itself particularly well to a hypermedia API. Quite obviously since the Net itself is a hypermedia API. The hypermedia API can guide the consuming client through the entire flow, letting it render screens as necessary. Each response from the server contains the links and actions that can be taken by the user at that particular stage in the login state machine.
From a UX perspective, using an hypermedia-driven authentication API would enable our dear app developer to build award-winning applications. This is the only way to achieve a true native user experience, stop fighting with each other and finally become friends :)
So how can we achieve this, and still maintain or improve the security level we have gotten accustomed to with OAuth?
- We must turn the public client securely into a trusted confidential client.
- We must assert that the client application is in fact the application that we intend it to be (e.g. the one the organization published on the app store or is running on our own domain).
Compared to a mobile app, the implementation looks slightly different for an SPA that wants to integrate the authentication flow directly. For web, we must assert that the application is served on the domain we expect. This was done using redirects previously, but must now be attested using other techniques. For mobile, we must obtain evidence that the application has not been tampered with and remains intact on a non-rooted system.
Once those attestations have been made, we can generate key material on the client side that, combined with the Demonstration of Proof of Possession (DPoP), can secure communication with the authentication API.
Given that the authentication is a hypermedia API, we can start with a regular code flow, (or any other flow such as CIBA), and indicate to the server that our client accepts resources of the media type of the API. The client will know how to present screens for the new media type, and can guide the user through the flow. Each step is protected using DPoP after the initial attestation.
With this, we can provide a complete authentication flow with advanced multi-factor capabilities in a completely API-driven manner.
So, what about third-party applications? The problem still remains: that we do not want the third-party application to be able to intercept the input from the end-user. This can be solved by either forcing third-party applications to continue to use the browser. This sucks though because they want to create award winning apps too! So, better is to move to more modern authentication methods, such as identity verification apps (e.g., BankID, Duo, etc). With these, the context is switched when the user authenticates to a safe environment, and then back to the client after their identity is verified. It is even possible to switch accepted media types in the middle of the flow, telling the client to open a browser. This powerful characteristic of hypermedia allows for federated authentication methods, such as Google, generic SAML, or OpenID Connect. The client can break out of the flow in the middle, and then resume once the browser closes and focus is restored.
Further on point three, requiring Strong Customer Authentication (SCA) with third-party apps addresses authentication but not consent. For consent, the hypermedia API must allow the consent to be digitally signed or verified out of band. To ensure that the third-party application can’t repeat the process without the user involved, the authorization of that application must be verified by the same sort of verification application or some variation. With a hypermedia API, obtaining this proof of consent is just another state in the automata. Rendering it client-side and waiting for out-of-band consent verification is more of the same.
Performing user authentication means guiding the user through a state machine. The web is exceptionally good at this, with each link or action representing a state transition. A properly secured hypermedia API is the natural evolution for authentication into the mobile world.
At Curity, we have released our first beta version of a complete hypermedia authentication API. It proves not only that this is both possible, but also safe and will enable customers to build blue-ribbon apps that improve UX. We have stopped fighting the mobile app devs, and provided them with an API that will enable them to security access APIs. We believe that this is the future of authentication. We have joined the OpenID Foundation’s Financial Grade API (FAPI) working group, and are beginning work to standardize not only the start and end of the flow, which uses OAuth and OpenID Connect, but the entire process. We are doing this because we believe that OAuth and OpenID Connect need an additional API specification for authentication, and that hypermedia is the way. We hope that you’ll check out our API and join us in this work!