Webview credential type

Background

The webview credential type is is intended foremost to allow Forms servers to support SAML 2.0, OpenID Connect and other authentication / federation protocols where the user agent is normally a browser. However the design is intended to be as generic as possible to allow for other uses. In particular, it is desired that a self-contained static page could be used for a webview step, provided it has JS code to understand the webview page contract specified in this document.

The term webview was chosen to reflect the fact that many Forms clients are not browsers and so will need to instantiate a webview control or specialized browser, typically supplied by the OS or platform, to support this credential type. A mechanism is therefore needed by which the web page will indicate completion of the browser-based step so that Common Forms flow can resume. The mechanism also needs to work for the case where a standard browser is running a JavaScript Single Page App which contains the Forms client (aka the Logon SPA). Another scenario is Chrome OS which allows privileged Chrome apps to trigger web authentication flows in the OS context, in order to reuse an Identity Provider SSO session created during logon to the Chrome OS device.

Basic Model

The simplest use of webview would be to display a self-contained web page on an arbitrary server that has a link or button to trigger the completion action directly.

Webview sequence diagram

The main constraint is that the web page must be able to invoke the completion mechanism, which will be indicated by parameters attached to the first request by the Forms Client.

In federated authentication scenarios, the web page (which would be part of the Forms Server) would direct the client to the Identity Provider with an authentication request, for instance by using JavaScript to auto-submit a hidden form containing a SAML AuthnRequest document. After authentication, which may be interactive or silent, the Identity Provider would direct the client back to a pre-defined receiving URL on the Forms Server with the authentication response, such as a signed SAML token. The receiving URL would serve a page that invokes the completion mechanism.

See the illustrations below for detailed examples of this flow.

Overview of client and server processing

The following diagram summarizes the processing done by client and server to perform a web authentication flow. The steps happen in numerical order: 3-6 are the webview pages.

Diagram showing processing done by client and server

Notes on each step:

  1. Webview credential type is the same for browsers and native apps using a webview control.

  2. In general the server should gracefully handle the case when the client doesn’t support webview, e.g. by offering an alternative auth method (based purely on Forms) or by sending a form with a user message to explain that logon is not supported by that client.

  3. Clients that can cache the Forms conversation state by themselves should do so; others (typically just plain browsers) will need the webview pages to cache the state until step 7. In general a web client also needs to tell the server how to return control at completion (i.e. specify the return page URL and method).

  4. If performing a web authentication flow such as SAML or OIDC, the first webview page will need to preserve any state and completion information provided in step 3, for recovery in step 6. SAML 2.0 and OIDC both provide a request variable for this purpose (RelayState and state respectively), if the server wishes to not maintain state itself.

  5. Webview pages, including any served as part of a web auth flow, are rendered essentially as normal with full control of the browser or webview control. Native apps need to monitor for the exit action, potentially by registering a handler before each window navigation or by injecting JS code into each page that is loaded.

  6. Once the server has received a response to the web auth flow and is ready to return to Forms flow, it constructs a completion page based on any explicit or implied state and completion information from step 4, containing response information derived from the web auth flow that needs to be provided to the Forms Server. Depending on load balancing operation, the Forms session may be on a different server so true processing of the response may need to be delayed until step 8.

  7. The completion page either triggers the exit action expected by a native app client, or navigates the browser window to the browser client page specified in step 3. Depending on the browser client app and the size of response data, the return page may be the Logon SPA or an intermediate service that stores the response before reloading the Logon SPA.

  8. The Forms Server receives the desired response as determined by the webview completion page. Normal Forms processing continues.

Webview contract

Contract Specification

The contract has three parts: the Forms parameters for the credential type webview, the parameters added to the start page request by the Forms Client, and the completion page mechanisms for returning control to the Forms Client based on the start page request parameters.

Webview credential parameters

The webview credential accepts two input parameters, specified by the Forms Server. These are represented as new sub-elements of the Credential form element:

Parameter Value
StartUrl URL of webview start page
[optional] PostData Form-encoded post data. If present this implies the StartUrl method is POST if the client supports it, otherwise it is GET with PostData added to StartUrl as query parameters

A distinct XML namespace and a container element are used to avoid potential confusion with other elements that may be defined for new credential types, including third-party ones that can be added via native client plugin mechanisms. The XML formatting is as follows:

<wv:WebView xmlns:wv="http://citrix.com/authentication/response/webview/1">
  <wv:StartUrl>https://sf.net/web/StartUrl?param1=v1&param2=v2</wv:StartUrl>
  <wv:PostData>param3=v3&param4=v4</wv:PostData>
</wv:WebView>
<!--NeedCopy-->

Forms Servers should not combine webview with any other credentials in the same form, and the credential should be marked as hidden (by having an empty Input element and Label Type set to none). Forms Clients should in any case ignore the Input and Label elements for a webview credential, and should reject a form that includes any other credentials, including another webview.

StartUrl, and/or PostData if used, can contain input parameters from the Forms Server to the webview start page, as appropriate to the semantics of the webview operation, as long as the query or post parameter names do not clash with the ones that may be added by the Forms Client. For instance, the name of the user might be established by the Forms Server before a SAML authentication request is made, and thus should be specified as the Subject element of the AuthnRequest document generated by the Authentication Request Service page. It is permitted to provide input data as StartUrl query parameters and as post data, if desired. Clients that don’t support post must merge the post data into StartUrl as additional query parameters.

Note:

GET is recommended, unless there is a compelling need to use POST. The main reason to consider POST is if the total size of input data to the start page (including the state context information that will be attached by some clients) could exceed the Internet Explorer URL path length limit of 2048 characters. However, the start page must then be prepared to accept GET or POST, and to process the parameters in either the URL or the post body (or both).

The state context information which will be attached by browser clients includes information supplied by the Forms Server itself (StateContext and the Credential ID), whose size is therefore known. It also includes a return URL that will be supplied by the Forms Client, whose precise length cannot in general be known in advance by the Forms Server. However to enable servers to make allowance, Forms Clients should use no more than 256 characters for the return URL.

Sample form to initiate SAML authentication with a pre-established username:

<AuthenticateResponse xmlns="http://citrix.com/authentication/response/1">
  <Status>success</Status>
  <Result>more-info</Result>
  <StateContext/>
  <AuthenticationRequirements>
    <PostBack>/Citrix/Authentication/SAML_Forms</PostBack>
    <CancelPostBack>/Citrix/Authentication/SAML_Forms/Cancel</CancelPostBack>
    <CancelButtonText>Cancel</CancelButtonText>
    <Requirements>
      <Requirement>
        <Credential>
          <ID>samlResponse</ID>
          <Type>webview</Type>
          <wv:WebView xmlns:wv="http://citrix.com/authentication/response/webview/1">
            <wv:StartUrl>https://sf.net/SAML/StartUrl?user=fred@acme.net</wv:StartUrl>
          </wv:WebView>
        </Credential>
        <Label>
          <Type>none</Type>
        </Label>
        <Input/>
      </Requirement>
    </Requirements>
  </AuthenticationRequirements>
</AuthenticateResponse>
<!--NeedCopy-->

Webview start page request parameters

The following parameters may be used by Forms Clients to convey state information and/or completion mechanism information to the webview start page, which must be used by the webview completion step. The webview start page has an obligation to safely remember these parameters if it is not also the completion page, e.g. by encrypting potentially sensitive information and applying integrity protection to detect tampering. It may also be necessary to apply replay detection, depending on the semantics of the webview operation.

The parameters are added by the Forms Client as form variables to PostData if the webview start method is POST, otherwise they are added as query parameters to StartUrl. Note that StartUrl may already have query parameters.

Browser clients:

Parameter Value
_cx value from StateContext element
_id ID of the webview form requirement
_rt return URL for reloading the logon page
[optional] _ps true to use POST to invoke the return URL
[optional] _hf value of Forms Client’s URL hash fragment
[optional] _pb Value from the PostBack element

The return URL and #fragment combined should be no more than 256 characters. This leaves at least 1777 characters available for StateContext, the ID name, the postback URL if needed, and the rest of the StartUrl path. Allowance should be made for URL encoding of StateContext (or URL-safe Base64 encoding should be used for StateContext).

Depending on how the web authentication protocol works, and whether the Forms Server completely consumes the authentication response when first presented during the webview rendering steps, the webview start page may need to be pre-configured to have a whitelist of trusted return URLs for _rt.

Native clients:

Parameter Value
[optional] _ri base URI to set as window.location target

Webview completion action

This is the action that must be invoked by the webview completion page, to restore state information where necessary and to supply the webview response value to the Forms Client. The completion action is determined by the parameters added to the webview start request. If no parameters are added, the native client case is assumed.

The completion actions are illustrated using the following values:

  • _cx=foo
  • _id=bar
  • _rt=url
  • _hf=frag
  • _ri=uri
  • _pb=pburl
  • Credential value=blah

Browser clients (_cx was supplied):

  • If _ps was false or not present:

    • Set window.location to url#resumeForms:_cx=foo&_hf=frag&\_pb=pburl&bar=blah.
  • If \_ps was true:

    • Cause auto-post to url with _cx=foo&_hf=frag&_pb=pburl&bar=blah as the body.

Native clients:

  • If _ri was not specified:

    • Call window.external.citrixExitWebview('blah').
  • If _ri was specified:

    • Set window.location to uri #resumeForms:_result=blah.

If using the URI intercept mechanism, a native client has the freedom to use a custom protocol scheme, or to specify a base URL with a query parameter name appended etc. It is expected that a native client is capable of maintaining the Forms conversation state by itself.

Implementation Considerations and Guidance

Design constraints

The constraints that arise from the various implications noted in the analysis section at the end of this document are as follows.

  1. Forms Servers must cope with Forms Clients where Forms conversation cookies are shared with the webview, and clients where they are not shared. This applies to browser based clients as well as native application clients.

    • I.e. web interactions with webview pages might be load balanced to a different StoreFront server than is used for Forms authentication, so the server may need to allow for transfer of a response via the webview or browser during completion.

    • Webview pages should not interfere with or depend on the cookies used by the Forms Server, or vice-versa.

  2. Forms Servers must allow for clients that only support GET on the initial request.

    • I.e. Forms Servers should use GET if possible, but must allow for URL length limits that restrict the amount of data that can be passed by some clients in the initial request. (In practice only Internet Explorer and Windows clients using the IE webview control are affected, with a path length limit of 2048 characters.)

    • If the URL length limit is too constraining, it would be appropriate to specify POST for the initial request with the understanding that Chrome apps and potentially other clients will convert this to GET. The start page must therefore cope with both verbs.

  3. Forms Servers must allow for clients that use GET to a client-specific URL as the webview completion action.

    • I.e. Forms Servers must allow for URL length limits that restrict the amount of data that can be passed to some Forms Clients as completion data. In practice only Internet Explorer is affected, with a path limit of 2048 characters but with the ability for a fragment portion that extends to 4096 characters.

    • If the URL length limit is too constraining, it will be necessary to hold completion data on the server, at least temporarily.

  4. Webview pages must not reflect client input without proper sanitization.

Cancel logon from webview

There isn’t a mechanism for the webview completion page to explicitly instruct the Forms Client to cancel the logon attempt (in contrast to normal Forms which would typically include a Cancel button for this purpose). This is done to keep the contract as simple as possible, and in recognition that normally it is only third party web pages, such as the logon pages from an Identity Provider, that will be displayed to the user for interaction. The webview start and completion pages that are aware of the Forms process are normally transient pages that immediately trigger the next action without rendering any UI.

However for cases where the webview start/completion page is designed for user interaction, the Forms Server and webview page should use a convention whereby a particular credential value (e.g. blank) indicates cancellation, and the completion page can then show a button or other UI element that triggers completion with that response. The user may also be able to trigger cancel directly by closing the authentication dialog, in the case of some native clients.

Logon SPA Restart

When the Forms Client is a browser-based app, exiting the webview must be done so that the browser ends up navigating back to the web UI address that will reload the Forms Client’s logon page and restore its URL #fragment value. Directing the browser window to an appropriate URL is the only available exit action.

In practice the Logon SPA for Citrix servers is an HTML5 offline page so will be restarted with GET, with the resumption information packed in the URL ready to be found by the JS Forms Client code. Internet Explorer imposes an overall path size limit of about 4KB when using the #fragment part of the URL to hold the resumption data.

The following example assumes the submit and cancel URLs for the Forms Server are proxied by the Web UI Server using static endpoints, which can be read from the Web UI Server configuration. This corresponds to the case where the Web UI Server and Forms Server are simply different endpoints on the same server.

SAML sequence diagram

Error handling and federated logout

There are some considerations that Forms Servers need to bear in mind regarding error handling and logout, when the webview credential is being used to support a federated authentication protocol like SAML 2.0.

If the login to an IdP itself fails, this will be invisible to the Forms Server, and probably not detectable by the Forms Client, even if the client can monitor activity inside the webview control. In the browser case, the user will be left on the IdP site with whatever UI is displayed in that circumstance, which is safe. In the native app case, the user may be given a means to externally close the webview control which would trigger cancel for the Forms conversation. As IdP logon has not occurred in either case, no further cleanup is required. This does mean that if the Forms Server wanted to offer alternatives to the IdP for logon, then this needs to be done using a selection step that comes before the webview form.

If a SAML token is issued it means that IdP logon was successful; however the token may still be rejected by the service provider for many possible reasons, e.g. because signature verification fails, or the token is improperly constructed, or the corresponding service provider account does not exist or is locked out, or the account has logon time restrictions that require access be denied at that time. In these cases the Forms Server must consider carefully how to respond, since there is a valid IdP session which may allow access to other resources, or potentially to the Forms Server itself if the failure reason is transient (as in the out-of-hours case).

A similar circumstance arises when login to the service provider (i.e. the Forms Server) has succeeded and the user later logs out of this server. The IdP session may still exist, in which case it may allow restarting the service provider session by silently issuing a new federation token. (The service provider could be configured to request that the IdP enforce a fresh logon, but that is an unusual configuration since it defeats SSO, which is one of the main benefits of using an IdP.)

In both circumstances, the requirement is to ensure that the user is aware of how things have been left, and to offer the ability (or explain how) to ensure the device is fully logged out before the user leaves.

The appropriate behavior is different for a native client, where the IdP session is private to the client app, versus the browser case where the session is potentially available for SSO to other service providers. The Forms Server can distinguish these cases by tracking whether the “return URL” completion action mechanism was used (which has to be used by browser clients). This requires the completion page modify the state context parameter, e.g. to add a marker which will be inspected by the Forms Server. (That is safe because the Forms Server and the webview start/completion pages need to be coordinated anyway.)

In the first case it would be appropriate to terminate the IdP session or at least destroy the client’s copy of the session cookies. This could be achieved by triggering a second webview step to initiate the IdP explicit logout flow, or by ensuring all of the webview control cookies are deleted. The latter mechanism would require the Forms Client to be aware of a distinct error code, and even then it might not be possible to easily clear cookies without the client app itself exiting. It would therefore be preferable for the Forms Server to drive the logout step directly. This would also be appropriate for a scenario where the Forms Server is configured to offer other authentication options if the federated method failed.

In the second case, it might be inappropriate for the Forms Server to terminate the IdP session, as this will stop SSO to other apps, which may be exactly what the user is expecting. Therefore the user interface should notify the user that they are not fully logged out, with instructions on how the user can ensure that full logout is achieved. Where possible, this message page should include a link that will trigger logout from the IdP (and other service provider sessions that were linked to the IdP session), otherwise it should state the user is not logged out until they close the browser or even logout from the end point or reboot it. (It is not uncommon now for browsers to have session restore features that preserve session cookies across a browser restart.)

Webview credential type