HTML5 Mastery: Web Security

HMTL5 Mastery series image

Security is a topic that comes up from time to time. It is not an issue from the beginning, but once something bad happens it is usually considered the one to blame. Software is complex, the human programming the machine is far from being perfect, and the user may not follow best practices either. So how can we create a secure system?

The web is one of the most insecure places possible. Computers with potential security risks connected to each other. Servers that can receive arbitrary data. Clients that execute code from unknown sources. While we cannot control the security of the servers, we have to do something to protect the clients. Even though JavaScript could be considered a secure scripting language, the code for any ActiveX, Flash or Silverlight plugins is definitely not. Furthermore, even though JavaScript itself is sandboxed, it might be used in such a way that the user will trigger insecure actions.

In this tutorial we will see the web security model in action. We will go into best practices and general guidelines for building secure web applications. We’ll learn what the cross-origin resource sharing policy is and how we can control it. Finally we will also talk about sandboxing (external) content.

Security Guidelines

One of the most important guidelines has nothing directly to do with HTML directly: Use HTTPS! The relation to HTML lies of course in the distribution of our hypertext documents. Nevertheless we have to realize that using HTTPS for the transport of our document and using HTTPS for our resources are two different things. It is definitely necessary to check if all contained resources really use the https:// scheme.

Another important guideline is related to user-defined content. Once we allow users to enter data on a form, we need to be careful. Not only do we need to make sure that the webserver is protected against common attacks, such as SQL injection, we also need to make sure that stored data is not used within executable code without caution. For instance we should escape strings to not contain any HTML. HTML alone is not malicious, but it can trigger script execution or resource fetching. One way to allow users to write HTML, which will be placed on the output page without modification, is to white-list certain tags and attributes. The other elements will be escaped.

Our JavaScripts should also minimize exposure and trust to third-party libraries. Of course we use the immediately-invoke-function expression (IIFE) to prevent the global context from being polluted; however, another reason is to not leak (probably) crucial internal states, which may then be changed by other scripts willingly or by chance.

For ourselves it is certainly good practice to rely on 'use strict'; and the conveyed benefits. Nevertheless, constraining the running script does not prevent us from using APIs with potentially corrupted data. Cookies and content in the localStorage may be changed or viewed by the user or other programs depending on conditions that cannot be influenced by us. We should therefore always implement some sanity checks, which give us a hint to detect integrity flaws as soon as possible.

Finally we should make sure to only use trusted third-party resources. Using scripts from other servers within our site makes it possible to mutate the page or breach the privacy of our users.

Cross-Origin Resource Sharing

The concept of cross-origin resource sharing (CORS) is simple. The browser does not allow the embedding of special resources from different origins unless the origins explicitly allow it. Special resources can be, e.g., web fonts or anything requested via an XMLHttpRequest. Cross-origin AJAX requests are forbidden by default because of their ability to perform advanced requests that introduce many scripting security issues.

The origin is basically defined via the used protocol, host and port combination. Therefore http://a.com is different from https://a.com, which is different from https://a.com:8080. They are all different to http://b.com.

Clients can be allowed to use resources by including a certain header in the response. The browser then determines if the current website is allowed to use the resource or not. The origin is usually determined via the domain of the current website.

Let’s have a look at an illustrative example. In the following we assume that our page is located at foo.com. We request JSON data from a page hosted at bar.com. For the JSON request we use the XMLHttpRequest as shown below.

The browser already anticipates the possibility of a CORS-enabled response by adding the Origin header to the request:

Now the server has to deliver the right response. Not only do we want the correct JSON to be transported, but even more importantly we require CORS specific headers. It is possible to use wildcards. For instance, the following example will grant the right to use the requested resource to any request.

CORS can also be used as an alternative to the JSONP workaround. JSONP uses scripts to make cross-origin AJAX requests resulting in JSON responses. Before CORS, cross-domain calls were prohibited in general, but including scripts from different domains was always acceptable. In most APIs a JSONP response was triggered by providing a special query parameter, naming the callback function.

Suppose the call to http://bar.com/api results in the following JSON response:

The JSONP call to, e.g., http://bar.com/api?jsonp=setResult would give us:

Since the result of JSONP is only seen by the <script> element, the GET request method is implied. There is no possibility to use anything different. CORS gives us much more freedom in that area, such that we can also determine other parameters. This is all enabled by allowing us to freely use the standardized XMLHttpRequest object.

An ideal solution would only fall back to JSONP for legacy browsers, while embracing CORS for more recent ones. This will prevent many cross-site scripting (XSS) issues originating from compromised external sites. Embedding scripts from external pages has always been a risky business. An even better fallback would be to redirect the JSON request from our server to the target machine. This way we talk with our (trusted) server, which gets the response from the target, evaluates its sanity and returns the (valid) result.

Sandboxing Flags

Every document comes with its own window. Accessing this window is mostly proxied to the window of the current browsing context, which drives the tab that we see. The browsing context is created with several options, such as the parent context, the creator, and the initial page. Along with these options, security flags are set. The flags set up the capabilities and restrictions of the context. In effect it is possible to prevent certain behavior, such as running scripts or opening new tabs.

How can we set the security flags of a new context? We set the context via attributes that are assigned to the elements, which need to create a new context. Currently only the iframe element has such an attribute, even though standard frames would also fit to the former description. However, standard frames are considered deprecated and are therefore not really well supported. Even though the HTML5 standard mentions them, they should not be used anymore.

There are a bunch of flags available. The most important ones are:

  • allow-top-navigation (allows changing the top context)
  • allow-plugins (enables embed, object, …)
  • allow-same-origin (content from the same origin may be accessed)
  • allow-forms (forms can be submitted)
  • allow-popups (popups / new contexts won’t be blocked)
  • allow-pointer-lock (enables the pointer-lock API)
  • allow-scripts (allows script execution)

The <iframe> uses the sandbox attribute to enter the sandboxed mode. If this attribute is not specified, everything is allowed as we know it. With this attribute, everything is forbidden. The previously stated flags therefore enable certain features.

Let’s take a closer look at the allow-same-origin flag. By default, the policy for an iframe is really relaxed. If we do not specify the sandbox attribute then only embedded pages from the same domain are allowed to read, e.g., cookies or the browser’s local storage for the given domain. Of course other risks exist as well—that’s why we usually want to supply the sandbox attribute.

The following picture shows the default behavior in a simple diagram. While the second iframe is allowed to access the previously stated content, the first one isn’t. The reason is the different domain, highlighted in yellow.

IFrame Standard Origin

So how does this picture change with the sandbox attribute? Content from a different domain is still prohibited. Therefore we only look at the additional possibilities for contents from the same origin. By default, even content from the same origin is treated as if it came from a different origin.

IFrame Sandbox Origin

There are more features that can be controlled or set for contexts, but these are unfortunately transported via their own attributes. A great example is the allowfullscreen attribute. Again it is only available on an iframe. In principle it allows applications to go into full-screen mode.

Furthermore, we should note the seamless attribute for an <iframe> element. This attribute enables the styling of the parent document on the contained document. It is especially powerful in conjunction with the srcdoc attribute, which allows us to provide the source of an iframe directly, without requiring HTTP requests or using a data URI. This way we can very easily allow user content without worrying about script execution.

An example to display user content in a sandboxed environment is shown below.

Sandboxed iframes can be used for many tasks. For instance, if we want to evaluate scripts safely, we could pass the content to evaluate to a special handler sitting inside an iframe. The handler would call the eval function. The inline frame is sandboxed to only allow scripting, nothing else. No popups can be opened, navigation cannot be used, and our whole DOM is separated.

The exact implementation also requires HTML5 messaging between documents. We send the evaluation request with the postMessage method and listen for the response via the message event.

We can also sandbox parts of the current page using the content security policy (CSP). This property is usually transported via the page’s HTTP headers, but HTML gives us the ability to set it in a <meta> tag.

An example looks as follows. We need to use the right http-equiv value. The content is a series of definitions specific for the page.

The different definitions (called policies) also accept wildcards. The exact purpose and look are strongly dependent on the policy being used.

Conclusion

Web security is possible, even though it is hard to achieve and depends on a lot of (often impossible to control) external parameters. If we can minimize these external parameters, such as inserted content or scripts, then we are usually in good shape. Most attack vectors can only be used from scripts.

We have seen that (inline) frames and hyperlinks can be adjusted with sandboxing flags. Our own resources should only be deployed with CORS in mind.

References

Tags:

Comments

Related Articles