Preventing XSS in ASP.NET

Many website security problems come from trusting the user too much. Most users of your web application will only do what they have the need to do, a curious or malicious user will often want to push the edges of access. At those edges, security holes often appear in your application. I've written about preventing two common types of vulnerabilities, SQL Injection and Cross Site Request Forgery, in ASP.NET apps before. This article looks at preventing Cross Site Scripting, a third common type of vulnerability in websites.

While a modern framework does much to make these attacks more difficult, I believe we should first have an understanding of the ways an app is vulnerable to an attack. First, let's look at what Cross Site Scripting is and how it can be exploited.

What Is Cross Site Scripting

Cross Site Scripting (often abbreviated as XSS) allows the injection of malicious scripts into an otherwise trusted website. This injection happens without the user's knowledge. The injected script is executed as though it came from the original website. The malicious script can thus access any resources of the hosted website the user would have access to, such as cookies or session tokens.

The opening for a cross site scripting attack comes when a web application displays input from users or outside resources without properly validating or encoding it. In most cross site scripting attacks, the attacker attempts to inject JavaScript into the webpage of a trusted server. The attacker can also attempt to inject HTML, Flash or anything else the browser will execute. No matter the script, the goal remains to get the browser to execute code of the attacker's choice.

A Persisted XSS Attack

There are three categories of cross site scripting attacks, divided by the method of injection and method of preventing the attack. In the first type of attack, the script is permanently stored on the target server and thus is called a persisted cross site scripting attack. This attack attempts to embed the malicious script into something such as a forum post stored in a database or a seemingly benign field such as the home page of a user of the database. With the script persisted, every visitor to the site who views the post, message, or otherwise compromised item, becomes a potential victim of the attack.

Attackers attempting this type of attack generally target comment fields, forums, social media and other fields where somewhat arbitrary end user input is expected and is a normal part of the application. The attacker can include the script in a forum post in an otherwise valid part of a conversation. Each time someone views the post, the script will be executed.

A Reflected XSS Attack

In the second type of cross site scripting attack, known as reflected cross site scripting, the attacker delivers the injected script to the vulnerable site so that it will be immediately returned back to the user. Common methods of doing this, target pages where user input becomes part of the output of a page. A search page can display the search terms to the user and could provide an outlet for this attack. The injected script in the input of a user should never be stored by the web application.

DOM Based Attacks

The third cross site scripting attack occurs entirely in the browser. The attack functions by manipulating the internal model of the webpage within the browser known as the DOM and are referred to as DOM based attacks. These again, allow the attacker to execute malicious code, but code returned by the server is manipulated into executable JavaScript by the webpage.

Ultimately, a cross site scripting attack is a cross site scripting attack, no matter how it is delivered. Since the injected code comes from an otherwise trusted server, it can often execute under the site's permissions. It therefore can act as though it were native code on the website.

A successful cross site scripting attack could allow access to cookies on a webpage. These cookies can contain sensitive information including session identifiers that would allow the attacker to impersonate the attacked user. The attack can also change HTML content on a page to display a fake login form and steal the user's login credentials. The attacker could examine and send any content of the page allowing capture of sensitive information such as account numbers. A more advanced attack could, in effect, install a key logger sending any information entered into the webpage to an attacker.

Protecting From Cross Site Scripting Attacks

Mitigating cross site scripting requires not trusting any input from a user or any other external source. The web application must treat this data as potentially dangerous, no matter the source. Let's look at a few methods specific to ASP.NET to prevent these attacks using components built into the framework and freely available libraries.

Validate All Input

The web application should validate any input to the application before it is used. Just as with other injection attacks such as SQL Injection. The application preferably validates this input against a white list of acceptable values. The validation removes or replaces any unexpected components of the input with an encoded value. A blacklisting method, which only removes a list of known unwanted characters, can be used, but is more vulnerable to new attack methods.

If we know a value should always be an integer, then you can validate the input using code such as:

int memberId;
if (!int.TryParse(externalValue, out memberId)) {
   return RedirectToAction("InputError");
}

If the framework cannot parse the previously retrieved externalValue as an integer, the code redirects to a page that would display an error. Otherwise we know that memberId contains an integer value. This process also works with other basic types. Some more common types also provide methods to validate the information. The .NET Uri class contains a method IsWellFormedUriString that can validate a URL. This would allow validation that a user's homepage entry contains a valid URL before display.

var userHomePage = userRecord["homepage"];
if (!Uri.IsWellFormedUriString(newUrl, UriKind.Absolute))
{
  Model.homepage = "none";
}
else
{
  Model.homepage = Html.Encode(userHomePage);
}

Other and more complex data types need more complex validation. Validation of a credit card number field could remove any characters in the string that are not digits. Validation of more complex strings could need regular expressions. Validation of a class may need more complex checks as well.

ASP.NET Request Validation

ASP.NET provides effective protection against reflected attacks using request validation. If ASP.NET detects markup or code in a request, it throws a "potentially dangerous value was detected" exception and stops the processing of the request.

While valuable, there are times that you need to allow these values in a request. A common example comes in allowing rich text input in a form. In these cases, unfortunately, request validation is too often turned off for the entire site. A better solution turns off this validation only where needed. In earlier versions of ASP.NET, adding validateRequest="false" to the Page directive in Webforms would turn the validation off for a page. In ASP.NET MVC, adding the [ValidateInput(false)] attribute to a controller action turns off validation for that action, while adding the [AllowHtml] attribute turns off validation for a field.

ASP.NET 4.0 changed request validation in several ways. This and later versions of the framework do validation early in the HTTP request. The validation also applies to all ASP.NET requests and not just .aspx page requests. This includes custom HTTP modules too. Pages that rely on the original behavior can revert to the older method by setting the requestValidationMode attribute in the web.config file to version 2.0.

<httpRuntime requestValidationMode="2.0" />

Even better, is to disable this only for pages where needed, using the syntax in the web.configfile:

<location path="novalidationpage.aspx">
  <system.web>
    <httpRuntime requestValidationMode="2.0" />
  </system.web>
</location>

ASP.NET 4.5 added the ability to defer validation until requesting the data. Setting the requestValidationMode attribute in your web.config file to version 4.5 activates this new behavior.

<httpRuntime requestValidationMode="4.5" />

ASP.NET 4.5 also added the HttpRequest.Unvalidated property. Using this property allows easier access to the unvalidated form value where needed. By combining delayed validation and the Unvalidated property, you can access the unvalidated values when needed, but protect other form inputs.

Encoding HTML

Before displaying outside data on a webpage, your HTML should be encoded so it's not processed by the browser. As an example, take an ASP.NET page written so a message can be passed for display, such as a status update. An application could use this page to show the user that their account had been created without errors. The URL for this page would normally look similar to http://appname/placeorder/Account+Created. The resulting page shows the message to the user with a field, such as:

<%= Html.Label("Message", Model.message) %>

... and displays as:

If we change the URL call to http:/appname/placeorder/<script>alert('hello!');</script>, we now get something different.

The script could be anything of course and not just the harmless alert box that appears here. Request Validation would catch the above examples and return an exception before display. If turned off though, then encoding the output prevents the attack.

ASP.NET makes it easy to encode data in order to prevent attacks. Early versions of MVC using Webform's syntax often contained code such as this which did not encode HTML.

<p id="status"><%= status ></p>

You had to manually encode the output so that any HTML would be converted into a display format. So the < character becomes the string <. The Html.Encode function provides this conversion. The safer form of code thus becomes:

<p id="status"><%= Html.Encode(status) ></p>

ASP.NET MVC later introduced a syntax for doing this in one step by replacing <= with <: so the code can be shortened to:

<p id="status"><%: status ></p>

Using the Razor view engine, all output is HTML encoded unless you specifically use a method to not encode it. In Razor, the code equivalent to the above becomes:

<p id="status">@status</p>

Razor automatically handles HTML encoding of whatever the string status contains. In a case where you need to render the raw data, you can use the HTML.Raw() method. To display the result without encoding, we can use:

<p id="status">@Html.Raw(status)</p>

In this example, the above code would make our application vulnerable again. So, there are a few circumstances where you should not encode output. If you do disable this feature on a field, you must take extra care to ensure the data is sanitized before display. Fortunately, there is a library that helps with this, while also doing more to protect your application from cross site scripting.

AntiXSS Library

If you're writing an ASP.NET application, you should use the AntiXSS Library for ASP.NET. From the project's website, "AntiXSS provides a myriad of encoding functions for user input, including HTML, HTML attributes, XML, CSS and JavaScript."

The library contains methods focused to sanitizing outside data based on the intended use of that data. These methods use the preferred white list based approach. This means that encoded data, meant for an HTML attribute, can be sanitized to contain only valid data for an HTML attribute. The traditional ASP.NET HtmlEncode methods use the black listing approach that only encode certain, potentially dangerous characters.

Microsoft began including core routines from this library into ASP.NET 4.5 in a new System.Web.Security.AntiXss namespace. You can also setup the framework to use these AntiXSS methods in place of the built-in encoding routines. You do this by setting the encoderType attribute of httpRuntime in the web.config file for the application:

<httpRuntime ... encoderType="System.Web.Security.AntiXss.AntiXssEncoder,System.Web, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a" />

If your application does any significant display of outside data, then the use of AntiXSS will do much to protect your application from cross site scripting. If using ASP.NET 4.5, then changing your application to use the new AntiXSS methods for default encoding provides even more protection for your web application.

In Summary

Preventing cross site scripting is harder than it initially seems. OWASP lists over 80 vectors that can be targeted using cross site scripting attacks. That organization also lists these vulnerabilities as third in their 2013 list of top ten web vulnerabilities.

If you do not ensure that all outside data brought into your application is properly escaped or do not validate input before placing it on an output page, you leave your web application vulnerable to cross site scripting. In ASP.NET, this can be done by:

Validating all external input to your application before displaying on a webpage.
Use Request Validation everywhere that your application doesn't specifically need to turn it off, such as a form allowing rich HTML input. If you must allow unvalidated information, leave validation on everywhere else in your application.
Encode HTML before displaying external data on a webpage
Use the AntiXSS based methods included in ASP.NET 4.5 and use the AntiXSS library for older versions of ASP.NET.

HIGHLIGHTS OF THE DAY