Preventing Code Injection

Often, websites seem to exist primarily to put something into a database in order to pull it out later. While other database methods, such as NoSQL, have gained popularity in recent years, data for many websites still resides in the traditional SQL database. This data often consists of valuable personal information such as credit card numbers and other personal information of interest to identity thieves and criminals. Hackers therefore always look to get this data. One of the most common targets of these attacks is the SQL databases that lie behind many web applications through a process of SQL Injection.

An injection attack works by getting the application to pass untrusted input on to the interpreter. Recently, 40,000 customer records being taken from Bell Canada are a result of an SQL Injection attack. In late 2013, hackers stole over $100,000 from a California based ISP using SQL injection.

The Open Web Application Security Project (OWASP) chose the injection attack as the number one application security risk in their 2013 top ten, based on its prevalence and the risk to the attacked web systems. Unfortunately, it also held the number one position in the previous report from 2010. So what is an SQL Injection attack? In this tutorial I'll discuss how they work and what you can do to protect your application from these attacks.

What Is an Injection Attack?

Any interpreted system behind a web server can be the target of an injection attack. The most common targets are the SQL database servers behind many websites. SQL injection doesn't directly result from weaknesses in the database, but uses openings in the application to allow the attacker to execute statements of the attacker's choice on the server. An SQL injection attack gets the database to share more information than the application is designed to provide. Let's look at an SQL database call that you could write in ASP.NET.

SqlCommand command = new SqlCommand("SELECT * FROM userdata WHERE UserId = " + id);
SqlDataReader reader = command.ExecuteReader();

What's wrong with this code? Perhaps nothing. The issue is that id string we're using. Where does that value come from? If we're generating it internally or from a trusted source, then this code might work without problems. However, if we're getting the value from a user and using without modification, we've just opened ourselves up to SQL injection.

Let's take a common case where we're getting the parameter to look up as part of the URL. Take the URL http://www.example.com/user/details?id=123. In ASP.NET using C# we can get that passed value using this code:

string id = Request.QueryString["id"];

This code combined with the call above leaves us open to an attack. If the user passes a user id such as 123 as expected, everything works fine. However, nothing we've done here ensures this is the case. Let's say the attacker attempts to access the URL http://www.example.com/user/details?id=0; SELECT * FROM userdata.

Instead of just passing a value as expected, we've given a value and then added a semicolon, which terminates an SQL statement. It then adds a second SQL statement that the attacker would like to run. In this case it would return all records in the userdata table. Depending on the rest of the code in our page, it might return an error or perhaps display every record in the database to the attacker. Even an error can be used with carefully constructed queries to build a view of the database. Worse, imagine the attacker goes to a URL of http://www.example.com/user/details?id=0; DROP TABLE userdata. Now all of your users are lost.

In this example, the attacker can run any code they want on the database server. If the account the database calls run under has full control of the database, a too common scenario, then dropping tables and deleting records makes an easy way to bring down a site. Even if the attacker can only read and write data in the database, then puling data is only a matter of patience and care.

Take a simple database named "Products" consisting of three columns. The first column holds the product id, the second holds the product name, and the third holds the product's price. For our normal query we'll attempt to find every product containing widget in the name. The SQL to do this in the same pattern we've show would look like this:

string sql =  “SELECT * FROM Products WHERE productname LIKE '%” + searchterm + “%'”;

A typical website to access this would look like http://www.example.com/product?search=widget. Here the web page will simply cycle through each returned record and display it on the screen. In this case we see our widget product.

| productid |productname | price |
| -----------
|    1      | Widget     | 100.00 |

Let's change our query to http://www.example.com/product?search=widget' OR 1=1;-- and execute the same query. see something more. In fact we'll see every record in the table.

|productid | productname | price |
|---
| 1        | Widget      | 100.00 |
| 2        | Thingy      | 50.00 |
| 3        | Boxy        | 125.00 |

We've tricked the interpreter into running SQL code of our choice. The resulting SQL executed will be:

SELECT * FROM Products WHERE productname LIKE '%widget' OR 1=1;—%'

The result will be two statements:

SELECT * FROM Products WHERE productname LIKE '%widget' OR 1=1;
—%'

By adding the 1=1, which is always true, the where clause will be true for every row in the table and the resulting query will return every row. The -- at the start of the second statement turns the rest of the SQL statement into a comment which prevents the error message we'd otherwise see.

Changes to the display cannot prevent this statement. A patient attacker can use even nothing more than the fact a value is returned or not and carefully constructed queries to slowly map out your database and possibly retrieve data even if they only see an error message. There are tools such as sqlmap to automate the process.

Mitigating SQL Injection

You prevent SQL injection by preventing untrusted input from getting to the SQL database or other interpreter. Any input from outside the system should be considered untrusted. Even data from other partner systems must be considered untrusted as you have no way to guarantee the other system does not suffer from security problems that would allow insertion of arbitrary data then passed on to your application.

Going back to our earlier example, if we know that the id parameter should always be an integer, we can attempt to convert it to an integer and show an error if this fails. An ASP.NET MVC application would do this using code similar to this:

int quantity;    
if (!int.TryParse(Request.QueryString["qty"], out quantity)) {
   return RedirectToAction("Invalid");
}

This will attempt to convert the string to an integer. If the conversion fails, the code redirects to an action that will display an invalid message.

This can be prevented if we're looking for an integer parameter. It wouldn't help if we're expecting text such as in the earlier product search. The preferred technique in this case is to use regular expressions or string replacements to allow only needed characters in the passed value.

This can be done by whitelisting, the process of removing any characters other than a specified set, or blacklisting, the process of removing an members of a specified set from the string. Whitelisting is more reliable since you specify only allowed characters. To remove anything other than letters and numbers in a string we can use code such as:

Regex regEx = new Regex("[^a-zA-Z0-9 -]");
string filteredString = regEx(originalString, "");

You will need to evaluate any input respectively. Some database queries might need more specialized attention. Take characters that can be meaningful in an SQL command but also could be a valid character in a database call. For example the single quote character ' is used to start and finish a string in SQL, but could also be part of a person's name such as O'Conner. In this case replacing the single quote ' with consecutive single quotes '' can eliminate the problem.

Stored Procedures

Stored procedures are often seen as the fix for this problem and they can be part of the solution. However a poorly written stored procedure will not save you. Take this stored procedure that includes code similar to what we've already seen to build a query:

ALTER PROCEDURE [dbo].[SearchProducts]
    @searchterm VARCHAR(50) = ''
AS
BEGIN
DECLARE @query VARCHAR(100)
SET @query = 'SELECT * FROM Products WHERE productname LIKE ''%' + @searchterm + '%''';
EXEC(@query)
END

The string concatenation here is the problem. Let's try a simple attempt where we attempt to set an always true condition to show all rows in the table and pass in the same `widget' OR 1=1;--`` as the query we saw earlier. The result is also the same:

| productid | productname | price |
| --
| 1         | Widget      | 100.00 |
| 2         |Thingy       | 50.00 |
| 3         | Boxy        | 125.00 |

If untrusted data is passed in, we have the same result as if the call were created in our code. That the string concatenation takes place inside a stored procedure instead of in our code provides no protection.

Parameterization

The next piece of the puzzle to protect from injection attacks comes in using parameterization. Building SQL queries by concatenating strings and then passing the finished code doesn't give the database any idea of what part of the string is a parameter and what is part of the command. We can help protect against attacks by creating SQL calls in a way that keeps statements and values distinct.

We can rewrite the stored procedure shown earlier to use parameters and produce a safer call. Instead of concatenating the % characters that represent wildcards, we'll create a new string adding these characters and then pass this new string as a parameter to the SQL statement. The new stored procedure looks like this:

ALTER PROCEDURE [dbo].[SearchProductsFixed]
    @searchterm NVARCHAR(50) = ''
AS
BEGIN
    DECLARE @query NVARCHAR(100)
    DECLARE @msearch NVARCHAR(55)

    SET @msearch = '%' + @searchterm + '%'
    SET @query = 'SELECT * FROM Products WHERE productname LIKE @search'
    EXEC sp_executesql @query, N'@search VARCHAR(55)', @msearch
END

Executing this stored procedure with just the word widget works as expected.

| productid | productname |price
| ----
| 1         | Widget      |100.00

And if we pass with our parameter widget'' OR 1=1;-- results in nothing being returned showing we're no longer vulnerable to this attack.

Parameterization doesn't require stored procedures. You can also take advantage of it with queries built within code. Here's a short segment in C# to connect and run a query against Microsoft SQL server using a parameter.

const string sql = "SELECT * FROM Products WHERE productname LIKE @CategoryID";

var connString = WebConfigurationManager.ConnectionStrings["ProductDatabase"].ConnectionString;

using (var conn = new SqlConnection(connString))
{
   var command = new SqlCommand(sql, conn);
   command.Parameters.Add("@searchterm", SqlDbType.NVarChar).Value = string.Format("%{0}%", searchTerm);
   command.Connection.Open();
   SqlDataReader reader = command.ExecuteReader();
   while (reader.NextResult())
   {
       // Code to execute on each line goes here
   }
}

Least Privilege

Up to this point, I've shown how to mitigate database attacks. Another important layer of defense, minimizes the damage causes in case the attacker gets past the other defenses. The concept of least privilege gives that a module of code which, in this case our database calls, should only have access to the information and resources needed for its purposes.

When a database command runs, it does so under the rights of a user account. We gain security by giving the account the database calls run under only the rights to do things it normally needs to do. If the calls from a database should only be reading data from a table, then only give the account select rights to the table and not insert or delete. If there are specific tables it does need to update, say an orders table, then give it insert and update to that table, but not other table such as one containing user information.

Canceling orders can be handled through a separate account used only on the page doing that task. This would prevent the deletion of an order from the table elsewhere more difficult. Using a separate database account for the administrative functions of the site, with the necessary greater rights than the one the public uses can do much to prevent damage from a user finding an open injection attack.

This won't prevent all attacks. It would do nothing to prevent returning extra results such as the earlier example showing the entire contents of a table. It would prevent attacks from updating or deleting data.

Conclusion

SQL injection is the most dangerous attack, especially when taking into account how vulnerable websites are to it and how much potential this type of attack has at causing a lot of damage. Here I've described SQL injection attacks and demonstrated the damage one can do. Fortunately it's not that difficult to protect your web projects from this vulnerability by following a few simple rules.

Never trust any outside data brought into your application. It should be validated against a whitelist of valid inputs before being processed any further. This can mean ensuring that an integer parameter is actually an integer or a date is a valid date value. Also validate text to only include the characters the parameter would need. For a text search you can often allow only letters and numbers and filter out punctuation that could be problematic such as the equals sign or a semicolon.

Use parameterization and avoid string concatenation when creating SQL calls. Stored procedures are not a panacea as they can be vulnerable too if simple string concatenation is used. Parameterization avoids many of the problems of string concatenation.

The code accessing the database should be run with the least privileges needed to complete the tasks needed. In few circumstances should the database calls used by the web application need to make changes to the database structure such as dropping or altering tables. You can add an extra layer of protection by running separate parts of the website under different accounts. The database account used for normal user actions likely has no reason to modify the table containing the roles or rights of users. Running the administrative parts of the site under a more privileged account and the end user section under a less privileged one can do much to mitigate the chances of code that slips through from causing other issues.

HIGHLIGHTS OF THE DAY