Escaping the Unsecure

Part of my job at WordPress VIP is review. It takes up a lot of my time, and it’s critical work. Before code reaches our servers it needs to be vetted and peer reviewed to vouch that it’s fast, safe, and secure. Every developer makes mistakes ( even VIP developers ), so we peer review everything.

An important part of this is escaping. Most of the time escaping is easy, you escape late, escape often, for example:

<a href="<?php echo esc_url( get_permalink() ); ?>">
    <?php echo esc_html( $title ); ?>
</a>

Above is a very basic example, escaping functions usually map on to obvious examples:

  • Escape URLs with esc_url
  • Escape html text with esc_html
  • Escape attributes and classes with esc_attr
  • Escape javascript with esc_js
  • etc etc

That’s a lot of calls to esc_* littered about your code though! Why don’t we escape once and reuse the output? For example:

$name = esc_html( $name );
echo $name;
... etc ...
echo $name;

The problem here is that tracking what has and hasn’t been escaped is a lot harder. You also have to trust that nothing is changed in the $name variable after it’s been escaped, otherwise you did it for nothing. Consistently escaping at the moment of output makes it clear when you should be escaping, and if you’ve escaped. It also prevents double escaping, which can sometimes be dangerous.

But sometimes the answer isn’t obvious, and sometimes you need to go one step further. For example, how would you escape this?

// echo out our custom post meta HTML content
echo apply_filters( 'the_content', $custom_post_meta );

HereĀ $custom_post_meta could contain an attack script placed by a hacker, or a broken javascript fragment. The naive way to escape this would be:

// echo out our custom post meta content
echo wp_kses_post( apply_filters( 'the_content', $custom_post_meta ) );

This fixes the security problem, but it introduces a new problem. The ‘ the_content‘ filter processes shortcodes and oembeds, some of which add scripts which are now broken. The content still needs to be escaped, but we also want working youtube videos and embeds, so what do we do?

The solution is to escape and sanitize beforehand:

// echo out our custom post meta HTML content
echo apply_filters( 'the_content', wp_kses_post( $custom_post_meta ) );

Now wp_kses_post filters out any dangerous html beforehand, so that the filter can add safe scripts to safe content. If we don’t do this, we’ll be adding safe scripts to dangerous content.

Another scenario is when you have a html block inside an option, and need to output it. Lets say we’ve been good and escaped our output:

echo wp_kses_post( $content );

But what if the value of $content is this?

<pre>hello<pre>

We’re going to have problems, now we have 2 pre-formatted sections nested inside eachother with no closing tags! We can solve this in the backend though by validating the data and notifying the user of html validation errors.

Here is some code I once used to do exactly this:

View the code on Gist.

Here I’ve loaded the content into a PHP dom object, and used its internal parsing to catch errors. These are then shown to the user in red to highlight that a problem occurred and they need to review what they’ve entered for issues

In conclusion, you should always escape and validate.

Sanitize Early

Escape Late

Escape Often

9 thoughts on “Escaping the Unsecure

  1. Pingback: Escaping the Unsecure – Tom J Nowell

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.