Skip to content

Escaper


Overview

Websites and web applications are vulnerable to XSS attacks and although PHP provides escaping functionality, in some contexts, it is not sufficient/appropriate. Phalcon\Html\Escaper provides contextual escaping and is written in Zephir, providing minimal overhead when escaping different kinds of texts.

We designed this component based on the XSS (Cross-Site Scripting) Prevention Cheat Sheet created by the OWASP. Additionally, this component relies on mbstring to support almost any charset.

Starting with v5.12.2, Phalcon\Html\Escaper is a façade over five per-context escapers, each living in the Phalcon\Html\Escaper namespace:

Class Used by
Phalcon\Html\Escaper\HtmlEscaper html()
Phalcon\Html\Escaper\AttributeEscaper attributes()
Phalcon\Html\Escaper\CssEscaper css()
Phalcon\Html\Escaper\JsEscaper js()
Phalcon\Html\Escaper\UrlEscaper url()

All five extend Phalcon\Html\Escaper\AbstractEscaper, which carries the encoding/flags/double-encode state. The façade exposes getXxxEscaper() and setXxxEscaper() accessors so individual contexts can be swapped without subclassing the façade itself. The legacy setEncoding(), setFlags(), and setDoubleEncode() setters fan out to all sub-escapers automatically, so existing code keeps working.

To illustrate how this component works and why it is important, consider the following example:

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

$title = '</title><script>alert(1)</script>';
echo $escaper->html($title);
// &lt;/title&gt;&lt;script&gt;alert(1)&lt;/script&gt;

$css = ';`(';
echo $escaper->css($css);
// &#x3c &#x2f style&#x3e

$fontName = 'Verdana\"</style>';
echo $escaper->css($fontName);
// Verdana\22 \3c \2f style\3e

$js = "';</script>Hello";
echo $escaper->js($js);
// \x27\x3b\x3c\2fscript\x3eHello

HTML

You can escape text before printing it to your views using html(). Without escaping you could potentially echo unsafe data in your HTML output.

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

$title = '</title><script>alert(1)</script>';
echo $escaper->html($title);
// &lt;/title&gt;&lt;script&gt;alert(1)&lt;/script&gt;

HTML syntax:

<?php echo $this->escaper->html($title); ?>

Volt syntax:

{{ title | escape }}

HTML Attributes

Escaping attributes is different from escaping HTML content. The escaper works by changing every non-alphanumeric character to a safe format. It uses htmlspecialchars internally. This kind of escaping is intended escape excluding complex ones such as href or url. To escape attributes, you can use the attributes() method. This method has been renamed. The old method escapeHtmlAttr() will be removed in the future and emit a @deprecated warning.

The method also accepts an array as a parameter. The keys are the attribute names and the values are attribute values. If a value is boolean (true/false) then the attribute will have no value:

['disabled' => true] -> 'disabled`

The resulting string will have attribute pairs separated by a space.

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

$attr = '"><h1>Hello</table';
echo $escaper->attributes($attr);
// &#x22;&#x3e;&#x3c;h1&#x3e;Hello&#x3c;&#x2f;table

HTML syntax:

<?php echo $this->escaper->attributes($attr); ?>

Volt syntax:

{{ attr | escape_attr }}

URLs

url() can be used to escape attributes such as href or url. This method has been renamed. The old method escapeUrl() will be removed in the future and emit a @deprecated warning.

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

$url = '"><script>alert(1)</script><a href="#';
echo $escaper->attributes($url);
// %22%3E%3Cscript%3Ealert%281%29%3C%2Fscript%3E%3Ca%20href%3D%22%23

HTML syntax:

<?php echo $this->escaper->attributes($url); ?>

CSS

CSS identifiers/values can be escaped by using css(). This method has been renamed. The old method escapeCss() will be removed in the future and emits a @deprecated warning.

When the input is an empty string, or contains only the null codepoint, css() returns an empty string instead of false.

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

$css = '"><script>alert(1)</script><a href="#';
echo $escaper->css($css);
// \22 \3e \3c script\3e alert\28 1\29 \3c \2f script\3e \3c a\20 href\3d \22 \23 

HTML syntax:

<?php echo $this->escaper->css($css); ?>

Volt syntax:

{{ css | escape_css }}

JavaScript

Content printed into javascript code must be properly escaped. js() helps with this task. This method has been renamed. The old method escapeJs() will be removed in the future and emits a @deprecated warning.

When the input is an empty string, or contains only the null codepoint, js() returns an empty string instead of false.

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

$js = "'; alert(100); var x='";
echo $escaper->js($js);
// \x27; alert(100); var x\x3d\x27

HTML syntax:

<?php echo $this->escaper->js($js); ?>

Volt syntax:

{{ js | escape_js }}

Encoding

Phalcon\Html\Escape also offers methods regarding the encoding of the text to be escaped.

detectEncoding()

Detects the character encoding of a string to be handled by an encoder. Special-handling for chr(172) and chr(128) to chr(159) which fail to be detected mb_detect_encoding. The method returns a string with the detected encoding or null

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

echo $escaper->detectEncoding('ḂḃĊċḊḋḞḟĠġṀṁ'); // UTF-8

getEncoding()

Returns the internal encoding used by the escaper

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

echo $escaper->getEncoding();

normalizeEncoding()

Utility method that normalizes a string's encoding to UTF-32.

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

echo $escaper->normalizeEncoding('ḂḃĊċḊḋḞḟĠġṀṁ');  

setEncoding()

Sets the encoding to be used by the escaper

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

$escaper->setEncoding('utf-8');

echo $escaper->getEncoding(); // 'utf-8'

setDoubleEncode()

Sets the escaper to use double encoding or not (default true)

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

$escaper->setDoubleEncode(false);

setFlags(int $flags)

You can set the quote type to be used by the escaper. This method has been renamed. The old method setHtmlQuoteType() will be removed in the future and emit a @deprecated warning.

The passed variable is one of the constants that htmlspecialchars accepts: - ENT_COMPAT - ENT_QUOTES - ENT_NOQUOTES - ENT_IGNORE - ENT_SUBSTITUTE - ENT_DISALLOWED - ENT_HTML401 - ENT_XML1 - ENT_XHTML - ENT_HTML5

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

$escaper->setFlags(ENT_XHTML);

Per-Context Escapers

Each method on Phalcon\Html\Escaper delegates to a dedicated sub-escaper. You can read the current sub-escaper, replace it with a custom subclass, or use it directly without going through the façade.

Accessing a Sub-Escaper

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

$htmlEscaper = $escaper->getHtmlEscaper();
$attrEscaper = $escaper->getAttributeEscaper();
$cssEscaper  = $escaper->getCssEscaper();
$jsEscaper   = $escaper->getJsEscaper();
$urlEscaper  = $escaper->getUrlEscaper();

echo $htmlEscaper->escape('</title><script>alert(1)</script>');
// &lt;/title&gt;&lt;script&gt;alert(1)&lt;/script&gt;

Swapping a Sub-Escaper

You can replace any single sub-escaper without subclassing the façade. This is useful for adding logging, alternative algorithms, or custom flag handling for one context only.

<?php

namespace MyApp\Escaper;

use Phalcon\Html\Escaper\HtmlEscaper as PhalconHtmlEscaper;

class LoggingHtmlEscaper extends PhalconHtmlEscaper
{
    public function escape(string $input): string
    {
        error_log('Escaping HTML: ' . $input);

        return parent::escape($input);
    }
}
<?php

use MyApp\Escaper\LoggingHtmlEscaper;
use Phalcon\Html\Escaper;

$escaper = new Escaper();
$escaper->setHtmlEscaper(new LoggingHtmlEscaper());

echo $escaper->html('<b>hello</b>');
// Logs the input and returns &lt;b&gt;hello&lt;/b&gt;

The available accessor pairs are:

Sub-escaper Getter Setter
HtmlEscaper getHtmlEscaper() setHtmlEscaper()
AttributeEscaper getAttributeEscaper() setAttributeEscaper()
CssEscaper getCssEscaper() setCssEscaper()
JsEscaper getJsEscaper() setJsEscaper()
UrlEscaper getUrlEscaper() setUrlEscaper()

Shared Configuration

Calling setEncoding(), setFlags(), or setDoubleEncode() on the façade fans the value out to every sub-escaper, so the change is applied uniformly:

<?php

use Phalcon\Html\Escaper;

$escaper = new Escaper();

$escaper->setEncoding('utf-8');
$escaper->setFlags(ENT_QUOTES | ENT_HTML5);
$escaper->setDoubleEncode(false);

// All five sub-escapers now share these settings.

Direct Use Without the Façade

Each sub-escaper can be used standalone if you do not need the façade aggregation:

<?php

use Phalcon\Html\Escaper\AttributeEscaper;
use Phalcon\Html\Escaper\CssEscaper;
use Phalcon\Html\Escaper\HtmlEscaper;
use Phalcon\Html\Escaper\JsEscaper;
use Phalcon\Html\Escaper\UrlEscaper;

$html = (new HtmlEscaper())->escape('<b>hi</b>');
$attr = (new AttributeEscaper())->escape('"><h1>Hi');
$css  = (new CssEscaper())->escape(';`(');
$js   = (new JsEscaper())->escape("'; alert(1)");
$url  = (new UrlEscaper())->escape('"><script>alert(1)</script>');

Exceptions

Any exceptions thrown in the Escaper component will be of type Phalcon\Html\Escaper\Exception. It is thrown when the data supplied to the component is not valid. You can use these exceptions to selectively catch exceptions thrown only from this component.

<?php

use Phalcon\Html\Escaper;
use Phalcon\Html\Escaper\Exception;
use Phalcon\Mvc\Controller;

/**
 * @property Escaper $escaper
 */
class IndexController extends Controller
{
    public function index()
    {
        try {
            echo $this->escaper->normalizeEncoding('ḂḃĊċḊḋḞḟĠġṀṁ');  
        } catch (Exception $ex) {
            echo $ex->getMessage();
        }
    }
}

Dependency Injection

If you use the Phalcon\Di\FactoryDefault container, the Phalcon\Html\Escaper is already registered for you with the name escaper.

An example of the registration of the service as well as accessing it is below:

<?php

use Phalcon\Di\Di;
use Phalcon\Html\Escaper;

$container = new Di();

$container->set(
    'escaper',
    function () use  {
        return new Escaper();
    }
);

You can now use the component in a controller (or a component that implements Phalcon\Di\Injectable)

<?php

namespace MyApp;

use Phalcon\Html\Escaper;
use Phalcon\Mvc\Controller;

/**
 * Invoices controller
 *
 * @property Escaper $escaper
 */
class InvoicesController extends Controller
{
    public function indexAction()
    {

    }

    public function saveAction()
    {
        echo $this->escaper->html('The post was correctly saved!');
    }
}

Custom

Phalcon also offers the Phalcon\Html\Escaper\EscaperInterface which can be implemented in a custom class. The class can offer the escaper functionality you require.

<?php

namespace MyApp\Escaper;

use Phalcon\Html\Escaper\EscaperInterface;

class Custom extends EscaperInterface
{
    public function css(string $css): string;

    public function html(string $text): string;

    public function attributes(string $text): string;

    public function js(string $js): string;

    public function url(string $url): string;

    public function getEncoding(): string;

    public function setEncoding(string $encoding): void;

    public function setHtmlQuoteType(int $quoteType): void;
}