HTML5 Mastery: Scoping Rules

One of the most important concepts of HTML5 is the unified parsing model. In earlier versions of the HTML specification, the implementer had a lot of freedom. Basically one could choose how to handle certain scenarios. The resulting divergence was among the most important reasons for different interpretations of many websites across multiple browsers. With HTML5 the error handling has been specified in great detail, leaving no room for interpretations.

In this first article of the “HTML5 Mastery” series, we will have a look at the error handling algorithms specified in the standard. We will see that some of these potential errors are actually accepted behavior, making them usable in popular websites. The main part of this article will discuss the scoping rules, which cover most of the experienced behavior as they trigger certain algorithms. We start with a practical example that shows implied closing tags.

Real-World Examples

Two of the most used abilities of a standardized HTML5 parser are the automatic construction of a valid document structure and the insertion of implied end tags. We start with the latter. A really nice example to illustrate this feature is the construction of a simple list. The following code snippet shows an unsorted list with three items.

<ul>
<li>First item
<li>Second item
<li>Third item
</ul>

Even though we omit the closing tag (</li>), the page displays the right thing. The correct rendering is only possible if the DOM tree has been constructed correctly from the given source. The DOM tree is the representation of the DOM nodes in a tree hierarchy. A DOM node can be an element, a comment, text or some other construct, which is introduced in other parts of this series.

The DOM tree hierarchy starts at the so-called root and shows the root’s child nodes. A root is the first node of a tree. It has no parent element. Some nodes, such as elements, can have child nodes on their own. The tree gives us information about the actual constructed DOM, whereas the source just proposes a construction.

The HTML5 parser ensures that the omitted end tags are inserted before new list items are added. This fits with our intuition. Naturally we feel that the given list has to have three items. Before HTML5 we could actually end up with a single item that contained text and another item, which again contained text and an item with text.

In the browser we see the following rendering (left side). We can also inspect the DOM tree to check if the parser handled the scenario correctly (right side).

Even though Opera (here in version 31) has been used to make the screenshots above, the behavior is browser independent as it was specified in the HTML5 standard.

By looking at the source code of some very popular sites we will possibly see some oddities. For instance the error page displayed by Google contains the following markup.

<!DOCTYPE html>
<html lang=en>
  <meta charset=utf-8>
  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
  <title>Error 404 (Not Found)!!1</title>
  <style>/* ... */</style>
  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>
  <p><b>404.</b> <ins>That’s an error.</ins>
  <p>The requested URL <code>/error</code> was not found on this server.  <ins>That’s all we know.</ins>

Even without specifying a <head> or <body> element, the elements are constructed. Also the paragraphs have implied closing tags. Apparently paragraphs are treated similar to list items. They do not occur nested—at least when constructed from source by an HTML5 parser.

Therefore the browser generates the tree on the right, which will render the layout on the left.

One of the most important lines in the preceding code is the declaration of the correct HTML5 document type. Otherwise we might end up in quirks mode, which is essentially equivalent to undefined behavior in terms of cross-browser development.

Implied End Tags

The behavior we’ve seen in the previous section relies on the generation of implied end tags. The mechanism for generating implied end tags works by closing the current node while the current node is one of the following elements:

<dd> or <dt> or <li>
<option> or <optgroup>
<rp> or <rt>
<p>

We recognize that it really makes sense for all of these tags. A nested paragraph is not well-defined. Similarly nested lists or options. They only make sense within another container.

An element with an implied end tag is not the same as a self-closing element. In the HTML standard there are elements, such as <source>, <img> or <input>, which are self-closing. Even though they are elements, they are not supposed to have children. The parser closes them immediately. While XML denotes these elements with a trailing slash, HTML encourages omitting the slash.

HTML Scopes

There are many scenarios where scoping applies. In principle, once the HTML5 parser encounters certain elements during tree construction, it looks if the element is in a certain scope. If that is the case, further actions may follow. Otherwise the current element is usually ignored.

The process of determining whether an element is in a given scope starts by looking at the current element. If the element is the host of the scope, we are in its scope. If the element is in a special subset of elements specified by the scope, we are not in the scope. Otherwise we continue the search at the parent element of the current element.

The subset of elements can be divided in five groups. Each scope selects one of these groups to specify the exclusion elements. These five groups are named:

General
List item
Button
Table
Select

The list item and button groups also contain all elements of the general group. The table group only contains <html> and the table element itself. The select group includes all elements except <optgroup> and <option>.

Scoping Examples

The select group can be used to illustrate nicely what we’ll get by using these scoping rules. If we, for example, close a select element, it is checked if we have a <select> element in a select group. If this is not the case, then the closing tag is ignored. Otherwise we close all elements until we’ve actually closed the select node.

Let’s look at some code. What tree will be generated?

<select><optgroup><option>First</select>

In fact, this was too easy. Our intuition tells us that the </select> closes the option and option group as well. This is correct. We are actually restricted to only using option and option group elements within a select element. This is supervised by the parser. So how about now?

<p><optgroup><option>First</select>

Well, there are two major differences here. First, we are not entering a select element in the first place. Therefore <optgroup> and <option> will not be constrained by the parser. They can also take arbitrary elements. Second, closing the select element will be ignored. The origin of this behavior lies in the applied scoping rule.

Let’s now consider some behavior that may influence your markup design. What will the constructed DOM tree for the following snippet look like?

<p>I am from <address>Germany</address>.</p>

Nothing easier than that, is there? Well, not so fast. The markup may seem legal at first sight. After all, it’s just semantically declaring an address inside, right? Unfortunately not. The <address> element is also considered a block, like a paragraph. This block has nothing to do with CSS, so we can’t change the behavior by using a different display declaration.

We already see that the constructed DOM makes the difference (right). The rendering just follows (left side).

This behavior applies to quite a few elements. The specification reads:

A start tag whose tag name is one of: “address”, “article”, “aside”, “blockquote”, “center”, “details”, “dir”, “div”, “dl”, “fieldset”, “figcaption”, “figure”, “footer”, “header”, “hgroup”, “menu”, “nav”, “ol”, “p”, “section”, “summary”, “ul” […]

All these elements will check if we have a paragraph element in the button group. If this is the case, then the paragraph will be closed.

There are many other places where the context / ancestor elements actually matter. This all makes HTML snippets highly non-local. However, for now we will look at another error handling topic.

Reformatting and Tables

Let’s start with a simple question. What does the DOM tree for the following code look like?

<p>1<b>2<i>3</b>4</i>5</p>

It is certainly not very hard to figure out the resulting tree until 3, but then the problems begin. There is no really good argument for what should happen next. This is the reason why some browser vendor decided to introduce an algorithm, which describes how reformatting works.

First, the closing bold tag implies the end for all included tags. In our example this affects italics. But since we did not close the italics tag, we have to treat it specially. After all inner tags have been closed, we need to open new ones for the implicitly closed formatting tags. A formatting tag is a normal tag that corresponds to an element, which has (historical) consequences for the formatting of the contained text.

The following picture shows the result. The tree on the left is the construction until 3, while the tree on the right illustrates the full picture after the code snippet.

In times of CSS the reconstruction of formatting elements can be considered outdated, but it is far from being obsolete. It is not directly triggered when elements are closed, but rather when new text is inserted.

Another interesting topic is the connection of formatting to table elements. Error handling in tables is quite strange. No one seems to get it right intuitively. Historically it was developed by many vendors with different philosophies.

Let’s consider an example again. How will the DOM tree look for the following snippet?

<table><b><tr><td>aaa</td></tr>bbb</table>ccc

Only table section elements are allowed in a <table>. As previously seen with core HTML elements (such as <head> or <body>), the parser usually takes care of this. For example, if a row is appended directly to a table, the <tbody> section is inserted automatically to take care of the row.

The natural thing therefore is to prepend the bold formatting element to the <body>. But, since we specified bold formatting for text, we still need to have it for any text (outside of the table). Therefore the first step is go on any produce a table (after the inserted <b> element), with some (normal) text.

After closing the row we encounter some more text. This text is within the table, but it has not been placed in a cell. It therefore has to be appended prior to the table like the bold element. Here the bold formatting is still active, i.e. the text won’t be appended in a standalone way, but needs to be wrapped in a <b> tag.

Finally we have some text outside of the table. We can correctly guess that this text has to come after the table in the DOM tree, but the key question is: Should it be formatted? The only correct answer is no, it should not. Why? The reason is that we never closed the bold tag. Since the bold tag has been shifted from the table to the body, we still did not close it.

The markup in the former example cannot be considered readable, intuitive or wanted at all. The behavior we’ve seen is an error recovery model at work, nothing that should be desired. I chose the example to illustrate the problem (and shown solution by the parser) of nesting incompatible tags. Most of the time such problems occur due to copy/paste or version control merge errors.

Conclusion

Knowing about the HTML5 parser internals is important. This knowledge leads to the right minification rules, which may save many bytes and some parsing time. Also we now have a basic understanding of the induced error handling by the HTML parser.

The scoping rules are always apparent in the HTML5 parsing process. We deal with an extremely stateful process that does not take the elements as equal or unknown. Instead we have a huge set of special behavior that is highly dependent on the current scope, which is defined by the considered element.

We have also seen that guessing the right tree is not always as intuitive as it should be. In general we should try to avoid such problematic areas.

HIGHLIGHTS OF THE DAY