HTML5 Mastery: Fragments

There are several types of DOM nodes. There are Document, Element, Text and many more, which also implement the generalized Node. One of the more interesting, yet until now not so often used ones, is the DocumentFragment node. It is basically a special container for nodes.

A DocumentFragment node is treated specially in many DOM algorithms. In this article we will see some of the API methods that are designed for use in conjunction with the DocumentFragment. We will also see that the concept of node containers is important for other modern web technologies, such as the <template> element or the whole shadow DOM API. But before we start we should have a quick look at fragment parsing, which is not directly related to the DocumentFragment.

Fragment Parsing

An HTML5 parser can be used for more than just parsing a complete document. It can also be used for parsing a part of a document, called a fragment. Setting properties such as innerHTML or outerHTML will trigger fragment parsing. Fragment parsing works similar to regular parsing with a few exceptions. The biggest difference is the need for a contextual root.

The fragment that is being parsed is likely placed as the child of some element, which may or may not have additional ancestors. This information is crucial to determine the current parsing mode, which depends on the current tree’s hierarchy. Additionally, fragment parsing will not trigger script execution due to security reasons.

We may therefore use code like the following, but we won’t see the additional output. The script execution won’t be triggered.

var foo = document.querySelector('#foo');
foo.innerHTML = '<b>Hallo World!</b><script>alert("Hi.");</script>';

Using fragment parsing is a simple way to reduce DOM operations. Instead of creating, changing and appending nodes, which all involve context switches and therefore DOM operations, we work exclusively in constructing a string, which is then evaluated and handled from the parser. Hence we only have a single, or just a few, DOM operations. The disadvantage of this method is that we require the parser and more work in JavaScript. The key question is: What is more time-consuming? Are the various DOM operations more expensive than all the required JavaScript string manipulations, or is it the other way round?

Clearly this depends on the case. For a particular scenario, Grgur Grisogono did the work to compare the performance using several methods. It also depends highly on the browser, especially how fast the JavaScript engine is. A higher value means more operations and is therefore desired.

Even though browsers are faster these days, the relative behavior is still valid. This should motivate us to search for better solutions and learn more about the DocumentFragment.

Aggregate DOM Operations

The idea behind the DocumentFragment node is simple: a container for Node objects. When a DocumentFragment is appended, it is expanded to append only the contents of the container, not the container itself. When a deep copy of a DocumentFragment is requested, its content is cloned as well. The container itself will never be attached to another node, even though it has to have an owner, which is the document that created the fragment.

Creating a DocumentFragment works as follows:

var fragment = document.createDocumentFragment();

From this point on, fragment behaves exactly like any other DOM parent node. We can attach nodes, remove nodes or access existing nodes. The option for running CSS queries using querySelector and querySelectorAll is available. Most importantly, as already mentioned, we can clone the node using cloneNode().

Templating in HTML

If document fragments are so great, why not use them for templating? Well, a DocumentFragment cannot be constructed in plain HTML, as the concept is only exposed via the DOM API. It is therefore only possible to create containers in JavaScript. This reduces the usage benefits a lot. Right now the most popular approach is still text-oriented. We start by placing our template in a pseudo <script> element. The element is pseudo, because the type attribute will be set to an invalid mime-type. This way nothing will be executed, but the text content of the element will use different parsing rules.

The image above shows the tokenization states. The parsing rules for script tags are special, since parsing will take place with a special tokenization state. HTML knows five tokenization states, but the fifth one, Plaintext, is not of great interest for us. The Rawtext state is very similar to Script, which leaves us with three states to explain.

Let’s consider an example. We use three elements that are good representatives for each of the three remaining states. The <div> element is, as many others, in the parsed characters (PCData) regime. The <textarea>, uses RCData like, e.g., the <title> element. Even more like raw characters is the Rawtext state, which could be represented by using a <style> element. There are subtle differences regarding escaping between the Rawtext and Script state. However, we will treat them as equivalent in the following discussion.

var example = '<br>me & you > them';
var types = ["div", "textarea", "script"];

types.forEach(function (type) {
  var foo = document.createElement(type);
  foo.innerHTML = example;
  console.log(foo.innerHTML);
})

Maybe we would expect that the output is the same, but even knowing that there are differences: Who knows what they look like?

<br>me &amp; you &gt; them
&lt;br&gt;me &amp; you &gt; them
<br>me & you > them

Only the last one matches the input string perfectly. Hence we have a clear winner. So this explains why <script> elements are so popular for transporting the templating fragments. But here is where the funny part starts. Most templating engines will create a function from the string, which takes a model and spits out a list of generated DOM nodes for the view. Some may already bind the values depending on the model. The important part is the node generation, which is mostly string-oriented, at least during the first iteration.

The W3C recognized the situation and reacted by introducing the <template> element. The element can be understood as a DocumentFragment carrier. Since the DocumentFragment does not participate directly in the DOM tree, it is attached to a node via a property. Using the element is as easy as the following example:

<template>
  <img src="{src}" alt="{alt}">
  <div class="comment">{comment}</div>
</template>

In the DOM we won’t see any children of this element. All children have been attached to the contained DocumentFragment instance, which can be accessed via the content property.

Let’s get these children:

var fragment = document.querySelector('template').content;
var img = fragment.querySelector('img');
var comments = fragment.querySelectorAll('.comment');

The text has been inserted in curly braces to indicate our intention of treating them as placeholders. There is no system of filling them out automatically.

Let’s create a function to return the instantiated nodes for us. We tailor the code for the previous example.

function createNodes (model) {
	var fragment = document.querySelector('template').content;
	var instance = fragment.clone(true);//deep cloning!
	var img = instance.querySelector('img');
	img.setAttribute('src', model.src);
	img.setAttribute('alt', model.alt);
	var div = instance.querySelector('div');
	div.textContent = model.comment;
	return instance;
}

Generalization is possible by iterating over all attributes of elements and child nodes, replacing text that matches a predefined structure in attributes and text nodes. Finally the instantiated nodes can be appended somewhere:

var nodes = createNodes({ 
	src: 'image.png',
	alt: 'Image',
	comment: 'Great!'
});
document.querySelector('#comments').appendChild(nodes);

There are three important aspects of the <template> element:

It triggers a different parsing mode. It is therefore more than just some element.
Its children won’t be attached to the DOM, but to a DocumentFragment accessible via content.
We have to make a deep copy of the fragment before we can use it.

Finally, a document fragment is so useful that it can be even utilized to make small parts of websites re-usable and more flexible.

The Shadow DOM

In recent years the demand for web components has exploded. Many of the front-end frameworks try to mimic a kind of web component structure. It is required, however, to have real DOM support, even though polyfills are certainly possible. The Polymer project is a good example of great polyfills, showcasing what could be done with web components.

What the shadow DOM allows us to do is to append a DocumentFragment to any Element. There are three constraints:

The DocumentFragment has to be special—it has to be a ShadowRoot.
Every Element can only have one ShadowRoot, or none of course.
The contents of the ShadowRoot are separated from the original DOM.

These constraints have consequences.

One consequence of attaching a ShadowRoot to an element is that the element is not rendered any more—instead the content within the shadow DOM is rendered. The content is scoped, however, which means that it may follow its own styling rules. Also the whole event handling process is a little bit different.

As a result, another new concept has been introduced: slots. We can define slots in our shadow DOM, which are filled with nodes from the element, which hosts the ShadowRoot. It seems obvious that creating custom elements, which carry the shadow DOM, is a good idea. The whole custom elements specification is a reaction to that.

So how can we use the shadow DOM? Let’s do some JavaScript to reveal the API. We start with the following HTML fragment:

<div id="#shadow-dialog">
	<span slot="header">
		My header title
	</span>
	<div slot="content">
		<strong>Some very important content</strong>
	</div>
</div>

At this point everything behaves as usual. Here is where our JavaScript skills are demanded:

var context = document.querySelector('#shadow-dialog');
var root = context.attachShadow({ mode: 'open' });
var headerSlot = document.createElement('slot');
headerSlot.name = 'header';
root.appendChild(headerSlot);
var contentSlot = document.createElement('slot');
contentSlot.name = 'content';
root.appendChild(contentSlot);

Up to here we have not gained much. We started with some elements, and we are back with them. Effectively the composed DOM tree would look as follows:

<div id="#shadow-dialog">
	<slot name="header">
		<span slot="header">
			My header title
		</span>
	</slot>
	<slot name="content">
		<div slot="content">
			<strong>Some very important content</strong>
		</div>
	</slot>
</div>

By default, all nodes of our shadow root are assigned to a default slot, if there is one. A default slot does not have a name. So what have we gained? We integrated some transparent elements—congratulations! But even more importantly, our original markup does not have to change in order to change the structure, attributes or layout of our composed DOM tree. We only need to change which elements are appended to the shadow root, and that’s it. We have essentially modularized the front-end.

Now some people may think that we had similar techniques already on the server-side. And of course some client-side frameworks also try to aggregate code like this. There are some key differences, however. First, we have the browser’s full support (if implemented). Second, the sandboxing makes rendering with specific rules for that module easy—no clash with existing CSS rules. It is essentially guaranteed that the module works on every page. No more debugging to see where the CSS rules interfere with each other. Finally, we produce much nicer code. It’s easy to generate and cheap to transport, and we can expect even better performance.

Conclusion

The DocumentFragment is a useful helper that has the ability to reduce the number of DOM operations drastically. It is also an important cornerstone of modern technologies, especially in the web components area. It has already generated two really outstanding technologies: the <template> element and ShadowRoot. While the former simplifies templating a lot, giving us a nice performance speedup and an elegant way to transport pre-generated nodes, the latter is the foundation for web components.

Is it really worth knowing about the DocumentFragment? Probably not yet. If we’re writing a framework or library then it is definitely a must, but most users will be happy that is very likely already used in their favorite front-end library, such as jQuery, Angular, and others. They all use the DocumentFragment in one or more places to overcome potential performance hits. Is a virtual DOM faster than the real one? Yes, of course, but it may not be as fast without using a DocumentFragment to aggregate multiple operations.

HIGHLIGHTS OF THE DAY

HTML5 Mastery: Fragments

Fragment Parsing

Aggregate DOM Operations

Templating in HTML

The Shadow DOM

Conclusion

References

Comments

Related Articles