Data URLs and QueryPath: How to embed images into XML or HTML

Aug 26 2010

QueryPath 2.1 is adding support for writing files directly into URLs using Data URLs. What this means is that you can encode and embed images or other documents straight into your HTML or XML.

Here's a simple example from the QueryPath 2.1 unit tests:

<?php
$xml = '<?xml version="1.0"?><root><item/></root>';
qp($xml, 'item')->dataURL('secret', 'Hi!', 'text/plain');
?>

The above will generate an XML fragment that looks like this:

<?xml version="1.0"?>
<root>
  <item secret="data:text/plain;base64,SGkh"/>
</root>

The important part there is the attribute secret="data:text/plain;base64,SGkh. This attribute includes an embedded text document with the contents Hi!. What we've done is encode the data and injected it as a document inside of the XML.

Sure, that's novel... but what would we want to use that for? How about adding images directly into a document? <!--break-->

Inserting images into HTML or XML using QueryPath's dataURL method

Here's a script that injects an image directly into the source of an HTML document.

<?php
// Start with QueryPath's built-in HTML document stub.
qp(QueryPath::XHTML_STUB, 'body')
  // Add an image tag and then select it. 
  ->append('<img/>')
  ->children('img')
  // Inject an image straight into the document.
  ->dataURL('src', 'file:///Users/mbutcher/Desktop/tinystar.gif','image/gif')
  ->writeHTML('./TestQP.html');
?>

The script above writes the image straight into the document. Web browsers (most of them, at least... sorry IE 6) can then render the image out. For example, here's what Safari looks like. I've shown the Developer toolbar so you can see the actual embedded data:

The "magic" is all done by the new dataURL() method added in QueryPath 2.1 Alpha 2. It basically works by taking an attribute name, a file or string to encode, and a MIME type. From these, it generates an embedded data URL. Interestingly, the dataURL() method can also be used to extract and decode embedded data.

Uses for Data URLs and QueryPath

So how might you use this? Here are a few ideas:

  • Embed small images directly into an HTML document to save repeated network connections.
  • Combine multiple files into one for convenient network transmission or archiving.
  • Retrieve embedded content using QueryPath, and then translate it back into a non-embedded version.

Note that data URLs tend to work best with smallish chunks of content. In fact, I hear that some browsers (namely, IE7) provide an upper limit (32k) on the size of a data URL.



comments powered by Disqus