querypath

06 May

QueryPath interview with SDR News's Andy McCaskey

in apache, jquery, querypath

Today, SDR News posted Andy McCaskey's full interview with me. In the interview, we talk about the speed of the web, changing development methodologies, and QueryPath.

This interview was filmed at CMS Expo in Evanston.

06 May

QueryPath and HTML: The Basics

in html, php, querypath

QueryPath can be used to work with XML or HTML. Here, I will introduce the typical tasks one uses when working with HTML documents.

We will look at the following:

  • Loading HTML documents
  • Modifying documents
  • Sending a document to the web browser
  • Creating new documents from scratch

Loading Documents

The first common task we will look at is loading an existing document. In most cases, we will be loading documents straight from the file system. Sometimes we may load them from a string of existing HTML, too. Here, we will look at each.

Loading from a file

To load a document from a file, all we need is the path to the file. Let's say we have an HTML document located in /var/www/html/index.html. Here's how we can load that file using QueryPath:

<?php
require 'QueryPath/QueryPath.php';
 
qp('/var/www/html/index.html');
?>

The code above will load the file from the file system. The last line of code will create a new QueryPath object that wraps the content of index.html. A little later, we will build on this example.

Loading from a string of HTML

Often times, HTML is generated on the fly and then sent to the web browser. QueryPath can be used as a filter for altering such dynamically generated HTML content.

For example, consider a case where the we have a string, $html that contains some HTML. Here's how we would load that string:

<?php
require 'QueryPath/QueryPath.php';
 
$html = '<html>
<head>
  <title>Existing HTML</title>
</head>
<body>
  <h1>The title</h1>
</body>
</html>';
 
qp($html);
?>

In the example above, $html has an entire HTML document. Note that the document is not technically correct -- it is missing a document type declaration. While QueryPath is strict about XML formatting, it is much more forgiving for HTML markup.

Again, the last line of the code above creates a new QueryPath object, this time wrapping the contents of the string.

It is important to note that in the above two examples, both used the qp() function to build the document. In fact, both used the function with the same signature: qp($string). QueryPath is "smart enough" to determine whether a string is an HTML document, an XML document, or a path to a file in the filesystem.

Tip: In some cases, you can get HTML from the output buffer (see the 'ob' functions in PHP) and then pass the markup on to QueryPath. In this way, you can do post-processing on data that has been output from the application already.

Modifying documents

Let's build on our last example to see how QueryPath can be used to process existing HTML.

In our new example, we will change the title (in the document head) and add a new paragraph beneath the h1.

<?php
require 'QueryPath/QueryPath.php';
 
$html = '<html>
<head>
  <title>Existing HTML</title>
</head>
<body>
  <h1>The title</h1>
</body>
</html>';
 
$qp = qp($html, 'title') // Load doc and find <title>
  ->text('A new title') // Set the new title
  ->top()  // Go back to the top of the document
  ->find('h1') // Find the <h1>
  ->after('<p>This is the new paragraph</p>'); // Add a new paragraph after.
?>

The example above is considerably more dense than our previous examples. Again, it begins with a string of HTML. But this time, when qp() is called, a query is passed in as the second argument. title will instruct QueryPath to search for any title elements. It will, as we can see, find one: The title inside of the document head.

The second line of our QueryPath chain will set the text (text()) of the title to A new title.

The third line will navigate back to the top of the document. This is necessary because we are not going to do any more manipulation of the head. We want to start looking for new content from the top of the document.

Next, we need to find our H1 tag. This is done with find('h1'). At this point, the QueryPath object is pointing to the h1 tag. We want to add some content after this tag.

The final step of the QueryPath chain adds a new paragraph after the h1 tag: after('<p>This is the new paragraph</p>'). QueryPath's after() method is one of the dozen or so tools for inserting or updating content in a document. Check out the article at IBM DeveloperWorks for an overview of the other methods.

Here, we've seen two methods, text() and after(), that can be used to modify the document. Next, let's see how to get the results of our modification.

Sending the results to a browser

Again, let's just continue on from our previous example.

At any point, we can get the current state of the HTML using the html() method. For example, we can do something like this:

<?php
require 'QueryPath/QueryPath.php';
 
$html = '<html>
<head>
  <title>Existing HTML</title>
</head>
<body>
  <h1>The title</h1>
</body>
</html>';
 
$content = qp($html)->html();
 
?>

Now, content will be a string that should look basically the same as $html (except that it will be cleaned up by QueryPath).

The html() method always works from the local context, though. So if we wanted to get just the body of the above HTML, we could do this:

<?php
require 'QueryPath/QueryPath.php';
 
$html = '<html>
<head>
  <title>Existing HTML</title>
</head>
<body>
  <h1>The title</h1>
</body>
</html>';
 
print qp($html, 'body')->html();
 
?>

This would output the following:

  <body>
    <h1>The title</h1>
  </body>

See how the html() method is only grabbing the contents that are currently selected? Since we queried for body, only the body is shown.

Most of the time, though, we are more interested in printing the entire document. The clumsy way of doing this is to do something like this:

<?php
require 'QueryPath/QueryPath.php';
 
$html = '<html>
<head>
  <title>Existing HTML</title>
</head>
<body>
  <h1>The title</h1>
</body>
</html>';
 
print qp($html, 'title')->text('New title')->top()->html();
 
?>

There are three steps here (all on the one line) involved in getting the entire HTML document to print.

First, there is an explicit print statement. Second, there is a call to the top() method to get us back to the top of the document. Third, there is the call to html() to get the HTML string.

The above can be further condensed using a different method, writeHTML(). This basically bundles the three steps above into just one step. Thus, we could rewrite the above like this:

<?php
require 'QueryPath/QueryPath.php';
 
$html = '<html>
<head>
  <title>Existing HTML</title>
</head>
<body>
  <h1>The title</h1>
</body>
</html>';
 
qp($html, 'title')->text('New title')->writeHTML();
 
?>

As a result of running this code, the following document would be shipped to the browser:

  <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
  <html>
  <head><title>New title</title></head>
  <body><h1>The title</h1></body>
  </html>

(Notice that the doctype has been automatically added.)

Creating new documents

The last thing we will look at is creating a new HTML document with QueryPath.Really, there are two ways. You can either create the entire document from scratch, or you can use a built-in HTML stub document.

Building from scratch

The first way of creating a new HTML document is to build the document from scratch. We've actually seen this method already in our string handling examples above. There, we created a document as a string and then passed it into QueryPath:

<?php
require 'QueryPath/QueryPath.php';
 
$html = '<html>
<head>
  <title>Existing HTML</title>
</head>
<body>
  <h1>The title</h1>
</body>
</html>';
 
qp($html);
 
?>

Building documents that way is always acceptable. Should you so choose, you can even build it up in an even more piecemeal fashion:

<?php
require 'QueryPath/QueryPath.php';
 
qp()->append('<html>')->children()->append('<head/><body/>'); // etc.
 
?>

This, however, is not a terribly efficient method of document building, and is generally only useful in rare cases.

Building from a stub

The easiest method is to use the HTML stub document included with QueryPath. This stub provides a skeleton XHTML document that you can then build using the methods we talked about above.

Here is a quick example:

<?php
require 'QueryPath/QueryPath.php';
 
qp(QueryPath::HTML_STUB, 'title')->text('New title')->writeHTML();
?>

This example creates a new stub document, sets the title, and writes the output. Here is what the output looks like:

  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
  <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
       <meta http-equiv="Content-Type" content="text/html; charset=utf-8"></meta>
     <title>New title</title>
    </head>
    <body></body>
    </html>

You may notice that here even more work is done for you. The content type is set, as is the doctype. This last method is the quickest way for you to author HTML documents.

Conclusion

We have quickly covered the basics of using QueryPath to work with HTML. We have looked at loading existing documents, changing documents, writing documents to the web browser, and creating documents from scratch.

Of course, QueryPath can be used for many other things. For more information, head over to http://querypath.org. And check back here for more articles like this one.

05 May

CMS Expo Sessions

in cms, cmsexpo, javascript, querypath

SDR News interviewed about a dozen or so speakers at the CMS Expo conference. These are slowly being made available at the SDR News website.

The primary QueryPath interview is not currently posted. However, the lightening talk that I gave just before the keynote is available.

The audio is not very good. Here's hoping that the audio for the QueryPath session is better.

28 Apr

QueryPath News

in drupal, openamplify, php, querypath, screencast

This week is "QueryPath week".
dwdw
IBM developerWorks has published Get to know the QueryPath PHP library: A fast, easy way to work with XML and HTML. The article walks discusses the design of the QueryPath library, and walks through a simple Twitter search application.
OpenAmplifyOpenAmplify
The folks at OpenAmplify have started tweeting about my screencast using QueryPath with OpenAmplify. I was gonna save the screencast for CMS Expo, but here it is. (If you are coming to CMS Expo, I might let you play with it.)

This short screencast presents a quick module I put together that uses QueryPath to retrieve and process web service information. The cornerstone of the application is the OpenAmplify web service, which provides lexical analysis of a text, returning information that can be used, well, to build stuff like this.

28 Apr

Introduction to QueryPath

in php, querypath, screencast

I recently recorded a screencast for QueryPath.org introducing QueryPath development.

This video provides a short walk-through of QueryPath's core features.

17 Apr

CMS Expo Presentations

in cmsexpo, javascript, jquery, presentation, querypath

At the end of April, I will be presenting two sessions at the CMS Expo.

Session 1: JavaScript and jQuery

This session will begin with a survey of JavaScript usage. We will then cover jQuery in some detail. From there we will move on to a more general discussion of how CMS systems can benefit from JavaScript integration. The last part of the discussion will discuss some of the new and exciting features in recent browser development, and explore how those are changing the way CMS systems will interact with clients.

Session 2: QueryPath

This session will introduce the QueryPath library. We will see why a library like QueryPath is necessary, what it does, and how it works. I will be showing demonstrations of tools that can be built (quickly) in QueryPath, including Twitter integration, Amazon and SPARQL queries, and a as-of-yet-unveiled mashup featuring an exciting new web service.

16 Mar

Debugging your PHP Code: XDebug on MAMP with TextMate and MacGDBp Support

in macgdbp, pecl, php, querypath, textmate, xdebug

As I see it, there are two major drawbacks to the otherwise-spectacular MAMP (MacOS Apache MySQL, PHP) package (three if you count the funky directory structuring):

  1. The .h files are all missing, so PECL doesn't work very well.
  2. There is no debugger.

The first issue is covered elsewhere. In this article, I will address the second by explaining how to setup XDebug on MAMP. XDebug is one of the two popular PHP debugging engines (With ZendDB being the other).

In this article we will cover the following:

  • Getting and installing XDebug
  • Using XDebug for basic stack tracing
  • Integrating XDebug with TextMate
  • Configuring XDebug and MacGDBp for client/server debugging
  • Using MacGDBp
11 Mar

Drupal QueryPath 1.1 Module Released

in drupal, php, querypath


Today the QueryPath 1.1 Drupal module has been released. This module provides Drupal developers with access to the QueryPath library. In addition, it integrates Drupal's database library with QueryPath, making it possible to execute Drupal database queries and feed the results directly into QueryPath.

11 Mar

Twitterpated, TweetStock updated

in php, querypath, tweetstock, twitter, twitterpated

Today I uploaded a new version of Twitterpated to the QueryPath.org server. This new version is smaller in (network) size, and has a few minor bug fixes. Also released along with it is TweetStock. TweetStock is a Twitter search client for an iPhone. It's actually a very simple overlay on the official Twitter search page. Check it out at http://querypath.org/tweetstock

Both of these technologies are built on the QueryPath library (http://querypath.org).

Twitterpated

in iphone, javascript, php, querypath, twitterpated

Twitterpated is an iPhone web-based Twitter client. It works in any version of the iPhone or iPod touch. To use Twitterpated, simply open Safari on your iPhone and enter the URL http://querypath.org/tweet. You will be asked to log in. Use your normal Twitter login and password. This information is used only to connect to Twitter. Twitterpated does not keep its own account database.

Twitterpated has the following features: