php

26 Mar

Loading Drupal Nodes into MongoDB with Drush

in drupal, mongodb, php, programming

To do some prototyping, I wanted to load all 32k of our Drupal nodes into MongoDB. At first, the thought of doing this seemed daunting. Then I realized that with Drush I could use a very simple script to perform an entire migration.

The result: With a 14 line PHP script, I transferred all of the nodes (CCK, taxonomy, and all) without a glitch.

Read on for the full explanation.

05 Mar

MongoDB: 5 Things Every PHP Developer Should Know About MongoDB

in mongodb, php, programming

2010 will be remembered as the year SQL died; the year relational databases were moved off of the front line; the year that developers discovered that they no longer had to force every single object into a tabular structure in order to persist the data.

2010 is the year of the document database. While momentum has been steadily building over the last seven years or so, there are now a wide variety of stable document databases -- from cloud-based ones from Amazon and Google, to a wide variety of Open Source tools, most notably CouchDB and MongoDB.

So what is MongoDB? Here are five things every PHP developer should know about it:

  1. MongoDB is a stand-alone server
  2. It is document based, not table-based
  3. It is schemaless
  4. You don't need to learn another query language
  5. It has great PHP support

Read on to learn a little about each of these.

05 Mar

Tek-X Webcast: A Developer's Intro to Drupal

in drupal, php

On March 12, 2010, I will be online with the folks from Tek-X giving a webcast on A Developer's Intro to Drupal. If you're just getting your feet wet with Drupal and are still a little confused about hooks, modules, themes, nodes, or even why Drupal isn't (fully) Object-Oriented, then this session is for you.
Tek-X Drupal WebcastTek-X Drupal Webcast

16 Feb

Using BetterAWStats in Drupal

in drupal, php, system administration

Our current environment uses AWStats to analyze our HTTP server log files and build reports. Because it has privileged access to our data, and because it is open source, we can glean more information out of it than we could from proprietary hosted analytics platforms.

It turns out that there is a PHP front-end to AWStats (called BetterAWStats) that comes complete with a Drupal module. Here, I explain how we've installed and configured this module to get our AWStats data imported into our Atrium server.

15 Feb

A QueryPath script for checking on a sitemap

in php, programming, querypath, sitemap, xml

Sitemap ScoresSitemap ScoresI've been tuning our sitemap during the last few months, and one thing I needed was a quick tool to check on the effectiveness of various sitemap generation strategies.

To do this, I wrote a quick QueryPath script (see a full-sized image of the output). The script is explained below.

The code is pretty straightforward. It simply retrieves a URL, parses the sitemap contents, and then sorts them. Finally, it displays the top 100 entries. I've tested it on sitemaps with over 20,000 items. While it is a little slow on such a large document, it works fine.

#!/usr/bin/env php
<?php
require 'QueryPath/QueryPath.php';
 
define('MAX_ITEMS', 100);
 
$sitemap = 'http://example.com/sitemap.xml';
 
$urls = array();
print "Parsing sitemap...\n";
$qp = qp($sitemap, ':root>url>loc');
$size = $qp->size();
$max = $size > MAX_ITEMS ? MAX_ITEMS : $size;
printf("Found %d entries; printing top %d\n\n", $size, $max);
 
try {
    foreach ($qp as $url) {
      $loc = $url->text();
    $score = $url->nextAll('priority')->text();
    $urls[$loc] = $score;
    }
} catch (Exception $e) {
  print $e->getMessage();
}
 
arsort($urls);
 
$filter = "%d: %0.5f  %s\n";
 
foreach ($urls as $uri => $score) {
  if ($i++ == $max) break;
   printf($filter, $i, $score, $uri);
};
?>

Basically, the script above simply fetches all of the URLs out of the sitemap, and then sorts them by their corresponding score. Only the top MAX_ITEMS (100) are shown.

06 Feb

LibRIS: A PHP library for RIS parsing and writing

in lantern, php, programming, reference, ris

LibRIS is a library for parsing and writing RIS data.

Learn all about it at the official GitHub repository.

RIS is a data file format for handling reference metadata for scholarly resources. It is used by Reference Manager, EndNote, and other such tools. For that reason, it is broadly supported by online scholar-centered sites.

This library provides a simple interface for parsing and writing RIS data for bibliography management.

30 Jan

OS X: Installing MongoDB and the PHP Mongo Driver

in mac, mongodb, os x, php

MongoDB is a full-featured object database. Since it is fast, versatile, and schema-less, you can develop a very complex data storage layer without an ORM, and without any tedious coding. For this reason, I have been investigating MongoDB as a storage layer for PHP. Here's how to set up an environment on OS X Snow Leopard.

In this blog we'll do the following:

  • Install MongoDB
  • Add some initial data to MongoDB
  • Install the PHP PECL driver for MongoDB
  • Write a short PHP Script that uses MongoDB
  • Shut down the MongoDB server
27 Jan

OpenAmplify Drupal Series: Part 2 - Building a Mini Portal

in drupal, openamplify, php, programming

The Second in my three-part series on Drupal an OpenAmplify has been published on their community site. If you missed the first part, you may want to start there. Part three, coming soon, will cover the API, and will focus on development instead of configuration.
Part 2Part 2
In part two, I walk through the process of building a "mini portal" by taking semantic information returned from an OpenAmplify analysis of a node, and using that information in conjunction with other web services. For this demonstration, I released a new version of the module, and added support for Shopping.Com and Bloglines, both of which can return some impressively rich content.

19 Jan

QueryPath on WebMonkey

in php, programming, querypath

It just came to my attention that a WebMonkey article (Parsing HTML? There's an App for That) from a few months ago suggested using QueryPath as an alternative to attempting to parse HTML by hand.

Webmonkey on QueryPathWebmonkey on QueryPath

Appropriately, last week I wrote a QueryPath script to analyze a site and extract all links so that I could feed them to Siege and simulate something like a real load against a server. It's nice to be able to easily extract data from HTML.

06 Jan

Acquia Webinar: "Playing Nicely with Others"

in acquia, php, programming, querypath, web services

In our webinar Playing Nicely With Others: Integrating Drupal with Third-Party Data, Ken, George, Larry, and I talk about integrating various web services with Drupal. We talk about SOAP, content importing, digital asset management systems, and QueryPath (surprisingly, I'm not the one plugging QueryPath in this vid).

Thanks to Acquia for doing a fantastic job putting together their webinar series.