Tek-X Webcast: A Developer's Intro to Drupal

March 5, 2010

On March 12, 2010, I will be online with the folks from Tek-X giving a webcast on A Developer's Intro to Drupal. If you're just getting your feet wet with Drupal and are still a little confused about hooks, modules, themes, nodes, or even why Drupal isn't (fully) Object-Oriented, then this session...

Using BetterAWStats in Drupal

February 16, 2010

Our current environment uses AWStats to analyze our HTTP server log files and build reports. Because it has privileged access to our data, and because it is open source, we can glean more information out of it than we could from proprietary hosted analytics platforms. It turns out that there is a PHP front-end to AWStats (called BetterAWStats) that comes complete with a Drupal module. Here, I explain how we've installed and configured this module to get our AWStats data imported into our Atrium server. ## Local Analytics Google Analytics is all well and good, but it's also nice to have a local analytics package that can be tuned and tweaked to mine more data. In the last few weeks, I've created an environment that fetches log files from our different frontline webservers, merges the logs together, and runs awstats on the results. This gives us a nice method for analyzing logs locally. BetterAWStats is a PHP frontend to AWStats that comes complete with a Drupal module. We are using it instead of the standard awstats.pl CGI script. And we have it configured to work inside of our Atrium server, which we use for ticket management, documentation, and team communication. ## Installing BetterAWStats Once the module is downloaded and placed in the module directory, copy over the built-in AWStats icons to the module. (If you want, I guess you could create your own icon sets instead.) ### Copy Icons On Debian, the icons are in `/usr/share/awstats/icon`. So we do this: ~~~ cp -a /usr/share/awstats/icon/* /var/www/VHOST/sites/all/modules/betterawstats/icons/ ~~~ Replace the second path with whatever path goes to your Drupal install. ### Enable and Configure the Module Next, log in as an administrator to Atrium. Go to the module configuration page and enable *BAW Statistics*. From there, you can go to the configuration page and enter the relevant paths. You will need four paths (Debian paths are shown below):
  • The data directory where AW Stats stores its statistical data. (/var/lib/awstats)
  • The library directory where AW Stats tools are located. (/usr/share/awstats/lib)
  • The language directory (/usr/share/awstats/lang)
  • The icon directory. If you followed the instructions above, this should be /sites/all/modules/betterawstats/icons

Save the configuration and you should be good to go.

Extra Configuration Settings

In addition to the "normal" stuff, I have found three other configuration steps I've had to take.

  • Fix vhosts configurations
  • Hard-code hostname in settings.php
  • Hack the the module's map generator.

All three steps are described below.

Make sure you are configured to use virtual host stat files

By default, our Debian package of awstats stores all configuration data in /etc/awstats/awstats.conf and all of the statistics information in /var/lib/awstats/awstatsMOYEAR.txt, where MOYEAR is replaced by a two-digit month code and a four-digit year code (e.g. 022010 for Feb, 2010).

However, Better AWStats requires that AW Stats be configured to use the virtual host setup. The data file must be named according to this convention:

awstatsMOYEAR.DOMAIN.txt

So, for example, I needed a file called /var/lib/awstats/awstats022010.www.spine-health.com.txt

In order to do this, you must rename your main configuration file from /etc/awstats/awstats.conf to /etc/awstats/awstats.DOMAIN.conf, where domain is the fully qualified domain (e.g. awstats.www.spine-health.com.conf).

The next time awstats.pl runs, it will store the data in /var/lib/awstats/awstats022010.www.spine-health.com.txt.

(Note that you can move the awstatsMOYEAR.txt file to awstatsMOYEAR.DOMAIN.txt on your own, and preserve any existing history.)

Hard-code one of the system variables

We are running stats for a domain other than the domain that our Atrium installation is using. Since we want stats for http://www.spine-health.com, we have to hack our settings.inc file for this to override the default configuration variable:

$conf['bawstats_defsite'] = 'www.spine-health.com';

Since Better AWStats stores data in the $_SESSION, I have to log out and back in again for this change to take place.

Hacking the map generator

As it ships, the bawstats Drupal module cannot render its IP distribution map. The reason for this is that it attempts to execute code outside of Drupal, yet use (unloaded) Drupal functions.

To fix this, I had to make two changes to /sites/all/modules/betterawstats/modules/render_map.inc.php:

Line 78:

if ($module == 'drupal') {
    // NOTE: drupal_get_path() does not exist.
    #$img_url = drupal_get_path('module', 'bawstats');
    $img_url = '/sites/all/modules/betterawstats';
} else {
    $img_url = $BAW_CONF['site_url'];
}

Line 101:

if ($module == 'drupal') {
    // AGAIN, drupal_get_path() is not defined.
    # $BAW_CONF['site_path'] = drupal_get_path('module', 'bawstats');
    $BAW_CONF['site_path'] = '../';
} else {
    include_once("./../config.php");
}

That seems to have fixed the map.

(I did file a bug on this, and it's possible that the issue is already fixed.)

Viewing AWStats Data in Drupal

Now the stats should be available in the Reports section of Drupal. You should be able to get there with a URL like http://example.com/admin/reports/bwstats

Why does Nginx return 499 errors?

February 16, 2010

I noticed something unexpected in my nginx logs today: There were a bunch of 499 HTTP codes in the access log. Oddly, these didn't show up in Google Analytics, there were no corresponding errors in the error log, but they did show up in my AWStats. What's the deal?

The answer is pretty simple: Nginx...

Large MySQL Imports with GoDaddy: How to get your database imported

February 15, 2010

Every once in a while, I have some project that requires working with one of GoDaddy's servers. By far, the biggest frustration for me when dealing with GoDaddy is getting MySQL databases uploaded. I've tried all kinds of crazy tricks, from exporting MySQL databases in "bite sized chunks" to writing...

A QueryPath script for checking on a sitemap

February 15, 2010

I've been tuning our sitemap during the last few months, and one thing I needed was a quick tool to check on the effectiveness of various sitemap generation strategies.

To do this, I wrote a quick QueryPath script (see a full-sized image of the output). The script is explained below.

The code is...

5 Differences: Moving from XML Sitemap module to Google's Sitemap Generators

February 15, 2010

For a large site that I maintain, we recently disabled the XML Sitemap module (we're using the 1.x branch) and switched to the Google Sitemap Generators tool (the Python one). We have noticed a few unsurprising things, and a few very surprising things.

We identified five big differences (all positive...

Downtime-free Drupal Migration

February 11, 2010

In Jauary we migrated a Drupal site that routinely has 40k+ hits per day. We moved the site from servers in the Pacific Northwest to a datacenter in Virginia. As if that wasn't enough, we moved the servers from Apache to Nginx, as well. But what makes this remarkable to me is that we managed to pull...

Google Scholar and RefMan: Configuring Scholar to give downloadable RIS references

February 7, 2010

Did you know that you can configure Google Scholar to provide RIS download links?

RIS is an industry-standard format for importing and exporting bibliography information. Recently I posted a PHP library for working with RIS files. I wanted to find a good search tool that would allow me to find articles...

LibRIS: A PHP library for RIS parsing and writing

February 6, 2010

LibRIS is a library for parsing and writing RIS data.

Learn all about it at the official GitHub repository.

RIS is a data file format for handling reference metadata for scholarly resources. It is used by Reference Manager, EndNote, and other such tools. For that reason, it is broadly supported...

Nginx, tcp_nopush, sendfile, and memcache: The right configuration?

February 1, 2010

Tuning Nginx ("engine-X") seems to be something of a black art. Today, I looked closely at the tcpnopush, sendfile, and keepaliverequests settings for pages rendered from PHP as a FastCGI, and memcached content. We discovered that with a little careful tuning, we could shave off as much as 200-400...