07 Feb

Google Scholar and RefMan: Configuring Scholar to give downloadable RIS references

in google, ris

Did you know that you can configure Google Scholar to provide RIS download links?

RIS is an industry-standard format for importing and exporting bibliography information. Recently I posted a PHP library for working with RIS files. I wanted to find a good search tool that would allow me to find articles, and then download them into Lantern (a project I will release soon).

RefMan is a popular tool that also uses the RIS format. So to enable RIS downloads, simply tell Google Scholar to provide RefMan support.

Here are the steps to do this:

  1. Log into Google Scholar (http://scholar.google.com)
  2. Click on scholar preferences next to the Search button.
  3. Scroll to the bottom of the configuration screen to Bibliography Manager and choose RefMan

Here's a screenshot showing the last step.
Bibliography Manager SettingsBibliography Manager Settings

Once you have saved those preferences, every article in your search results should have an Import into RefMan link next to it.

06 Feb

LibRIS: A PHP library for RIS parsing and writing

in lantern, php, programming, reference, ris

LibRIS is a library for parsing and writing RIS data.

Learn all about it at the official GitHub repository.

RIS is a data file format for handling reference metadata for scholarly resources. It is used by Reference Manager, EndNote, and other such tools. For that reason, it is broadly supported by online scholar-centered sites.

This library provides a simple interface for parsing and writing RIS data for bibliography management.

01 Feb

Nginx, tcp_nopush, sendfile, and memcache: The right configuration?

in linux, nginx, system administration

Tuning Nginx ("engine-X") seems to be something of a black art. Today, I looked closely at the tcp_nopush, sendfile, and keepalive_requests settings for pages rendered from PHP as a FastCGI, and memcached content. We discovered that with a little careful tuning, we could shave off as much as 200-400 msec per request.

I have been working on several speed improvements on the Condition Centers at Spine-Health.com. Initially, these pages were taking upwards of 3.5 seconds just to render the HTML. Through a series of optimizations that I will document in another article, we have the conditions page rendering in around 100 msec now.

Before we get going, let me mention a few details of our system:

  • We are running CentOS 5.3 (roughly equivalent to RHEL 5.3)
  • We are running Nginx 0.6, which is behind the current stable, but is the latest in the Fedora EPEL repositories that we use.
  • Since these settings make use of low-level kernel facilities (like TCP_UNCORK), other platforms may differ.
30 Jan

OS X: Installing MongoDB and the PHP Mongo Driver

in mac, mongodb, os x, php

MongoDB is a full-featured object database. Since it is fast, versatile, and schema-less, you can develop a very complex data storage layer without an ORM, and without any tedious coding. For this reason, I have been investigating MongoDB as a storage layer for PHP. Here's how to set up an environment on OS X Snow Leopard.

In this blog we'll do the following:

  • Install MongoDB
  • Add some initial data to MongoDB
  • Install the PHP PECL driver for MongoDB
  • Write a short PHP Script that uses MongoDB
  • Shut down the MongoDB server
27 Jan

OpenAmplify Drupal Series: Part 2 - Building a Mini Portal

in drupal, openamplify, php, programming

The Second in my three-part series on Drupal an OpenAmplify has been published on their community site. If you missed the first part, you may want to start there. Part three, coming soon, will cover the API, and will focus on development instead of configuration.
Part 2Part 2
In part two, I walk through the process of building a "mini portal" by taking semantic information returned from an OpenAmplify analysis of a node, and using that information in conjunction with other web services. For this demonstration, I released a new version of the module, and added support for Shopping.Com and Bloglines, both of which can return some impressively rich content.

19 Jan

QueryPath on WebMonkey

in php, programming, querypath

It just came to my attention that a WebMonkey article (Parsing HTML? There's an App for That) from a few months ago suggested using QueryPath as an alternative to attempting to parse HTML by hand.

Webmonkey on QueryPathWebmonkey on QueryPath

Appropriately, last week I wrote a QueryPath script to analyze a site and extract all links so that I could feed them to Siege and simulate something like a real load against a server. It's nice to be able to easily extract data from HTML.

15 Jan

OS X: Using curl instead of wget

in bash, curl, mac, os x, wget

OS X does not come with wget, a command-line tool for retrieving websites. For a while, I grumbled about this. I knew that curl was installed, but I hadn't ever used curl from the command line. But once I tried it out, I realized that for my needs, curl is just as good as wget... and I don't have to install anything extra to get it.

Here's how to use curl to fetch a remote URL:

$ curl -OL h ttp://spine-health.com/index.php
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 34646    0 34646    0     0  15314      0 --:--:--  0:00:02 --:--:-- 17767

This will download the remote file and store it locally in index.php. If you leave off the -O, it will write the file to standard output (your terminal, usually).

curl: Remote file name has no length!

Fetching from a URL's root can behave differently. If you perform the same command as above, but pointing to the base URL, you will get an error:

$ curl -OL h ttp://spine-health.com/
curl: Remote file name has no length!
curl: try 'curl --help' or 'curl --manual' for more information

What's going on here? The error is not terribly informative on this point. The problem is that curl doesn't know where to write the output file. A better command here is something like this:

$ curl -L h ttp://spine-health.com/ > out.html

This time, the retrieved data will be written to the out.html file.

Like wget, curl has many options. You can read the man page for details, or you can get a quick summary with the usual curl --help command.

13 Jan

OpenAmplify Drupal Series: Part 1 - The Amplify Module

in drupal, openamplify, programming

Over at OpenAmplify's Community site, they are running Part 1 of a three-part series I've written about using OpenAmplify with Drupal.
Open AmplifyOpen Amplify
The first part covers the basics of using Acquia Drupal and the Amplify module to perform semantic analysis of your content.

13 Jan

Nagios: Fixing "error: Could not stat() command file" (on Debian)

in Debian, linux, monitoring, nagios

Nagios is a network monitoring tool. I use it to track web servers, mail servers, and whatever else I have running on the LAN and on the Internet.

One common configuration issue is getting the Service Commands menu to work correctly. By default, it is visible in the UI, but disabled on the server backend. And on Debian, not all of the steps to enable it are particularly evident from the docs. Often, one will recieve the cryptic error Could not stat() command file pointing to /var/lib/nagios3/rw/nagios.cmd. This can be fixed without too much fuss.
Nagios Service CommandsNagios Service Commands

06 Jan

Acquia Webinar: "Playing Nicely with Others"

in acquia, php, programming, querypath, web services

In our webinar Playing Nicely With Others: Integrating Drupal with Third-Party Data, Ken, George, Larry, and I talk about integrating various web services with Drupal. We talk about SOAP, content importing, digital asset management systems, and QueryPath (surprisingly, I'm not the one plugging QueryPath in this vid).

Thanks to Acquia for doing a fantastic job putting together their webinar series.