Python on Mac: Using the xattr functions to get extended attributes

Feb 9 2009

Have you ever noticed that sometimes when you copy OS X files from your Mac to, say, a Linux or Windows server, you get a bunch of files that begin with ._? Those are metadata files. They store extended attributes for other files. Why don't they appear on Mac OS X? Because on OS X, extended attributes are stored in the file system itself, not in auxiliary hidden files. In this article, I am going to cover how you can use Python on your Mac to access the extended attributes.

OS X's HFS+ file system supports file-level metadata stored in extended attributes. Essentially, extended attributes are maintained as a list of key/value pairs, and these are exposed to the system's C libraries by the xattr library. While poking around in the Python interpreter, I was surprised to discover that the version of Python that shipped with my Mac has an xattr library, too.

Recently, I read that TextMate stores its bookmarks (which appear as little stars in a file's margin) in the extended attributes section. It seemed to me like it might be a handy thing to be able to extract this an other metadata from files to, perhaps, enable external shell scripts to use that information. (Yes, I have a plan... no, I'm not going to talk about it right now.)

So I decided to do a little fetching of that data. Turns out, this is a trivially easy task. From a Python interpreter, we can see all of the attributes attached to a file. I'm starting with test.php, a simple text file with bookmarks on the first and fourth lines.

The first thing to do is list the extended attributes for my test.php file:

Python 2.5.1 (r251:54863, Feb  4 2008, 21:48:13) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import xattr
>>> xattr.listxattr('./test.php')
(u'com.macromates.bookmarked_lines', u'com.macromates.caret')
>>> 

There are only two lines of code above. First, we import the xattr library. Second, we run xattr.listxattr('./test.php'), which simply lists all of the extended attributes on ./test.php.

There are two extended attributes:

  • com.macromates.bookmarked_lines
  • com.macromates.caret

(Recall that the leading u there is to demarcate the string as Unicode.)

Of these two attributes, the first is surely the one we are interested in. We can get that attribute using (surprise!) getxattr().

Python 2.5.1 (r251:54863, Feb  4 2008, 21:48:13) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import xattr
>>> xattr.listxattr('./test.php')
(u'com.macromates.bookmarked_lines', u'com.macromates.caret')
>>> xattr.getxattr('./test.php','com.macromates.bookmarked_lines')
'(\n    0,\n    3\n)'

The last line of code is the only thing new: xattr.getxattr('./test.php','com.macromates.bookmarked_lines'). The getxattr() function takes two arguments:

  • The file to examine
  • The name of the attribute

It returns the value of the attribute. As far as I can tell, the value can be just about anything. The bookmarked lines property returned something like a tuple or list. Other attributes I've looked at are binary data. I suppose it just depends on what the programmer decides to put in the attribute value.

There are a couple of other xattr functions you may find useful:

  • removexattr(): Removes an extended attribute from a file.
  • setxattr(): Adds an extended attribute name/value pair to a file.
  • xattr(): Wraps a file in a dict-like interface.

I think the last one is sorta cool. Here's what it looks like in action:

>>> attrs = xattr.xattr('./test.php')
>>> for name in attrs:
...   print name
...   print attrs.get(name)
... 
com.macromates.bookmarked_lines
(
    0,
    3
)
com.macromates.caret
{
    column = 0;
    line = 5;
}
>>> 

In the sample above, attrs = xattr.xattr('./test.php') assigns attrs a new dict-like object wrapping the extended attributes. From there, we can loop over the data just like we would do with any dict.

In our loop, we print the name, and then the value. Iterating over two extended attributes, we first get:

com.macromates.bookmarked_lines
(
    0,
    3
)

(This is the bookmark attribute we looked at before)

Then we also get:

com.macromates.caret
{
    column = 0;
    line = 5;
}

This attribute indicates where the cursor is positioned in the saved document.

So there you have it. With just a few lines of Python code, you can conveniently extract extended attribute information from Mac OS X files.