worrbase

Extended Attributes

2015-12-30

While working on support for FreeBSD's extended attributes in python, I tried to be conscious of other implementations of extended attributes in different operating systems, that way I wouldn't be inadvertently causing the sane problem that I was trying to fix: reliance on a particular API's semantics.

What are extended attributes?

To put it very simply, extended attributes are metadata that are attached to files. Typically, they're key/value pairs that the filesystem associates with a particular file on the filesystem, though that doesn't always have to be the case.

How they're implemented depends on both the filesystem, as well as the operating system. This means that implementations on the same filesystem (UFS, for example) can be complete incompatible across operating systems (Solaris and FreeBSD).

Extended attributes are not mandated by any standard. The tooling and APIs are quite different across operating systems and some operating systems (OpenBSD, HP-UX) don't implement them at all. Because support is non-standard and spotty, it's rare to see them used in cross-platform software. I'd be super-interested in seeing some counter-examples to this.

Extended attributes are sometimes namespaced. That is to say, there exists some top-level grouping of attributes. Other than the top-level namespace, there usually isn't hierarchy to attributes, other than any arbitrary user-defined hierarchy. Namespaces are usually system and user, although this isn't necessarily consistent, as we'll see. Extended attributes under the system namespace are only modifiable by root (and sometimes only queriable by root).

Linux

ssize_t getxattr(const char *path, const char *name,
                 void *value, size_t size);
ssize_t lgetxattr(const char *path, const char *name,
                 void *value, size_t size);
ssize_t fgetxattr(int fd, const char *name,
                 void *value, size_t size);

ssize_t listxattr(const char *path, char *list, size_t size);
ssize_t llistxattr(const char *path, char *list, size_t size);
ssize_t flistxattr(int fd, char *list, size_t size);

int removexattr(const char *path, const char *name);
int lremovexattr(const char *path, const char *name);
int fremovexattr(int fd, const char *name);

int setxattr(const char *path, const char *name,
              const void *value, size_t size, int flags);
int lsetxattr(const char *path, const char *name,
              const void *value, size_t size, int flags);
int fsetxattr(int fd, const char *name,
              const void *value, size_t size, int flags);

The linux API is actually a fairly nice one, and for the rest of this post I'm going to use it as my point of comparison. The return values of the getxattr and listxattr functions are the total size of the attribute, not the size of the data returned. This lends itself to a nice idiom for checking whether or not truncation occurred:

char buf[BUFSIZ];
ssize_t res;

res = getxattr("/home/worr/foo", "user.foo", buf, sizeof(buf)) > sizeof(buf);
if (res > sizeof(buf))
    /* truncation occured */
else if (res == -1)
    /* error occurred */

If value is NULL in a call to getxattr or listxattr, the size of the buffer required to hold the contents of the EA will be returned. This allows you to query the amount of space required to hold the return value, allocate it, and then call the function again to populate that. That's unfortunately racey, so it's preferable to call and then realloc if truncation.

Linux extended attributes are namespaced, and the namespace is specified as part of the attribute name. Namespaces are separated from attribute names by a .. Currently, they support the common system and user namespaces, as well as security and trusted.

It's important to note that listxattr will never retrun EPERM. If there are EAs that the current user cannot access, they just won't be returned.

The attribute list returned by listxattr is NULL-delimited, and all of the attribute names returned by listxattr are fully-qualified.

AIX

It seems funny that I'm going to talk about AIX's interface right after linux's, but that's largely because it's...almost exactly the same.

ssize_t getea(const char *path, const char *name,
        void *value, size_t size);
ssize_t fgetea(int filedes, const char *name, void *value, size_t size);
ssize_t lgetea(const char *path, const char *name,
        void *value, size_t size);

ssize_t listea(const char *path, char *list, size_t size);
ssize_t flistea (int filedes, char *list, size_t size);
ssize_t llistea (const char *path, char *list, size_t size);

int removeea(const char *path, const char *name);
int fremoveea(int filedes, const char *name);
int lremoveea(const char *path, const char *name);

int setea(const char *path, const char *name,
        void *value, size_t size, int flags);
int fsetea(int filedes, const char *name,
        void *value, size_t size, int flags);
int lsetea(const char *path, const char *name,
        void *value, size_t size, int flags);

Just like linux, getea and listea return the size of the actual attribute value, which makes checking for truncation super easy. They also support getting called with a zero size, which will just return the size of the list or attribute value without writing any data to value.

The only key difference is that is that they use the character 0xF8 to separate the namespace from the attribute name. So querying for system attributes involves querying the name 0xF8SYSTEM0xF8attr.

There's also the statea family of functions, which will fill in a struct stat64x, but that's of little consequence to us here.

FreeBSD / NetBSD

ssize_t
extattr_get_fd(int fd, int attrnamespace, const char *attrname, void *data, size_t nbytes);
ssize_t
extattr_get_file(const char *path, int attrnamespace, const char *attrname, void *data, size_t nbytes);
ssize_t
extattr_get_link(const char *path, int attrnamespace, const char *attrname, void *data, size_t nbytes);

int
extattr_set_fd(int fd, int attrnamespace, const char *attrname, const void *data, size_t nbytes);
int
extattr_set_file(const char *path, int attrnamespace, const char *attrname, const void *data, size_t nbytes);
int
extattr_set_link(const char *path, int attrnamespace, const char *attrname, const void *data, size_t nbytes);

int
extattr_delete_fd(int fd, int attrnamespace, const char *attrname);
int
extattr_delete_file(const char *path, int attrnamespace, const char *attrname);
int
extattr_delete_link(const char *path, int attrnamespace, const char *attrname);

ssize_t
extattr_list_fd(int fd, int attrnamespace, void *data, size_t nbytes);
ssize_t
extattr_list_file(const char *path, int attrnamespace, void *data, size_t nbytes);
ssize_t
extattr_list_link(const char *path, int attrnamespace, void *data, size_t nbytes);

FreeBSD and NetBSD both use the same functions for the extended attribute calls. The most obvious difference is that the attribute namespace is no longer part of the attribute name. Each namespace is defined as an constant, and must be passed separately.

Almost seemingly as a result of this difference, extattr_list can now error with EPERM, rather than hiding the attribute names that the caller doesn't have access to.

The other, more annoying difference, is the return value of the extattr_get and extattr_list functions. Rather than behaving like linux, AIX or OS X, they instead return the number of bytes written, making truncation detection harder. This basically requires that you make two calls if you want to ensure that no truncation will occur.

OS X

ssize_t
getxattr(const char *path, const char *name, void *value, size_t size, u_int32_t position, int options);
ssize_t
fgetxattr(int fd, const char *name, void *value, size_t size, u_int32_t position, int options);

ssize_t
listxattr(const char *path, char *namebuf, size_t size, int options);
ssize_t
flistxattr(int fd, char *namebuf, size_t size, int options);

int
removexattr(const char *path, const char *name, int options);
int
fremovexattr(int fd, const char *name, int options);

ssize_t
listxattr(const char *path, char *namebuf, size_t size, int options);
ssize_t
flistxattr(int fd, char *namebuf, size_t size, int options);

OS X differs in a few ways. Notably, their functions all take an options arg. Rather than calling an entirely different function to prevent following symlinks, you can pass the XATTR_NOFOLLOW to prevent traversing symlinks.

Another, fairly curious difference is the position argument that's part of the prototype for getxattr. To really get a handle on this, we're going to dive into the wonderful world of forks.

Forks

Forks are kind of like having multiple datastreams for the same file. The data that we typically think of being stored in a file is dumped into one fork (in the case of Mac OS, the data fork) and metadata, resources or any other type of data could exist in other forks, wholly independent.

On Mac OS filesystems (MFS, HFS, HFS+), each file could have at least a resource fork for the purpose of storing resources about a given file. This was used for things like splitting up icons that Finder would use to represent a file, or for separating presentation and content of text documents.

HFS+ (maybe HFS too? I'm not sure) allowed for any number of named forks.

Back to OS X

Extended attributes on OS X are actually just named forks. The extended attribute API wholly supplanted the old resource manager API. To ensure that applications could seek to arbitrary points in a fork, OS X's extended attribute API includes a position argument.

getxattr is similar to Linux, in that it returns the size of the attribute's data, not just the number of bytes read. This makes truncation detection pretty easy.

It is worth noting that extended attribute names in OS X are not namespaced in any special way.

Solaris

Solaris gets weird. Solaris is probably closest to OS X in its implementation of extended attributes, in that extended attributes are just named forks. However, Solaris includes only one specialized function call to deal with extended attributes.

int attropen (const char *path, const char *attrpath, int oflag, ...);
/* the varargs can include a mode argument of type mode_t */

But even this isn't required, since you can get the same results from using a combination of open and openat:

int fd = open(path, O_RDONLY);
int attrfd = openat(fd, attrpath, oflag|O_XATTR, mode);
close(fd);

From there, all of the *at functions can be used to operate on extended attributes with some restrictions:

  • no links between attribute space and non-attribute space
  • no renames between attribute space and non-attribute space
  • only regular files are allowed - no dirs, symlinks or devices

Otherwise, extended attributes are treated like regular files.

This sucks

This is awful when trying to expose a generic, cross-platform API for extended attributes; the only one that I've found is written for perl. I had to add support for FreeBSD in Go, Python and Rust - and none of these deal with Solaris or AIX! Adding FreeBSD support was pretty rough, largely since implementors assume that every OS has a Linux-compatible API.

No OS has a Linux-compatible extended attribute API

  • Are attributes namespaced? Are namespaces strings? Are they int constants?
  • Are they named forks? What happens if I need to seek?
  • How big can the data be? How do we check for truncation?
  • Error conditions differ radically

Honestly, I wonder if this contributes to the lack of cross-platform apps that use extended attributes. They're super useful in any case where it's necessary to track metadata about files without having to keep track of it in a separate database. That's honestly fraught with peril anyway, since you're dependent on the name of the file (or whatever identifier you use in your db) staying constant across renames, deletes, etc.

Where to go from here?

A C wrapper lib around all of these implementations would be nice, but there are some obvious trade-offs that need to be made.

The way that I've done this in Python and Rust has been to:

  • Assume linux-like namespaces, and translate accordingly. If there aren't namespaces in your OS's implementation, then just make the namespace part of the attribute name.
  • Make two calls to get the size of the extended attribute. This works across AIX, Linux and OS X. Solaris will have to use statat to get the size. Unfortunately, race conditions abound.
  • When listing extended attributes, ignore EPERM for system-level attributes

Maybe when I get some time, I'll start working on one.

Finally: please, please stop assuming that the whole world is Linux.

Sources