A trap when using stat(2) via Python's ctypes on macOS

Published

I was trying to use getmntinfo(3) in macOS today to get information about mounted file systems. getmntinfo returns some struct statfs containing the desired information. I wasn’t just trying to use it from C, though. That would be too easy. No, I was calling getmntinfo from Python using its ctypes library.

In order to use getmntinfo via ctypes I needed to inform ctypes about the various data types in place, particularly struct statfs, which is how getmntinfo returns the information about your mounts.

struct statfs is documented in statfs(2), which has two different structures, one that is used when _DARWIN_FEATURE_64_BIT_INODE is defined and one that is used when it is not defined. For reference, I read the man page for stat(2) as saying that, barring user intervention with other macros, _DARWIN_FEATURE_64_BIT_INODE will be turned on for macOS 10.6 and later. For context, macOS (then “OS X”) 10.6 was released in August 2009, and I am writing this almost ten years later in 2019. I am running macOS 10.13.

I whipped up some code with the help of ctypeslib like so:

# structs courtesy clanglib2.  Generated from /usr/include/sys/mount.h
# on 10.13.6.


class struct_fsid(ctypes.Structure):
    _pack_ = True  # source:False
    _fields_ = [("val", ctypes.c_int32 * 2)]


class struct_statfs(ctypes.Structure):
    _pack_ = True  # source:False
    _fields_ = [
        ("f_bsize", ctypes.c_uint32),
        ("f_iosize", ctypes.c_int32),
        ("f_blocks", ctypes.c_uint64),
        ("f_bfree", ctypes.c_uint64),
        ("f_bavail", ctypes.c_uint64),
        ("f_files", ctypes.c_uint64),
        ("f_ffree", ctypes.c_uint64),
        ("f_fsid", struct_fsid),
        ("f_owner", ctypes.c_uint32),
        ("f_type", ctypes.c_uint32),
        ("f_flags", ctypes.c_uint32),
        ("f_fssubtype", ctypes.c_uint32),
        ("f_fstypename", ctypes.c_char * 16),
        ("f_mntonname", ctypes.c_char * 1024),
        ("f_mntfromname", ctypes.c_char * 1024),
        ("f_reserved", ctypes.c_uint32 * 8),
    ]


Mount = collections.namedtuple("Mount", "src dst fs_type")


def get_mount_info():
    libc = ctypes.cdll.LoadLibrary(ctypes.util.find_library("libc"))
    getmntinfo = libc["getmntinfo"]
    getmntinfo.argtypes = [
        ctypes.POINTER(ctypes.POINTER(struct_statfs)),
        ctypes.c_int,
    ]
    getmntinfo.restype = ctypes.c_int
    mounts = ctypes.POINTER(struct_statfs)()
    num_mounts = getmntinfo(ctypes.byref(mounts), 0)
    if num_mounts <= 0:
        raise Exception("invalid num_mounts=%r" % (num_mounts,))
    mount_objs = []
    for i in range(num_mounts):
        mount = mounts[i]
        mount_objs.append(
            Mount(mount.f_mntfromname, mount.f_mntonname, mount.f_fstypename)
        )
    return mount_objs

Quite mysteriously, this didn’t work. I would end up with garbage in the statfs structs I got back. What ever was going on?

Well, let’s read stat(2) some more, with some emphasis added by me:

_DARWIN_FEATURE_64_BIT_INODE

In order to accommodate advanced capabilities of newer file systems, the struct stat, struct statfs, and struct dirent data structures were updated in Mac OSX 10.5. […] On platforms that existed before these updates were available, ABI compatibility is achieved by providing two implementations for related functions: one using the legacy data structures and one using the updated data structures. Variants which make use of the newer structures have their symbols suffixed with $INODE64. These $INODE64 suffixes are automatically appended by the compiler tool-chain and should not be used directly.

Yes, so although my Python binary was built with 64-bit inode support, that support was slipped in at compile time. Naïvely calling getmntinfo at run time calls the old, pre-64-bit-inode version of the function, which returns a shorter struct statfs.

So, naturally, here’s how I modified my code:

def get_mount_info():
    libc = ctypes.cdll.LoadLibrary(ctypes.util.find_library("libc"))
    try:
        getmntinfo = libc["getmntinfo$INODE64"]
    except (KeyError, AttributeError):
        # HOPE THIS DOESN'T RETURN A DIFFERENT STATFS STRUCT.
        getmntinfo = libc["getmntinfo"]
    getmntinfo.argtypes = [
        ctypes.POINTER(ctypes.POINTER(struct_statfs)),
        ctypes.c_int,
    ]
    getmntinfo.restype = ctypes.c_int
    # ...

I don’t ever intend to run this code on macOS < 10.6 (or < 10.13, for that matter), so I try to use getmntinfo$INODE64 if I can, regular ol’ getmntinfo otherwise, preparing for the day when Apple removes this ABI compatibility trick.