Storage track

Linux’s storage system contains many aspects, from filesystems through the VFS layer to low-level device handling. Many of these aspects have very active communities and therefore are the subject of much productive discussion. Nevertheless, some aspects of storage do not always get the attention that they require. Here to help us give these aspects the love and attention that they deserve is Matthew Wilcox, who has been a Linux kernel hacker for more than ten years, most recently with Intel’s Open Source Technology Centre.

The first topic is “Evaluating Linux storage APIs for use in QEMU/KVM” by Anthony Liguori. Anthony’s work with the virtio-pci Linux kernel module (which provides virtual I/O support for guest operating systems running on QEMU/KVM) pointed up some shortcomings of the current userspace storage APIs used for QEMU and KVM. Cleaning up this API would be very helpful, both from the viewpoint of simplicity and from the viewpoint of more-efficient virtualization. We hope that this discussion will help Linux further burnish its “green” credentials, but with full performance and reduced complexity.

The second topic is “Linux Data De-Duplication” by Mingming Cao. At first glance, the large and growing capacities of disk drives would make any de-duplication a waste of CPU time. However, solid-state disks are not quite so large, and de-duplication can reduce the amount of memory required for buffer cache, especially when running multiple similar operating systems on the same system. Mingming will describe different approaches to de-duplication, including some that have been discussed within the btrfs community. Please bring your ideas and experiences!

The third and final topic is “Locking issues on Clustering File Systems” by Coly Li. In contrast to the first two topics, which involve pushing more workloads onto a single system, Coly is working on clustering many systems together to work on a single problem. Clusters require special coordination, which is often provided by a distributed lock manager (DLM, as in the fs/dlm facility in Linux) and a cluster filesystem (such as OCFS2). These coordination facilities bring their own costs, including lock-mastering expense, lock communication cost, DLM compatibility between fs/dlm and OCFS2, deadlock detection, and so on.

Proposals for this track

* Evaluating Linux storage APIs for use in QEMU/KVM

Discussing limitations of current userspace storage APIs for use in QEMU/KVM.
Storage 06/11/2009
Anthony Liguori

* Linux Data de-duplication

Data de-duplication is a effective way to reduce large storage needs by eliminating redundant data, a hot demanded feature for virtualization OS image sharing and efficient data storage backups. It's really valuable to add data de-duplication support to Linux filesystem, however the feature is quite challenging too. How to get it right? What's the performance impact? Block level or file level? On the fly data de-duplication in filesystem or background userspace de-duplication?
Storage 06/11/2009
Mingming Cao

* Locking issues on Clustering File Systems

open discussion on locking issue on clustering file systems, especially associated with fs/dlm code
Storage 06/18/2009
Mark Fasheh

* Migrating Data from Old Hardware to New Hardware

This talk will focus on some of the challenges in migrating data from old, potentially failing hardware to new hardware: dealing quickly with IO errors, how to optimize the list of files to move and suggestions about how to handle failures during migration.
Storage 06/11/2009
Ric Wheeler

* On predicting predictors: hacking archive formats for fun and prophecy

We aim to inform you about the archive formats you use every day. We will include an in-depth look at the tar, ar, cpio, gzip, bzip2, and deb formats, as well as the internals of the Git object store. Armed with this information, we will show you a practical application: removing the redundancy between files in version control and distributions of source and binaries.
Storage 06/22/2009
Josh Triplett, Jamey Sharp

* Proportional IO Controller

The Proportional IO controller allows to distribute disk time to tasks/cgroups in proportion to their assigned weights. It leverages existing cgroup infrastructure for task grouping and supports specification of weights hierarchically.
Storage 06/23/2009
Divyesh Shah, Nauman Rafique, Vivek Goyal