Monday, March 16, 2009

Storage: Capsule and FlashDB

Capsule: An Energy-Optimized Object Storage System for Memory-Constrained Sensor Devices
Mathur et al.

FlashDB: Dynamic Self-Tuning Database for NAND Flash
Nath and Kansal

The topic of today's discussion was flash storage in wireless sensor networks. The class started by discussing some higher-level issues about storage on motes and about flash storage in general before digging into the specifics of the assigned papers. First, one person questioned the assumption that mote applications are write-intensive. His argument was that motes tend to have low duty cycles, so there is not too much data written after all. In response, one student suggested that the write-intensive classification is based on the read-write ratio instead of basing it on the frequency of writes.

Then, one person questioned the need for advanced storage primitives, such as queues or files, since in practice, most applications use only simple append-only logs. However, some motes store their configuration as files on flash, and filesystem provides a familiar storage abstraction for programmers. If motes supported a filesystem recognized by PC's, such as FAT, we would be able to just pull out an SD card and plug it to a laptop in order to access the data. However, implementing FAT on a mote can be quite challenging, so the network could have few more capable, FAT-enabled motes, which would gather data from other simpler motes.

We were also wondering why we would need a lot of storage on a mote. For example, if the application sampled two sensors every 20 minutes, producing 4 bytes each time, a 2 GB disk would be filled in approximately 20,000 years. Even in the case of data-intensive applications, it would take at least several months in order to fill the disk. Another concern that we discussed was pulling out data once it was collected, since it is effectively impossible to stream several gigabytes of data using the radio. Instead, the users should be able to send queries to motes and retrieve only the data they actually need, which justifies the need for indexes and structured storage in general.

Then, the topic of discussion changed to flash storage and its properties. Someone pointed out that a mote is inherently incapable of efficiently performing updates in place. In order to do so, it must read the entire block to memory before erasing it, which requires more memory than it is found on typical motes. The alternative is to copy the entire block elsewhere in the flash, erase the original block, and then copy it back while performing the requested changes on fly.

In flash, it is possible to turn 0's into 1's without having to erase the entire block. One person suggested that this could be leveraged to change already stored pointers to null pointers if we represent them as FFFF instead of 0000. Another student suggested that the application can leave some of the written data as 0‘s and update it in place later. However, we realized that this would not work, because the flash controllers do not expose the interface for performing such operations.

Finally, our attention turned to Capsule, the first paper assigned for today. The system uses log-structured storage, which seems to be the only reasonable storage model for motes. This approach works well for write-intensive applications at the cost of slow reads. For example, accessing the head of a queue requires the application to traverse it all the way from back to the front. On personal computers, this performance issue is addressed by using large buffers, but on motes, we do not have the luxury to do so.

Most of the remaining discussion of the paper revolved around log cleaning and compaction. Because of the log-based approach, the flash storage can quickly fill up with irrelevant (deleted) data. But in order to use reclaim the space, the log must be compacted due to the must-erase-before-write nature of flash. Compaction, however, takes a lot of energy and time. One of our concerns was that Capsule might start this process even if the client application could be in the middle of a high-rate data capture. Another concern was that when there is nothing to be compacted, the storage component would attempt to do so anyway and just waste energy. Finally, someone pointed out that this scheme effectively allows only 1 GB of storage on a 2 GB device, since it has to leave enough space for compaction.

Towards the end of the class period, we shifted our attention to FlashDB. One of the first observations was that using the B+ tree nodes in the disk modes requires in-place updates, which would not work on typical motes. We highlighted FlashDB's ability to switch the mode of operation online - instead of just determining the appropriate mode before deployment.

We were wondering why we would need the online self-tuning capability until one student suggested an example, in which a mote samples data during the day and streams it to the base station during the night. Ideally, the B+ tree would be in the write-intensive (log) mode during the day and in the read-intensive (disk) mode during the night. However, we were not convinced that FlashDB would make this switch early enough, especially since this dynamics was not evaluated in the paper. Perhaps it would be beneficial if the client application could disable self-tuning and perform it manually.