Sunday, March 29, 2009

Dealing With Sensor Data

A Macroscope in the Redwoods
Tolle et al.

Werner-Allen et al.

Today we discussed two papers which involved environmental monitoring. Although we've seen papers which did this sort of monitoring before (Great Duck Island), both of these papers brought something new to the discussion, in that they tackled using WSN as scientific instruments, with a focus on the collected data.

The first paper we discussed was "Macroscope." Opinion was split over the contributions of this work. Some argued that the real deployment in the redwoods gives a good overview of the capabilities of current WSN for data collection, grounding the field. WSN are here! Data valuable to biologists, that had previously been unattained, was now available. "Macroscope" provided a look into what useful data could be the result of such a deployment, where the problems for future consideration were, and calibration issues.  

On the other hand, some argued that there was nothing altogether too impressive here. Previous works had preformed environmental monitoring, and other fields had data sets far, far larger than what was considered in this paper. It was proposed that perhaps particle physicists, astronomers, and some others may get a good laugh out of the relatively tiny data sets from environmental monitoring.

In "Macroscope," the data yield from the network was far less than ideal, and far less than the data loggers. Someone wondered if it wouldn't be better for a static deployment to just run cables, given the unreliability of wireless communication, and given the need for scientists to miss as few sensor readings as possible. We discussed a few problems with wired connections, including deployment issues, and the weight of the cables being cumbersome for the tiny motes. Someone suggested that WSN could hopefully be deployed by a non technical person, just by switching the nodes on, and letting the network form itself, rather than have to worry about physically connecting things properly.

One point of discussion was in regards to the need for the radio network at all. "Macroscope" points out that they used data both from the on board flash data loggers, and from the network. However, had the logs been reset at deployment time, there would have not been any need for the network, so why bother? Could environmental monitoring be an application where long existing data loggers are more appropriate? Someone made the point that the network allows us to do time synchronization for sensor reading timestamps, without power hungry GPS. Additionally, the data from some applications, such as seismological event monitoring, may have real time value.

This motivated a more general discussion about the potential misuse of motes, where other platforms were more appropriate (not saying that this was the case with "Macroscope"). One impassioned individual argued that there may be a temptation to solve an interesting technical problem for motes, for an application where motes aren't the most practical solution. It may indeed be out of balance to stick a mote on an expensive, power hungry sensor.

As the discussion segued to "Volcano," someone suggested that the success or failure of a particular WSN should be data oriented. Indeed, "Volcano" focused on how well a WSN could replace previously used systems for data collection, using data fidelity and yield as its metrics.

In one sense, this work was less successful than "Macroscope." Between software problems and power outages, the network was down for a significant portion of its deployment, making conclusions about performance potentially troublesome. Additionally, a bug in the time sync protocol made data analysis very difficult. Luckily, a sensor nearby to the deployment was used for "ground truth" to successfully recover as much data as possible. The paper stressed the many problems could occur in deployment which were not apparent in lab testing (such as high velocity rocks destroying an antenna).

One person said that this paper presented itself as "We deployed a broken system, and this is how we extracted useful lessons," rather than "Here's our volcano monitoring system, and here was the performance," as the latter presentation may not have been accepted into a top conference, due to the various problems encountered during deployment.

Someone, perhaps somewhat cynically, suggested that if the system had been used to monitor a forest, rather than a volcano, it would not have been accepted.

Throughout the discussion, we considered which of the papers today, if either, we might include in a cs263 hall of fame. Nobody seemed ready to make a strong commitment to either paper, nor did anybody suggest any of the other papers we read this semester. It seemed hard to directly compare the quality of the contributions of one paper with those of another, and even more so to say that one paper is better, by whatever metric, than most of the rest.