Thursday, March 5, 2009

Time sync and localization: FTSP and Radio Interferometric Localization

The Flooding Time Synchronization Protocol
Maroti et al.

Radio Interferometric Geolocation
Maroti et al.

Today's discussion was a little different from previous classes
because the two papers weren't on the same topic. So discussion was
largely split into two parts.

Part I: FTSP.

We started off by throwing out some thoughts about our likes and
dislikes. Under the "likes" column, people seemed to agree that FTSP
was a generally useful protocol in that it solved a pratical problem
well (especially, since it produces error on the order of tens of
microseconds). Also, we particularly appreciated the author's
extensive breakdown of the many (ten!) sources of synchronization
error, the magnitudes of each, and their characteristics. The
exposition made it much easier to understand the problem space and
what problem FTSP was trying to solve.

Under the "dislikes" column, people seemed to have some reservations
about the evaluation section since there was no direct evaluation
alongside other synchronization protocols, just comparisons to their
published performance numbers. Along these lines, there were also a
few questions raised about the metrics used to measure FTSP's
performance (we'll get to that a little later).

Root election came up as a discussion topic. It caught our attention
mainly because the root election process was reported to take so long
to converge (17 mins). There was some confusion about what happened
during convergence, when multiple nodes might be considered as root.
The election algorithm maintains time consistency when a root
relinquishes its status to another node with a lower ID, but only if
the algorithm had previously been stable and the nodes were more or
less in sync. However, when the system is boostrapping, clusters in
disjoint parts of the routing tree may have inconsistent times. It
would have been nice if the authors had included some analysis about
how the system behaves during bootstrapping.

The talk about root election brought up a more fundamental question
about whether a root was needed at all. Would it instead be possible
for nodes, in a distributed way, arrive at a global concensus time?
Recent work by the SYRAH group on DESYNC, a distributed concensus protocol modeled on pulse-coupled oscillators,
might serve as a primitive for doing just this.

As we delved deeper into the paper, one major issue that came up was
the authors' evaluation metric of average pairwise error. Since the
whole point of this protocol was to institute a global clock amongst
all the nodes, it would seem more prudent to use a metric that
measured each node's deviation from the global clock. Additionally,
the calculation of the average error per hop seemed strange since it
took the all pairs average error and divided it by the network
diameter.

One alternative to FTSP we considered was a protocol that used time
deltas. The idea would be that each node would measure its time
difference with its parent and that the root could figure out the
clock offset of a node by simply summing the deltas along the path to
that node. The drawback to this is that only the root has a concept
of the global time and that this would make it hard to coordinate
nodes (e.g., "at 10:24:56.000 execute command X"). One could conceivably
send a command from the root like "in 00:10:00.000 execute command X"
and each node would adjust the time according to its own clock
offset. However, this would only work if the time for the message to
propagate to the deepest node was less than the clock skew of any node
along the path. Since we're talking about microseconds, the delta
method may not be feasible.

Part II: Radio Interferometric Geolocation

We started off the discussion of the second paper by again throwing
out our "likes" and "dislikes" about the paper, although there weren't
many "dislikes" to speak of. In particular, people seemed to like the
novelty of the idea and were pleasantly surprised that it was actually
possible to implement. Everyone also seemed to enjoy the fact that
the paper started off with some theoretical underpinnings and then
steadily progressed into a very "engineering" kind of paper
(e.g., dealing with inaccuracy of the radio). It was striking that
nothing about this paper was naive -- there are very important
considerations made for some real world problems that are not
immediately obvious.

One of the highlights of this paper was the method used to defend
against inaccurate radio crystals. The problem they were trying to
solve was that each radio behaved somewhat differently -- when you set
it to emit at a particular frequency, the emitted signal was up to
2kHz from nominal. So the authors built in an automatic calibration
step that involved three nodes: nodes A and B transmitted at close
frequencies f1 and f2. Node B was held fixed at f2 and f1 was varied
until the difference between the frequencies at a third node C was
observed to be minimized. This was pretty clever!

Another clever thing done in this paper was in the measurement of the
frequency and phase offset. Note that what they are sampling is NOT
the carrier frequency. If this were the case, then with a carrier
frequency of approximately 400MHz, the sampling rate would have to be
800MHz -- clearly impossible with an ADC that samples at 9kHz max. So
it was neat that the authors sampled the beat frequency instead --
especially since the two carrier frequencies were close, the beat
frequency was low and easily sampled at 9kHz.

There were some practical considerations that hampered this method.
In particular, the 80 mins it took to localize the nodes seemed a
little onerous, and impossible if nodes were at all mobile. Also, the
experiments were performed in a flat outdoor field, presumably with no
signal interference from other sources. Since this method directly
relies on the radio signal, lack of interference is necessary. This
assumption, however, is not all that realistic. Still, we marvel at
the sheer novelty of the method.