Linus sequence

The sequence composed of 1s and 2s obtained by starting with the number 1, and picking subsequent elements to avoid repeating the longest possible substring. The first few terms are 1, 2, 1, 1, 2, 2, 1, 2, 1, 1, 2, 1, 2, 2, … (Sloane’s A006345). The Sally sequence gives the length of the run that was avoided. (From Wolfram’s Mathworld)

Published in: on May 29, 2008 at 9:06 pm Leave a Comment
Tags:

Commercial localization in shopping malls using GNURadio

I came across a company named Path Intelligence that is selling equipment to track consumer’s locations passively using their cellphone signals inside shopping centers, malls, etc. ostensibly for the purpose of tracking consumer behavior. Apparently, their ‘equipment’ is centered around the GNUradio platform and uses triangulation based schemes for localization. What is interesting to me is that:

(1) They manage to use signals from cellphone even when the phone is not in a voice call (There must be some beacons!?)
(2) They manage to simultaneously localize multiple phones.
(3) The company website claims accuracy of 1-2 meters

It appears that early adopters of their solution are in the UK. A large mall can be covered by 20 of their ‘boxes’.

The Privacy issue:
“The Information Commissioner’s Office (ICO) expressed cautious approval of the technology, which does not identify the owner of the phone but rather the handset’s IMEI code – a unique number given to every device so that the network can recognise it.”

So they don’t just track the pure signal but they also demodulate the IMEI code – a reverse lookup may reveal IMEI -> Phone number -> identity but would probably require the cooperation of the carrier. Unlike MAC addresses in 802.11 networks, the IMEI cannot be changed as easily. From a privacy perspective, it appears that the IMEI is therefore a really bad thing (it is unencrypted) as it could potentially allow one to be spied upon (location traces, etc etc.) There are various other privacy issues especially when such a technology is put into effect without having shoppers sign disclosure forms. The page on slashdot has many interesting user comments.
Some links are here, here and here.

Here is an email from Toby Oliver, the owner and CEO of Path Intelligence explaining the technology in an effort to allay concerns.

Here is a particularly insightful comment from Slashdot that brings out the ‘value’ of location information rather than worrying about privacy:

“My shop usage data have great financial value (otherwise the shops wouldn’t pay to install surveillance systems) and the shop’s surveillance is involuntary – I am not given a choice whether to allow them track me or not, except if I avoid transmitting wireless signals while near their shop. As the data collection is not voluntary and my shop usage data have financial value, I demand payment from shops using this system. I want a share of my shop usage data’s financial value.”

I think this makes a good case for researching PHY-layer based location privacy. Having simple omni transmitters is equivalent to relinquishing one’s privacy as well as volunteering usage data for free.

Published in: on May 28, 2008 at 1:56 pm Leave a Comment
Tags:

Deanonymizing data

Anonymity is a big deal these days. And it should be, because the proliferation of personal computing devices with multiple radio interfaces places individual privacy in question. Consider the problem of de-anonymizing the Netflix database released for the Netflix prize project. A recent paper from U. Texas showed that the records released as part of the database were not anonymous at all and given a little bit of side information, allowed easy identification of individual records in a simple way.

It is clear that some information must be removed from a database or a set of trajectories in order to prevent re-identification. But WHAT part of the information to remove is not so clear! I am pretty sure information theory must have something useful to say about this problem. In particular, the Information Bottleneck Method of Tishby et al might be useful place to look for answers. Pending job for the summer.

Published in: on May 16, 2008 at 6:08 pm Comments (1)
Tags: ,

Wireless Localization – problems and challenges

Wireless Localization
=============

1. The localization information must be given to the right ‘people’ (at the right time) – this relates to security and privacy issues

2.  It is important to carefully think about the roles and players in any localization system to avoid future engineering blunder in terms of securty, privacy and correct flow of informaiton, economic incentives, etc.

3. A number of legal/social isues exist: do the owners of a ’space’ (e.g. a college campus) have the right to know what wireless devices are in that space.

4. These issues are important to consider from an engineering persepcitve even though they may be left to ‘lawyers’ later so that we are able to provide ‘knobs’ or ‘controls’ from an engineering point of view that would allow us to implement flexible functionalities.

5. Players/roles: Users, Network operators, Space owners, Govt. , Application (incl. app. service provider), –> Who gets what information in important!

6. Key distictions: Algorithm and PHY-layer measurements

7. Collecting training data and updating it from time to time is a big problem — costly. So if we can come up witha method that avoids this that would be great!

Future challeneges:

8. Defining contracts between the players
9. Leveraging existing communication infrastructure
10. Improving the phy layer – cheap way to getr better PHY layer informationm (time, angle, RTT, RSSet, etc)
11. Connecting the ‘islands’ -> Interfacing different localization technologies/systems

Others:

1. The economics of wireless localization / network localization

2. Bootstrapping localization using non-fixed infrastructure – i.e. using clients themselves for localizing other clients to get a relative map of locations.

Published in: on May 14, 2008 at 9:36 pm Comments (1)

Dealing with an active intereferer in secret-key agreement

I gave a talk at the 3rd Rutgers-Helsinki PhD student workshop today and got some useful feedback from Marco Gruteser. He came up with the following attack: What is Eve transmits a oulse signal that momentarily causes the received signal at Alice as well as Bob to go above the threshold level q_+ ? This would allow Eve a way of forcing Alice and Bob to generate certain bits at certain instants of time. How can this be avoided. There seems to be a mounting pile of active attacks that I need to address. Perhaps I should consider working on addressing ‘active attacks in secret key agreement’. Some of the active attacks are clearly protocol-specific (Ee inserts a message of some sort that appears in the prorcol) and some are purely at the physical layer – of the type suggested by Marco for example. IT would be intresting to study what is possible and whayt is impossible from the point of view of an adversary messing things up.

Published in: on May 7, 2008 at 8:12 pm Leave a Comment
Tags:

Delay helps enhance PHY-layer spoofing detection

The standard technique for employing the physical layer to detect a spoofing attack is to construct a hypothesis test that tests some characteristic(s) of the received signal against the recent history of received symbols. Using a likelihood ratio test, the problem is transformed into a simple comparison of a test statistic with a suitably tuned threshold. However, the appraoch suffers from poor ROC performance – that is, it results in high false alarm probabilities for required detection probabilities, especially if the transmitter is mobile.

Intuitively, this problem arises because we have only one symbol to base the decision on – the most recent one. If instead, we were able to tolerate a delay, by creating an out-going queue of received messages, the amount of information available to make the decision could be increased. This would help lower the false alarm rate for any given detection rate.

Allowing a delay before declaring an authentication failure has another advantage to declaration based on a single bad received symbol. The latter approach allows Eve to continue masquerading as the legitimate transmitter in the event of a miss detection. This is because the test statistic is based only on the most recent received symbol [See Xiao et al.] so a single miss detection ensure that Eve goes undetected.

Published in: on May 6, 2008 at 8:13 pm Leave a Comment
Tags:

Cellular carriers provide location data without warrant

Here is a story I just saw on Slashdot about how cellphone companies give away location information to the police without a warrant for persons reported missing (which means they log location information of course):

Published in: on at 5:19 pm Leave a Comment
Tags:

Bootstrapping Authentication and Confidentiality

A famous chicken-and-egg problem that arises in any secret-key generation system is that extracting identical strings of bits by communication over a pubilc channel requires the channel to be authenticated. However, the avilability of an authenticated channel implies that the two users attempting to extract a key (to be used for encryption) already share a key (the authentication key!) – therefore the purpose of extracting a key is defeated.

In a seminal paper my Witsenhausen, it is shown that given correlated random variables X and Y possessed by Alice and Bob respectively, not even a single bit of secret information can be reliably extracted from X and Y without having Alice and Bob exchange a message. This is a problem because any messages exchanged over a public channel would ostensibly require Alice and Bob to have an authenticated channel available to them if they are to avoid an active attack by Eve. This implication (requirement of an authenticated channel) has been so far taken for granted. On a wireless channel, having exhcanged a series of probes in a TDD fashion, Alice and Bob have built up some statistics for what the received signal from the other user must look like. If Alice sends Bob a quick message in order to enable extraction of identical bit strings then the history of the past few messages can be used in a hypothesis test to determine (with some false alarm probabilty) whether it was sent by Alice or inserted by Eve.

A bigger problem for secret-key generation seems to be finding an application that would require two entities to establish an encrypted channel between themselves without requiring an authentication by a trusted third party. That is, at the beginning of the protocol, how do Alice and Bob know whether they are sending probes to the right entity and not a malicious intruder? In other words, the authentication afforded by the wireless channel at the PHY layer is only good for maintaining authentication but not for guaranteeing authentication at the start. It seems like good old certificates will be necessary for that (?) The only scenario I can think of where guaranteed authentication isn’t an issue at the start is when entities are simply talking to other (unknown) entities in an ad-hoc environment.

Published in: on at 2:29 am Leave a Comment
Tags: ,

Security is inherently a cross-layer problem

Does it make sense to implement a security protocol at one layer alone? Consider the OSI layered communication model. If a security suite is implemented at the application layer alone, it cannot possibly address attacks mounted on a lower layer, can it? This is because the application layer, by definition of the layered network architecture is imperceptible to the existence of lower layers. The logical next step is to propose that each layer have its own security suite. But is this a good approach? Meaning that does this effectively address the set of threats at various layers, or does it leave loopholes open to be exploited?

I feel a cross-layered approach to security is essential. Building in security into a communication system as an afterthought and in a layer-by-layer fashion is sloppy engineering!

Another point to ponder about is whether securing a lower layer has any (provable) effect on the vulnerability of higher layers. If it does, then it might serve us well to focus on a bottom-up approach to security.

Published in: on May 5, 2008 at 11:06 pm Leave a Comment
Tags:

Location Privacy

I attended some interesting talks at the 3rd Rutgers-Helsinki PhD Student Workshop on Spontaneous and Pervasive Networking (phew!) today. In particular, a talk by Marco Gruteser reviewing location privacy for various applications caught my eye. Location based services have been much talked about and are expected to take off (any moment now) in a big way. This introduces the problem of preserving user-privacy. Marco talked about an interesting class of problems that deal with preserving the privacy of location traces. The idea is as follows: Each mobile client periodically transmits its location to a central server, which then forwards this information on to a application service provider (ASP) that provides some location-based service to the user. (Think DASH!) However, the user wishes to conceal its identity to the ASP. Hence we would like to make it impossible (or very hard) to infer the user’s identity from observing location updates.

Marco’s group has proposed a centralized processing architecture for solving the problem, wherein the location updates from all users are first ‘anonymized’ by a central server (call it the ‘location broker’) by what can be termed verious signal processing techniques such as dropping a few samples, shifting the time stamps a bit, etc. Note that the degree of location privacy granted to a user is a coupled function of the information about other users that is revealed by the location broker to the ASP.

An interesting extension of the problem would be to engineer a system wherein the availability of a location broker cannot be guaranteed and users wish to solve the problem themselves, i.e. a distributed architecture for the problem of location privacy/ praivacy of location traces. Ostensibly, this would require some cooperation /message passing between the mobile clients because intuitively, the degree of anonymity enjoyed by a user is a function of the user density in its surroundings. Something to think about..

Published in: on at 10:14 pm Leave a Comment
Tags: