How to resist data mining offline?
October 15, 2009 8:13 AM   Subscribe

PrivacyFilter: Can you think of examples of offline anti-data mining behavior? We can encrypt our activities, use Tor, etc., online, but how do people try to stymie data mining in daily life? I'm thinking of groups sharing a "loyalty card" (like these guys) to break up the patterns of their shopping habits, and putting incorrect information into surveys and forms to make things harder to correlate (in theory), tack-in-the-shoe style. Other examples? (And, if you know more about data mining, do these methods actually accomplish anything?)
posted by finnb to Technology (23 answers total) 7 users marked this as a favorite
This is pretty low-tech, but whenever I get asked for any personal information by a retailer (other than a zip code for credit card verification), I either refuse to give it to them or I make something up.
posted by Inspector.Gadget at 8:27 AM on October 15, 2009

I would distinguish between strategies that avoid data mining from those that aim to subvert it. The easiest option would be to decline to use loyalty cards and to refuse to provide personal information in other circumstances. I think accepting the benefits of a loyalty card (points for free stuff?) while rendering worthless the data that you are meant to be exchanging for those benefits is kind of a dick move. I understand where you're coming from, but I wouldn't feel right screwing up their data unless I was being compelled to provide personal information. Plus, by providing fake information you're missing the opportunity to voice your disapproval...
posted by onshi at 8:36 AM on October 15, 2009 [2 favorites]

The loyalty card swap would only potentially work if you always paid in cash-- If I was to build a datawarehouse, I'd certainly try and tie the loyalty card to a method of purchase, last four digits of a card, hash of the entire number, whatever the law would allow. That way I can at least get gender related information (card registed name, or even address)-- or, if I share my data with other shops/industries, I can drill down on your habits much further.

Also, if I share the data wide enough, there's a possbilitity I can tie it back to a personal identifier, SSN, and then further use that to find other instances of your address being mentioned.

With that information I could give a weight to how much I believe certain information, and if the addresses I found elsewhere won, I'd essentially overwrite the data that's stored in the card itself.

At least a few years ago, the data-warehouses surrounding loyalty cards were some, if not, the biggest commercially run datawarehouses in the world. I'd be surprised if that has changed. Obviously I'd be far more weary in the United States, european data protection laws tend to negate some of the risk of company to company data exchanges.

I'll leave you with a nice quote:

"As every man goes through life he fills in a number of forms for the record, each containing a number of questions... There are thus hundreds of little threads radiating from every man, millions of threads in all. If these threads were suddenly to become visible, the whole sky would look like a spider's web, and if they materialized as rubber bands, buses; trams and even people would all lose the ability to move, and the wind would be unable to carry torn-up newspapers or autumn leaves along the streets of the city. They are not visible, they are not material, but every man is constantly aware of their existence.... Each man, permanently aware of his own invisible threads, naturally develops a respect for the people who manipulate the threads."

Alexander Solzhenitsyn (1968)
posted by Static Vagabond at 8:39 AM on October 15, 2009 [4 favorites]

I always make up a phone number when asked at the counter. Also, when I fill out any sort of membership / bonus thing, I don't use my real name, sometimes just initials or a funny name.

Onshi: I think it is a dick move for companies to only provide benefits to loyal customers by demanding all sort of invasive information about themselves and their purchasing habits. Even worse move when loyal customers personal information is then sold to other companies.
posted by RajahKing at 8:42 AM on October 15, 2009 [2 favorites]

I go out of my way to pay cash for everything.
posted by JohnnyGunn at 8:45 AM on October 15, 2009

RajahKing - As Gandhi never would have said, trading a dick move for a dick move makes the whole world blind. Or something like that?

Subverting something large-scale like Air Miles wouldn't bother me, but I had independent or otherwise less sophisticated loyalty schemes such as those run by small businesses in my area in mind. The data from a well-considered loyalty program can make a huge difference to small businesses, which not only improves the odds that such places can survive but could also result in a much improved customer experience. In these cases, either protesting or (better still) engaging with the proprietors to push for a system that maintains your privacy while still providing useful data to inform retail strategy strikes me as a much more fruitful approach.
posted by onshi at 8:51 AM on October 15, 2009 [1 favorite]

Response by poster: Awesome -- thank you all so far.

I should have mentioned that I'm an academic, and I write a lot about how people adopt and adapt and resist technologies (and their ideas of technologies). So I'm setting aside ethical considerations and even practicality for now to cast a wide net and see what kind of practices are in play here. Those considerations are really important, of course -- they're a key part of the conversation around these new systems -- but I'm just starting to dig into this stuff (this is literally day one) and beginning with what people do, or think about doing.

And the point about distinguishing between avoiding data mining (refusal, blocking moves) and trying to subvert it is well taken. The latter is particularly interesting, not least because it plays into people's ideas of how dm'ing works and how they can interfere.

Thanks as well for that fantastic Solzhenitsyn quote! I've never seen that so beautifully expressed before.
posted by finnb at 8:56 AM on October 15, 2009

When they ask for my phone number, with area code, I always say "Three one four, one five nine, two six five four." Nobody ever bats an eye. Cash, just about anywhere I can get away with it. Follow the spammers — anywhere there are spammers, you'll find ways to game authentication. History is full of examples of registration lists abused by, well, anyone who can get ahold of them, really.

And then I'll turn it over to Hodgemeyer: "Give incorrect information ... everywhere. And never use your real name."
posted by adipocere at 8:56 AM on October 15, 2009 [2 favorites]

Response by poster: Sorry, should have previewed -- onshi, you make a really good point about the different parties and their scales of operation. It's often a more case-by-case decision than a blanket act.

And the thing about spammers is great. I wrote my dissertation on the history of spam, and somehow never thought in the current project about all that work they do to automate ginning up identities, gaming CAPTCHA and trying to create the appearance of legitimacy for false profiles. A great example of a (particularly unethical) set of methods partially applicable to this. Hmmmmm!
posted by finnb at 9:02 AM on October 15, 2009

Unless there is a specific need for the store to contact you (like a special order), there is never any need for them to take any contact info from you, no matter what they claim.

I think it is a dick move for companies to only provide benefits to loyal customers by demanding all sort of invasive information about themselves and their purchasing habits. Even worse move when loyal customers personal information is then sold to other companies.

There was a great comment here about a year ago from someone who ran a supermarket branch, who basically said that they accumulate more data than they can deal with at the individual level, and that it's thrown out once they run their aggregation tools on it.

The primary usefulness of tying purchases to individuals is not, in other words, to turn it into some kind of sellable data, but to identify buying trends to optimize inventory. Remember, the retail operations that tend to have loyalty programs are extremely low-margin, high-volume. They really don't have the time or resources for the nefarious stuff you're imagining.
posted by mkultra at 9:02 AM on October 15, 2009

The triplet {date of birth, age, zip code} resolves to a very few people. Therefore, I lie about at least one of them or just refuse to give the information.
posted by jet_silver at 9:16 AM on October 15, 2009

They really don't have the time or resources for the nefarious stuff you're imagining.

But plenty of other entities do. And many of them have plenty of money, and plenty of smart people to think of new things to do with the data.

I ALWAYS lie when I have the opportunity to do so. I think it's even better than refusing to answer -- when you refuse, you're just being a dick and you needlessly add hassle to some other peon's life. When you lie, you (may be) damaging the data of the people you really hope to get at by refusing to answer.

I asked a smart friend who mines data for a living, and he said that lying probably does reduce dataset value. (I guess the countervailing consideration is that if they get an immutable data key that you _can't_ lie about, such as a credit card number, then they can ignore the other stuff, or even detect that you're a habitual liar.)
posted by spacewrench at 9:17 AM on October 15, 2009 [2 favorites]

Response by poster: The thread mkultra mentioned is here -- on loyalty cards -- with SpecialK's answers being the really interesting ones in terms of data storage.

(The American Express profiling case is interesting here as well, to take a different business on a very different scale of operations: using shopping habits to do risk analysis for lowering credit limits. The parallel example the NYT mentions for this sort of analysis is kind of hilarious, with "Marriage counselors, tire retreading and repair shops, bars and nightclubs, pool halls, pawnshops and massage parlors" being signs of a bad credit risk according to CompuCredit.)
posted by finnb at 9:22 AM on October 15, 2009

Regularly using the same loyalty card allows the stores to gather data on the habits of a purchaser, even if they can't link that to you as an individual.

i don't bother with carrying these cards, and generally ask the person behind or in front me in line if I can use theirs. At many stores the cashiers will put in a number if you do not have a card.

As to places that ask for a phone number or zip code, I don't understand the need to make one up. Nothing terrible happens if you say "I would prefer not to give out that information" in a friendly way. Occasionally the cashier experiences a moment of clashing confusion at this, which I like to think is a good thing -- then I suggest that they put in their own phone number if they must have one.

do these methods actually accomplish anything?

What do you want to accomplish? I see two different possible goals, keeping your individual information private (ex: preventing your health insurer from seeing what you buy at the grocery store), and preventing industry from doing large scale analysis of buying habits across a population. The latter is probably difficult to impossible at this point - there are probably demographic categories of 'people sharing shopping cards', 'people not using a card', and 'people paying cash at store without cards that attempts to track by credit card info instead', and what their buying habits are.

For individual privacy, pay cash and pay attention to what information you give out. Pay special attention to this if you are buying health care that might affect your ability to get insurance later on (which is nearly anything).

In the long run, with the advances in facial recognition and drops in price of processing power and disk space, I wouldn't be surprised if eventually a store (or the police, or the repressive government you disagree with, or your health insurer) can tell who you are in when you walk in the door, compare your photo to archives crawled years ago when you had your flickr account linked to your metafilter account, see that you've just asked a question about why your arm hurts and show an and for a Tylenol while selling the data to your health insurer.
posted by yohko at 9:27 AM on October 15, 2009 [1 favorite]

Unless I want to get mail from a company for some reason, I give them my address as 1060 West Addison, Chicago IL 60613.
posted by ROU_Xenophobe at 9:44 AM on October 15, 2009

I don't have it anymore, but for a decade or so I frequently gave out my fax number to these type of folks. Since I used that number only to send faxes, every time it rang I knew it was from one of them, which never ceased to be momentarily amusing.

Petty, I know, but it WAS my number. I didn't lie!
posted by Pufferish at 10:47 AM on October 15, 2009

How about patronizing businesses that do not practice the sort of data mining that one might dislike? In Denver, Colorado USA, Albertsons and Wal-Mart sell groceries and they don't have shopper cards. Safeway sells groceries and has shopper cards. So I prefer Albertsons. (They used to have shopper cards but then they decided to get rid of them.)
posted by massysett at 11:21 AM on October 15, 2009

When asked for a phone number, I say, "I don't have a phone." This usually stuns the person into complete silence. If they still press that they "must" enter something, I say, "Please enter your own phone number, since I do not have one." This has worked every time for me, and I've been doing it for about 10 years now.

For the loyalty cards, I don't submit the application. The card still works, at least at CVS drug store, and Randalls and Kroger grocery stores.
posted by Houstonian at 11:33 AM on October 15, 2009

One tip I remember reading in 2600 was to add an apartment initial to your address assuming you live in a house you can add a different letter after the street number 2442K, J, etc. This way your mail will still get to you, and if you keep track of what letter you give to whom, you can track who leaked your address. Not exactly complete resistance, but it's a good method for tracking exactly how junk mail arrives in your mailbox.
posted by Locobot at 11:44 AM on October 15, 2009

@jet_silver: you mean {date of birth, gender, zip code} since dob and age are basically the same thing. As discussed here.
posted by BrokenEnglish at 1:23 PM on October 15, 2009

I also pay only with cash for ~98% of things I buy. On the other hand, I'm taking money out of ATMs, so those movements can easily be tracked.
posted by beerbajay at 3:59 AM on October 16, 2009

This really isn't even a privacy issue- if they can see you, you are not in private. This sort of data mining doesn't care about you, it cares about the aggregate.

And please don't lie about phone numbers and addresses. I assume I can thank one of you for the addition of my address to the "Dairy Goat Journal" mailing list.
posted by gjc at 5:38 AM on October 16, 2009

Mod note: few comments removed - OP os not asking for your personal opinions, please don't turn this thread into an argument, thanks.
posted by jessamyn (staff) at 7:25 AM on October 16, 2009

« Older Should I remove my name from a project since I...   |   Humpty Dumpty played with a drive... Newer »
This thread is closed to new comments.