Help Me Trace Leakers
September 22, 2014 4:23 PM   Subscribe

I need to send a confidential email to a few hundred people, and I expect "leaks". No biggie, but I'd like to weed the recipient list before proceeding with this discrete project (nothing illegal; just best kept hushed until the appropriate time). My idea is to tweak punctuation, word choice, and spacing to make each version unique, so leakers (who'd likely cut/paste from the email) can be identified. Are there apps to make the job easier? Scripts? Suggestions? I could do this by hand, keeping a table to catalog the alterations, but it'd take hours.
posted by Quisp Lover to Computers & Internet (20 answers total) 7 users marked this as a favorite
 
that's called a canary trap. the most subtle and effective version would be handmade.
posted by bruce at 4:34 PM on September 22, 2014 [1 favorite]


I'm pretty sure this won't work, because there's no reason to expect you'd have the "paper trail" of the unique versions. You might never find out that the information was leaked, or if you found out for sure that there was a leak, you might never find the smoking gun forwarded email. It's also possible that the information could be leaked by someone who rephrases it or passes it along in a different format.

My suggestion to prevent leaks is to distribute the information in a PDF that would be watermarked for each recipient, or simply to not share this information via email. I work in a highly confidential environment, and those are our usual tactics for especially sensitive information. We do a lot more physical distribution on paper than is common for businesses nowadays because it's much harder for a hard copy to get around than for someone to forward an email.
posted by Sara C. at 4:36 PM on September 22, 2014 [3 favorites]


Best answer: Option 1: A few hundred (~500) people can be uniquely identified with just nine bits of information. You need a script that, given a piece of text that looks like this:

Hi[,|.] I want to [discuss|talk about] my [super-secret|brand-new] project with [you|you all].

spits out each possible variation once. Then you mail-merge them. Is that something you can write? It's pretty simple, but I don't know of one off the top of my head.

(But really, I think Sara C has the solution - don't try to push water uphill).
posted by Leon at 4:45 PM on September 22, 2014 [10 favorites]


Ya, unless you can get access to the "original", I'm not sure this would do you any good. Email is not confidential (unless you attach something which is).

If you really think you want to do this, you pretty much have to do at least a little bit by hand because a lot of automatic watermarking alternatives (synonym substitution) alter the semantics of the text.

One easy way to do this, which is akin to the spreadsheet method, is to number your recipients (or recipient groups) and then figure out log2(num recipients) features in the text to alter. Mark those features in the text, and then run a script to produce all the variations. You probably want redundancy of the features so snippets could still identify a smallish subset.
posted by smidgen at 4:48 PM on September 22, 2014


Are you concerned about leaking THIS email, or leaks in general, and this email is only one the potential issues?

If the latter, divide your group into a small set of buckets (e.g. five). Reduce your changes to irreducible facts that will not vary (e.g. not just a word choice), such as the date of an event or a numeric figure. Something that can't just be re-written or ignored. One change per bucket.

If you find a leak, now you know which bucket it came out of. Then you can narrow your hunt to just the people in that bucket. Send just those people emails that vary with the bucket scheme. Lather, rinse, repeat.

That said -- virtually anything you try can be defeated. You have to start with that assumption. Ultimately, this will likely be a pointless endeavor.
posted by Cool Papa Bell at 4:50 PM on September 22, 2014 [1 favorite]


On preview, that is, what Leon wrote much clearer than I did. :-)
posted by smidgen at 4:51 PM on September 22, 2014


If the recipients find out that you've done this, they are going to know that you don't trust them, and next time, you'll get more leaks, because hey, they don't have a trusting relationship with you, why not leak it? It would be better IMHO to include the same text for everyone along the lines of "I'm sure you understand the need for confidentiality on this and I therefore ask that you don't forward or discuss this email until after the embargo date"

If you must do it, then you could use Outlook to do a mail merge, and include a unique sentence for each recipient as one of the merge fields.
posted by girlgenius at 4:52 PM on September 22, 2014 [1 favorite]


The big problem I see here is that it relies on you being able to see the exact text of the email that was leaked. Typically journalists (or whoever, really) are more concerned about the actual content (or information) in the message than the text of the message itself, so it seems unlikely that making superficial changes to the text that doesn't affect the content would work to identify anyone. (That said, if someone does just post the email to pastebin or whatever, then they deserve what they get.)

The only way for this strategy to reliably work would be to distribute different content, which seems unfeasible if you want to be viewed as a reliable source of information. I'd say to make sure to remind people about the confidentiality of the information, or to find a different way to distribute the information, potentially involving lawyers and the signing of non-disclosure agreements.
posted by Aleyn at 5:37 PM on September 22, 2014 [1 favorite]


If you do vary the punctuation and phrasing, you need to do it in an especially quotable bit of the email. So you might want to embellish the prose a bit.
posted by ryanrs at 6:14 PM on September 22, 2014 [1 favorite]


Best answer: If you have a time for the event, then you alter that slightly as this is a draft to set up a project, not an announcement - 6pm, 6.15pm, 6.30pm etc. Or include a URL that links to a page about the project and have several versions of the url so you can look at the site traffic and see what got hit.
posted by viggorlijah at 6:49 PM on September 22, 2014 [9 favorites]


Response by poster: include a URL that links to a page about the project and have several versions of the url so you can look at the site traffic and see what got hit.


Bingo. Thanks.
posted by Quisp Lover at 7:12 PM on September 22, 2014 [3 favorites]


How savvy are you expecting your leakers to be? If you have a document to attach (I am thinking a pdf), it isn't hard to embed information. It is just time consuming.

The link method works well, but can look very obvious. A savvy leaker will copy the result page, so there won't be a huge discrepancy.
posted by troytroy at 7:24 PM on September 22, 2014


The link idea will only work if you have every authorized user's IP address, or if everyone is local to one place, would under no circumstances be visiting the link from outside your local area, and would not be leaking the information to other locals. It also puts confidential information right out there on the internet practically asking for someone to find it even if nobody from your email group is untrustworthy. It doesn't make sense to share something in the most public way possible in the attempt to keep it private.

Also, it assumes the leaker will copy and paste the link from the email you sent, and will leak the information by sharing a link rather than communicating it in another way. If I click that link and then text my friend "the secret meeting is on X date at Y location!" you'd never know I was the leaker.
posted by Sara C. at 7:32 PM on September 22, 2014


Mailchimp will let you do that, seeing who clicked on what links, but you'd have to be VERY sure of your subscriber list to not get marked as a spammer.

Sara C, what I meant was that if she hears that everyone is saying "Oh the event is at 6.15pm") or whatever detail she has, or spreading the XYZ-code link, then she can backtrack to that person.
posted by viggorlijah at 7:50 PM on September 22, 2014


Response by poster: Sara C-

It works if I personalize a link for each recipient (which is way easier than personalizing the email itself). All I need to watch for is several pings on any one page (above a reasonable threshold to account for one person viewing on multiple devices or from both work and home).
posted by Quisp Lover at 6:51 AM on September 23, 2014 [2 favorites]


Use Bananatag to track your emails.
posted by Mac-Expert at 7:57 AM on September 23, 2014


Response by poster: Mac-Expert, Bananatag apparently gives info about when/how recipients view my email. I don't see how that's helpful here. Am I missing something?
posted by Quisp Lover at 8:51 AM on September 23, 2014


It will give you similar tracking options as Mailchimp but this time you are sending the email from your own account and not a email marketing platform.
posted by Mac-Expert at 9:30 AM on September 23, 2014


It works if I personalize a link for each recipient (which is way easier than personalizing the email itself). All I need to watch for is several pings on any one page (above a reasonable threshold to account for one person viewing on multiple devices or from both work and home).

A link to a publicly accessible web page, no login needed, does not give the impression you are trying to keep a secret. This might encourage "leaks", because it seems like it's already public information.

Also, if people are viewing the page on a shared computer it's more likely that other people will come across it without even meaning to than an email.
posted by yohko at 11:32 AM on September 23, 2014 [1 favorite]


Also, unless your recipients are using a proxy server to view the web page it might be easy for someone else to find out the web page address, and the person you would be blaming for being the leaker would never even know.
posted by yohko at 11:36 AM on September 23, 2014


« Older Having Trouble Finding/installing Mac OS 10.10...   |   Chicago-based mover recommendations Newer »
This thread is closed to new comments.