Join 3,523 readers in helping fund MetaFilter (Hide)


".cPr" instead of ".com" - what gives?
October 26, 2011 12:48 PM   Subscribe

The user's actual email address ends with ".com", but it is POST'ed to the application as ".cPr". This has happened with multiple users. Any ideas on what would cause that?

I have a small, free web-based application. As part of a recovery-type feature, users can (optionally) enter their own email addresses. For some reason, I've been getting multiple cases where a user's email domain ends up as "gmail.cPr", "yahoo.cPr" or "[whateverdomain].cPr".

I've also noticed that ".edu" gets changed to ".ePr" as well.

I've gone through my code and I can't see how this is being mangled on my app's end. From looking at the timestamps and mangled addresses, it is possible that these users are in a single location.

Could there be something on the user's computer (perhaps an alternate keyboard layout or language setting or who-knows-what???) that would generate these weird changes?

Has anyone heard of this before?

FWIW, the app uses the Django framework.
posted by foggy out there now to Computers & Internet (14 answers total) 2 users marked this as a favorite
 
Doubtful it's layout or keyboard issue seeing as how e"du" and c"om" should by that theory come through as different strings. With out look at your code, it's difficult to say, but my first suspect would be JavaScript.
posted by pyro979 at 1:01 PM on October 26, 2011 [1 favorite]


I'll also bet that you're doing something fishy in the javascript. Are you doing client-side validation of the email address?
posted by le morte de bea arthur at 1:06 PM on October 26, 2011


Are you storing the responses in a database? Any chance the field isn't long enough to hold these email addresses? I don't know about the Pr part but if it were an encoding issue or something like that it would be unusual for om and du to come out exactly the same.
posted by missmagenta at 1:08 PM on October 26, 2011


Its also possible they're being truncated and mangled by something else (like javascript). I would start by comparing the lengths of the affected email addresses. If they're different lengths you can at least rule that out.
posted by missmagenta at 1:10 PM on October 26, 2011


There's no javascript on the form. No validation except on the server. Just a simple web form.

Here's the form line for the email.

<p><label for="id_email">Email</label>: <input type="text" name="email" id="id_email" /> (optional)</p>

Also, the users seem to figure out that they've entered it incorrectly. They eventually correct it to [domain].com.

I'm stumped.

Email addresses are stored in a database, but these addresses are way shorter than the limit.
posted by foggy out there now at 1:11 PM on October 26, 2011


You say that the timestamps make it look like they may have all come from the same location. Is there a chance that these are automated registrations being performed by a bot?

You aren't doing any JavaScript handling of the email addresses, but maybe the bot is when it harvests them before registering.
posted by 256 at 1:32 PM on October 26, 2011


Maybe users in that location are using a non-English keyboard and they're accidentally entering a character that looks like an o but isn't (e.g. Greek small letter omicron: ο, Cyrillic small letter O: о), then it's getting mangled by some non-Unicode-aware code?

Try entering "foo@bar.cȏm" and see how it comes through.
posted by The Tensor at 1:47 PM on October 26, 2011


I looked at the server logs again. The latest event today came from a university. The email addresses are all gmail.cPr, yahoo.cPr or [universitydomain].ePr. This university's domain appears incorrectly in our database numerous times over the last year, however they aren't the only one.

Judging from the content that these people uploaded into my web app, it appears to be part of a school project or assignment. It is entirely possible that they were working from a school's computer lab, or over the school's network.

All of the students email addresses were incorrect the first time they created a [widget]. Then a few minutes later they created a new [widget] and it was added correctly.
posted by foggy out there now at 1:49 PM on October 26, 2011


Try entering "foo@bar.cȏm" and see how it comes through.

"foo@bar.cȏm" is submitted to the server, but Django returns a validation error with a polite note to the user. Nothing was added to the database. Expected behavior with that test.
posted by foggy out there now at 1:55 PM on October 26, 2011


disclaimer: I don't know Django, but I'm a very error-prone beginning coder who has strange things happen all the time.

If this were my code, I'd look for any command or function that had Pr at the beginning of it (Print? Project? Process?) and backtrack to see if there's some missing quote or extra bracket that would cause the email address to truncate and the first characters of that word to be added.
posted by ladygypsy at 2:15 PM on October 26, 2011 [2 favorites]


Is the manglement in the webserver's access log for the request? If it's not, then it happens after there, else before.

Any chance the form has some characters in it when you first load the page, but that they are cleaning it up on a second submit?

Is this publicly facing where we can look at your source?
posted by Mad_Carew at 2:23 PM on October 26, 2011


I'm with ladygypsy - I'd start by grepping through the whole codebase looking for "Pr"

grep -R "Pr" *

If you're lucky, that'll give you a couple dozen starting points and from there, you can continue to narrow it down to find some wonky code run amok.
posted by chrisamiller at 3:39 PM on October 26, 2011


What happens if someone leaves off the .com? Like if I go to your form and just type in "myemail@gmail"?

(This used to be a much bigger problem back in the AOL days, because AOL encouraged users to confuse their usernames with their email address, under the rubric that it was "easier.")
posted by ErikaB at 5:54 PM on October 26, 2011


I would try to find a pattern in the submissions, either IP, user agent, or maybe some sort of proxy server. I've seen proxy servers that mangle data they process (I assume this was a bug), "download accelerators" that click links on pages at random, and so forth. (And don't underestimate the weirdness of the things bots do.)
posted by fogster at 6:03 PM on October 26, 2011


« Older How can I mitigate/avoid misdi...   |  Is a dehumidifier the right so... Newer »
This thread is closed to new comments.