October 1, 2015 12:20 PM   Subscribe

What are these data:application... items ABP is reporting?

Over the past year or so, I've noticed a new sort of item showing up on websites whenever I look at AdBlock's blockable items list for the page, and I have no idea what they are or why they exist.

The addresses for these items are like this...
data:application/javascript;base64 or data:image/png;base64, and followed by a really long run of jumbled characters.

What are these?

Googling hasn't provided any understandable (to me) explanation of what they are, only that a lot of developers are using them on websites. To a non-coder like myself, something loading on a website with "application" in the name sounds a bit ominous. Normally, I can block them and a site still works just fine, but the main link in this FPP doesn't work at all unless I allow a data:application item load.

posted by Thorzdad to Computers & Internet (6 answers total)
Normally JavaScript and images are in separate files from the HTML for a web page.

For very small JavaScript files or images, it is now possible to embed them in the HTML itself. This saves an extra trip to the server.

It also makes AdBlock Plus's job harder since ad-related images or JavaScript are right there inside the page.

It's possible that nefarious advertisers are embedding images and JavaScript in HTML specifically to avoid AdBlock Plus, although I know nothing about that.

The application/javascript is nothing to worry about, that's just the universal code for JavaScript which is a key part of every modern web page.
posted by miyabo at 12:32 PM on October 1, 2015 [1 favorite]

Traditionally a page would refer to resources like scripts and images by using a link to a separate file on some server (maybe even not the same server), and your browser would have to go and open a new connection to get each of them. Instead of including a link to an external resource (<img src="">) , what you're seeing is the actual contents of that file are being embedded in-line in the actual html itself. Doing that has a lot of advantages for performance, reliability, and even privacy, but also make ad blocking harder (which is a bonus for some and a downside for others). Doing this also requires some hassle on the part of the host, in terms of getting and updating the ad content themselves.
posted by aubilenon at 12:35 PM on October 1, 2015

This isn't such a great analogy, but in case the above explanations don't entirely clarify it... remember how back in the 1900s, because of the different cost between black-and-white printing and color printing some times you would come across a book that had all of the photographs together in a short series of glossy pages in the middle of the book, and in the rest of the book in the narrative black-and-white text you'd have notes like [SEE FIGURE 4 ON PAGE 163] in place of where the author actually wanted the photographs to appear?

The above is sort of how the code of web pages originally were required to work, and this is a new technology that's kind of like the color pictures being printed directly in-line with the text where they belong, as it were, with the various implications that miyabo and aubilenon describe.

† Not very new really, as the Wikipedia page mentions the rules for how it would work were first proposed in 1998 and it's taken this long for all the browser vendors to implement it and all the people publishing web sites to start using it extensively.
posted by XMLicious at 12:58 PM on October 1, 2015 [2 favorites]

For what it's worth: These days, almost every website can be described as an "application." The distinction between a "site" that just presents data and an "application" that processes data is largely moot, because even sites that just present data usually do so by using application code rather than just simple static HTML.
posted by Tomorrowful at 1:26 PM on October 1, 2015

tomorrowful: That may be true, but it isn't the correct explanation for this particular use of the word application, though.

These are data urls, instead of the way more common http urls. Data urls start with the MIME type of their contents. MIME types have a top level type followed by a subtype (e.g., image/png). The top level type is a fixed set (application, audio, example, image, message, model, multipart, text, video). The spec defines the application top level type as "some other kind of data, typically either uninterpreted binary data or information to be processed by an application." which means "everything else" which means it doesn't mean much. Zip files, spreadsheets, and PDFs would also be tagged as application/<whatever>, if you could think of a good reason to embed them in a web page (please don't).
posted by aubilenon at 2:35 PM on October 1, 2015

Tomorrowful: That's not completely true of what's downloaded to the browser, though. I use uMatrix to browse with javascript, cookies, CSS, frames, and nearly everything else turned off and apart from video websites (which I use youtube-dl for) I've only encountered a handful of sites where nothing at all is visible. The comment sections frequently don't work because they're outsourced, and of course things you'd actually think of as an application like email won't work, but for sites that are mostly text or images if they try to have the core content generated entirely by client-side scripts they risk making it inaccessible to they types of browsers that disabled people use and causing other problems.

(There's no fundamental reason that this state of affairs is going to last, it just hasn't gotten that bad yet if you understand what's going on and don't want to execute code locally from every site you visit and all it's advertisers. Most of the things I click on from Metafilter's front page, for example, I can read without loosening my restrictions. All on a professional white background, too!)
posted by XMLicious at 2:40 PM on October 1, 2015 [1 favorite]

« Older Can Interior Primer Be Used On The Exterior?   |   Klassik Metal \m/ Newer »
This thread is closed to new comments.