How to algorithmically identify pretty pictures?
November 1, 2013 6:20 AM   Subscribe

I would like to be able to automatically sort a large amount of images based on pretty or interesting pictures vs ugly or boring pictures (not resolution or jpeg compression or any more 'objective' measure).

I understand this is subjective, but wonder if any work has been done with machine learning or something similar. This is for an online service with lots of users and pictures, so mechanical turking is not really an option.

If there's any actual research or libraries, great! If there's no actual work, any ideas you have on how to go forward are more than welcome. Python is my language of choice.
posted by signal to Computers & Internet (18 answers total) 1 user marked this as a favorite
 
If I were to tackle this project, I would base it off of the Golden Ratio and the Rule of thirds. These are both quantitative numbers that computers like. How you apply them would be up to you, though.

Good luck!
posted by jillithd at 6:37 AM on November 1, 2013 [1 favorite]


The place to start with this is by getting a PhD in machine learning from MIT. There is no useful existing work in this area. It is something nobody currently knows how to do.
posted by tylerkaraszewski at 6:57 AM on November 1, 2013 [4 favorites]


Flickr lets you explore by interestingness. They use a fair bit of non-image data to compute interestingness, though: "Where the clickthroughs are coming from; who comments on it and when; who marks it as a favorite; its tags ..."

This paper used the Flickr "interestingness" as one of their criteria in constructing their training set, using whether a photo was deemed interesting by Flickr or not:

Miriam Redi and Bernard Merialdo. 2012. Where is the beauty?: retrieving appealing VideoScenes by learning Flickr-based graded judgments. In Proceedings of the 20th ACM international conference on Multimedia (MM '12). ACM, New York, NY, USA, 1363-1364. DOI=10.1145/2393347.2396486 http://doi.acm.org/10.1145/2393347.2396486

(The Flickr API lets you retrieve interesting photos, through flickr.interestingness.getList, and flickr.photos.search has options to return search results sorted in order of interestingness.)
posted by needled at 7:04 AM on November 1, 2013


I would define a few measures that you can extract from each image, that you think might correlate to beauty/interest. Roughly I would throw out some vague concepts like color variation, repeated patterns, straight lines, sparseness vs denseness, etc (or following jillithd, appearances of golden ratio and rules of thirds). Each one of those measures can be evaluated as a a big huge matrix, or just one single number. As you like.

I would then feed all of those measures into a neural network, and take a large set of pictures that you, or someone else has rated as being beautiful/ugly, or boring/exciting (either binary or on a scale), and then train the neural network.

Not sure how well it would work out. Deciding on and then creating those measures will be the hardest part I would imagine.
posted by molecicco at 7:10 AM on November 1, 2013


Everpix claims to do this, picking highlights out of large numbers of photos. Not surprising they don't describe their algorithms.

Easier than "interesting" would be to deprecate poor photos. For example, it should be relatively easy to identify photos that are out of focus, under exposed, or over exposed. I have been told that it is relatively easy to tell if someone in a photo has their eyes closed or open, so you could promote group photos in which everyone has their eyes open and deprecate photos in which some people have their eyes closed.
posted by alms at 7:18 AM on November 1, 2013 [1 favorite]


BTW, my personal experience of Everpix is that the photos they choose for highlights are indeed good photos.
posted by alms at 7:18 AM on November 1, 2013


Pretty is subjective, so for starters, you'd have to build a robust training set.
posted by oceanjesse at 7:25 AM on November 1, 2013


If I were tasked with this, I would start by separating images of people, landscapes, and so on.

Here is an example of considerations that are relevant to one category but not the other:

With human beauty, there is a strong universal preference for averageness. The more average the face (for example, a large number of faces taken to form a composite face), the more ideal it is perceived to be (link).

With natural beauty, there is a strong universal preference for certain types of landscapes. Regardless of the landscapes that we grew up with, we seem to enjoys vistas that look like the Pleistocene savannas where we evolved (link).

Once you've established these categories, addressing beauty parameters of each seems more manageable.
posted by rada at 7:32 AM on November 1, 2013


All of the above suggestions would result in sets of images that look "similar to one another", not "pretty" or "interesting".

Machine learning has not progressed anywhere near the point where what you're asking for is possible.
posted by ook at 7:47 AM on November 1, 2013 [1 favorite]


The ACQUINE engine seems to do what you want. Unfortunately the paper describing it doesn't give enough detail to reproduce it, and the online demonstration appears to be dead. You might be able to find similar algorithms by searching google scholar for "acquine engine" and reading papers that cite the one linked at the website above.
posted by logicpunk at 7:52 AM on November 1, 2013


Best answer: The ACQUINE algorithm is described in more detail here. Looks like they attempted an approach more or less like what molecicco suggested above: "interesting" is defined as "colorful + in focus + rule of thirds + properly exposed" etc, toss it into a neural net and use human ratings to train it. The conclusion notes that "certain extracted features did not show a significant correlation with aesthetics".
posted by ook at 8:18 AM on November 1, 2013 [1 favorite]


I really, really think you should work with your users here. Figure out some way to get your users to rate pictures, and use those to sort them.
posted by zscore at 9:04 AM on November 1, 2013


Best answer: 1. Develop a theory of visual beauty.
2. Select an appropriate type of classifier for this theory, and train it somehow.
3. Apply the classifier to images.
(4. Yes, if you were to get here, you would certainly profit.)

Note that 1,2 are unsolved and entirely non-trivial; the combination of them even spans many fields. If I had to take a stab at this* I would use human judgments as a partial proxy for 1: (i) select a specific genre so as to minimize the number of features you might be working with (e.g. outdoor scenes), (ii) present many examples of this genre to many subjects on mechanical turk to get ratings of "beauty" on a simple 1-dimensional scale, (iii) take a list of potentially relevant features from work on visual perception in that genre (e.g. start here or maybe here), possibly starting with the simplest brute force features like broad image statistics. Then (iv) build a classifier (a possible starting point google search) that uses these features to try to match the human judgments about the images. Then, hopefully (very hopefully), the classifier will generalize to novel images. It won't, which would lead to iterative development/research (especially in terms of trying to figure out what the right features are, and generalizing across types of image).

* Seriously, this would be an ambitious research agenda, not a feasible practical project.
posted by advil at 10:32 AM on November 1, 2013


Neural network training is subject to surprising results. Back in the day when neural networks were something of a fad in computer circles, there was a project in DOD to try to use a neural network to evaluate aerial photos to look for Warsaw Pact armor. They created a large library of aerial photos that include Warsaw pact armor and another library that was NATO and trained the network (they thought) to tell them apart.

But when they tried to use it to evaluate new photos, the results were terrible. Someone dug into the neural network internals, and found out the problem. Turned out that all the NATO pictures had been captured on sunny days, and all the Warsaw Pact photos on days which were cloudy. The neural network had come to the conclusion that NATO tanks had shadows, and Warsaw Pact tanks didn't have.

Which was true for the photo sets used to train it, but not a very useful result.

I think it's unlikely that the OP's project can be done algorithmically, because I don't see any way to come up with a rigorous definition of "attractive", and without that you're nowhere.
posted by Chocolate Pickle at 10:43 AM on November 1, 2013


Best answer: I think the other replies here are too pessimistic. The important thing to consider is that the user experience is important for most software projects. A key part of the user experience are the expectations created around the feature, and the way the user interacts with it. Those expectations and the interaction can be shaped and managed. With proper management and shaping of user expectations and the right interaction design an "artificial intelligence" based feature can be successful when the algorithm generates a >50% false positive rate and a false negative rate of 90% or more.

As proof, I give you, the Google. In the early days, and even today, most users for most searches consider a search reasonably successful if one of the results is useful. That is a false-positive rate of 90%, but it works, because the summaries presented allow the users brain to quickly find the most relevant results. Moreover, if Google successfully delivers one or more relevant results, the user is unlikely to notice if they missed 5-10 pages that were equally or more relevant.

So, in your case, the you need to position the feature in such a way that users are delighted by your success and understanding of your failures. Part of the solution is to show the users multiple candidates for beautiful images at a time, at a sufficient size that they can pick out the truly beautiful images at a glance. Another part is to tune the results with the understanding that a high proportion of false negatives and false positives are Ok as long as the rate of true positives is high enough.

As for algorithms, i have no idea. I am sure there are plenty of cookbook feature extraction algorithms. I'd put together a training dataset of beautiful images, bland images, and ugly images and then run them all through all the feature extractions. Then I'd start exploring, looking for obvious patterns with clustering, and/or pushing things through something like random forest with various weightings.

For what it is worth, checkout the python scikit-learn and scikit-image packages.
posted by Good Brain at 1:47 PM on November 1, 2013


Best answer: You mention in your question that prettiness is subjective, and therein lies the problem. Computers don't do subjective. They can sometimes fake it if you give them enough data to train on (e.g. give them a bunch of pictures that humans have rated as interesting or not, and then feed that data into a classifier), but it's still an approximation, and any pictures that don't fit the model the training generated would still be overlooked.

If you want something that is possible to implement, find some sort of objective criteria to pick out. An easy one would be prevalence of a given color, or whether the image has a high contrast (which can be accomplished with some histogram analysis). A slightly harder one would be whether there are sharp edges in the picture (you'd probably use a Fourier transform to figure that one out). Or, if you do pick harder classifications, pick ones that *have* had a lot of work put into them (like facial or object recognition) so you can hopefully find a solution where a lot of the legwork has already been done for you.

You say you have a lot of users; I suspect you'll get more useful results by adding a mechanism that allows them to rate or like pictures and then sorting by popularity or rating. This has its own issues, but again, you've run headlong into some pretty tough machine learning territory.
posted by Aleyn at 6:16 PM on November 1, 2013


Okcupid did a post a few years go about their user data on attractiveness and image attributes. Not sure if this is too narrow of a focus (dating profiles) but it's interesting.
posted by aspen1984 at 11:38 PM on November 1, 2013


Is there a user behavior that you could use as a signal for interesting photos? e.g. can you have an upvote button, or get them to play a "game" where you show them two results and they click on the one that is more interesting?

If so, you might be able to train an estimator that doesn't use the photos characteristics at all, it just relies on experimenting with showing users unknown photos some of the time in order to test out which ones are interesting.

You could possibly try to gamify the whole experience, let users earn points when the images they upvote get lots of upvotes, which would encourage users to do the exploration of content for you.
posted by vegetableagony at 11:42 AM on November 14, 2013


« Older What is this heater, and how much could I expect...   |   Nervous cat advice: Over-grooming Newer »
This thread is closed to new comments.