Filed under 'image processing':

Multicolr Search Lab, from Idée, Inc.


[via Legion]

Thanks to Garrett over at the new(-ish) Harvardian-run blog Legion for pointing out this seriously awesome use of Flickr and image-matching algorithms. The tool lets you search Flickr for photos that are predominantly one color, or a small palette of colors. The most beautiful results come from clicking just one or two colors–the individual pictures might not necessarily be spectacular, but seeing them all arranged on one page is magical, as the screenshot above might attest.

This reminds me of 80 million tiny images, a project at the MIT media lab. Only instead of sophisticated image-matching algorithms (a specialty of Idée Inc., the developers of the Multicolr app) the MIT researchers devote their computational resources to creating a map of semantic relationships between words. Basically, they pick thousands of English words and put them in a grid so that adjacent words have similar meanings. Then they fill the grid with thumbnails that represent the averages of the topmost hits on Google images when the word in a grid element is searched. The resulting color mosaic is pretty awe-inspiring, and has some interesting patterns. (I rambled quite a bit about this already.)

It’s interesting that while the theory of image processing can get pretty involved, successful commercial applications on these algorithms tend to be built around simple, elegant ideas that get refined progressively (think Google PageRank). For example, procedures for color-matching (among other visual parameters, like shape and texture) have been around forever, but only now are we seeing companies with the resources and design sensibilities to parlay this into useful and attractive products like the one above. I think what’s new is the access we now have to the huge datasets available on online image banks like Flickr or professional photo services, and the broadband infrastructure to be crawling these databases so extensively.

This is a cool example of the fact that, while technological improvements in computing and networking are necessarily quantitative (increased data-storage and bandwidth, with the overhaul in architecture once in a while), second-order changes in how we use that technology continue to be qualitative, dramatic, and mind-boggling. It doesn’t matter that we’re using image-processing concepts that computer scientists gave us 20 years ago–the fact that such a simple idea can be put through enough hours of seemingly inelegant number-crunching to produce something like the above is itself a profound and novel achievement. </gushing at the incomprehensible greatness of computers>

80 Million Tiny Images

Tiny Images Screenshot

Thanks to C.J. for sending me the following, incredibly cool link: 80 Million Tiny Images. It’s a mosaic of millions of online images corresponding to nouns in the English language, and the spatial arrangement of the images in the mosaic reflects their semantic relationship to each other–i.e. closer images represent words that are closer in meaning. From the page (which also contains a link to the research paper):

Each of the tiles in the mosaic is an arithmetic average of images relating to one of 53,463 nouns. The images for each word were obtained using Google’s Image Search and other engines. A total of 7,527,697 images were used, each tile being the average of 140 images. The average reveals the dominant visual characteristics of each word. For some, the average turns out to be a recognizable image; for others the average is a colored blob. The list of nouns was obtained from Wordnet, a database compiled by lexicographers which records the semantic relationship between words. Using this database, we extract a tree-structured semantic hierarchy which we use to arrange tiles within the poster. We tessellate the poster using the hierarchy so that the proximity of two tiles is given by their semantic distance.

The most interesting thing about the result is the remarkable degree of color agreement that they achieve. Despite the fact that each tile is the average of several photos of the same thing, the end result is often surprisingly recognizable, and close-by tiles tend to have the same color scheme. The overall mosaic, rather than appearing as a wash of meaningless color noise, has some fairly uniform blobs on it because of the semantic association of images. The one they give on the website (composed of 7.5 million images, it seems–not 8 million) almost looks like a hunched over figure of a person. I wonder what the full 80 million images put together looks like.

The real question is, when can I get this as a wall poster to put up in my room?