About this entry

First look at Cuil: search results are polluted with unrelated images

After the recent announcement of another Google search competitor, Cuil, I gave it a short – unscientific – test run today because the Cuil servers were reportedly out of service due to an unexpectedly high amount of visitors in the first day(s).

My first impressions were mixed. My results were not as bad as others reported, but also not as good as I expected (from any web search engine). I don’t want to talk about the search result quality in this post however – I want to try Cuil a bit longer before coming to a first conclusion – but I’d like to point out something interesting that I discovered while doing random searches with Cuil.

Observation 1: Cuil presents search results together with foreign images, i.e. a web page http://yourwebsite.com/ is shown together with images from http://differentwebsite.com/.

It’s not only that a web page and its snippet (a short excerpt of some of the web page’s text) and the “foreign” image are shown next to each other – which might just have been a web design bug – but the “foreign” image also links to the original web page, giving the impression that said image really belongs to the web page and originates from there.

Observation 2: Cuil hyperlinks those foreign images to point to the original search results, i.e. readers will get the impression that those images have been extracted from the search result pages directly.

For example, I searched Cuil for “Safer Internet Project”, which is both the name of a project I’m working on plus a commonly used name for a related European initiative to create a more secure Internet environment. Have a look at the screenshot. I cleaned it up a bit to make it more readable (Cuil’s 3-column layout makes for some huge screenshots, so I removed everything but the center column and snipped one search result from the center to make the screenshot less large) but other than that it’s 100% Cuil.

Screenshot: Cuil search results are presented with unrelated, foreign images (July 2008)

Screenshot: Cuil search results are presented with unrelated, foreign images (July 2008)

You can see that three of my own web pages are returned by Cuil when searching for “Safer Internet Project”. And you can see that Cuil also embeds three images into the snippets of my pages, none of which are made by me, linked by me, or have any whatsoever relation to my website. For example, the first image links to http://www.michael-noll.com/wiki/Safer_Internet_Project, the second image to http://www.michael-noll.com/. To repeat it again, none of these images have any relation to my website. Oh, and in case you were wondering: it does not matter whether “Safer Search” is turned on or off.

Since Cuil hosts the thumbnails on its own server infrastructure, namely cuilimg.com (link for the first image thumbnail), it’s also difficult to trace back where the foreign images originally come from. The thumbnails are also quite small which makes it even harder to recognize details. I think I can read “In [unidentifyable word] High’s computer lab, students videoconference with Singapore College” in the second image, but I wasn’t able to find the image on its actual home location in the Internet.

More fun

Browsing through the search result list for “Safer Internet Project”, I also discovered the following.

Observation 3: Cuil presents the “same” foreign images for more different search results, i.e. images from http://differentwebsite.com/ will be shown next to http://yourwebsite.com/ and http://mywebsite.com/.

Examine the next two screenshots:

Screenshot: Same foreign images shown for different websites in search results (1)

Screenshot: Same foreign images shown for different websites in search results (1)

Screenshot: Same foreign images shown for different websites in search results (2)

Screenshot: Same foreign images shown for different websites in search results (2)

For example, you can see in screenshot (1) a link to the resume of “Kenneth Ray, Internet Project Manager / Business Analyst”, next to a picture of soldiers jumping out off a military helicopter during what looks like Vietnam war. In screenshot (2), you can see the same helicopter picture next to Richard Swetenham’s website about European news in the area of Internet, information society and information content (he’s working for the European Commisssion). The same goes for the other two highlighted pictures that seem to “float” from search result to search result.

Please also note that almost all images in the two screenshots are duplicates!

Now what does this mean for us?

Well, for you and me, this means that there’s a chance that your web page will be “polluted” by others’ images. Mind some porn pictures next to your company’s website? Guess not.

For Cuil itself, my observation suggests they have a serious bug somewhere, whether its in the indexing/processing part, the search result view, or wherever.

I’ll contact Cuil if possible and notify them of this strange behavior, which I think is not what they intended for. Also, this post is not a rant against Cuil (hey, we all make mistakes). On the contrary, I wish them all the best with their search engine. After all, we need more competition in the search market, and particulary, I really like Cuil’s privacy policy MUCH more than Google’s, Yahoo’s, or Microsoft’s. So good luck, guys!