Tag Archives: Social Search

Yahoo Gives Up on Social Search

In an interview that strangely made headlines only in Indian tech blogs, Yahoo Research Labs’ Chief Prabhakar Raghavan declared that Yahoo will not replace its search with Bing. OK, the Yahoo-Microsoft deal is not really off, but the deal details turn out to imply that Yahoo will only use Microsoft search technology as the backend, and keep building its own smart front-end to it that will make use of Yahoo’s content assets. Raghavan says:

“Yahoo will not use Bing. Bing is a branded search engine that Microsoft is building on top of its search back-end and we will build our own search front-end on that same Microsoft back-end. It (using Bing) is not the case, at least as envisioned at the moment”

This actually makes perfect sense. Stop spending tons of resources on crawling and ranking in a futile war with Google, and focus on building the user experience over it, leveraging Yahoo’s advantage – content. Raghavan mentions scenarios that sound a lot like Yahoo shortcuts (that’s really old news) as one example of how to deliver a more complete experience over commodity search results.

The article then goes on to discuss the second focus for Yahoo, social applications, and mentions Microsoft’s tie-up with Facebook for access to social graph. Raghavan is quoted as saying:

“Social networks are not just a place to hang out, but to get things done. It predates the web.. I’m not sure where the sweet spot is, we’re still doing research on it”

Also makes perfect sense. With Google as a common enemy, and Microsoft a Facebook partner, Yahoo may be better positioned to deliver social applications that leverage the de-facto standard of Facebook graph, rather than push its own failed networks.

So why is my post title suggesting what it’s suggesting??

There is one catch in sub-contracting your search results: you are now limited with what you can do in search ranking. The best you can do is re-rank the set of results Microsoft’s technology supplied you with before presenting it to the user. As I’ve pointed out in the past when talking about Delver’s technology, social (graph-based) search is a game that cannot be played by reranking, since it’s a classic long tail problem. So when you can’t interfere with how search results are ranked, you also can’t deliver true social search, as Google recently did. One less social application Yahoo can build…

Google Nails Down Social Search

Google’s Social Search is doing the walk, all the rest are just doing the talk. As soon as I activated the Social Search experiment, my next search yielded a social result. No setting up, showing how I am connected to that result (including friends of friends), showing as part of the standard web results…

google-social-searchContrast this with Microsoft’s poor attempt at “social search” by indexing tweets and status messages and showing them regardless of the actual searcher (example search, you’ve got to be on “United States” locale on bing to see it).

Then also contrast it with Facebook’s announcement back in August of its implementation of searching within friends’ posts – a less grandiose announcement that yet delivered far more social experience than Bing’s. Nevertheless, it’s a very limited experience and far from being a true information source for any serious search need.

So how does Google overcome the main obstaclecollecting your connections?

Google relies on its own sources and on open sources it can obtain by crawling the social graph. That is the true reason why Facebook is not part of Google’s graph (no XFN/FOAF marking on Facebook’s public pages). Google may be counting on Facebook’s inevitable opening up, and with Gmail’s rising popularity it becomes a reasonable alternative even for Facebook users like me.

Sadly, all this great news gave zero credit to Delver, where it all happened first

The Broken Web

Dave Winer recently pointed out two trends that pose risk to user-created content on the web:

  • Over-reliance on url-shorteners. Fueled by twitter’s laconic style, more and more links to content are created using an indirection via url shortener services such as bit.ly and tr.im. The collapse of such a service may turn tons of links into broken links in an instant.
  • Centralized conversation platforms. Shifting the conversation away from their blogs, influencing content publishers chose to center on platforms such as twitter and FriendFeed. Besides the increased noise inherent to lifestreaming, there is increased risk in making your contributions (and having your readers contribute back) in a site run by a private company with no real commitment to its users.

In the past two weeks both these risks materialized to some extent. The url-shortener service tr.im shut down, and that 404-iceberg was avoided in the last minute by the owners’ decision to open-source it. Then Facebook acquired FriendFeed, and their PR said

“…FriendFeed.com will continue to operate normally for the time being as the teams determine the longer term plans for the product.”

Hmm, right… So Scoble’s blog still loves him, and is probably a safer publishing venue.

But why is this such a big deal anyway?Broken web of intrigue, CC by 'Looking for a Lighthouse'/Flickr

We tend to forget how much we have invested into such services until they break down (as was the case with ma.gnolia). The web’s strength is in storing and being able to search in the content produced by millions of earthlings. The impact of frailness of large amounts of content or links is significant. Especially for social search, that content could be vital (OK, perhaps except for that part about what you had for breakfast).

As always with such issues, the best solution is decentralization. For url shorteners, the ‘shortlink’ protocol was already suggested for site-maintained shorteners, and WordPress has already implemented it. My blog is already enabled, try http://wp.me/plBAi-8Q.  And then content decentralization is in our hands. Think about it the next time you post your thoughts into twitter rather than in your blog…

Mechanical Hype, revisited

aardvarkAs I wrote previously, I really like the idea behind Aardvark (previously known as Mechanical Zoo) and it’s a great social Q&A tool, but it simply is notsocial search” (and unlike TechCrunch,  RWW realize that). The Aardvark team still pushes with that terminology, I guess for a good reason given the financial climate, and disperses more of it in a white paper. Once they actually start searching in their aggregated Q&A repository to provide you with an available answer without bothering your network – that would become more of a search solution, rather than Q&A.

Having played with the product a bit, I also see an inherent flaw in the social premise here. Aardvark provides me with answers from friends, or friends-of-friends. Now, it’s more likely I’ll get answers from friends-of-friends, as there are simply a lot more of them. However, these would be people who don’t know me, and will not provide a personal answer that is tailored to my own individual needs.

Still, it’s a great way to make new friends. Not kidding – Aardvark strongly drives conversations, as Danny Sullivan also pointed out, and since this friend-of-friend was the one who responded to my question, I’d feel more comfortable discussing further. Presumably Aardvark will also track this, and practically add this person to my direct social graph.


Update: Max Ventilla of Aardvark commented in my previous post that indexing your graph and finding the right person to answer your query has, in fact, the ingredients of social search. He has a point there, but still that search ends in finding a person, not information, so it’s more of a people search. Still, I agree that in executing this task, the varkers face similar difficulties to those we faced in Delver, albeir on much smaller scale.

IBM IR seminar talk on Socially Connected Search

I had the pleasure today of presenting Delver in a talk I gave at IBM Haifa Research Labs IR  seminar. My slides are over here.

The seminar’s focus this year was on social search, and there were quite a few other talks I found very interesting, I’ll blog about those later on too. One of the positive surprises for me was the amount of work carried out at IBM-HRL on social/web 2.0 tools such as SONAR. Impressive social product work for a non-consumer player; I plan to read more of their published work on that.

Social Search, or Search Socially?

An interesting paper in Computer-Human-Interaction conference CSC08 described social search in terms of the entire searching process, from consulting with friends on what keywords to use, to sharing the search outcome. The research was based on interviews on Mechanical Turk asking for respondents’ recent search experiences, and concluded with some practical suggestions. After watching the presentation slides, I also exchanged some thoughts with one of the authors, Brynn Evans.

Mechanical Hype

I like the idea behind Aardvark. I’m also happy for Mechanical Zoo for securing their future just in time. But to call an IM bot that sends a question to your network social search – well, that’s almost as hype-ish as labeling an advanced bookmarking service as Semantic Web. Search is about smartly tapping a mass of information, where the problem is in finding the right needle in an existing haystack. Aardvark is simply a Q&A site smartly superimposed on the social network.

And nevertheless a clever concept, wishing them success.

Gmailizing blogs

When I first started using gmail, I was shocked: “What? no folders??…” I couldn’t figure out those funny labels, and searching my emails instead seemed a strange idea. Nowadays, when I have to locate an old email, I pray that it’s on gmail and not in my Outlook (even with Vista’s improved search).

The dilemma between search and browse paradigms runs through many software user interfaces, and was especially emphasized with Google’s focus on search in their products. In some areas, such as finding web sites, the search paradigm has undisputably won and the once-king Yahoo! Directory barely has a stub article in Wikipedia. In others, such as news, search is a rarely used service, and a portal-like browse interface rules.

But in reality these are complementary paradigms, rather than competing. Browsing is excellent when the data fits a clear and sufficiently granular taxonomy, shared by the author and reader, and unstructured searching fits into all the other cases (and in some cases, like web search, that’s all there is). Oh, and one more difference: search is A LOT easier. Just stuff all the text into strong index machines, and give the user the ubiquitous search box.

With gmail I wouldn’t think twice before moving an email to the archive, I have no doubt I’ll find it when needed, and all the hassle of managing folders is gone. A blog is no different. You have an author communicating a heap of knowledge to readers, and instead of sorting it for future reference in tags and categories (the complete opposite of “…a clear and sufficiently granular taxonomy…“) they should be gmailized – stuff them in an index and search.

Ah, you say, just embed a blog search box. Sure, but I have dozens of blogs I want to search in. So use some blogs search aggregator, you suggest. But I don’t want to get results from all the blogs out there, just from those I care about. Well, then, guess you’ll need to build yourself a custom search… or just use Delver. Knowing that in a few years every major search engine will integrate social features, I can carelessly blog about anything my social circle could find useful (say, how to plug an mp3 player to the audio system of an Israeli leasing-level Ford Focus), without bothering about categorizing with the perfect keywords (hint: there aren’t any). In fact, I think I’ll skip categories altogether in this blog, and just use tags for a nifty tag cloud 🙂

(crossposted on the Delver Blog)

Why blog? why now??

If a blog post is published and no one is around to read it, does it make a difference?…

That’s the thought that kept me from opening a blog all these blogosphere years. Bloggers write for others to read, but in an information overload time when I can hardly read just the few feeds I need for work, I’d have a very hard time keeping up with reading all the stuff friends write. So why bother?

But then I started working in Delver, and one day it dawned on me. I was waiting for my turn to speak at IAAI-08, and listening to some very interesting talks had this tingling of “…I could blog about my thoughts on that!“, when I suddenly realized that this is what will change with real social search. Suppose I indeed blogged about insights from IAAI on the creative uses of Wikipedia as NLP datasets generator, the chances of that being helpful to a friend or colleague, at that moment, could be slim, and a post or two later – that post fades into oblivion. However, if that friend could find this socially-relevant post on-demand, just when needed – now that’s a different story. That’s pretty much what gmail did to email categorizing – but that’s a subject for a post on its own.

So what is this blog about? depends which of my alter egoz takes over, but it’s safe to say web search is always there, one way or the other. There, that’s general enough so I won’t need to re-edit this post as my blog evolves to discuss marine biology. Here goes!