Tag Archives: Yahoo

Microsoft Israel ReCon 2014

Microsoft Israel R&D Center held their first Recommendations Technology conference today, ReCon. With an interesting agenda and a location that’s just across the street from my office, I could not skip this one… here are some impressions from talks I found worth mentioning.

The first keynote speaker was Joseph Sirosh, who leads the Cloud Machine Learning team at Microsoft, recently joining from Amazon. Sirosh may have aimed low, not knowing what his audience will be like, but as a keynote this was quite a disappointing talk, full of simplistic statements and buzzwords. I guess he lost me when he stated quite decisively that the big difference about putting your service on the cloud is that it means it will get better the more people use it. Yeah.

Still, there were also some interesting observations he pointed out, worth mentioning:

  • If you’re running a personalization service, benchmarking against most popular items (i.e. Top sellers for commerce) is the best non-personalized option. Might sound trivial, but when coming from an 8-year Amazon VP, that’s a good validation
  • “You get what you measure”: what you choose to measure is what you’re optimizing, make sure it’s indeed your weakest links and the parts you want to improve
  • Improvement depends on being able to run a large number of experiments, especially when you’re in a good position already (the higher you are, the lower your gains, and the more experiments you’ll need to run to keep gaining)
  • When running these large numbers of experiments, good collaboration and knowledge sharing becomes critical, so different people don’t end up running the same experiments without knowing of each other’s past results

Elad Yom-Tov from Microsoft Research described work his team did on enhancing Collaborative Filtering using browse logs. They experimented with adding user browser logs (visited urls) and search queries to the CF matrix in various ways to help bootstrapping users with little data and to better identify short-term (recent) intent for these users.

An interesting observation they reached was that using the raw search queries as matrix columns worked better than trying to generalize or categorize them, although intuitively one would expect this would reduce the sparsity of such otherwise very long-tail attributes. It seems that the potential gain in reducing sparsity is offset by the loss of specificity and granularity of the original queries.


Another related talk which outlined an interesting way to augment CF was by Haggai Roitman of IBM Research. Haggai suggested the feature of “user uniqueness” –  to what extent the user follows the crowd or deliberately looks for the esoteric choices, as a valuable signal in recommendations. This uniqueness would then determine whether to serve the user with results that are primarily popularity-based (e.g. CF) or personalized (e.g. content-based), or a mix of the two.

The second keynote was by Ronny Lempel of Yahoo! Labs in Haifa. Ronny talked about multi-user devices, in particular smart TVs, and how recommendations should take into account the user that is currently in front of the device (although this information is not readily available). The heuristic his team used was that the audience usually doesn’t change in consecutive programs watched, and so using the last program as context to recommending the next program will help model that unknown audience.

Their results indeed showed a significant improvement in recommendations effectiveness when using this context. Another interesting observation was that using a random item from the history, rather than the last one, actually made the recommendations perform worse than no context at all. That’s an interesting result, as it validates the assumption that approximating the right audience is valuable, and if you make recommendations to the parent watching in the evening based on the children’s watched programs in the afternoon, you are likely to make it worse than no such context at all.


The final presentation was by Microsoft’s Hadas Bitran, who presented and demonstrated Windows Phone’s Cortana. Microsoft go out of their way to describe Cortana as friendly and non-creepy, and yet the introductory video from Microsoft Hadas presented somehow managed to include a scary robot (from Halo, I presume), dramatic music, and Cortana saying “Now learning about you”. Yep, not creepy at all.

Hadas did present Cortana’s context-keeping session, which looks pretty cool as questions she asked related to previous questions and answers, were followed through nicely by Cortana (all in a controlled demo, of course). Interestingly, this even seemed to work too well, as after getting Cortana’s list of suggested restaurants Hadas asked Cortana to schedule a spec review, and Cortana insisted again and again to book a table at the restaurant instead… nevertheless, I can say the demo actually made the option of buying a Windows Phone pass through my mind, so it does do the job.

All in all, it was an interesting and well-organized conference, with a good mix of academia and industry, a good match to IBM’s workshops. Let’s have many more of these!

Yahoo Gives Up on Social Search

In an interview that strangely made headlines only in Indian tech blogs, Yahoo Research Labs’ Chief Prabhakar Raghavan declared that Yahoo will not replace its search with Bing. OK, the Yahoo-Microsoft deal is not really off, but the deal details turn out to imply that Yahoo will only use Microsoft search technology as the backend, and keep building its own smart front-end to it that will make use of Yahoo’s content assets. Raghavan says:

“Yahoo will not use Bing. Bing is a branded search engine that Microsoft is building on top of its search back-end and we will build our own search front-end on that same Microsoft back-end. It (using Bing) is not the case, at least as envisioned at the moment”

This actually makes perfect sense. Stop spending tons of resources on crawling and ranking in a futile war with Google, and focus on building the user experience over it, leveraging Yahoo’s advantage – content. Raghavan mentions scenarios that sound a lot like Yahoo shortcuts (that’s really old news) as one example of how to deliver a more complete experience over commodity search results.

The article then goes on to discuss the second focus for Yahoo, social applications, and mentions Microsoft’s tie-up with Facebook for access to social graph. Raghavan is quoted as saying:

“Social networks are not just a place to hang out, but to get things done. It predates the web.. I’m not sure where the sweet spot is, we’re still doing research on it”

Also makes perfect sense. With Google as a common enemy, and Microsoft a Facebook partner, Yahoo may be better positioned to deliver social applications that leverage the de-facto standard of Facebook graph, rather than push its own failed networks.

So why is my post title suggesting what it’s suggesting??

There is one catch in sub-contracting your search results: you are now limited with what you can do in search ranking. The best you can do is re-rank the set of results Microsoft’s technology supplied you with before presenting it to the user. As I’ve pointed out in the past when talking about Delver’s technology, social (graph-based) search is a game that cannot be played by reranking, since it’s a classic long tail problem. So when you can’t interfere with how search results are ranked, you also can’t deliver true social search, as Google recently did. One less social application Yahoo can build…

IBM IR Seminar Highlights (part 1)

IBM Haifa Research LabsYesterday’s seminar was also packed with some very interesting talks from a wide range of social aspects to IR and NLP.

Mor Naaman of Rutgers University and formerly at Yahoo! Research gave an excellent talk on using social inputs to improve the experience of multimedia search. The general theme was about discovering metadata for a given multimedia concept from web 2.0 sites, then using those to cluster potential results and choose representative ones.

In one application, this approach was used to identify “representative” photos of a certain landmark, say the Golden Gate bridge, see WorldExplorer for an illustration. So first, you’d find all flickr photos geotagged and/or fickr-tagged by the location and name of the bridge (or any given landmark). Next, image processing (SIFT)  is applied to those images to cluster them into subsets that are likely to be of the same section and/or perspective of the bridge. Finally, relations between the images in each cluster are formed based on the visual relation, and link analysis is employed to find a “canonical view”. The result is what we see on the right sidebar in World Explorer, and described in this WWW’08 paper.

[Update: Mor commented that the content-based analysis part is not yet deployed in World Explorer. Thanks Mor!]


Another example applied this approach to concerts on YouTube, and the purpose was to find good clips of the concert itself, rather than videos discussing it etc. Metadata describing the event (say, an Iron Maiden concert) was collected from both YouTube and sites such as Upcoming.org, and Audio Fingerprinting was employed to detect overlapping video sections, as it’s quite likely the concert itself would have the most overlap. Note that in both cases, the image/audio processing is a heavy task, and applying it only to a small subset filtered by social tags makes the work involved more feasible.

I’ll talk about the keynote (by Prof. Ben Schneiderman) on another post, this one is already way too long… Here are soundbites from some other talks:

Emil Ismalon of Collarity referred to personalized search (e.g. Google’s) as a form of overfitting, not letting me learn anything new as it trains itself only on my own history. That, of course, as a motivation for community-based personalization. 

Ido Guy of IBM talked about research they did comparing social network extracted from public and private sources. The bottom line is that some forms of social relations are stronger, representing collaboration (working on projects together, co-authoring papers or patents), and others are weaker, being more around the socializing activities (friending/following on SN, commenting on blogs etc) . Of course, that would be relevant for Enterprise social graph, not necessarily personal life…

Daphne Raban of Haifa University summarized her (empirical) research into motivations of participants in Q&A sites. The main bottom lines were: 1) money was less important to people who participate very often, but it’s a catalyst, 2) Being awarded with gratitude and conversation is the main factor driving people to become more frequent participants, and 3) in quality comparison, paid results ranked highest, free community results (Yahoo! Answers) ranked close, and unpaid single experts ranked lowest.

If you liked my blog, you’d like this post. Trust me!

One of the sites that most impressed me when I first started browsing the web was called MovieCritic.com. You would rate a few movies you saw, then it would predict whether you’d like a new movie. It would even let you find one that matches both your taste and your girlfriend’s. Pure magic, for that time. For me that was the first demonstration of what we can achieve with the web as a medium.

MovieCritic is dead for a few years now, but recommender systems are now everywhere. NetFlix runs one of the most successful commercial implementations (Amazon another classic example, “People who bought this book…”), and two years ago they challenged researches to come up with a system that would perform 10% better than their own, in predicting users’ ratings. The best achieving team so far almost got there, and today I attended a talk in the Technion by Yehuda Koren, one of the team members and a researcher at Yahoo! Research Haifa lab.

Most methods follow the neighborhood-based model – find an item’s neighbours (in some representation), and predict based on their rating. This may be done in a user-user matching (find users like this user, then check their rating) or item-item (find items like the rated item, then predict based on how the user rated those items). One of the interesting approaches proposed by Koren’s team represented both users and movies in the same space, then looked for similarity in this unified space.

The most striking finding for me, however, was that winning strategies did not use anything from the movie’s “content” features. Genre, director, actors, length, etc. – all these did not produce any additional value beyond the plain statistical analysis and correlation of ratings and users, and are therefore not used at all. In fact, Koren claims that knowing that a certain user is a Tom Hanks fan makes no difference, we will infer this from the recommendations anyway (assuming there are enough of them of course).

I find that almost sad… Not being able to intelligently reason over the underlying logic exposed by an AI software is a tremendous drawback in my eyes, even if the overall prediction score is better. Telling the user “you may want to watch this movie because A and B and C” can benefit in more satisfaction by the user, understanding even the incorrect predictions, and possibly leading to a feedback cycle. Doing away with it is like showing web search results without keyword highlighting, no visible cue for the user why this result was returned (“…trust me, I know what’s the right answer for you!“).