Microsoft Israel ReCon 2015 (or: got to start blogging more often…)

Yes, two consecutive posts on the same annual event are not a good sign to my virtual activity level… point taken.

MSILSo 2 weeks ago, Microsoft Israel held its second ReCon conference on Recommendations and Personalization, turning its fine 2014 start into a tradition worth waiting for. This time it was more condensed than last year (good move!) and just as interesting. So here are three highlights I found worth reporting about:

Uri Barash of the hosting team gave the first keynote on Cortana integration in Windows 10, talking about the challenges and principles used. Microsoft places a high empasis on the user’s trust, hence Cortana does not use any interests that are not explicitly written in Cortana’s notebook, validated by the user. If indeed correct, that’s somewhat surprising, as it limits the recommendation quality and moreover – the discovery experience for the user, picking up potential interests from the user’s activity. I’d still presume that all these implicit interests are probably used behind the scenes, to optimize the content from explicit interests.

ibm_logoIBM Haifa Research Labs have been doing work for some years now on enterprise social networks, and mining connections and knowledge from such networks. In ReCon this year, Roy Levin presented a paper to be published in SIGIR’15, titled “Islands in the Stream: A Study of Item Recommendation within an Enterprise Social Stream“. In the paper, they discuss a feature for a personalized newsfeed included in IBM’s enterprise social network “IBM Connections”, and provide some background and the personalized ranking logic for the feed items.

They then move on to describe a survey they have made among users of the product, to analyze their opinions on specific items recommended for them in their newsfeed, similar to Facebook’s newsfeed surveys. Through these surveys, the IBM researchers attempted to identify correlations between various feed item factors, such as post and author popularity, post personalization score, how surprising an item may be to a user and how likely a user is to want such serevdipity, etc. The actual findings are in the paper, but what may actually be even more interesting is the deep dissection in the paper of the internal workings of the ranking model.

Outbrain-logoAnother interesting talk was by Roy Sasson, Chief Data Scientist at Outbrain. Roy delivered a fascinating talk about learning from lack of signals. He began with an outline of general measurement pitfalls, demonstrating them on Outbrain widgets when analyzing low numbers of of clicks on recommended items. Was the widget visible to the user? where was it positioned in the page (areas of blindness)? what items were next to the analyzed item? were they clicked? and so on.

Roy then proceeded to talk about what we may actually be able to learn from lack of sharing to social networks. We all know that content that gets shared a lot on social networks is considered viral, driving a lot of discussion and engagement. But what about content that gets practically no sharing at all? and more precisely, what kind of content gets a lot of views, but no sharing? Well, if you hadn’t guessed already, that will likely be content users are very interested to see, but would not admit to it, namely provocative and adult material. So in a way, leveraging this reverse correlation helped Outbrain automatically identify porn and other sensitive material. This was then not used to filter all of this content out – after all, users do want to view it… but it was used to make sure that the recommendation strip includes only 1-2 such items so they don’t take over the widget, making it seem like this is all Outbrain has to offer. Smart use of data indeed.


Microsoft Israel ReCon 2014

Microsoft Israel R&D Center held their first Recommendations Technology conference today, ReCon. With an interesting agenda and a location that’s just across the street from my office, I could not skip this one… here are some impressions from talks I found worth mentioning.

The first keynote speaker was Joseph Sirosh, who leads the Cloud Machine Learning team at Microsoft, recently joining from Amazon. Sirosh may have aimed low, not knowing what his audience will be like, but as a keynote this was quite a disappointing talk, full of simplistic statements and buzzwords. I guess he lost me when he stated quite decisively that the big difference about putting your service on the cloud is that it means it will get better the more people use it. Yeah.

Still, there were also some interesting observations he pointed out, worth mentioning:

  • If you’re running a personalization service, benchmarking against most popular items (i.e. Top sellers for commerce) is the best non-personalized option. Might sound trivial, but when coming from an 8-year Amazon VP, that’s a good validation
  • “You get what you measure”: what you choose to measure is what you’re optimizing, make sure it’s indeed your weakest links and the parts you want to improve
  • Improvement depends on being able to run a large number of experiments, especially when you’re in a good position already (the higher you are, the lower your gains, and the more experiments you’ll need to run to keep gaining)
  • When running these large numbers of experiments, good collaboration and knowledge sharing becomes critical, so different people don’t end up running the same experiments without knowing of each other’s past results

Elad Yom-Tov from Microsoft Research described work his team did on enhancing Collaborative Filtering using browse logs. They experimented with adding user browser logs (visited urls) and search queries to the CF matrix in various ways to help bootstrapping users with little data and to better identify short-term (recent) intent for these users.

An interesting observation they reached was that using the raw search queries as matrix columns worked better than trying to generalize or categorize them, although intuitively one would expect this would reduce the sparsity of such otherwise very long-tail attributes. It seems that the potential gain in reducing sparsity is offset by the loss of specificity and granularity of the original queries.


Another related talk which outlined an interesting way to augment CF was by Haggai Roitman of IBM Research. Haggai suggested the feature of “user uniqueness” –  to what extent the user follows the crowd or deliberately looks for the esoteric choices, as a valuable signal in recommendations. This uniqueness would then determine whether to serve the user with results that are primarily popularity-based (e.g. CF) or personalized (e.g. content-based), or a mix of the two.

The second keynote was by Ronny Lempel of Yahoo! Labs in Haifa. Ronny talked about multi-user devices, in particular smart TVs, and how recommendations should take into account the user that is currently in front of the device (although this information is not readily available). The heuristic his team used was that the audience usually doesn’t change in consecutive programs watched, and so using the last program as context to recommending the next program will help model that unknown audience.

Their results indeed showed a significant improvement in recommendations effectiveness when using this context. Another interesting observation was that using a random item from the history, rather than the last one, actually made the recommendations perform worse than no context at all. That’s an interesting result, as it validates the assumption that approximating the right audience is valuable, and if you make recommendations to the parent watching in the evening based on the children’s watched programs in the afternoon, you are likely to make it worse than no such context at all.


The final presentation was by Microsoft’s Hadas Bitran, who presented and demonstrated Windows Phone’s Cortana. Microsoft go out of their way to describe Cortana as friendly and non-creepy, and yet the introductory video from Microsoft Hadas presented somehow managed to include a scary robot (from Halo, I presume), dramatic music, and Cortana saying “Now learning about you”. Yep, not creepy at all.

Hadas did present Cortana’s context-keeping session, which looks pretty cool as questions she asked related to previous questions and answers, were followed through nicely by Cortana (all in a controlled demo, of course). Interestingly, this even seemed to work too well, as after getting Cortana’s list of suggested restaurants Hadas asked Cortana to schedule a spec review, and Cortana insisted again and again to book a table at the restaurant instead… nevertheless, I can say the demo actually made the option of buying a Windows Phone pass through my mind, so it does do the job.

All in all, it was an interesting and well-organized conference, with a good mix of academia and industry, a good match to IBM’s workshops. Let’s have many more of these!

The Great Managers Balancing Act

With so many approaches to management – and of software development in particular – there are plenty of authors who write about it. I don’t intend to join that fray. Personally, I enjoy the “What” much more than the “How”, but recently this piece of insight dawned on me.

To be helpful, a good middle manager does one of two things:

  1. Up: Make decisions and be held accountable for their outcome.
  2. Down: Remove obstacles from his team’s path.

Where it gets interesting is where #1 and #2 collide, and how this manager deals with it. Great managers find the right balance. Mediocre managers can only handle this by screwing one at the expense of the other.

For example, a certain middle manager gets some directive handed down from above, while the team is already at full capacity. Rather than trading off another highly prioritized task and facing a tough time with higher management, he prefers to push the requirement down to his team, to try and “make an extra effort.” He even considers it his decision, so he feels that he lives up to #1. But sadly for his team, not only did he not remove obstacles, he also just added more.

Alternatively, such managers try to execute #2 and help their team by making the tough decisions that remove an obstacle. But because they do not realize they’re the ones held accountable on these decisions, they prefer to not communicate them upward to keep their political standing, thus violating #1. This eventually results in the team losing credibility and being considered lower-execution, despite all their hard work.

Of course, how to successfully balance #1 and #2 and still keep your job and sanity as a manager is a separate topic, one I’ll leave to the management experts to discuss…

"life is a great balancing act..."

Mining Wikipedia, or: How I Learned to Stop Worrying and Love Statistical Algorithms

I took my first AI course during my first degree, in the early 90’s. Back then it was all about expert systems, genetic algorithms, search/planning (A* anyone?). It all seemed clear, smart, intuitive, intelligent…

Then, by the time I got to my second degree in the late 00’s, the AI world has changed by a lot. Statistical approaches took over by a storm, and massive amounts of data seemed to trump intuition and smart heuristics anytime.

It took me a while to adjust, I admit, but by the time I completed my thesis I came to appreciate the power of big data. I now can better see this as an evolution, with heuristics and inutions driving how we choose to analyze and process the data, even if afterwards it’s all “just” number-crunching.

So on this note, I gave a talk today at work on the topic of extracting semantic knowledge from Wikipedia, relating also to our work on ESA and to this being an illustration of the above. Enjoy!


The secret to Facebook’s growth? has recently also been attacked by the wonderous Facebook profile spam comments (I kept two specimens here and here, but deleted many dozens more in the past weeks). At first, I was amused at this new type of spam comments, but after running a few searches I felt more of disgrace for being so late to the party, seeing mentions of these more than a year ago

So what’s the deal with these comments? they usually don’t include any links, not selling anything, and some are really good comments. If you’d look at the above two you’ll have a very hard time figuring out they are not real comments. Looks like some spammers harvest comments from legit blogs, and then classify your post to find the most similar comment to stick. What is the motivation?

I don’t have the answers myself, but two thoughts:


  1. One spam fighting blog claims that the motivation is to establish the credibility of these accounts, so that they can later be used to sell likes on Facebook itself. The plot thickens…
  2. I’ve never seen an account repeating. The amount of fake FB accounts being created is probably huge. How much of Facebook’s recent continued growth is attributed to such fake accounts? nothing you would hear about in Facebook’s earnings calls.



Amazon, Apple, and Application Platforms

Apple is known for keeping a bustling legal department. Steve Jobs reportedly swore to “destroy Android“, the results of which Samsung has felt very well.

But Apple has more enemies to fight. It holds a complicated relationship with Amazon, who now produces the second most selling tablet after the iPad, claiming it already owns 22% of the US tablet market. That’s a lot of iPads that Apple isn’t selling, and so it readies its own iPad Mini in response.

A less familiar front in this battle is Apple’s “False Advertising” suit against Amazon with regard to the latter’s use of “App Store” for its Android-based application market. Amazon’s response ridiculed this claim, but this does raise the question – what exactly is Amazon’s app store all about?

Amazon’s Kindle store is one strange beast. Kindle apps are in fact re-purposed Android apps, with some added functionality. However, Amazon took care to clearly differentiate the Kindle’s UX and app store from the general Android market. So what is the justification for developing an extra Kindle app?

Every application development platform has its unique core capabilities, which developers can leverage for their own application. Developers get to apply their creative ideas on these assets, while the platform owner enjoys increased engagement for their users, with apps taking these capabilities to places the platform did not even imagine. Facebook’s application platform revolved around the social graph, a unique and very valuable data asset, and Apple provided access to the iPhone’s unique (at the time) features such as its accelerometers and gyroscope, GPS and camera.

Visiting the Amazon Kindle SDK site shows where Amazon feels it has the advantage: 1-click purchasing. This patented Amazon feature (a patent which Apple has actually licensed) can appeal to application developers who feel their application has premium features worth paying for, if only the payment was frictionless. Initial results seemed to validate that, and show excellent revenue per user on Amazon’s platform.

And so, Amazon’s platform says a lot about where Amazon feels its strength lies with the Kindle. Unlike Apple, Amazon builds its success in the tablets market on selling content, much less than selling devices. Hence, expect Kindle to continue beating the iPad on price even when iPad mini launches.

Out of Context

Sponsored Stories are a brilliant advertising model by Facebook. Just like  AdWords in 2000, it’s an example of a model that leverages the core value of the company for advertising, without compromising that value’s authenticity. If your friends liked Starbucks, it was of their own free will and in a public forum, so having Starbucks pay to show this more prominently and to other users can only make sense.

So why is it, then, that a simple amusing case of 55-gallon of lubricant made so many bad headlines for Facebook?

And Facebook has more fronts to fight in its battles for transformation into a revenue-driven company. Timeline may be great for brands, but it’s a magnet for popular revolt. Besides resenting the no-alternative approach Facebook took, why are users so upset about the actual Timeline view, which is surely more visually appealing than the boring wall?

I find the answer to both relates to context.

Out Of Context

For the Sponsored Stories it seems pretty clear. “Yes, I linked to a 55-gallon lubricant product, but I did so as a joke”, well then, Sentiment Analysis still has a long way to go with sarcasm despite some recent advance right here in the Hebrew university. Sarcasm is one extreme example, but that missing context could even just be that you’re no longer fan of that company you liked a month ago, and just didn’t get to unlike yet.

And what about Timeline? isn’t it great that all your previous statuses and photos are there, organized along your timeline and telling your story? well, it is, but only if you care to ensure that it tells the story that you really want to tell. The context of that story may depend on where we were, what we were up to at the time, who our friends were… some of this may not even be possible to reconstruct in the Timeline.

In addition, we are used to our stories dropping off the cliff of the page fold and disappearing into oblivion, so we don’t really care to update them or remove those we don’t feel so proud of anymore. Suddenly, they come back to haunt us with Timeline, and we have to scramble to adjust

And in a final associative thought: the tiled UX of Timeline does remind me of the Pinterest-mania that has taken hold on every new social curation site. So why does this look so so much fun on Pinterest? Context again. Pinterest has none of it, it’s a pure fun/discovery experience, each tile is independent and you’re not really trying to follow up a thread, or cover all that you’ve missed since your last visit. For a social network though, that would be, well, out of context.