ChatGPT will kill Google Queries, not Results

It has been a very long time – too long – since Search was disrupted. So it was only appropriate for the hype cycle to reach disproportionate levels with both Microsoft and Google embracing Large Language Models, namely ChatGPT and Bard, into their search experience. It should be noted that “experience” is the key word here, since LLMs have been part of the search backend for quite some time now, only behind the scenes.

By now, we’ve all heard the arguments against these chat-like responses as direct search results. The models are trained to return a single, well articulated piece of text, that pretends to provide an answer, not just a results list. Current models were trained to provide a confident response with lower emphasis on accuracy, which clearly shows when they are put to actual usage. Sure, it’s fun to get a random recipe idea, but getting the wrong information about a medical condition is a totally different story.

So we are likely to see more efforts invested in providing explainability and credibility, and in training the model to project the appropriate confidence vased on sources and domain. The end result may be an actual response for some queries, while for others more of a summary of “what’s out there”, but in all cases there will likely be a reference to the sources, letting the searcher decide whether they trust this reponse, or still need to drill into classic links to validate.

This begs the question then – is this truly a step function, versus what we already have today?

A week ago, I was on the hunt for a restaurant to go to, with a friend visiting from abroad. That friend had a very specific desire – to dine at a fish restaurant, that also serves hummus. Simple enough, isn’t it? Asking Google for “fish restaurants in tel aviv that also serve hummus” quickly showed how very much not so. Google simply failed to understand me. I got plenty of suggestions, some serving fish and some serving hummus, but no guarantee to serving both. I had to painstakingly go one by one and check them out, and most of them had either one or the other. I kept refining that query over and over, as my frustration kept growing.

With the hype still fresh on my mind, I headed over to ChatGPT:

Great. That’s not much help, is it? I asked for a fish restaurant and got a hummus restauarant. For such lack of understanding, I could have stuck with Google. Let’s give it one last try before giving up…

That, right there, was my ‘Aha’ moment.

This result, and the validation it included, was precisely what I was looking for. ChatGPT’s ability to take both context pieces and combine them in a way that reflected back to me what information it is providing, totally made all the difference.

This difference is not obvious. Almost all of the examples in those launch events could do great also with keywords. Google’s Bard announcement post primary examples (beyond the “James Webb” fiasco) were “is the piano or guitar easier to learn, and how much practice does each need?” and “what are the best constellations to look for when stargazing?“. But take any of these as a regular Google queries, and you will get a decent result snippet from a trusted source, as well as a list of very relevant links. At least here you know where the answer is coming from, and can decide whether to trust it or not!


Left: Bard results from announcement post. Right: current Google results for the same query

In fact, Bing’s announcement post included better examples, ones that would work, but would not be optimal for classic search results, such as “My anniversary is coming up in September, help me plan a trip somewhere fun in Europe, leaving from London” (“leaving from London” is not handled well in a search query), or “Will the Ikea Klippan loveseat fit into my 2019 Honda Odyssey?” (plenty of related search results, but not for this exact ikea piece).

The strength of new language models is their ability to understand a much larger context. When Google started applying BERT into their query understanding, that was a significant step in the right direction, moving further away from what their VP Search described as “keyword-ese”, writing queries that are not natural, but that searchers imagine will convey the right meaning. A query he used there was “brazil traveler to usa need a visa” which previously gave results for US travelers to Brazil – perfect example for how looking only at keywords (or “Bag of Words” approach) would fail when not examining the entire context.

I am a veteran search user; I still am cognizant of these constraints when I formulate a search query. That is why I find myself puzzled when my younger daughter enters a free-form question into Google rather than translate it to carefully-selected keywords, as I do. Of course, that should be the natural interface, it just doesn’t work well enough. That is not just a technical limitation – human language is difficult. It is complex, ambiguous, and above all, highly dependent on context.

New language models can enable the query understanding modules in search engines to better understand these more complex intents. First, they will do a much better job at getting keywords context. Then, they will provide reflection; the restaurant example demonstrates how simply reflecting the intent back to the users, enabling them to validate that what they get is truly what they meant, goes a long way to help compensate for mistakes that NLP models will continue to make. And finally, the interactive nature, the ability to reformulate the query as a result of this reflection by simply commenting on what should change, will make the broken experience of today feel more like a natural part of a conversation. All of these will finally get us closer to that natural human interface, as the younger cohort of users so rightfully expects.

What Job Postings Can Tell Us about Data Product Managers

Data product managers are a hot commodity in the job market these days. More and more companies realize the importance of data for their business and the need for building a strategy and a clear roadmap around data. Product managers are naturally part of that effort, and so the demand for Data Product Managers is on the rise.

Recently I happened to get into discussions that got me thinking more about what practically differentiates Data PMs from any other PM. Is it just about their product domains? are there specific new skills? is it perhaps the tools that are being used? maybe even their academic background?

Being a data enthusiast myself, it only made sense to turn to actual data to answer these questions. And so the journey begins…

the data

First things first – data. At first, I considered using LinkedIn profiles of actual Data PMs as a dataset. By comparing profiles of Data PMs to non-Data PMs, one would expect to see the main differences stand out. However, even when leaving the scraping and privacy aspects aside, the self-written role descriptions on LinkedIn are too personalized and not easily comparable.

And so I turned to job descriptions. These are written with a clear focus on what the role requires, are better structured, and also easier to obtain. There was still no available dataset to use, but this seemed like an easier one to solve. Indeed.com provides a pretty good aggregation of links to job postings, which I could then follow and scrape the content. I could also easily select companies that offer both a Data PM and a non-Data PM positions, to hopefully have a better signal to noise ratio, though admittedly this may cause a slight bias for larger companies (having two or more PM openings).

The resulting dataset includes 100 recently posted job postings, by 50 different companies, where for each company one position was for a Data PM and the other for non-Data PM. For this purpose, I’ve defined “Data PM” as having the word “Data” in the title; this seemed like a reasonable starting point, but I’d welcome any feedback as to what such a definition may have missed in the bigger picture.

DATA PROCESSING

An interesting exploration step we can take as we get started, is to look at the role titles we actually got in this sample. Taking the 50 Data PM postings and removing all the standard PM title keywords (including “Senior”, “Associate”, “Product Manager”, “Product Management” etc.), provides this distribution of title fragments which indicates what these data products seem to be about. As the graph below shows, around two thirds of the postings focus on a short list of general names: Data Science, Data Platform, Data Products, or simply Data. The “Other” third is made up of a long list of more specific terms all across the data pipeline – Data Integration, Ingestion, Modeling, Foundation, Analysis, Strategy and more. It seems safe to assume therefore, that in most cases Data PMs own the entire data domain in their organization, end to end.

data pms titlesNext, we’ll process the job posting text itself. The general approach I took was to compare the two large groups of 50 Data postings to the 50 non-Data postings, and look for features that best separate these two classes. We’ll start by tokenizing, removing stopwords and stemming the terms (all using nltk), then parsing the resulting texts to extract n-grams and their frequency in each class. For each n-gram, we’ll register the number of postings it was found in, the ratio of Data to non-Data counts, and the Information Gain measure when using this n-gram to separate the two classes. We’ll also remove low-count n-grams (having less than 5% matches), as well as ones with zero or very low information gain.

ANALYSIS AND FINDINGS

Now we’re finally ready to see the actual results… so what are the keywords and key phrases that differentiate a Data PM from other PMs?

Top 10 terms by their Information Gain

The above table shows the top terms by their information gain, or how well they separate the two groups. Examining the full list indicates that organizations view Data PMs as professionals that build data platforms, work with teams of data scientists and data engineers, and derive or work with data models. It’s worth noting that the term data in itself is not so unique anymore to Data PMs, and actually has a high frequency also in non-Data PM job descriptions (hence the low “Data Ratio” value), while other top terms and phrases appear in small numbers of postings, but when they do – they are highly informative.

When we look for names of data tools in the list, we will find very few that made it high in the list. SQL stands out as a tool that got mentioned in 16 Data postings vs. 3 non-Data, Tableau appears in 9 Data postings vs. 1 non-Data posting, while Python appears in only 6 postings but all are for Data PMs. All of these terms combined cover about 40% of the Data postings, illustrating the expectation from many Data PMs to be able to access and manipulate data, from the basic SQL to actual coding.

Flipping the list around to a high-ratio of non-Data to Data postings, we can learn what terms highly predict non-Data postings. Not surprisingly, we’ll find user-facing keywords such as engage, delight and experience, but more interestingly there are quite a few classic product skills and terms, such as product backlog, product definition, portfolio and launch. That can be interpreted as an indication that Data PMs are assumed to be experienced PMs who have already mastered product management basics, and so the posting focuses on the Data-specific aspects.

What about degree requirements? In general, degree requirements, whether undergraduate or graduate (as well as MBA), all do not seem to have any particular significance for Data PMs, showing zero information gain. Statistics, on the other hand, whether a degree or just having background in, is a clear attribute of Data postings, with 30% mentioning it versus only 2% in non-Data postings.

While this small dataset may not be large enough to be a true sample, it does give an interesting snapshot for how the job market views the role of a Data PM right now. If you have any further insights or comments, I’d love to hear your thoughts in the comments!

Amazon Go isn’t about no-checkout. Here’s what we missed.

After long anticipation and a whole year of delay, Amazon Go was finally launched to the general public with much fanfare three weeks ago. Well, that is if you call opening one small store on the ground floor of your giant corporate building – a “launch”.

Amazon Go in Seattle, December 2016.jpgThe prototype Amazon Go store at Day One, Seattle. By SounderBruce – CC BY-SA 4.0

 

The move has reignited the debate about what impact the new technology will have on 2.3M cashiers in the U.S., whose jobs might be eliminated, and on the retail industry as a whole.

But the real question runs deeper. It’s very clear that operating an Amazon Go store comes at a major cost. If all the cost saving is the paychecks of a few cashiers, but the price to pay is installing and maintaining a very large number of advanced sensors – not to mention the initial price of developing the technology – the bottom line is clearly not a profitable one.

Furthermore, if all Amazon wants is to remove the need for cashiers, self-checkout has existed for quite some time and is likely much cheaper to install. Walmart took care to announce that it expanded its “scan and go” app in advance of the Amazon Go launch – yet another alternative. Why is it so critical for Amazon to eliminate the explicit checkout step altogether?

Is it all perhaps just a publicity stunt? Amazon is not known for pulling such stunts; when it launches something big, it’s usually to truly make a strategic move in that direction.

To better understand Amazon’s motivations, we need to go back to how we, the target audience, have been looking at this story, and in particular – the user data story.

If no cashier or RFID readers scan your (virtual) shopping bag, it inherently means that Amazon should have a different way to know what you put in it. Using cameras means it will need even more than that – it will need to know which item you picked up and only looked at, and which item you did decide to purchase eventually.

So, Amazon Go means Amazon should be watching every move you make in the store. In fact, it means Amazon must watch everything you do, or else the whole concept will not work. This requirement that we accept so naturally, this seemingly so obvious conclusion… this is the heart of Amazon’s strategy.

Just think about it: You walk into an Amazon store; your body and face are being scanned and tracked to the point that you can be perfectly recognized; your every move is being tracked and monitored, and all are mapped to a personally identifiable profile tied to your credit card, managed by a powerful private corporation. In any other context, this would trigger a firestorm of privacy and security charges, but for Amazon Go – well, that’s what it takes to deliver on its promise, isn’t it?

What does Amazon gain from this data?

What’s fascinating to notice is that this data enables the transfer of an entire stack of technologies and methodology from online to offline, from the app or site to the brick-and-mortar, which is a dramatic gain. Think about the parallels to what we already came to expect from online –  browse sessions, recommendations, abandoned cart flows… That item you were considering? Amazon now knows you considered it. Data scientists all over the retail world would love to put their hands on such physical, in-store behavioral data.

For now, the technology may be limited to groceries as a first step. But we could expect Amazon to work to expand it – rather than to more locations – to further verticals. Just think of the personalized recommendations and subscription services such a technology could drive in high-end wine stores, as one example.

One indication that may show Amazon is truly after the data rather than the stores themselves will be if Amazon licenses Go to other retailers or small players. This will immediately position it as a data broker. In any case, retailers have yet another good reason to keep a close look on Amazon’s disruptive moves.

The Data-Product-Scientist-Manager

What’s the difference between Machine Learning, Artificial Intelligence, Deep Learning and Data Science? The huge buzz around these concepts in recent years makes it seem as if they could be used interchangeably.

Several months ago, I gave a meetup talk, going through a history of machine-learning algorithms through the prism of learning algorithms for game playing (video, in Hebrew). As the talk did not require prior knowledge, I began with a quick intro to artificial intelligence (AI) and machine learning (ML) –

  • AI is the science of building machines that mimic what we humans perceive as intelligent behavior
  • ML is a branch within AI, concerned with algorithms that learn their “intelligence” from data (rather than explicit coding)
  • Deep Learning is a very particular method within ML, which uses artificial neural networks and huge volumes of data

At the time of building the talk, I was struggling with making a reference to data science (DS) as well, to help my audience make sense of all the current hype terms.

But is DS a discipline within AI? Within ML? Is it not simply a fancy name for Statistics? I ended up leaving it out.

A few days ago, I stumbled upon an episode of the excellent ML podcast Talking Machines, where Neil Lawrence made an observation that put it all into place for me. Lawrence posited that DS arises from the new streams of data we collect in this era, which are generated in huge volumes out of sensors and interactions, and its mission is to extract value from these. In other words – “Here, we have all of this data, what can we do with it?”

This may seem like a petty technicality, but it makes all the difference. In classic experimentation, scientists would make a hypothesis, collect data to validate or invalidate it, and then run the needed statistical analysis.

With DS, there is no such prior hypotheses. So the core of DS becomes coming up with these hypotheses, and then validating them right there and then in the data we already have. But whose job is it to come up with hypotheses?

There are several relevant roles out there to consider:

  • Data (and business) analysts have a strong understanding of how to wrangle data and query it, and will run analysis (either on-demand or on their own) against clear business objectives. But their role and state of mind is not to find new such objectives or disruptive new ways to reach them
  • Data and ML engineers build the technology and libraries on which the data is collected and crunched. They love to see their systems used to generate powerful insights and capabilities, but see themselves as the infrastructure for generating and validating these hypotheses, rather than as the users
  • Data scientists apply their strong statistics and ML skills to the above data infrastructure to build models, enabling new capabilities and user-value out of validated hypotheses. But a model is not built in a vacuum: they need a clear mission, derived from a validated hypothesis (or even a yet-to-be-validated one)
  • Product managers are the classic hypothesis-creator types. They analyze the market, meet with customers, dive into analytics and business data, and then create product hypotheses, collected into roadmaps. But they hardly use the above “big” data infrastructure for generating hypotheses, mostly due to tech know-how gaps

What we need for data to be fully leveraged is a new role, a hybrid of the latter two. The data science product manager is a data scientist with the instincts and user-centric thinking of the product manager, or a product manager with the data exploration intuitions of a data scientist. Which skills will this require?

  • Strong data instincts, the ability and desire to explore data both assisted and unassisted, applying intuition to identify ad-hoc patterns and trends
  • User-centric thinking, seeing the users and real-life scenarios behind the data, almost like Neo in “The Matrix”
  • Technical acumen, though not necessarily coding. Today’s DS and ML tools are becoming more and more commoditized, and require less and less writing from scratch
  • Very strong prioritization capabilities – creating hypotheses from data may be easy, almost too easy. Hence the need to further explore only the most promising ones, turning them into a potential roadmap.
  • Ability to work closely with the data team and “speak their language” to quickly validate, understand the productization cost, and estimate ROI for a large list of such hypotheses

While this role could still be fulfilled by a strong partnership between two individuals working in tandem (PM and data scientist), it is clear that a single individual possessing all of these skills will achieve results far more efficiently. Indeed, as a quick search on LinkedIn shows, the combined role is emerging and exploding in demand.

“Alexa, add voice shopping to my to-do list”

Amazon is promoting voice shopping as part of its deals for Prime Day next week. Shoppers will get $10 credit just for making their first voice purchase from a list of “Alexa Deals“, items that are already greatly discounted. That’s a major incentive just to push consumers into something that should actually be a great benefit – effortless, simple, zero-click shopping. Why does Amazon have to go through so much trouble to get shoppers to use something that’s supposedly so helpful?

To understand the answer, it’s worthwhile to first understand how valuable voice shopping is for Amazon. In all demos and videos for the various Alexa devices, voice shopping is positioned as the perfect tool for spontaneous, instant ordering purchases, such as “Alexa, I need toilet paper / diapers / milk / dog food / …” That easily explains why you would need to be an Amazon Prime subscriber in order to use voice shopping, and getting Prime to every household is a cornerstone to Amazon’s business strategy.

In addition, Alexa orders are fulfilled by 1-click  payment, yet another highly valuable Amazon tool. Amazon also guarantees free returns for Alexa purchases, just in case you’re concerned about getting your order wrong. Now, combine all of these together and you can see how voice shopping is built to create a habit, of shopping as a frictionless, casual activity. That is probably also why the current offer does not apply for voice shopping from within Amazon’s app, as the long process of launching it and reaching the voice search in it ruins the spontaneity.

And yet – shoppers are not convinced. In last year’s Prime Day, a similar promotion offered by Amazon drove on average one voice order per second. This may sound like a lot, but ~85K orders are still a tiny fraction of the total ~50M orders consumers placed on Amazon that day. This year Amazon raised the incentive even further, which indicates there is still much convincing to do. Why is that?

Mute Button by Rob Albright @ Flickr (CC)

For starters, Amazon’s Alexa devices were never built to be shopping-only. Usage survey reports consistently show that most users prefer to use the Alexa assistant to ask questions, play music, and even to set timers, much more than to shop. This does not mean that Amazon has done a bad job, quite the contrary. Voice shopping may not be that much of a habit initially, and getting used to voice-controlling other useful skills helps build habit and trust. Problem is, when you focus on non-shopping, you also get judged by it. That’s how Amazon gets headlines such as “Google Assistant is light-years ahead of Amazon’s Alexa“, with popular benchmarks measuring it by search, question answering and conversational AI, fields where Google has historically invested more than Amazon by orders of magnitude. The upcoming HomePod by Apple is expected to even further complicate Amazon’s stand, with Apple growing to control the slot of a sophisticated, music-focused, high-end smart home device.

The “How it works” page for the Prime Day Alexa deals hints at other issues customers have with shopping in particular. Explanations aim to reassure that no unintended purchases take place (triggered by your kids, or even your TV), and that if your imperfect voice interaction got you the wrong product, returns are free for all Alexa purchases. These may sound like solved issues, but keep in mind the negative (and often unjustified) coverage around unintended purchases has sent countless Echo owners to set a passcode on ordering, which is actually a major setback for the frictionless zero-click purchasing Amazon is after.

But most importantly, voice-only search interfaces have not yet advanced to support interactions that are more complex than a simple context-less pattern recognition. It’s no accident that the most common purchase flows Alexa supports are around re-ordering, where the item is a known item and no search actually takes place. This means that using Alexa for shopping may work well only for those simple pantry shopping, assuming you already made such purchases in the past. Google, on the other hand, is better positioned than Amazon in this respect, having more sophisticated conversational infrastructure. It even enables external developers to build powerful and context-aware Google Assistant apps using tools such as api.ai (for a quick comparison on these developer platforms, see here).

So what might Amazon be doing to make voice shopping more successful?

Re-ordering items is the perfect beginner use-case, being the equivalent of “known item” searches. Amazon may work on expanding the scope of such cases, identifying additional recurring purchase types that can be optimized. These play well with other recent moves by Amazon, such as around grocery shopping and fulfillment.

Shopping lists are a relatively popular Alexa feature (as well as on Google Home), but based on owner testimonials it seems that most users use these for offline shopping. Amazon is likely working to identify more opportunities for driving online purchases from these lists.

Voice interface has focused mainly on a single result, yielding a “I’m Feeling Lucky” interaction. Using data from non-voice interactions, Amazon could build a more interactive script, one that could guide users through more complex decisions. An interesting case study for this has been eBay with its “ShopBot” chatbot, though transitioning to voice-only control still remains a UX challenge.

And finally – it’s worth noting that in the absence of an item in the purchase history (or if the user declines it), Alexa recommends products from what Amazon calls “Amazon’s Choice“, which are “highly rated, well-priced products” as quoted from this help page. This feature is in fact a powerful business tool, pushing vendors to compete for this lucrative slot. In the more distant future, users may trust Alexa to the point of just taking its word for it and assuming this is the best product for them. That will place a huge lever in Amazon’s hands in its relationship with brands and vendors, and it’s very likely that other retailers as well as brands will fight for a similar control, raising the stakes even more on voice search interfaces.

Feeling Lucky Is the Future of Search

If you visit the Google homepage on your desktop, you’ll see a rare, prehistoric specimen – one that most Google users don’t see the point of: the “I’m Feeling Lucky” button.

Google has already removed it from most of its interfaces, and even here it only serves as a teaser for various Google nitwit projects. And yet the way things are going, the “Feeling Lucky” ghost may just come back to life – and with a vengeance.

lucky

In the early years, the “I’m Feeling Lucky” button was Google’s way of boldly stating “Our results are so great, you can just skip the result lists and head straight to destination #1”. It was a nice, humorous touch, but one that never really caught on as users’ needs grew more complex and less obvious. In fact, it lost Google quite a lot of money, since skipping the result list also meant users saw fewer and fewer sponsored results – Google’s main income source. But usability testing showed that users really liked seeing the button, so Google kept it there for a while.

But there’s another interface rising up that prides itself on returning the first search result without showing you the list. Did you already guess what it is?

robots

Almost every demo of a new personal assistant product will include questions being answered by the bot tapping into a search engine. The demos will make sure to use simple single-answer cases, like “Who is the governor of California?” That’s extremely neat, and was regarded as science fiction not so many decades ago. Amazing work on query parsing and entity extraction from search results has led to great results on this type of query, and the quality of the query understanding, and resulting answers, is usually outstanding.

michelle

However, these are just some of the possible searches we want our bots to run. As we get more and more comfortable with this new interface, we will not want to limit ourselves to one type of query. If you want to be able to get an answer for “Give me a good recipe for sweet potato pie” or “Which Chinese restaurants are open in the area now?”, you need a lot more than a single answer. You need verbosity, you need to be able to refine – which stretches the limits of how we perceive conversational interfaces today.

Part of the problem is that it’s difficult for users to understand the limits of conversational interfaces, especially when bot creators pretend that there are no such limits. Another problem lies in the fact that a natural language interface may simply be a lousy choice for some interaction types, and imposing it on them will only frustrate users.

There is a whole new paradigm of user interaction waiting to be invented, to support non-trivial search and refine through conversation – for all of those many cases where a short exchange and single result will probably not do. We will need to find a way to flip between vocal and visual, manage a seamless thread between devices and screen-based apps, and make digital assistants keep context on a much higher level.

Until then, I guess we’ll continue hoping that we’re feeling lucky.

 

siri-physics

Learning to Play

Ever since I took my first course in Artificial Intelligence, I have been fascinated by the idea of AI in its classical meaning – teaching machines to perform tasks deemed by us humans as requiring intelligence.

Recently, I gave a talk at my company on some of the intriguing instances of one of these tasks – learning to play (and win!) games. I often found the human stories behind the scenes even more fascinating than the algorithms themselves, and that was my focus in this talk. It was really fun both to assemble as well as deliver, so I wanted to capture these stories in this blog post, to accompany the embedded slides below.

 

So let’s get started!

a humble start

Game playing is a fantastic AI task, one that researchers were always excited about. Just like a toddler being taught to swing a baseball bat by an excited parent, the algorithm gets clear rules, a measurable goal and training input. But above all, testing the result involves the fun act of playing against the opponent you yourself have created, just like a proud parent. What a great way to do AI research!

As we go way back in the AI time machine, the first known implementation of an AI game was in 1950. Josef Kates was a young Jewish Austrian engineer, whose family fled the Nazis’ rise to power and ended up in Canada. Kates worked on radar and vacuum tubes design at a company named Rogers Majestic, and later developed his own patented tube, which he called the Additron. While waiting for the patent to be registered, he wanted to demonstrate the power of his invention in a local technology fair, so he built a machine that could play Tic-Tac-Toe, calling it “Bertie the Brain”.

Comedian Danny Kay pleased after "beating" Bertie the Brain during the fair

Comedian Danny Kaye pleased after “beating” Bertie the Brain during the fair

“Bertie the Brain” was a huge success at the fair. Kates made sure to adjust its level of difficulty to allow players to occasionally beat it, and visitors lined up to play. Nevertheless, at the end of the fair it was dismantled and forgotten. Unfortunately for Kates, the Additron took a very long time to go through patenting, and by the time it was approved technology had already moved on toward transistors.

minimaxThe algorithms pioneered and used in those early days were based on the Minimax method – constructing a tree of all possible moves by the player and opponent, and evaluating the proximity to a winning position. In each move, the algorithm would assume best play with the computer playing the move with MAXimal value and the opponent playing its own maximum, which is the computer’s MINimal value. Thus, the algorithm could calculate into the future as much as time allowed.

With only 765 unique board positions in Tic-Tac-Toe, the game was small enough that all positions and moves could be calculated in advance, making Bertie unbeatable. AI researchers call this situation a “Solved” game. In fact, perfect game play will always end in a draw, and if you watched the 1983 movie “War-Games” with Matthew Broderick, you’ll recall how this fact saved the world from nuclear annihilation…

advance to world-class wins

So if Tic-Tac-Toe is too simple, how about a more complex game such as checkers?

Checkers has, well, slightly more board positions: at 5 x 1020 board positions, it was a much more challenging AI task. The best-known checkers program, even if not the first, was the one written by Arthur Samuel at IBM. Samuel’s checkers was considered a real classic, and for several decades it was considered the best that can be achieved. It still used Minimax, but expanded its repository of board positions from actual games played, often against itself, thus becoming a true learning algorithm. However, it never got to the level of beating master human players.

checkers

In 1989, a group of researchers – led by Jonathan Schaeffer from the University of Alberta – set out to use advances in computing and break that glass ceiling with a new program called Chinook. I had the privilege of attending a fascinating talk by Schaeffer at the Technion 10 years ago, and the blog post I wrote subsequently summarizes the full story. That story has fascinating twists and touching human tributes in it, but it ends with machines being the clear winners – and with AI researchers declaring the game of checkers as solved as well.

The obvious next challenge in our journey would be what’s considered the ultimate game of intelligence – chess. Using the same board as checkers, but with more complex moves, chess has approximately 10120 board positions – that’s about the number of checkers positions, squared. A famous chess-playing machine was The Turk, designed and constructed in Austria by Wolfgang von Kempelen as early as 1770. The Turk was a wonder of its age, beating experienced chess players and even Napoleon Bonaparte. It was a hoax, of course, cleverly hiding a human sitting inside it, but the huge interest it created was a symbol of the great intelligence attributed to playing the game.

kasparovThe huge search space in which Minimax had to be applied for chess made early programs extremely weak against humans. Even with the introduction of minimax tree-pruning methods such as Alpha-Beta pruning, it seemed like no algorithmic tuning would produce a breakthrough. As the decades passed, though, more powerful computers enabled faster computations and larger space to hold billions of possible board positions. This culminated in the famous 1996 duel between IBM’s Deep Blue chess-playing computer – already capable of evaluating 200 million positions per second – and the world champion at the time, Garry Kasparov. Despite losing two games to the supercomputer, Kasparov won the tournament easily, 4-2. IBM went on to further improve Deep Blue and invited Kasparov to a re-match the following year. Kasparov won the first game easily, and was so confident as a result that he lost the next game, a loss he blamed on cheating by IBM. The match ended 3.5-2.5 to Deep Blue, a sensational first win for a machine over a presiding world champion.

from brute force to TRUE learning

The shared practice that connected all the work we saw so far – from Bertie the Brain to Deep Blue – was to feed huge amounts of knowledge to the software, so that it could out-do the human player by sheer computing power and board positions stored in its vast memory. This enabled algorithms such as Minimax to process enormous numbers of positions, apply the human-defined heuristics to them and find the winning moves.

Let’s recall the toddler from the start of our journey. Is this how humans learn? Would we truly consider this artificial intelligence?

If we want to emulate true intelligence, what we’d really like to build are algorithms that learn by themselves. They will watch examples and learn from them; they will build their own heuristics; they will infer the domain knowledge rather than have it fed into them.

In 2014, a small London start up named DeepMind Technologies, founded less than three years earlier, was acquired by Google for the staggering sum of $600 million before it had released even one product to the market. In fact, reporters struggled to explain what DeepMind was doing at all.

deepmind-logoThe hints at what attracted Google to DeepMind lie in a paper its team published in December 2013. The paper, presented in NIPS 2013, was titled “Playing Atari with Deep Reinforcement Learning“. It was about playing games, but unlike ever before. This was about a generic system, learning to play games without being given any knowledge, nothing but a screen and the score-keeping part in it. You could equate it to a human who had never played Pac-Man, taking the controls and just hitting them in all directions, watching the score and gradually figuring out how to play it like a pro and then doing the same for many other games, all using the same method. Sounds human? This was the technology Google was after.

Watching DeepMind play Atari Breakout (seen in this video) is like magic. The algorithm starts out moving randomly, barely hitting the ball once every many misses. After an hour of training, it starts playing at an impressive pro level. Then it even learns the classic trick that any Breakout player eventually masters – tunneling the ball to the top so that it hits bricks off with little effort. The beauty of it all was that the exact same system mastered several other games with no custom optimizations – only the screen raw input and an indication of where the score is, nothing else. This was no Minimax running, no feeding of grandmaster moves books or human-crafted heuristic functions. It was generic deep-learning neural networks, using reinforcement learning that would look at a series of moves and their score outcome, and uncover the winning patterns all by itself. Pure magic.

AI Building games

For the last part of the talk, I deviated to a related topic. For this part, I was walking through a wonderful series of blog posts I stumbled upon called “Machine Learning is Fun!”, where the author, Adam Geitgey, walks through basic concepts in Machine Learning. In part two, he describes how Recurrent Neural Networks can be trained to learn and generate patterns. The simplest example we all know and appreciate (or sometimes not…) is the predictive text feature of mobile keyboards, where the system attempts to predict what word we are trying to type – the cause of so many great texting gaffes.

Moving to more elaborate examples, Geitgey fed an RNN implementation with a Hemingway book (“The Sun Also Rises”), and trained it recurrently on the book’s text, then having it spit out texts of its own that would match the book. It starts out with incomprehensible strings of text, but gradually takes the form of words and sentences, to the point that the sentences almost make sense and retain Hemingway’s typically curt dialogue style.

Geitgey then takes this system and applies it to none other than the Super Mario Maker. This is a version of Super Mario that allows players to build levels of their own. He transforms game levels into text streams and feeds these into the learning system. Again here, at first the system spits out nonsense. But then it gradually learns the basic rules and eventually generates actual playable levels. I’m no expert on Super Mario so I couldn’t tell, but I showed it to my son and he said it’s a great level that he would be happy to play. That’s intelligent enough for me!

supermario

 

So Long, and Thanks for All the Links

 

Prismatic is shutting down its app.

I’ve been fascinated by algorithmic approaches to information overload for quite some time now. It seemed like one of those places where the Web changed everything, and now we need technology to kick in and make our lives so much easier.

Prismatic_logo,_June_2014Prismatic was one of the more promising attempts to that I’ve seen, and I’ve been a user ever since its launch back in 2012. Every time I opened it, it never failed to find me real gems, especially given the tiny setup it required when I first signed up. Prismatic included explicit feedback controls, but it seemed to excel in using my implicit feedback, which is not trivial at all for a mobile product.

flipboard-logo-iconFlipboard is likely the best alternative out there right now, and its excellent onboarding experience helped me get started quickly with a detailed list of topics to follow. With reasonable ad-powered revenue, which Prismatic seemed to shun for whatever reason, it is also less likely to shut down anytime soon. Prismatic still does a much better job than Flipboard in surfacing high-quality, long-tail, non-mainstream sources; let’s hope Flipboard continues improving to get there.

It seems, though, that news personalization is not such a strong selling point. Recently, Apple moved from a pure personalized play for its Apple News app to also add curated top stories, as its view counts disappointed publishers. In my own experience, even the supposed personalized feed was mostly made up of 3-4 mainstream sources anyway. Let’s hope that this is not where information overload is leading us back to. Democratizing news and getting a balanced and diverse range of opinions and sources is a huge social step forward, that the Web and Social Media have given us. Let’s not go backwards.

Marketing the Cloud

watsonIBM made some news a couple of days ago announcing consumers can now use Watson to find the season’s best gifts. A quick browse through the app, which is actually just a wrapper around a small dedicated website, shows nothing of the ordinary – Apple Watch, Televisions, Star Wars, Headphones, Legos… not much supercomputing needed. No wonder coverage turned sour after an initial hype, so what was IBM thinking?

tensorflowRewind the buzz machines one week back. Google stunned tech media, announcing it is open sourcing its core AI framework, TensorFlow. The splashes were high: “massive potential“, “Machine Learning breakthrough“, “game changer“… but after a few days, the critics were out, Quorans talking about the library’s slowness, and even Google-fanboy researchers wondering – what exactly is TensorFlow useful for?

Nevertheless, within 3 days, Microsoft quickly announced its own open source Machine Learning toolkit, DMTK. The Register was quick to mock the move, saying “Google released some of its code last week. Redmond’s (co-incidental?) response is pretty basic: there’s a framework, and two algorithms”…

So what is the nature of all these recent PR-like moves?

marketing-cloud

There is one high-profit business shared by all of these companies: Cloud Computing. Amazon leads the pack in revenue, and uses the cash flow from cloud business to offset losses on its aggressive ecommerce pricing, but also Microsoft and Google are assumed to come next with growing cloud business. Google even goes as far as predicting cloud revenue to surpass ads revenue in five years. It is the gold rush era for the industry.

But first, companies such as Microsoft, Google and IBM will need to convince corporates to hand them their business, rather than to Amazon. Hence they have to create as much “smart” buzz for themselves, so that executives in these organization, already fatigued by the big-data buzzwords, will say: “we must work with them! look, they know their way with all this machine-learning-big-data-artifical-intelligence stuff!!”

So the next time you hear some uber-smart announcement from one of these companies that feels like too much hot air, don’t look for too much strategy; instead, just look up to the cloud.

Thoughts on Plus – Revisited

plusTwo weeks ago, Google decided to decouple Google+ from the rest of the Google products, and to not require a G+ login when using those other products (e.g. YouTube), in effect starting to gradually relieve it from its misery. Mashable published excellent analysis on the entire history of the project, and of the hubris demonstrated by Vic Gundotra, the Google exec who led it.

Bradley Horowitz, who conceived Google+ along with Gundotra and is now the one to oversee the transition, laid out the official Google story in a G+ blog post. He talked of the double mission Google assigned to the project – become a unifying platform, as well as a product on its own. A heavy burden to carry, as in many cases these two missions will surely conflict each other and mess up the user experience, as they did. Horowitz also explains what G+ should have focused on, and now will: “…helping millions of users around the world connect around the interest they love…”

Well, unfortunately Horowitz seems to not be a regular reader of Alteregozi 🙂 Had he read this post, exactly 4 years ago right here, perhaps G+ would have had more of a differentiation, and a chance.