Netflix is a subscription based DVD rental service that lets its customers browse thousands of movies online and then delivers them though the mail system. With so many movie choices available and to be competitive with brick and mortar rental establishments, Netflix has improved its customer satisfaction though numerous user interface improvements, movie distribution centers, streaming video, and its ability to suggest movies based on individual preferences to deliver a personalized experience. So what does a Netflix movie suggestion have to do with the effectiveness of a Social Networking Site (SNS) like Facebook and Twitter? Surprisingly, the ability to suggest movies you might like may not be all that different from the way we use SNS as an efficient information gathering and interpersonal connection tool.
Besides the entertainment value, we do know that SNSs are
described as efficient mechanisms for connecting people and information together to form relationships and generate conversation. They are useful in finding and gathering information we are interested in quickly with little effort and this lower ‘transaction cost’ makes them very appealing. Scientifically speaking, what is it that social networking sites do that is unique in the world of Web 2.0 tools that makes it better at providing us the kind of news and information we want to see? After all, mainstream tools like RSS feeds, discussion forums, blogs and commercial products like Google, Amazon, and Network News have been providing us this kind of information and they all work fairly well.
But there is a subtle, but important difference in the way
information is gathered and presented to us, which might be explained by something as surprisingly different, such as a Netflix movie recommendation. Two tightly coupled forces are at work: our preferences and resistance to change. Over time, we have developed certain affinities for routines and things like a favorite way to make coffee or a particular brand of coffee itself. We are creatures of habit, sticking to similar themes and products that we have a preference for. When attending a multiple day class or conference we tend to sit in the same place as the previous day. I’ve observed that people tend to use the same piece of gym equipment to park in the same spot or take ownership of a particular corner in the shared work refrigerator. In some cases a sense of ‘ownership’ may be at work and the routine provides us comfort and control over a few facets of our lives. Since preferences are developed over time, changing them quickly would either take a long time or a significant event to alter them, but it does happen. Netflix is particularly elegant in the fact that it takes our established preferences based on how well we rate a particular movie and uses it to suggest others that we may like. In doing so, it may expose us to movies that we had forgotten about or had no idea ever existed. These suggestions may be on the fringe of our established preferences or interests, but the mere fact that we are exposed to something out of our routine is the key.
Besides what the Netflix
algorithm recommends, they have added an extra layer that bridges the gap into the concept of an SNS by allowing users to add “friends” to see each other’s movie queues and obtain suggestions based on favorable ratings. In essence, it’s a double pronged approach to a better user experience in finding movies we might like and it’s very effective. What’s happening here is that our transaction cost to change, to explore has been significantly lowered to where we are comfortable enough to take a chance on something new. After all, it’s just a movie; there is no long term commitment and little cost associated with adding it to the movie queue. But it’s the connection to how a SNS operates that is so important to understanding why social networks have an edge over the other mainstream Web 2.0 tools. First, let’s take a look at how some of these tools work, then get back to the link between Netflix and the SNS.
It’s not hard for Amazon to suggest coffee and running
products to me, because I tend to shop for those products more frequently than other ones. Perhaps it is also a little creepy that it remembers what I bought five years ago, and suggests I look at similar items. Because we are creatures of habit, these systems are designed to help us push the boundaries of our preferences, to explore items on the fringe of our interests and known world. These could be lateral movements across multiple interconnected subjects, or digging deeper into narrower slices of a particular niche we find ourselves in. They are designed to intelligently present choices that may fill unconscious voids and gaps or become serendipitous agents of need. More importantly, they find ways of drawing us in, spending time on the system, participating in the act of spending money with each “add to cart” click. Sometimes I wonder what my shopping/buying over time would look like if I were able to map it out, constructing some digital representation manifest in the material things I looked at.
But how can these systems anticipate behavior, what we know
and like? In fancy terms, it’s known as a statistical inference engine, When Google provides suggested search terms in the box, they use a large data set of prior usage history based on the collective behaviors of other people looking for similar things. I’ve used Google as a spell checker because it understands the most common mistakes people make and suggests an alternative based on the most likely answer. This is also true as Amazon which also makes predictions on what I want to look at and buy. There is the feature to tell what other people looked at, what other people bought, and the bundling of “frequently bought together” items to create a “package deal” of related products. Another ‘helpful’ feature lets one know, for example, that after viewing a particular digital camera 75% also purchased it as well as breaking down the other comparisons of similar products and their respective percentages. It’s great that 3/4 people bought a specific digital camera, and seems like my search is over, or is it? Is this a great product or another display of groupthink where comfort in the majority decision means I do not have to think, other people have done it for me and I will be satisfied with the results. This sounds more like the reasoning of a self-fulfilled prophecy, but quality in terms of product, usability, and service are more often attributable to products that sell in larger quantities. Sometimes a negative review might be a bad battery shipped with the device, not necessarily warranting the one-star rating given. Other times glowing reviews with a five-star rating may have been too subjectively kind. It turns out that buying online can be easy or difficult if one cannot decide what to get and who to trust.
Models based on ‘popularity’ work in the broader sense, but
are not so accurate when looking for news, info, topics that more specialized. Some sites may use “recent tags” or “trending topics” to quickly give the viewer a sense of what the latest buzz or flavor of the hour might be. News sites like CNN and Fox list out the most popular read and emailed stories, and often I find that I am not interested. In the case of breaking news, like an earthquake, these stories filter up to the top quickly and the model succeeds in delivering content I want. Otherwise, I am going to specific areas or RSS feeds to deliver more targeted content. Another interesting model is the awesomely geeky website Slashdot, whose tagline, “News for nerds. Stuff that matters” is full of specialized information actively posted and commented on by a vibrant community. This community has created its own wake, dubbed the “slash-dot effect” where unsuspecting niche websites and information stores have had their servers overrun and crashed due to an unexpected volume of visits generated by the site. Slashdot while an interesting news and discussion forum, is not the same area I would want to ask my digital camera questions to, nor does it afford me the same level of conversational access to other people I am connected to. I suppose that some people are perfectly happy in there and it serves a niche community well, but let’s face it: it’s not for the masses. However, movie rentals and SNS are for the masses so let’s dig into what makes them so effective.
Netflix
had a contest to determine a better algorithm for predicting how a user would rate a movie based on prior history in rating others. They awarded “progress prizes” to teams that demonstrated improvement and a grand prize of One Million U.S. dollars awarded in 2009 for beating Netflix’s own predictive engine by 10%. Given that there are thousands of movies we have never heard of, this is a nice way of helping a user find movies that they would like to see, improving the overall value of the system. We are complex creatures full of what might be walking contradictions to an algorithm. I smile every time Netflix identifies a martial arts movie with lots of explosions for my wife; I wonder what data got in the system to cause that? Even my own movie preferences might appear peculiar because I would rank The Wizard of Oz (1938) as one of my top five favorite movies of all time, but I generally can’t stand musicals. Strangely enough I found that I liked the movie Chicago (2002) as well, but it was not my choice to watch it in the first place. If I cannot explain why I really like these two movies in simple terms, how does an algorithm mathematically determine what I did not know I would like in the first place? We will have to examine the paper from the winners, The Big Chaos Solution to the Netflix Grand Prize.
Wrapping one’s head around the mathematics presented in the Big Chaos, the result of three years of study and competition is quite a challenge, so I didn’t go there. Instead, there are some concepts and interesting statements afoot that have complex and profound implications on the world of SNS. While some of the effects observed seem simple, almost “no-duh” in aspect, they form strong foundations for developing a holistic understanding of the fickle undercurrents involved in movie preference. During the first year, the “…biggest discovery was the binary information, accounting for the fact that people do not select movies for rating at random.” This is what I meant by a “no-duh” moment. People do not rent or rate movies at random because they are browsing and selecting them based on their preferences. If I did not enjoy musicals, it would be hard to find any in the genre that I’ve rated. Also, I do not have to change or explore new territories to rate what I like in the areas I’m interested in. The second year focused on time based, or ‘temporal effects’ seen in the data. They observed small changes over longer periods of time but the short term effect, “…especially the one day effect, was very strong.” They were uncertain if multiple users on the same account or a person’s mood on any particular day was the cause of the one day effect, but that explains why Netflix would attribute Fight Club to my wife because we’ve most likely gotten our account profiles and ratings mixed up at one time or another. For the changing of mood, a perfect case in point was my one day rating and addition of several seasons of MacGyver to our queue during a nostalgic ‘favorite childhood television shows’ moment.
These strange behaviors obviously did not fit a linear model well, so they went the route of a nonlinear more akin to the way our brain works. Biological systems are complex, adaptive, and contain an astounding number of connections to process information. For the 2008 progress prize, the Big Chaos team implemented an artificial neural network to blend a set of independent predictors with some, but not enough success to win the top prize yet. What they realized is that, “… training and optimizing the predictors individually is not optimal. Best blending results are achieved when the whole ensemble has the right tradeoff between diversity and accuracy.” In the goal to be accurate, the network lost the glimmer or hint of the subtle interconnectedness of the various parameters that might only be found, even if subconsciously understood, in the human mind. Improvements were made by going back and tuning the parameters in sequence and finally, a single neural network was replaced by multiple ones to maximize diversity with, “…different subsets of predictors and different blending methods.” This blending approach using diverse nonlinear sets was critical to their success in claiming the million dollar prize, and is the secret link between a movie recommendations and explaining the value of social networking.
Big
Chaos, in many ways reproduces the result of our own behavior when using social networking tools. If we break down our own social networks, we have a diverse connection of friends and networks that have been developed over time to establish an optimal collection of thought and information sources based on our preferences. The artificial neural network is an attempt to mimic the behavior of the human mind, but they are a very poor substitute for the real thing. In this aspect, each of our connections is a diverse neural network that independently optimizes news and information based and every friend, fan page or group appears in our news feed because we chose put it there. We adjust this mix over time as we subscribe, friend and follow others and if the information is annoying or irrelevant, it’s easy to hide or unsubscribe. These relationships essentially carry an independently tuned ‘predictor’ of our preferences. Just like favorite movie categories, we know some friends are better at delivering information we like, so we pay more attention to them. Some people or pages are great at focused information, which comes in handy when one needs help, for example, of understanding how a neural network works.
It’s the commonality between friends, a subtle connectedness that establishes a basic predictor of relevance to us that makes social networks so effective. Creative solutions to problems have arisen from fresh perspectives and different approaches and the ability to connect and garner input from so many sources has never been easier with things like Facebook and Twitter. What we have created with our networks is a stream of information we
would most likely be interested in. I’ve had a friend announce they were going to see a movie, so I asked for a report to see if I would like it too. Because I have a sense of their interests and preferences, their input is valuable in that it has greater weight in my decision making process over some advertisement on how I could spend my time and money on a Saturday matinee. While I use a movie as an example, the subject could be about anything from work, academic, hobby or other personal interest, and perhaps, information on a digital camera.
There is something about social networking and the way that it works that must make it unique to the set of web 2.0 tools. It’s very interesting that Big Chaos began to represent preference by using information similar to the behavior of a social network, but that doesn’t answer the ‘what’s so special’ question. We are, after all, social creatures and some are more introverted than others in normal settings. Behind the computer screen, introverts may excel at sharing and connecting more freely, bringing more conversation to the party while allowing the true extroverts to stay connected with 1000 of their closest friends mush easier. From each aspect, the system helps me find and deepen relationships between people I know and it could help me find lots of friends that I’ve barely met. What is happening here is that both behaviors are enabled by the system, which gets back to this idea of lowering the transaction cost of change. In Netflix, we saw that this was easy because it’s something small like a movie with no long term commitment. We’ve all seen movies that we did not like, it’s not really a big deal because know that the penalty for taking the risk is small.
For people who like change, social networking allows for countless hours of searching, connecting, exploring and finding new things. Finding out what other people are doing may lead to new ideas and adventures. For people with an apprehension to change, the behavior requires no modification because social networking systems enable them as well. Choice and information are presented quickly and efficiently and there is no penalty for taking a quick peek at something online, notwithstanding the ever present malware links. Change outside the realm of our experience often appears as a secondary event, even trivial because they are non-intrusive or nonthreatening. Sites like Facebook do not present choices at random because they make direct or indirect use of friend data to present more probable choices, borrowing from either a collection of neural networks or making use of the one’s own behavior. It’s nice because we are rewarded by a system that allows us to maintain our bad habits, yet it
circumvents them in its machinery with little conscious thought on our part as we participate. That is the real secret and no wonder these things are so useful and fun to me, I can have my cake and eat it too!
Privacy Issues: The Netflix Grand Prize contest teaches that “personally identifiable information” (PII) is a surprisingly slippery, context-specific idea.