Saturday, June 24, 2017

Two decades of Amazon.com recommendations

IEEE Internet Computing just celebrated its 20th anniversary.

On its 20th anniversary, the editorial board created its first ever “The Test of Time” award. I'm honored to say they gave it to our 2003 article, "Amazon.com Recommendations: Item-to-Item Collaborative Filtering", which continues to be accessed, cited, and used in industry and research many years after its original publication.

In addition, for the 20th anniversary issue of IEEE Internet Computing, we wrote a new article, “Two Decades of Recommender Systems at Amazon.com". Some excerpts:
For two decades now, Amazon.com has been building a store for every customer. Each person who comes to Amazon.com sees it differently ... It's as if you walked into a store and the shelves started rearranging themselves, with what you might want moving to the front, and what you're unlikely to be interested in shuffling further away.

Amazon.com launched item-based collaborative filtering in 1998, enabling recommendations at a previously unseen scale for millions of customers and a catalog of millions of items. Since we wrote about the algorithm in IEEE Internet Computing in 2003, it has seen widespread use across the Web, including YouTube, Netflix, and many others.

The algorithm's success has been from its simplicity, scalability, and often surprising and useful recommendations, as well as desirable properties such as updating immediately based on new information about a customer and being able to explain why it recommended something in a way that's easily understandable.

What was described in our 2003 IEEE Internet Computing article has faced many challenges and seen much development over the years ... We describe some of the updates, improvements, and adaptations for item-based collaborative filtering, and offer our view on what the future holds for collaborative filtering, recommender systems, and personalization.

....

What does the future hold for recommendations? ... Discovery should be like talking with a friend who knows you, knows what you like, works with you at every step, and anticipates your needs.

Recommendations and personalization live in the sea of data we all create as we move through the world, including what we find, what we discover, and what we love ... Intelligent computer algorithms leveraging collective human intelligence ... Computers helping people help other people.

The field remains wide open. An experience for every customer ... offering surprise and delight ... is a vision none have fully realized. Much opportunity remains to add intelligence and personalization to every part of every system, creating experiences that seem like a friend that knows you, what you like, and what others like, and understands what options are out there for you.

Sunday, June 11, 2017

Quick links

Some of the tech news I found interesting lately, and you might too:
  • Jeff Bezos: "Many decisions are reversible, two-way doors. Those decisions can use a light-weight process. For those, so what if you’re wrong? .... If you’re good at course correcting, being wrong may be less costly than you think, whereas being slow is going to be expensive for sure." ([1])

  • Jeff Bezos: "I would say, a lot of the value that we’re getting from machine learning is actually happening beneath the surface. It is things like improved search results. Improved product recommendations for customers. Improved forecasting for inventory management. Literally hundreds of other things beneath the surface." ([1])

  • A good summary of Mary Meeker's 2017 report. A key highlight is saturation in smartphones and internet usage. ([1])

  • New Google AI incubator: "Investment arm aimed squarely on artificial intelligence ... will operate almost like an incubator with a shared workspace for AI startups and mentorship" ([1] [2])

  • Lots of good labeled data (reliable ground truth) is the key to success with AI ([1] [2] [3] [4])

  • AI in the real world is a lot harder than ideal conditions in part because you see crazy things like robots getting attacked by humans ([1] [2])

  • "The Google [Chrome] ad-blocker will block all advertising on sites that have a certain number of 'unacceptable ads,' according to The Wall Street Journal. That includes ads that have pop-ups, auto-playing video, and 'prestitial' count-down ads that delay the display of content." ([1])

  • Nice ACM Queue article from Google SREs on availability as a combination of subservice reliability, rapid recovery, and setting expectations ([1])

  • "Designing a [software] library to reduce cognitive load is still the exception, not the rule" ([1] [2])

  • A lesson for bigger companies, investing in the long-term with your researchers, who are often working a few years ahead of what you'll need now ([1])

  • Wow: "The Melt’s blundering trajectory is instructive ... Entrepreneurs frequently embark on these missions with vast sums of money and a deep belief in technology’s power to solve all problems — which is not always a formula for success .... They were all good people, and they all wanted good things. They just didn’t know anything about running restaurants." ([1])

  • "The once-hot social network was built on the idea that people would enjoy having anonymous conversations with people close by. That’s a fantastic concept until you remember that anonymous internet person and by definition near you are scary as hell in practice." ([1])

  • Great teardown of the Juicero, includes some excellent business advice on iterative development and testing your ideas on real customers ([1] [2])

  • "When the US government discovers a vulnerability ... it can keep it secret and use it offensively ... or it can alert the software vendor and see that the vulnerability is patched, protecting the country ... Every offensive weapon is a (potential) chink in our defense." ([1])

  • On spearfishing attacks: "By a careful design and timing of a message, it should be possible to make virtually any person click" ([1] [2])

  • Schneier on forging voices: "I don't think we're ready for this. We use people's voices to authenticate them all the time, in all sorts of different ways." ([1])

  • Facebook says, "We have had to expand our security focus... to include more subtle and insidious forms of misuse, including attempts to manipulate civic discourse and deceive people" ([1] [2])

  • Remarkable and concerning that this is possible: "By accessing accelerometer and gyroscope sensors, the Web-hosted JavaScript measures subtle changes in a phone's angle, rotation, movement speed, and similar characteristics. The data, in turn, can reveal sensitive information about the phone and its user ... [including] the keystrokes being entered" ([1])

  • Nice high level description here of the difference between what Apple and Google are doing for privacy-preserving machine learning. In brief, Apple adds noise to the data to preserve privacy, but Google learns on the device then sends the updates to the machine learned models back (much like parameters servers in deep learning). The truth is they're probably both doing both, but it's still a good thing to think about. ([1])

  • Using battery backup to optimize gas power plants by being able to skip the expensive bits for gas turbines, sitting in standby because of lengthy startup times. It's easy and practical, a nice example of low hanging fruit with major impact. ([1])

  • Good data on the projected costs of energy sources ([1])

  • Good data on the newspaper industry. There's a curious spike in ad revenue from 1980-2000 that isn't matched by subscriptions. ([1])

  • Jeff Bezos is making journalism profitable: "The Post has said that it was profitable last year — and not through cost-cutting ... The Post has gone on a hiring spree. It has hired hundreds of reporters and editors and has more than tripled its technology staff ... third straight year of double-digit revenue growth ... 'You have to be great at technology. You have to be great at monetization. But one thing I think we’re proving is that if you are, great journalism can be profitable.'" ([1])

  • How Google took over the classroom, great article, but misses that the failure of iPads was a big piece of this ([1] [2])

  • Duolingo's excellent efforts to help people learn English, which can be a tool for economic or educational advancement ([1])

  • Amazon Web Services cuts prices again, remarkable ([1])

  • Almost all cloud workloads right now are not cloud optimized, so the customers mostly moved a system built for fixed hardware resources to the cloud and then run idle a lot rather than redesigning to optimize with dynamically scaling ([1])

  • Latest version of Google Earth is impressive, definitely worth trying ([1])

  • Brent Smith and I received the first ever IEEE Internet Computing Test of Time award for our 2003 paper on Amazon's recommender system. In a new article for the IEEE Internet Computing 20th Anniversary Issue, we look back at the last two decades. ([1])

  • A virtual reality game that succeeds at taking advantage of what it can do well and what it can't to create a fully immersive experience ([1])

  • Somehow, I missed that Chris Sacca is retiring. Amazing career and influence he had, and impressive to decide to go an entire new direction now. ([1])

  • In a Stack Overflow survey, what software engineers care about, it's who they work with, what they are doing, and what they learn far more than salary. In the top five items, three are about who you work with and what you learn, one is benefits, and one is commute. But the benefits are complicated -- it's not salary, stock, and bonus -- but the top items all things related to work environment and commute, vacation, and health care. ([1])

  • Great interview with the CEO of Coursera: "Humility and the ability to listen well are the big things I look for ... If you want to understand people, you need to hear them ... [Also have] ambitious goals to lift the organization up and everybody with it. Setting goals that are ambitious but also achievable is an important skill." ([1])

  • Great quote from Jeff: "At Amazon, we've had a lot of inventions that we were very excited about, and customers didn't care at all. And believe me, those inventions were not disruptive in any way. The only thing that's disruptive is customer adoption." ([1])

  • Nice line in Dan Ariely's book Payoff: "If you really want to demotivate people, shredding their work is the way to go, but ... you can get almost all the way there simply by ignoring their efforts." ([1])

  • Xkcd points out minor changes in methodology yield radical changes in data visualizations of most unusually popular activity in a location ([1])

  • Xkcd on machine learning, disturbingly close to reality ([1])

  • Xkcd on hard problems ([1])

  • Xkcd on survivorship bias ([1])

  • Xkcd on unhelpful code reviews ([1])

  • Very funny that Burger King ran an ad with "OK, Google" and it works. Once again Xkcd was hilariously prescient about this. ([1] [2])

  • SMBC comic on bayesian inference: "Given his low priors..." ([1])

  • SMBC comic: "Then it occurred to me, hey, I've got like a sample size of one here, and it's not double blind." ([1])

  • SMBC comic on behavioral economics ([1])

  • SMBC comic: "Wait, are you going to turn my life's work into a joke about butts or something?" [1])

Sunday, April 30, 2017

All Crunchzilla tutorials now open source

All the code is now available for all the Crunchzilla coding tutorials.

Code Monster, Code Maven, and Game Maven from Crunchzilla have been used by hundreds of thousands of people around the world to experiment with learning to write computer programs.

There have been many requests to make them and available in languages other than English.

By open sourcing the Crunchzilla tutorials, I hope three things might happen:

Translations: I hope others are able to take the content and translate part or all of it into languages other than English for use in more classrooms around the world.

New lessons: New tutorials might teach programming games, working through puzzles or math problems, or perhaps a more traditional computer science curriculum aligned with a particular lesson plan.

Entirely new tutorials: Some of the ideas and techniques -- including the step-by-step learn-by-doing style, live code, informative error messages, and avoiding infinite loops in students' code -- might be useful for others.
The code was designed to be all static, so you can easily create your own version just by editing the files and then putting all the files together on your own server. There is a single JSON file that contains all the lesson content.

If you use the code for anything that helps children learn, I'd love to hear about it (please e-mail me at greg@crunchzilla.com).

Sunday, April 02, 2017

Quick links

A carefully picked list of some of the tech news I enjoyed recently:
  • So, you know that prototype we showed you? Turns out AI in real world conditions is hard. ([1] [2] [3])

  • Artificial intelligence expert Yann LeCun says, "There have been, on the face of it, impressive demonstrations, [but] those are not as impressive as they look ... They don't have common sense ... One of the things we really want to do is get machines to acquire the very large number of facts that represent the constraints of the real world just by observing it through video or other channels. That’s what would allow them to acquire common sense, in the end." ([1])

  • Genetic algorithms and neural networks are back. It feels like the 1990s all over again. ([1])

  • Bringing more novices to AI now is the way to get more experts and advances later ([1])

  • Nice results from focusing on errors that matter to people, the perceived quality of the system by humans, not theoretical accuracy ([1] [2])

  • Success often comes from trying many things: "Start ... with a hazy intuition or vision ... After a lot of trial and error they get closer and closer to discovering what their idea is ... Seeking novelty instead of objectives is risky — not every interesting thread will pay off — but ... the potential payoffs are higher." ([1])

  • Research includes people able to do things no one else can, including having data or compute at the frontier beyond what anyone else has done before ([1] [2])

  • 6.3M virtual reality headsets sold in 2016, but almost all so far just the cheap toys where you slot your smartphone in to use as the screen ([1] [2])

  • "Total [tablet] sales sinking 15.6%, year on year, with sales of 174.8M units in 2016 compared to 2015's 207.2M" ([1])

  • For the first time, more people in the US using Netflix than a DVR: "54 percent of US adults reporting they have Netflix in their households compared to the 53 percent of US adults that have DVR" ([1])

  • The Economist: "Amazon’s heady valuation resembles a self-fulfilling prophecy. The company will be able to keep spending, and its spending will keep making it more powerful" ([1])

  • "What has surprised AWS as the cloud has evolved ... I don’t think in our wildest dreams we ever thought we’d have a six- to seven-year head start" ([1])

  • ... and that is true in retail for Amazon as well ([1] [2] [3])

  • "Yahoo is perhaps most famous for destroying all of its best social properties. From its hideous deformations of Flickr and neglect of Upcoming to its starvation of Delicious and torment of GeoCities users, the company excelled at buying great things and turning them into unusable parodies of themselves. Execs seemed to profoundly misunderstand why people used the sites they bought." ([1])

  • "Google will account for 78 percent of search ad revenue in 2017, while Facebook will get 39 percent of display ad revenue. Everyone else ... is fighting over the scraps." ([1])

  • Culture is created by what you publicly reward, not what you say ([1] [2] [3])

  • "The problem with bad processes is that they institutionalize inefficiency. They ensure that things will be done the wrong way, over and over and over again" ([1] [2])

  • "Burnout begins when a worker feels overwhelmed for a sustained period of time, then apathetic and ultimately numb .... Workers who used to take the lead on projects grow taciturn during meetings. Top performers start coming in late, leaving early and watch their careers stall ... Burnout is claiming victims at work, and companies aren’t ready to cope" ([1])

  • A lot of companies have merely medium data, not big data: "Hundreds of enterprises were hugely disappointed by their useless 2 to 10TB Hadoop clusters ... Their data works better in other technologies" ([1])

  • Lack of incentives leads to poor Internet of Things security ([1])

  • As Javascript ages, it repeats many of the problems of the past: "Using data from over 133K websites, we show that 37% of them include at least one library with a known vulnerability" ([1])

  • "What are some things you wish you knew when you started programming?" ([1] [2])

  • Many Xkcd comics are both funny and prescient, and this one on encryption seems particularly relevant right now ([1])

  • Xkcd comic on friends that have an Amazon Echo ([1])

  • SMBC comic on "existential sort". Don't miss the hovertext: "Also, any list can be immediately sorted by just pretty much being fine with it the way it is." ([1])

Saturday, April 01, 2017

Book review: Radical Candor

This just came out, the book Radical Candor by Kim Scott. It's a good read on managing and focused on people. I'd recommend it if you are a manager or help others manage people.

I'd summarize it by saying it takes a teaching and mentoring approach to management, very much of the school that managers primarily exist to help the people on their team. The advice is both practical and actionable, with specific advice for running 1:1s and meetings, and focused how to encourage conversations where people strive to improve themselves as well as helping others.

Some carefully selected quotes from the book:

"It seems obvious that good bosses must care personally about the people who report directly to them ... And yet ... "

"It turns out that when people trust you and believe you care about them, they are much more likely to accept and act on your praise and criticism, tell you what they really think about what you are doing well and, more importantly, not doing so well, engage in this same behavior with one another ... embrace their role on the team, and focus on getting results"

"When you're the boss, it's awkward to ask your direct reports to tell you frankly what they think of your performance, even more awkward for them than it is for you. To help, I [ask] ... 'Is there anything I could do or stop doing that would make it easier to work with me?' ... It is essential that you ... commit to sticking with the conversation until you have a genuine response. One technique is to count to six before saying anything else, forcing them to endure the silence. The goal is not to bully but to insist on a candid discussion ... Then listen with the intent to understand ... Once you've asked your question and embraced the discomfort and understood the criticism, you have to follow up by showing that you welcome it. You have to reward the candor if you want to get more of it ... Make a chance as soon as possible ... show you're trying."

"If you can absorb the blows, the members of your team are more likely to be good bosses to their employees when they have them ... The rewards of watching people you care about flourish and then help others flourish."

"The ultimate goal of Radical Candor is to achieve results collaboratively that you could never achieve individually ... A culture of guidance ... An exemplary team ... self-correcting quality whereby most problems are solved before you are even aware of them ... Don't start by bossing people. They'll just hate you. Start by listening to them."

Sunday, February 26, 2017

More quick links

Some of the tech news I found interesting lately, and you might too:
  • "In addition to making our systems more intelligent, we have to make them more intelligible too ... AI systems to augment human capabilities ... A human-centered approach is more important than ever." ([1])

  • "Understanding the brain is a fascinating problem but ... separate from the goal of AI which is solving problems ... We don’t need to duplicate humans ... We want humans and machines to partner and do something that they cannot do on their own." ([1])

  • "Machine learning and reasoning to help doctors to understand patient outcomes -- in advance of poor outcomes ... a great deal of low-hanging fruit where even today’s AI technologies are well positioned to help ... error detection, alerting, and decision support ... could save hundreds of thousands of lives per year" ([1] [2])

  • "Google's first entirely on-device ML technology ... machine intelligence ... run on your personal phone or smartwatch" ([1])

  • Accelerometers and heart rate monitors in earbuds, clever and avoids the need for a separate wearable ([1])

  • On Google's business: "Mobile search and YouTube were the main drivers of Google’s strong performance ... Google’s market share ... is above 90 percent on mobile devices" ([1] [2] [3])

  • "AI is the next platform for Facebook right now. The company is quietly approaching this initiative with the same urgency as its previous Web-to-mobile pivot." ([1])

  • "Microsoft formed a new 5,000-person engineering and research team to focus on artificial intelligence products" ([1])

  • Qi Lu leaves Microsoft for Baidu, and Jan Pedersen leaves Microsoft for Twitter. ([1] [2])

  • Not sure how well known this is: "Facebook collects information about pages [you] visit that contain Facebook sharing buttons ... And in case that wasn’t enough, Facebook also buys data about its users’ mortgages, car ownership and shopping habits from some of the biggest commercial data brokers. Facebook uses all this data to offer marketers a chance to target ads to increasingly specific groups of people. Indeed, we found Facebook offers advertisers more than 1,300 categories for ad targeting — everything from people whose property size is less than .26 acres to households with exactly seven credit cards." ([1])

  • Interesting example for the news industry: "Doubling down on traditional journalism and investing heavily in new ways to deliver it, through smartphone apps, voice-activated speakers and e-readers. The Post’s digital effort has become the envy of the industry, with as many as 80 software engineers, developers and others working alongside reporters and editors to present the news in real time." ([1])

  • "Bezos has worked to create a culture at Amazon that’s hospitable to experimentation ... developing products customers will actually want to pay for ... experiments start small and grow over time ... a small team to experiment with the idea and find out if it’s viable ... if a team succeeds in smaller challenges, it’s given more resources and a larger challenge to tackle ... prioritize launching early over everything else ... learn as quickly as possible whether an idea that sounds good on paper is actually a good idea in the real world ... getting a product into the hands of paying customers as quickly as possible and taking their feedback seriously ... avoids wasting years working on products that don’t serve the needs of real customers." ([1])

  • New direction for the cloud, just small pieces of code running somewhere (you don't care where) and data stored somewhere (you don't care where), all auto scaled ([1] [2])

  • "Many failed ideas have been resuscitated and rebranded as successful products and services, owned and managed by people other than their originators. Behind almost every popular app or website today lie numerous shadow versions that have been sloughed away by time. Yet recognition of the group nature of the enterprise would undermine a myth that legitimizes the consolidation of profit, for the most part, among a small group of people." ([1])

  • For those of us tracking virtual reality: "While Facebook does not provide sales figures for the $599 Oculus Rift headset ... analysts believe they are slow. One research firm ... estimated the company sold only about 355,000 by the end of last year." ([1] [2] [3])

  • A surprising level of detail here on what software development is like inside of Google. I agree with most of it, and highly recommend reading at least Section 2. ([1] [2])

  • Great blog post summarizing NIPS 2016. Highlights are what wins Kaggle competitions, why deep learning works, latest twiddles to deep learning and reinforcement learning, why dialogs (chat) still doesn't work, and that Baidu has products who's only value is in the data they collect (not direct revenue, just the explore part of explore/exploit, learning how to be more effective). ([1])

  • Ease of use is badly underrated: "Using TensorFlow makes me feel like I’m not smart enough to use TensorFlow; whereas using Keras makes me feel like neural networks are easier than I realized." ([1])

  • New paper by Geoff Hinton and Jeff Dean, essentially a very large ensemble of neural networks with sparsity enforced to minimize the computational cost ([1])

  • Thoughtful comments on engineering management ([1])

  • Different people we work with in tech tend to have different ideas of what it means to get things done ([1])

  • "People with different backgrounds bring new information. Simply interacting with individuals who are different forces group members to prepare better, to anticipate alternative viewpoints and to expect that reaching consensus will take effort." ([1])

  • Meetings are expensive -- a 10 person meeting for an hour costs a few thousand dollars -- and people hate meetings too. Some good reoccurring themes here are to keep meetings small, short, write a tight agenda ahead of time, stay off your laptop and phone, and try to finish early. ([1])

  • Disappointing game theory tidbit of the day, the Joy of Destruction game shows people enjoy causing harm when they can do it without consequences ([1] [2])

  • Great data visualizations from 538, not just eye candy but convey information quickly ([1])

  • "Tesla has 1.3 billion miles of car-driving data thanks to its Autopilot-equipped vehicles that are already on the road before competitors in Detroit and Silicon Valley can roll self-driving cars off the lot. It’s a massive competitive advantage." ([1])

  • Fun details on laying undersea internet cables from Amazon Web Services Distinguished Engineer James Hamilton ([1])

  • "All future wars will begin as cyberwars" ([1])

  • Impressive plans from China's space program, probes on the far side of the moon and on Mars in the next four years ([1])

  • For those interested in education, MIT's popular and excellent Scratch has published a dataset of how people learn computational thinking ([1])

  • What Code.org has achieved is very impressive: "Trained 50,000 new K-12 computer science teachers ... More than 20 million lines of code have been written by ... more than one million K-12 students ... we expect to dramatically change the demographics of AP Computer Science this year" ([1])

  • Funny article from The Onion on having too many browser tabs open ([1])

  • SMBC comic on the universe as A/B testing ([1])

  • SMBC comic on behavioral economics and anchoring ([1])

  • SMBC comic: "The wise man was put to death in the most mathematically insulting way possible" ([1])

  • Xkcd comic on what phones are, random emotional stimuli to replace boredom with anxiety ([1])

  • Xkcd comic on being an overoptimizer ([1])