I have been absent in sending these the past few weeks and am changing topics, but bear with me, as I feel I’ve been remiss of late to not tackle the slew of machine learning articles that have been published. As well, given that AI and machine learning are one of Gartner’s Top Ten for 2017, it seems time to revisit. First, though, what is machine learning?
If you remember back, I did a dive into artificial intelligence last spring, mostly because I got overly curious myself. Machine learning and artificial intelligence go hand in hand, and in fact machine learning is a subset of AI. It provides computers with the ability to learn without being explicitly programmed and focuses on the development of computer programs that can teach themselves to grow and change when exposed to new data.
The process of machine learning is similar to that of data mining; both systems search through data to look for patterns. However, instead of extracting data for human comprehension — as is the case in data mining applications — machine learning uses that data to detect patterns in data and adjust program actions accordingly. That said, there are some who argue that machine learning is not a subset of AI, but the only kind of AI there is. Perhaps we’ll evolve to that point, but there are still many examples of artificial general intelligence out there that don’t leverage machine learning that it’s still meaningful to distinguish between the two in my mind.
So why has machine learning gained so much momentum in the past few years? Two factors stand out: data availability and computational power.
Today, the amount of digital data being generated is huge thanks to smart devices and Internet of Things (see previous posts). This data can be analyzed to make intelligent decisions based on patterns, and Machine Learning helps to do exactly that. As well, Moore’s law has ensured that the current hardware has the capability to reliably store and analyze the massive data and perform massive amount of computations in a reasonable amount of time. This allows us to build complex Machine Learning models with billions of parameters which was not possible a decade ago.
Machine learning evolved from the study of pattern recognition and explores the study and construction of algorithms that can learn from and make predictions on data – such algorithms overcome following strictly static program instructions by making data driven predictions or decisions. It is a method of data analysis that automates analytical model building, allowing computers to find hidden insights without being explicitly programmed where to look. And it is everywhere. For example:
Financial services: Banks and other businesses in the financial industry use machine learning technology for two key purposes: to identify important insights in data, and prevent fraud.
Government: Government agencies such as public safety and utilities have a particular need for machine learning since they have multiple sources of data that can be mined for insights. Analyzing sensor data, for example, identifies ways to increase efficiency and save money. Machine learning can also help detect fraud and minimize identity theft.
Health care: Machine learning is a fast-growing trend in the health care industry, thanks to wearable devices and sensors that can use data to assess a patient’s health in real time.
Marketing and sales: Websites recommending items you might like based on previous purchases are using machine learning to analyze your buying history – and promote other items you’d be interested in. As well, it’s the place you most often experience machine learning in an obvious way and Amazon’s predictive engine is one of the best examples out there of how machine learning can enhance a consumer’s experience and has decades of data from millions of users to pull from.
Oil and gas: Finding new energy sources, analyzing minerals in the ground, predicting refinery sensor failure, or streamlining oil distribution to make it more efficient and cost-effective; the number of machine learning use cases for this industry is vast – and still expanding.
Transportation: Analyzing data to identify patterns and trends is key to the transportation industry, which relies on making routes more efficient and predicting potential problems to increase profitability. The data analysis and modeling aspects of machine learning are important tools to delivery companies, public transportation and other transportation organizations. As well, with autonomous vehicles coming more into the mainstream (you can get an autonomous Uber in San Francisco now), machine learning will have an even greater impact. Let’s just hope we don’t end up with Johnny Cabs.
So, those are the obvious areas, and perhaps many of you are rolling your eyes at this point wondering when I’ll get to the good stuff. If you can bear with me a little longer, we’ll get to the articles I mentioned … or you can always scroll to the end.
Some of the base level requirements for creating a good machine learning system include data preparation capabilities, the quality of algorithms – basic and advanced (duh), automation and iterative processes, scalability, and ensemble modeling. Wait, what’s that last one? It’s the process of running two or more related but different analytical models and then synthesizing the results into a single score or spread in order to improve the accuracy of predictive analytics and data mining applications. OK, so let’s clarify a few other terms that’ll help later on: in machine learning, a target is called a label, however in statistics, a target is called a dependent variable. Right. Great, not too confusing. But wait, a variable in statistics is called a feature in machine learning and a transformation in statistics is called feature creation in machine learning. Oh, and for those of you who’ve not had statistics in a while or never had it (or when you took it, it was a bit of a joke in your grad program), the Khan Academy is a great place to go get an overview, or you can try Stanford. That doesn’t touch on fit, overfit, and underfit as it relates to machine learning. There’s also a great visualization of machine learning from R2D3.
There are a variety of machine learning methods out there that are employed today: supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning.
Supervised learning algorithms are trained using labeled examples, such as an input where the desired output is known. For example, a piece of equipment could have data points labeled either “F” (failed) or “R” (runs). The learning algorithm receives a set of inputs along with the corresponding correct outputs, and the algorithm learns by comparing its actual output with correct outputs to find errors. It then modifies the model accordingly. Through methods like classification, regression, prediction and gradient boosting, supervised learning uses patterns to predict the values of the label on additional unlabeled data. Supervised learning is commonly used in applications where historical data predicts likely future events. For example, it can anticipate when credit card transactions are likely to be fraudulent or which insurance customer is likely to file a claim.
Unsupervised learning is used against data that has no historical labels. The system is not told the “right answer.” The algorithm must figure out what is being shown. The goal is to explore the data and find some structure within. Unsupervised learning works well on transactional data. For example, it can identify segments of customers with similar attributes who can then be treated similarly in marketing campaigns. Or it can find the main attributes that separate customer segments from each other. Popular techniques include self-organizing maps, nearest-neighbor mapping, k-means clustering and singular value decomposition. These algorithms are also used to segment text topics, recommend items and identify data outliers.
Semi-supervised learning is used for the same applications as supervised learning. But it uses both labeled and unlabeled data for training – typically a small amount of labeled data with a large amount of unlabeled data (because unlabeled data is less expensive and takes less effort to acquire). This type of learning can be used with methods such as classification, regression and prediction. Semisupervised learning is useful when the cost associated with labeling is too high to allow for a fully labeled training process. Early examples of this include identifying a person’s face on a web cam.
Reinforcement learning is often used for robotics, gaming and navigation. With reinforcement learning, the algorithm discovers through trial and error which actions yield the greatest rewards. This type of learning has three primary components: the agent (the learner or decision maker), the environment (everything the agent interacts with) and actions (what the agent can do). The objective is for the agent to choose actions that maximize the expected reward over a given amount of time. The agent will reach the goal much faster by following a good policy. So the goal in reinforcement learning is to learn the best policy.
But that brings us to the topic of deep learning. Wait, are we talking about a subset of a subset now? Why, yes. Yes we are.
Deep learning combines advances in computing power and special types of neural networks to learn complicated patterns in large amounts of data. Deep learning techniques are currently state of the art for identifying objects in images and words in sounds. Researchers are now looking to apply these successes in pattern recognition to more complex tasks such as automatic language translation, medical diagnoses and numerous other important social and business problems. Algorithmia has a great blog about why Deep Learning matters for a more detailed look, but the long and the short is that deep learning trains a computer to perform human-like tasks, such as recognizing speech, identifying images or making predictions. Instead of organizing data to run through predefined equations, deep learning sets up basic parameters about the data and trains the computer to learn on its own by recognizing patterns using many layers of processing.
So what are some of the emerging trends in machine learning? Three came out of the Machine Learning / Artificial Intelligence Summit last summer in Seattle: data flywheels, the algorithm economy, and cloud-hosted intelligence.
Digital data and cloud storage follow Moore’s law: the world’s data doubles every two years, while the cost of storing that data declines at roughly the same rate. This abundance of data enables more features, and better machine learning models to be created. In the world of intelligent applications, data will be king, and the services that can generate the highest-quality data will have an unfair advantage from their data flywheel — more data leading to better models, leading to a better user experience, leading to more users, leading to more data. Feel free to review the flywheel concept, but it’s an apt analogy here.
Next, all the data in the world isn’t very useful if you can’t leverage it. Algorithms are how you efficiently scale the manual management of business processes. This creates an algorithm economy, where algorithm marketplaces function as the global meeting place for researchers, engineers, and organizations to create, share, and remix algorithmic intelligence at scale. As composable building blocks, algorithms can be stacked together to manipulate data, and extract key insights. In the algorithm economy, state-of-the-art research is turned into functional, running code, and made available for others to use. The intelligent app stack illustrates the abstraction layers, which form the building blocks needed to create intelligent apps.
Last is cloud-hosted intelligence. For a company to discover insights about their business, using algorithmic machine intelligence to iteratively learn from their data is the only scalable way. It’s historically been an expensive upfront investment with no guarantee of a significant return. However, with more data becoming available, and the cost to store it dropping, machine learning is starting to move to the cloud, where a scalable web service is an API call away. Data scientists will no longer need to manage infrastructure or implement custom code. The systems will scale for them, generating new models on the fly, and delivering faster, more accurate results.
OK, I think that’s enough of a dive for our purposes today. Now to the news: first is a set of articles from Harvard Business Review that explore current trends. These include the simple economics of machine intelligence, where HBR explores how machines learning will impact how prediction with regards to the production of goods and services and how it will impact the value of other inputs, then there’s what every manager should know about machine learning, and, last, how to make your company machine learning ready.
Alex Hern, from The Guardian, went so far as to give machine learning a go himself last summer. You can read about his experience here. Before we move on to other, still tech related news, I thought a few of you might be interested in how machine learning is impacting healthcare. Forbes, HuffPost, VentureRadar, and MedCity News have their own takes on that topic. Oh, and The Atlantic has a great article on searching for lost knowledge in the age of machines while Medium explores the public policy implications of AI and the New York Times Magazine has a long piece on “The Great AI awakening.”
I included Magic Leap in the round up a few weeks back, and already the luster is beginning to dull as The Verge notes that they are “way behind” in their VR device development. The concept is stunning, the execution may be years away and is far behind Microsoft’s HoloLens. Speaking of Microsoft, my former employer has rounded up 17 predictions from 17 researchers for 2017 and 2027. Fast Company also recently published a brief history of the most important economic theory in tech, based on an HBR article by W. Brian Arthur back in 1996. One of the best quotes form the interview is when Arthur was asked what kind of CEO can best take advantage of increasing returns that might exist: “You need an awareness of the ecology you are in. If you think of different firms and products as being different species, then you have to be very aware of how that entire network of different companies operates, even if they are quite peripheral to you.”
It seems fitting this week to revisit a TED talk by a buddy of mine from a few months ago: how computers are learning to be creative. Blaise Aguera y Arcas is one of the more brilliant guys I’ve ever worked with, and to boot he is a renaissance musician and an avid evangelist for advancing technology. In this talk, he discusses his work with deep neural networks for machine perception and distributed learning and shows how neural nets trained to recognize images can be run in reverse, to generate them.