Defining Machine Learning, Predictive Analytics, and Data Science — Without the Mumbo Jumbo
by John Bates
posted on 03-16-2016
If you walk into any conference room in the world, it is likely you will hear people talking about “opening the kimono” in a “play for transparency” so they can “dig deep into Big Data.” You may walk away thinking that what you just heard did not make any sense. Honestly, it is entirely possible that what you heard did not actually make any sense. Often, people oversimplify things, use too many buzzwords, or simply do not understand the material well enough to explain it without using a whole lot of mumbo jumbo. That is why we want to level set and explain the difference between data science, machine learning, and predictive analytics in terms that anyone can understand.
Let us start at the beginning with data science. Data science is the broad umbrella under which all other types of analysis fit. Data cleansing, manipulation, and the selection of the type of analysis that needs to be done are all key pieces of the data-science foundation. Without these foundational items being accomplished, your data analysis will never truly be accurate.
Communication and domain knowledge are crucial traits found in data scientists. Data scientists need to understand how the underlying data is created, the business objectives and goals, the applications of the data and the predictions, and how to interpret the data through data storytelling and engaging visualizations. This is why it is important to not just hire a data scientist but to hire the data scientist that is right for your brand.
At its core, predictive analytics uses statistics to either estimate what behavior a customer is likely to exhibit or to forecast future outcomes of the business. You may hear people say that predictive analytics are always probabilistic in nature, because they tell us what the probability is of something happening. Predictive analytics help us to understand what is likely to happen in the future based on what has happened in the past. While this may seem like some type of new-age voodoo, predictive analytics have been used for years. For instance, your credit score is calculated using predictive analytics. Based on a predetermined predictive analytics model that includes data about how you have behaved in the past, your credit score predicts how creditworthy you are likely to be in the future.
When you are dealing with analytics that have to process truly Big Data — like terabytes or petabytes of data — it is unreasonable to expect a human or an unparalleled, unscaled technology to do it. Not even if you have the best data scientist in all the land. This is where it makes sense to bring in machines to help process and analyze mass quantities of data.
From there, these tools can also address the “learning” piece of machine learning. Once you can see what you think is going to happen, and you begin to receive feedback on what actually did happen, your model can update and become even better at predicting which customers might take which actions. This means that predictive analytics — when dynamic — drive machine learning so that your model is constantly becoming more and more accurate.
These analytics tools are used every day in ways that can help your brand better understand and connect with your target audience. To do that, though, it is important for everyone in the room to understand exactly what you are talking about when you discuss various analysis techniques. Having everyone on the same page can save tons of time, money, and headaches.