In the simplest term, Data Science is the transformation of Data into actionable insights or visions. Raw data or data that has not been processed for use1, must be processed and transformed to produce information. That information is then analyzed to develop knowledge. The end goal is for the knowledge to drive action.
The emerging field of data science is the connection between social science and statistics, information and computer science, and design. Many of the tech‐centric breakthroughs we consider commonplace today began as little more than a heap of unstructured data.
That unstructured data is run through an algorithm.
An algorithm is a set of steps a computer goes through to accomplish a task. Algorithms help you analyze huge amounts of data or help you select intelligently from a vast number of possible decisions. An algorithm must solve a problem and do it efficiently. 2
Did you know?
- Amazon has an algorithm to recommend items for you to buy
- Netflix has an algorithm to recommend movies you will enjoy
- Spotify has an algorithm to recommend your favorite music
- Gmail has an algorithm to determine if an email message is junk or not
Traditional Data Analytics examines raw data with the purpose of drawing conclusions about that information. What differentiates data science from traditional data analytics is the focus on future outcomes, decision making and the reduction of human intervention in the process. Traditional analytics describes what happened in the past and diagnoses why it happened. Data Science continues this process organically; analyzing data to predict what will happen next and prescribe what action should be taken.
Data Science is the intersection between Computer Science, Math and Statistics and Subject Matter Expertise.
- Machine Learning (a type of artificial intelligence (AI) that allows software applications to become more accurate in predicting outcomes without being explicitly programmed4) is the intersection between Computer Science, and Math and Statistics and all you need to know are the input variables and how to interpret the output. Machine Learning Algorithms are created using various techniques and programming machine learning algorithms is the job of a Data Scientist.
- Traditional Research is the intersection of Math and Statistics, and Subject Matter Expertise. This is the kind of research that is not dependent on any technology.
- Traditional Software is the intersection between Subject Matter Expertise and Computer Science. There are many software vendors in this space attempting to make data analytics easier for the subject matter experts who lack computer science skills.
The person who is the master of all three disciplines is called a Data Scientist. Since very few individuals can realistically be masters of these diverse domains, these individuals are often called Unicorns. The unicorn is a mythical creature who some believe exist, but no one has actually seen one.
Although, at one point in history, it was possible to know everything, the current breadth and depth of knowledge is such that it is no longer possible. Experts believe that the last person to know everything was born in the mid‐nineteenth century (circa 1870). Since then, it is agreed that the body of "known knowledge" had become so large that it was no longer possible for one person to now everything.
Therefore, successful Data Science should be considered a team sport. Teams are formed to address a specific business problem. Building a successful team starts with having the essential skills. Roles don’t always tell the whole story. Focus on the skills required for the project. The ability to identify valuable skills and partner them effectively will also enable more successful data science efforts.