You have probably heard a lot of talk about Data Science here and there lately. New terms started showing on your social feed and articles, such as machine learning, deep learning, artificial intelligence, etc. And a lot of these words were surrounding big fresh topics. Examples of these topics may include the use of this technology by social media and companies to improve their profits, or governments depending on it for getting their intelligence. You may also hear people call data science the sexiest job in the 21st century. A description that has first seen light in Harvard Business Review in 2012. Or you may have just stumbled upon a media that presented the term as some new futuristic technology which has some awesome implementations (Which is true, to be honest !). But really, what is Data science?
What is data science?
To answer this question in a simple way, let’s begin with what you would find on Google. Wikipedia will answer you with the following definition:
Data science is a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured,similar to data mining.
Dictionary talk as it is, this doesn’t give you much insight, does it?
Well, to simplify things, and in a few words, Data science is the art of manipulating data and understanding it. You can look at it this way: modern technologies generates so much data about our personal and professional everyday life, that we could extract information and knowledge you can’t even imagine.
Let’s say some detective (Sherlock !) is visiting your house. The more information you’ll have hanging around, the more conclusions our detective will come up with. Maybe some groceries receipt left on the kitchen’s counter and the sanitary state may lead him to your health and lifestyle. Some of your kids’ stuff in their playroom and pictures all over the house may lead him to make assumptions about your relatives and friends, etc. Now Imagine this at a larger scale, with machines that can compute and find patterns incredibly much faster than humans. And with a very huge amount of data, that is available in abundance in this current world of big data.
That’s it, you guessed it! The amount of accurate conclusions that can be extracted is insanely big.
So what can be done with Data science?
The answer to this question is fairly easy. The ads and pages and publications and even friends that are suggested to you on Facebook, Instagram, or other social media. The suggestions of movies on Netflix, products on Amazon or eBay, songs on Spotify, Google search, are all part of recommendation systems developed by each company. All those things were made possible by data science applications. All those technologies were polished to perfection, that sometimes it scares you!
I mean how many times were you talking about some friend about something or some topic or even visited some website, and you login into your Facebook account, and there it is. An ad about the thing you were just chatting about or looking at!
As shown above, the implementations of Data science in the tech industry are almost infinite. Since 2014, the social explosion year, the data science field has known a great change too. The data became very available, and with the cloud solutions, the computational power stopped being a problem too.
Consequently, researches didn’t stop pouring since, in almost all areas, especially within the computer science field itself.
And even though, the applications of data science are mainly in the Tech sector and computer science field and used primarily by Data-driven companies, There is a ton of other fields that benefit from this study field.
Fields of applications of data science
Health care is a major example of that. A lot of researches and work are being done using data science techniques to solve a lot of complicated problems, such as diagnosing all forms of cancers and detecting symptoms of a hard to spot but common disease. Agriculture is also a field that profits from advances in data science.
Techniques to detect which specific plants need irrigation, fertilizing, etc.., and which plants are more in danger of pests or weed.
In the same fashion more fields scored a win utilizing data science such as transport, industry, security (benefited a big time with fraud detection in banks for example and spams detection), etc. So you see, Its not an exaggeration to call it the sexiest job in the 21st century.
Okay cool! So, what is it that a data scientist does exactly?
A data scientist is a curious person able to identify relevant questions about a certain problem. First comes the data. Either explicitly given, or he has to fetch it himself. Collect data from a multitude of different data sources, organize the information. This data is usually referred to as datasets.
Then, this person has to apply some of the adequate tools and techniques to clean it and pre-process it. Meaning taking empty and irrelevant data and values.
Then, its time to use a bunch of special statistical and mathematical skills to analyze this data and answer some questions. What follows is to use those answers, to drive strategic decision-making in an organization or help advance some scientific research. Those results should be presented and visualized to be communicated to concerned parties, that can be either stakeholder of a company or a scientific community.
The data scientist should also have great skills at storytelling and being curious and making notable observations. Because let’s face it, looking at a huge amount of data that can be millions or billions of lines, it’s not that obvious.
Summary
To summarize, data science is:
“The ability to take data, to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it ”.
That’s it, that was data science in a Nutshell!
Of course, you will wonder what about all those fancy words you starting to hear everywhere such as machine learning and cognitive science, etc. Well, those are just some branches of the data science as it is a really large study field that now encompasses many other older fields such as business intelligence and data mining and statistics. And it tangles with many other new fields such as big data. if you are curious to know more about some of the amazing applications of data science in the real world nowadays, you can check this amazing article.