Have you ever come across an article saying something along the lines of:
“Everyone’s calling themselves a data scientist these days. The ability to manipulate data and use Python libraries doesn’t make you a data scientist. Knowledge of linear algebra and statistics doesn’t make you a data scientist. The ability to use data to derive business value doesn’t make you a data scientist. Stop calling yourself a data scientist. You are no scientist.”
After reading around five articles making the same point, I found myself wondering:
“If that’s the case, who is a data scientist? What qualification does a person need to possess in order to call themselves a data scientist?”
It seemed as though the authors of these articles were so caught up in their frustration of who they thought shouldn’t be calling themselves data scientists, that they forgot to mention who should be.
Not one of these writers mentioned the qualifications a “real data scientist” should possess. They also conveniently failed to mention the everyday job of a “real data scientist.”
So… who is a real data scientist?
I spent the entire day reading definitions of the word data science.
Let me share my findings with you.
Wikipedia defines data science as “an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains.”
The Oxford Dictionary defines a data scientist as “a person employed to analyse and interpret complex digital data, such as the usage statistics of a website, especially in order to assist a business in its decision-making.”
Datarobot defines data science as “the field of study that combines domain expertise, programming skills, and knowledge of mathematics and statistics to extract meaningful insights from data.”
After reading these definitions, I took a look at some data science job listings. What skills were companies looking for in a data scientist?
One of the first job listing I came across was by a Fortune 500 company, and I will attach a screenshot of the job description here:
Most data science job descriptions were similar to the one above.
So if the definition and job description of a data scientist matches a person’s skillset, why don’t we just allow them to call themselves data scientists?
Data science is a very loose term.
It is used to describe so many different roles, and the term has become so vague that it’s almost meaningless.
I’ve worked with data science team leaders at multinational companies who came from non-technical backgrounds.
People who do the work of a statistician often refer to themselves as data scientists, and so do people who do data engineering or analytics related work.
The term is just so broad that it encompasses the job scope of many different people. And that’s okay.
Who should call themselves a data scientist?
My job title has the words “data science” in it, and my skillset is exactly as described above.
If you deal with data on a regular basis to derive insight and come up with business value, you are (by definition) a data scientist.
The ability to clean and manipulate data, create data pipelines, and build predictive models does qualify you to call yourself a data scientist.
Data science has been widely accepted as just that.
The term data science is so broad now that it defines the job scope of millions of people around the world.
And honestly, it doesn’t make sense to obsess about terms.
Unless the term “data scientist” magically gets replaced with a different word overnight, people aren’t going to stop calling themselves data scientists.