The difference between data scientists and data engineers

Despite the growing demand, many people still don’t understand the difference between a data scientist and a data engineer. Here’s how the roles differ.

The difference between data scientists and data engineers
Picture by Peter Gombos, Unsplash.
8 minutes to read

The growing need for data engineers, as well as data scientists, means that increased demand is pushing salaries even higher, making the two positions among the most highly paid in the sector. In the UK, the average salary for both data engineers and data scientists with just a few years’ experience is well over £50K, with senior roles often in the low six figures.

However, many people (including some hiring managers) still don’t understand the important differences between data scientists and data engineers, or what skills to expect from job candidates. In this article, I’ll explain how data scientists and data engineers differ, and explain why most companies really need staff in both positions on their data team.

Data engineer vs data scientist

What is a data engineer?

Data engineers are essentially “devops” for data science teams. They architect, develop, and maintain the systems data scientists need. They handle everything from creating data pipelines that move data from one system to another, to deploying machine learning models so they can be used in production.

Data engineers ensure that data scientists can focus on doing what they’re best at, instead of trying to be part-time data engineers, and performing tasks that are outside their comfort zone and their area of specialist knowledge.

They’re experts in handling raw data, cleaning it to ensure it’s in the right format for the data scientists to use in their machine learning models, and are experienced in setting up the data infrastructure, data warehouses, big data or cloud platforms used to let stakeholders utilise the machine learning or artificial intelligence models in production.

What is a data scientist?

Data scientists understand business data, mine data for insights, and create statistical and predictive models to help improve the businesses in which they work through in-depth data analysis. They often move into the field from data analyst positions, but use more advanced statistical techniques, including ML, deep learning, and AI.

Even though most data scientists are expected to perform some data engineering tasks, the roles of data scientists are quite different and the things they need to know and do are miles apart. Some companies, especially those with smaller data teams, will expect their data scientists to have some data engineering skills.

However, as technologies used in the two fields are very different, and very fast moving, so it’s unusual to find a single individual who is truly an expert in both areas. These so-called “full stack” data scientists are quite a rarity.

Typical responsibilities of data scientists vs data engineers

The typical responsibilities for data scientists and data engineers vary according to the size and maturity of the company, and the size of its data science team. In smaller companies, data scientists may have to handle data engineering tasks themselves, so will need to be generalists or work as full stack data scientists.

As businesses and teams get larger, responsibilities tend to be split between data engineers and data scientists, and staff tend to hold more specialised roles. Bringing in engineers to support the scientists should mean the team are better able to focus on their specialisms, and the data team generates better overall results and is more productive.

Data Scientist Data Engineer
Understanding business performance Creating a big data architecture
Mining data for insights Handling data ingestion
Removing outliers from data Creating data workflows
Creating statistical models Maintaining cloud infrastructures
Creating predictive models Detecting and handling corrupt data
In-depth data analysis Preparing data for access

png Picture by Jefferson Santos, Unsplash.

Common technologies used by data scientists

Data scientists tend to have a quantitative background, often coming from mathematics or science-based backgrounds, or data analyst roles. These data driven individuals are unlike the average programmer or software engineer and have a sought after mix of maths, stats, data visualization, programming, and domain knowledge.

As data science is a relatively new discipline, at least by its current name (it was previously known as data mining), relatively few have a specific qualification in data science. However, Master’s degrees in related fields are common, and PhDs are considered a requirement for some roles.

Even entry level data scientist roles command salaries over £30K in the UK, rising to around £50K with a few years’ experience, and with senior practitioner roles topping £100K and head, director, and chief roles going even higher.

Area Technologies
Programming languages Generally Python or R
Operating systems Usually Linux, sometimes Windows or Mac
Containerisation Docker
Development tools PyCharm, Visual Studio, Jupyter Lab
Data manipulation Pandas, NumPy, SciPy
Data visualisation Matplotlib, Seaborn, Bokeh
Deep learning TensorFlow, Keras, PyTorch
Machine learning scikit-learn, XGBoost
Techniques Classification, regression, NLP, deep learning, unsupervised learning
BI tools Apache Superset, Tableau, Google Data Studio
Databases BigQuery, RDS, Azure SQL, Google Cloud SQL, MySQL, PostgreSQL, SQL Server, Hive SQL

png Picture by Nubelson Fernandes, Unsplash.

Common technologies used by data engineers

The data engineer usually has a broader mix of skills. The strong mathematics background and analytical personality traits of data scientists are less important here, but problem-solving skills are just as crucial, and the field is just as challenging because the pace of change is rapid, and a wide breadth of technologies are used.

Data engineers don’t really need to know the business or have domain knowledge, nor do they need to understand the mathematics, statistics, or complexities of the models. Instead, they specialise in working together with the data scientists to ensure they have access to real time data, that it’s in the right format, and that the predictions from their models can be passed back to stakeholders using APIs or other data pipelines.

Entry level salaries for data engineering roles are around £30K and go to £50-60K with three or four years’ experience. Senior roles are higher still, but there are fewer head, director, or chief roles, as department leads tend to come from data science rather than engineering in general.

Area Technologies
Programming languages Generally Python and Bash
Operating systems Linux
Databases BigQuery, RDS, Azure SQL, Google Cloud SQL, MySQL, PostgreSQL, SQL Server, Hive SQL
NOSQL systems Redis, MongoDB
Data processing Spark, Hive, PySpark
Workflows Airflow, Cloud Composer, cron, Luigi
Cloud infrastructure AWS, Google Cloud, Microsoft Azure
Data storage AWS S3, Google Cloud Storage, Azure Blob Storage, Apache Druid
Containerisation Docker, Kubernetes

If you’re only running a small data science team, you may be able to get away with data scientists who have some data engineering skills. However, as the field of data engineering, big data, data pipelines, and cloud technologies is a different specialism entirely, the most efficient teams usually include a mixture of data engineers and data scientists.

How do salaries differ

Based on my UK data science jobs dataset, which scraped data from the Reed.co.uk jobs site in early 2021, data scientists are still commanding higher salaries than data engineers, despite reports stating the opposite.

The mean salary for data scientist roles was £55K, while this was just £49.9K for data engineer roles. Senior data engineers can expect around £68K, while senior data scientist roles are a few thousand less, likely to current demand for candidates.

Lead roles tend to attract salaries around £70-90K for data science, or around £80K for data engineers. Principle level staff (the top level for non-managerial practitioners) are averaging £100K. Heads and directors are around £100-120K. Chief roles are rare, but typically demand over £120K.

Matt Clarke, Monday, March 08, 2021

Matt Clarke Matt is a Digital Director who uses data science to help in his work. He has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.

Data Engineering for Everyone

Discover how data engineers lay the groundwork that makes data science possible. No coding involved!

Start course for FREE

Comments