The growing need for data engineers, as well as data scientists, means that increased demand is pushing salaries even higher, making the two positions among the most highly paid in the sector. In the UK, the average salary for both data engineers and data scientists with just a few years’ experience is well over £50K, with senior roles often in the low six figures.
However, many people (including some hiring managers) still don’t understand the important differences between data scientists and data engineers, or what skills to expect from job candidates. In this article, I’ll explain how data scientists and data engineers differ, and explain why most companies really need staff in both positions on their data team.
Data engineers are essentially “devops” for data science teams. They architect, develop, and maintain the systems data scientists need. They handle everything from creating data pipelines that move data from one system to another, to deploying machine learning models so they can be used in production.
Data engineers ensure that data scientists can focus on doing what they’re best at, instead of trying to be part-time data engineers, and performing tasks that are outside their comfort zone and their area of specialist knowledge.
They’re experts in handling raw data, cleaning it to ensure it’s in the right format for the data scientists to use in their machine learning models, and are experienced in setting up the data infrastructure, data warehouses, big data or cloud platforms used to let stakeholders utilise the machine learning or artificial intelligence models in production.
Data scientists understand business data, mine data for insights, and create statistical and predictive models to help improve the businesses in which they work through in-depth data analysis. They often move into the field from data analyst positions, but use more advanced statistical techniques, including ML, deep learning, and AI.
Even though most data scientists are expected to perform some data engineering tasks, the roles of data scientists are quite different and the things they need to know and do are miles apart. Some companies, especially those with smaller data teams, will expect their data scientists to have some data engineering skills.
However, as technologies used in the two fields are very different, and very fast moving, so it’s unusual to find a single individual who is truly an expert in both areas. These so-called “full stack” data scientists are quite a rarity.
The typical responsibilities for data scientists and data engineers vary according to the size and maturity of the company, and the size of its data science team. In smaller companies, data scientists may have to handle data engineering tasks themselves, so will need to be generalists or work as full stack data scientists.
As businesses and teams get larger, responsibilities tend to be split between data engineers and data scientists, and staff tend to hold more specialised roles. Bringing in engineers to support the scientists should mean the team are better able to focus on their specialisms, and the data team generates better overall results and is more productive.
Data Scientist | Data Engineer |
---|---|
Understanding business performance | Creating a big data architecture |
Mining data for insights | Handling data ingestion |
Removing outliers from data | Creating data workflows |
Creating statistical models | Maintaining cloud infrastructures |
Creating predictive models | Detecting and handling corrupt data |
In-depth data analysis | Preparing data for access |
Picture by Jefferson Santos, Unsplash.
Data scientists tend to have a quantitative background, often coming from mathematics or science-based backgrounds, or data analyst roles. These data driven individuals are unlike the average programmer or software engineer and have a sought after mix of maths, stats, data visualization, programming, and domain knowledge.
As data science is a relatively new discipline, at least by its current name (it was previously known as data mining), relatively few have a specific qualification in data science. However, Master’s degrees in related fields are common, and PhDs are considered a requirement for some roles.
Even entry level data scientist roles command salaries over £30K in the UK, rising to around £50K with a few years’ experience, and with senior practitioner roles topping £100K and head, director, and chief roles going even higher.
Area | Technologies |
---|---|
Programming languages | Generally Python or R |
Operating systems | Usually Linux, sometimes Windows or Mac |
Containerisation | Docker |
Development tools | PyCharm, Visual Studio, Jupyter Lab |
Data manipulation | Pandas, NumPy, SciPy |
Data visualisation | Matplotlib, Seaborn, Bokeh |
Deep learning | TensorFlow, Keras, PyTorch |
Machine learning | scikit-learn, XGBoost |
Techniques | Classification, regression, NLP, deep learning, unsupervised learning |
BI tools | Apache Superset, Tableau, Google Data Studio |
Databases | BigQuery, RDS, Azure SQL, Google Cloud SQL, MySQL, PostgreSQL, SQL Server, Hive SQL |
Picture by Nubelson Fernandes, Unsplash.
The data engineer usually has a broader mix of skills. The strong mathematics background and analytical personality traits of data scientists are less important here, but problem-solving skills are just as crucial, and the field is just as challenging because the pace of change is rapid, and a wide breadth of technologies are used.
Data engineers don’t really need to know the business or have domain knowledge, nor do they need to understand the mathematics, statistics, or complexities of the models. Instead, they specialise in working together with the data scientists to ensure they have access to real time data, that it’s in the right format, and that the predictions from their models can be passed back to stakeholders using APIs or other data pipelines.
Entry level salaries for data engineering roles are around £30K and go to £50-60K with three or four years’ experience. Senior roles are higher still, but there are fewer head, director, or chief roles, as department leads tend to come from data science rather than engineering in general.
Area | Technologies |
---|---|
Programming languages | Generally Python and Bash |
Operating systems | Linux |
Databases | BigQuery, RDS, Azure SQL, Google Cloud SQL, MySQL, PostgreSQL, SQL Server, Hive SQL |
NOSQL systems | Redis, MongoDB |
Data processing | Spark, Hive, PySpark |
Workflows | Airflow, Cloud Composer, cron, Luigi |
Cloud infrastructure | AWS, Google Cloud, Microsoft Azure |
Data storage | AWS S3, Google Cloud Storage, Azure Blob Storage, Apache Druid |
Containerisation | Docker, Kubernetes |
If you’re only running a small data science team, you may be able to get away with data scientists who have some data engineering skills. However, as the field of data engineering, big data, data pipelines, and cloud technologies is a different specialism entirely, the most efficient teams usually include a mixture of data engineers and data scientists.
Based on my UK data science jobs dataset, which scraped data from the Reed.co.uk jobs site in early 2021, data scientists are still commanding higher salaries than data engineers, despite reports stating the opposite.
The mean salary for data scientist roles was £55K, while this was just £49.9K for data engineer roles. Senior data engineers can expect around £68K, while senior data scientist roles are a few thousand less, likely to current demand for candidates.
Lead roles tend to attract salaries around £70-90K for data science, or around £80K for data engineers. Principle level staff (the top level for non-managerial practitioners) are averaging £100K. Heads and directors are around £100-120K. Chief roles are rare, but typically demand over £120K.
Matt Clarke, Monday, March 08, 2021