Data science courses for budding data scientists and data engineers

If you are considering learning data science there are now hundreds of online data science courses available to help you develop the skills you'll need for a career in the field.

Data science courses for budding data scientists and data engineers
Picture by Ketut Sabiyanto, Pexels.
18 minutes to read

If you want to change careers and move into the data science or data mining field, as either a data scientist or a data engineer, or simply improve your skills, there are now hundreds of courses available to help you understand the concepts, learn the relevant programming languages, undertake data analysis and predictive analysis, even if you have no previous knowledge or experience in the subject.

Advances in data science mean that many companies want to use it for decision making, and it’s well documented that there’s both high demand for staff in the sector, and very high salaries. As a result, many people are looking to get into the field. While the machine learning, AI, and big data areas are deeply specialised, the barrier to entry for data analysis and statistical analysis roles is a bit lower. There are courses available to cover everything.

Here’s our round-up of some of the available data science courses to help you learn the data science skills you’ll require to undertake your own data science projects and break into the field, whichever career path you choose. It covers free online training, in-depth e-learning courses, and highly rated degree and Master’s programmes from universities around the world.

Types of data science course

There are four main styles of data science courses available. Each has its own benefits, and which you choose will depend on how much time you want to invest, how motivated you are, how much previous experience you have, what career path you want to follow and how much money you’re willing to spend on learning new skills in data science.

Data scientist Picture by Peter Gombos, Unsplash.

Short online data science courses

There are dozens of short online courses available to teach you data science skills. These courses typically combine some basic theoretical material with hands-on practical programming and usually aim to tackle a small topic. These short courses can usually be completed in a few hours through online learning environments, and they often don’t require previous computer science training. They’re the ideal first step in learning data science.

Unlike the more structured and lengthier online courses, these short courses can be undertaken at your own pace, in your own time, whenever you have the time or inclination to learn a new skill. If you want to learn data science and have no previous computer science experience or basic programming skills, these courses are often a great way to get started.

My personal recommendation for short courses would be Datacamp. Its online courses provide an interactive experience that mixes video and audio information with PowerPoint-style slides and interactive coding sessions to test your understanding. The cover both Python programming and R programming, and touch upon a wide range of techniques, each with instruction, tutorials, and practical assessment.

Datacamp’s courses cover a wide range of topics from machine learning and statistical analysis to data visualization, and even include some real-world projects to let you practice applying your new skills. The other neat thing about Datacamp is that you can take the first lesson in a four-hour course totally free of charge. Here’s a full list of Datacamp’s courses.

Other excellent free online courses include the ones offered by Kaggle, which helped me get started when I was first learning. This offers a well-rounded introduction to various common subjects in data science, from supervised to unsupervised learning.

Datacamp Datacamp’s courses are superb.

In-depth online data science courses

While Datacamp and providers of other short online courses do offer skill tracks where you can combine multiple short courses and cover a topic in greater depth, there are also many much more academic courses available online, many of which are provided by top universities through online learning platforms such as Udemy and Coursera.

These longer courses are obviously a bigger commitment than the short four-hour courses offered by providers like Datacamp. They’re typically also much more expensive, and often have a schedule to which you may need to stick.

Coursera and Udemy courses probably have the best reputation in the data science community in terms of the more in-depth course programmes. They cover a very wide range of subjects from analytics, statistics and machine learning through to very specific sub-branches of natural language processing.

As a general rule, these more in-depth courses will focus more on the theoretical background of data science, and often include more mathematics and statistics than the shorter courses. Not all of them require maths or statistics skills, and they’re usually very well taught.

Yale Picture by Pixabay, Pexels.

Data science degrees

Next up is a Bachelor’s degree in data science. Top universities around the world have added courses to serve this popular field in recent years. These are typically three-year university courses and are a much larger financial and time investment than the usual online courses, and you won’t be learning at your own pace - you’ll be expected to keep up with your fellow students. Expect to pay anywhere from £6K to £9K per year for a data science degree course. Sadly, these courses aren’t free…

Bachelor’s degrees in data science can be taken in-situ at universities alongside other students, or remotely via online learning platforms. This means you could feasibly undertake a degree programme at a well-respected university on the other side of the world, and there’s no reason to limit your search to universities in your own country.

They arguably provide the best overall grounding in data science, as they focus on giving you a good understanding of the underlying mathematics, statistics, machine learning algorithms, neural networks, deep learning, linear models, natural language processing, and other computer science concepts. Many data science roles do demand a degree in a related field, but actual data science degrees are relatively new, so you may not be up against candidates with a data science degree, so having one could give you the edge.

However, although it will teach you many skills, a degree course can lack some of the depth of the other course formats, and the number of practical skills you will be taught may be less than learning online through a more practical and less theoretical course structure.

Universities often aren’t as agile or reactive as the online learning providers, so you may find that their course syllabuses focus on a programming language or software applications that no longer have the industry demand that they did when the course syllabus was created. While I’m not knocking R programming or SPSS (they’re both excellent), there are now more career opportunities elsewhere, so look for Python programming if you can.

Degree programmes

Data science Master’s degrees

Master’s degrees come in two main forms: those designed for computer science graduates who wish to specialise in a specific field, such as machine learning or artificial intelligence, and those designed to act as conversion courses for new students without prior experience.

Master’s degrees typically take one year to complete when studied full-time at university, but can be extended to two or three-year programmes when taken part-time. Many students, including me, have taken a Master’s alongside their day job. (Trust me, this is quite an undertaking time-wise.)

Pay close attention to the course syllabus and compare how course modules differ between universities. Many courses offered at degree and Master’s degree level share the same basic modules between subjects, so you could find that your degree programme is actually almost identical to a regular computer science, with only one or two modules added on data science to allow the university to name it accordingly.

Master’s programmes

Data science degrees Picture by KOBU Agency, Unsplash.

Developing your practical data science skills

One common criticism of taking courses online is that the problems they present you with aren’t really like those you’ll experience in the real world. Rather than being given a cleaned data set that is in the right format for your machine learning problem, in a business setting you’ll need the skills to create a dataset from raw data, clean it, and put it in the right format for your model.

Creating your code will be a lot harder than simply filling in the missing blank in some pre-written code, or answering a multiple-choice question. Therefore, data science courses can sometimes lull you into a false sense of security by making you think you know what you’re doing, when you may in fact struggle to apply the data science process to an actual business project.

The best way to actually learn data science and develop your practical skills in data science, data mining, data analysis, machine learning, deep learning, or statistics, is to actually build applications using real data, explore the data, build your own models, and come up with your own actionable insights. There are several resources to help you do this. However, my personal favourites are Datacamp projects and Kaggle competitions.

Datacamp projects

Datacamp projects provide a gentle introduction to undertaking work outside the web-based learning experience. It offers a range of structured data science tutorials you can undertake to help you make the transition from the online experience to making data driven decisions on real world data.

These short practical tasks come with raw data and a set of guidance and have a basic structure to help you get started and apply your new skills. They’re all written and presented by experts in their field and are of an excellent overall quality. I’ve undertaken many of them and absolutely love them.

They’re a great way to quickly understand a topic and some of the project ideas are really fascinating - they go way beyond the common Iris or Titanic dataset problems every data scientist starts with.

Kaggle competitions

Kaggle is a data science community owned by Google and provides both a free data science course, as well as community-provided datasets, online Jupyter notebooks, and an environment where you can compete with other data scientists to analyse a dataset and solve a machine learning problem in a competitive environment - all for free. If you’re good, you can even win substantial cash prizes.

What data science skills do top companies want?

Before considering what data science courses to undertake when you start to study data science, it’s worth considering what skills companies are looking for in data scientists when they advertise their job vacancies.

Git Git is an essential skill in most data science roles. (Picture: Yancy Min, Unsplash.)

Basic skills for data scientists

  • Programming languages: There are two main programming languages companies will be looking for - Python and R. While the R programming language remains brilliant language for data analysis and statistics, Python has become the most popular programming language in recent years, so I’d personally opt for learning that. Solid Python coding skills are going to be a pre-requisite for many jobs, so don’t overlook them.
  • Git: Pretty much every data science team will be using Git for version control, so it’s vital that you can demonstrate that you can use Git to manage your projects and collaborate with others. Here’s a great Git course to get you started. The first lesson is available free.
  • Agile: Agile is a project management framework and is widely used in data scientist positions. You should expect Agile (or a derivative of it) to be used to manage your team’s work, so it’s worth building an understanding of it before you apply. It’s pretty easy to pick up.
  • Data analysis: Data cleansing, data manipulation, and data analysis will be a major part of your work as a data scientist, even if you specialise in a specific branch, such as statistical analysis, deep learning, big data analytics, or artificial intelligence. Pandas (and to a lesser extent Numpy) are the main packages you’ll need to master to explore and analyze data and guide the decision making process.
  • Data visualization: One key responsibility of a data scientist is helping others understand the data, so the ability to visualize it with graphs, charts, or plots is helpful. Matplotlib and Seaborn are worth studying.
  • Scikit-learn: The scikit-learn Python package is probably the number one framework you’ll use for actual machine learning tasks that go beyond the initial Exploratory Data Analysis (EDA) process. You’ll need to be able to show that you can understand statistics and can create regression models, classification models, and use common techniques such as model selection, feature selection, dimensionality reduction, and hyperparameter tuning to get most jobs.
  • Domain knowledge: While it can be learned on the job, one thing that can give you the edge in data science is domain knowledge. Without an understanding of the business domain you’re going to be limited in what you can achieve - that’s why my articles on data science always focus on this aspect so much.

Programmer Picture by Anete Lusina, Pexels.

How can I improve my chances of getting a data science job?

If you want to get a data science job then you’ll benefit from showcasing some of your work to show potential employers what you’re capable of. As a hiring manager myself, I can tell you that we would rarely if ever, employ data scientists or software engineers without first seeing how well they can write code.

Even for business analytics or big data analytics positions, we’d want to see evidence that a candidate has the natural curiosity to explore and analyse data to extract hidden patterns, so showing some examples of your skills are highly recommended to make it easier for hiring managers to select you.

Show your domain knowledge

One thing I would recommend is that you build your domain knowledge in the field in which you want to get a data science role by undertaking practical projects using real world data. This can set you apart from other candidates.

I work in e-commerce and marketing, so I spend lots of my spare time outside work finding or creating interesting business analytics datasets and using my data science and machine learning skills to try to tackle new projects I’ve not had time to try at work.

There are loads of interesting data sets available to practice your skills with. Check out Kaggle and GitHub for some inspiration, or check out my guide to datasets for e-commerce and marketing data science projects if this is the field you want to enter.

Create a portfolio

Creating a portfolio of your data analysis and machine learning projects can show employers that you have more than just a basic understanding and can actually apply your expertise to solving real problems. GitHub makes it easy to share any Jupyter notebooks you may have created, and allows you to create a free website showcasing your work.

Since this requires you to use the Git version control system, which all data science employers will be looking for, creating a GitHub site is also a good way to demonstrate that you can use Git too. While you can just post repositories of your best data science code on GitHub, you can also create more stylish websites using GitHub Pages for free, which look much more professional.

For added bonus points, if you’re more technically minded, you may even want to use a static site builder, such as Jekyll to create a website from your Git repository and host it as a fully-fledged website using a more sophisticated platform such as Netlify. In fact, that’s exactly what I did with this website, and amazingly, it was all free.

Matt Clarke, Sunday, May 30, 2021

Matt Clarke Matt is a Digital Director who uses data science to help in his work. He has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.

Joining Data with pandas

Learn to combine data from multiple tables by joining data together using pandas.

Start course for FREE

Comments