If you want to change careers and move into the data science or data mining field, as either a data scientist or a data engineer, or simply improve your skills, there are now hundreds of courses available to help you understand the concepts, learn the relevant programming languages, undertake data analysis and predictive analysis, even if you have no previous knowledge or experience in the subject.
Advances in data science mean that many companies want to use it for decision making, and it’s well documented that there’s both high demand for staff in the sector, and very high salaries. As a result, many people are looking to get into the field. While the machine learning, AI, and big data areas are deeply specialised, the barrier to entry for data analysis and statistical analysis roles is a bit lower. There are courses available to cover everything.
Here’s our round-up of some of the available data science courses to help you learn the data science skills you’ll require to undertake your own data science projects and break into the field, whichever career path you choose. It covers free online training, in-depth e-learning courses, and highly rated degree and Master’s programmes from universities around the world.
There are four main styles of data science courses available. Each has its own benefits, and which you choose will depend on how much time you want to invest, how motivated you are, how much previous experience you have, what career path you want to follow and how much money you’re willing to spend on learning new skills in data science.
Picture by Peter Gombos, Unsplash.
There are dozens of short online courses available to teach you data science skills. These courses typically combine some basic theoretical material with hands-on practical programming and usually aim to tackle a small topic. These short courses can usually be completed in a few hours through online learning environments, and they often don’t require previous computer science training. They’re the ideal first step in learning data science.
Unlike the more structured and lengthier online courses, these short courses can be undertaken at your own pace, in your own time, whenever you have the time or inclination to learn a new skill. If you want to learn data science and have no previous computer science experience or basic programming skills, these courses are often a great way to get started.
My personal recommendation for short courses would be Datacamp. Its online courses provide an interactive experience that mixes video and audio information with PowerPoint-style slides and interactive coding sessions to test your understanding. The cover both Python programming and R programming, and touch upon a wide range of techniques, each with instruction, tutorials, and practical assessment.
Datacamp’s courses cover a wide range of topics from machine learning and statistical analysis to data visualization, and even include some real-world projects to let you practice applying your new skills. The other neat thing about Datacamp is that you can take the first lesson in a four-hour course totally free of charge. Here’s a full list of Datacamp’s courses.
Other excellent free online courses include the ones offered by Kaggle, which helped me get started when I was first learning. This offers a well-rounded introduction to various common subjects in data science, from supervised to unsupervised learning.
Datacamp’s courses are superb.
While Datacamp and providers of other short online courses do offer skill tracks where you can combine multiple short courses and cover a topic in greater depth, there are also many much more academic courses available online, many of which are provided by top universities through online learning platforms such as Udemy and Coursera.
These longer courses are obviously a bigger commitment than the short four-hour courses offered by providers like Datacamp. They’re typically also much more expensive, and often have a schedule to which you may need to stick.
Coursera and Udemy courses probably have the best reputation in the data science community in terms of the more in-depth course programmes. They cover a very wide range of subjects from analytics, statistics and machine learning through to very specific sub-branches of natural language processing.
As a general rule, these more in-depth courses will focus more on the theoretical background of data science, and often include more mathematics and statistics than the shorter courses. Not all of them require maths or statistics skills, and they’re usually very well taught.
Picture by Pixabay, Pexels.
Next up is a Bachelor’s degree in data science. Top universities around the world have added courses to serve this popular field in recent years. These are typically three-year university courses and are a much larger financial and time investment than the usual online courses, and you won’t be learning at your own pace - you’ll be expected to keep up with your fellow students. Expect to pay anywhere from £6K to £9K per year for a data science degree course. Sadly, these courses aren’t free…
Bachelor’s degrees in data science can be taken in-situ at universities alongside other students, or remotely via online learning platforms. This means you could feasibly undertake a degree programme at a well-respected university on the other side of the world, and there’s no reason to limit your search to universities in your own country.
They arguably provide the best overall grounding in data science, as they focus on giving you a good understanding of the underlying mathematics, statistics, machine learning algorithms, neural networks, deep learning, linear models, natural language processing, and other computer science concepts. Many data science roles do demand a degree in a related field, but actual data science degrees are relatively new, so you may not be up against candidates with a data science degree, so having one could give you the edge.
However, although it will teach you many skills, a degree course can lack some of the depth of the other course formats, and the number of practical skills you will be taught may be less than learning online through a more practical and less theoretical course structure.
Universities often aren’t as agile or reactive as the online learning providers, so you may find that their course syllabuses focus on a programming language or software applications that no longer have the industry demand that they did when the course syllabus was created. While I’m not knocking R programming or SPSS (they’re both excellent), there are now more career opportunities elsewhere, so look for Python programming if you can.
Master’s degrees come in two main forms: those designed for computer science graduates who wish to specialise in a specific field, such as machine learning or artificial intelligence, and those designed to act as conversion courses for new students without prior experience.
Master’s degrees typically take one year to complete when studied full-time at university, but can be extended to two or three-year programmes when taken part-time. Many students, including me, have taken a Master’s alongside their day job. (Trust me, this is quite an undertaking time-wise.)
Pay close attention to the course syllabus and compare how course modules differ between universities. Many courses offered at degree and Master’s degree level share the same basic modules between subjects, so you could find that your degree programme is actually almost identical to a regular computer science, with only one or two modules added on data science to allow the university to name it accordingly.
Picture by KOBU Agency, Unsplash.
One common criticism of taking courses online is that the problems they present you with aren’t really like those you’ll experience in the real world. Rather than being given a cleaned data set that is in the right format for your machine learning problem, in a business setting you’ll need the skills to create a dataset from raw data, clean it, and put it in the right format for your model.
Creating your code will be a lot harder than simply filling in the missing blank in some pre-written code, or answering a multiple-choice question. Therefore, data science courses can sometimes lull you into a false sense of security by making you think you know what you’re doing, when you may in fact struggle to apply the data science process to an actual business project.
The best way to actually learn data science and develop your practical skills in data science, data mining, data analysis, machine learning, deep learning, or statistics, is to actually build applications using real data, explore the data, build your own models, and come up with your own actionable insights. There are several resources to help you do this. However, my personal favourites are Datacamp projects and Kaggle competitions.
Datacamp projects provide a gentle introduction to undertaking work outside the web-based learning experience. It offers a range of structured data science tutorials you can undertake to help you make the transition from the online experience to making data driven decisions on real world data.
These short practical tasks come with raw data and a set of guidance and have a basic structure to help you get started and apply your new skills. They’re all written and presented by experts in their field and are of an excellent overall quality. I’ve undertaken many of them and absolutely love them.
They’re a great way to quickly understand a topic and some of the project ideas are really fascinating - they go way beyond the common Iris or Titanic dataset problems every data scientist starts with.
Kaggle is a data science community owned by Google and provides both a free data science course, as well as community-provided datasets, online Jupyter notebooks, and an environment where you can compete with other data scientists to analyse a dataset and solve a machine learning problem in a competitive environment - all for free. If you’re good, you can even win substantial cash prizes.
Before considering what data science courses to undertake when you start to study data science, it’s worth considering what skills companies are looking for in data scientists when they advertise their job vacancies.
Git is an essential skill in most data science roles. (Picture: Yancy Min, Unsplash.)
Picture by Anete Lusina, Pexels.
If you want to get a data science job then you’ll benefit from showcasing some of your work to show potential employers what you’re capable of. As a hiring manager myself, I can tell you that we would rarely if ever, employ data scientists or software engineers without first seeing how well they can write code.
Even for business analytics or big data analytics positions, we’d want to see evidence that a candidate has the natural curiosity to explore and analyse data to extract hidden patterns, so showing some examples of your skills are highly recommended to make it easier for hiring managers to select you.
One thing I would recommend is that you build your domain knowledge in the field in which you want to get a data science role by undertaking practical projects using real world data. This can set you apart from other candidates.
I work in e-commerce and marketing, so I spend lots of my spare time outside work finding or creating interesting business analytics datasets and using my data science and machine learning skills to try to tackle new projects I’ve not had time to try at work.
There are loads of interesting data sets available to practice your skills with. Check out Kaggle and GitHub for some inspiration, or check out my guide to datasets for e-commerce and marketing data science projects if this is the field you want to enter.
Creating a portfolio of your data analysis and machine learning projects can show employers that you have more than just a basic understanding and can actually apply your expertise to solving real problems. GitHub makes it easy to share any Jupyter notebooks you may have created, and allows you to create a free website showcasing your work.
Since this requires you to use the Git version control system, which all data science employers will be looking for, creating a GitHub site is also a good way to demonstrate that you can use Git too. While you can just post repositories of your best data science code on GitHub, you can also create more stylish websites using GitHub Pages for free, which look much more professional.
For added bonus points, if you’re more technically minded, you may even want to use a static site builder, such as Jekyll to create a website from your Git repository and host it as a fully-fledged website using a more sophisticated platform such as Netlify. In fact, that’s exactly what I did with this website, and amazingly, it was all free.
Matt Clarke, Sunday, May 30, 2021