Natural Language Processing

28 articles tagged Natural Language Processing

How to use Spacy for noun phrase extraction

Noun phrase extraction is a Natural Language Processing technique that can be used to identify and extract noun phrases from text. Noun phrases are phrases that function grammatically as nouns...

How to use Spacy EntityRuler for custom Named Entity Recognition

Spacy’s EntityRuler component is one of several rule-based matcher components that can be used to extend the core functionality of the package. It’s really useful for the creation of custom...

How to do custom Named Entity Recognition in Pandas using Spacy

As I showed in my previous tutorial on named entity recognition in Spacy, the EntityRuler allows you to customise Spacy’s default NER model to allow you to create your own...

How to use Spacy for POS tagging in Pandas

Spacy is one of the most popular Python packages for Natural Language Processing. Alongside the Natural Language Toolkit (NLTK), Spacy provides a huge range of functionality for a wide variety...

How to transcribe YouTube videos with OpenAI Whisper

OpenAI Whisper is a new open source automatic speech recognition (ASR) model from Elon Musk’s OpenAI project that has also brought us the incredible GPT-3 language models. Like GPT-3, it’s...

How to use NLTK for POS tagging in Pandas

The Natural Language Toolkit (NLTK) is a powerful Python package for performing a wide range of common NLP tasks, including Part of Speech tagging or POS tagging for short.

How to create a fake review detection model

Fake reviews seem to be everywhere these days, leaving customers unsure over which products or businesses are actually any good. Whether you’re shopping on Amazon, checking out a restaurant on...

How to perform tokenization in NLP with NLTK and Python

Tokenization is a data science technique that breaks up the words in a sentence into a comma separated list of distinct words or values. It’s a crucial first step in...

How to create a Naive Bayes text classification model using scikit-learn

Naive Bayes classifiers are commonly used for machine learning text classification problems, such as predicting the sentiment of a tweet, identifying the language of a piece of text, or categorising...

How to use CountVectorizer for n-gram analysis

CountVectorizer is a scikit-learn package that uses count vectorization to convert a collection of text documents to a matrix of token counts. Given a corpus of text documents, such as...

How to create content recommendations using TF IDF

After work, when I’m not learning about data science, practising data science, or writing about data science, I like to browse classic car auction sites looking for cars I can’t...

How to classify customer support tickets using Naive Bayes

In ecommerce, customer service staff are often among the busiest people in the organisation, handling hundreds of tasks every day, often simultaneously. However, CS managers often get so bogged down...

How to auto-generate meta descriptions with EcommerceTools

Meta descriptions are strings of text added to the head of an HTML document to describe its content to search engines and search engine users and are of critical importance...

How to machine translate product descriptions

Whether you’re analysing content written in other languages using Natural Language Processing, or you want to assist your content team by translating their writing into other languages, machine translating software...

How to classify customer service emails with Bart MNLI

Zero-shot learning, or ZSL, is a machine learning process commonly used for Natural Language Processing that allows you to generate predictions on unseen data without the need to train a...

How to auto-generate product summaries using deep learning

Several years ago, in one of my first Ecommerce Director roles, I worked with the ex-Myprotein founder to launch sports nutrition brand GoNutrition. As a “bootstrapped” startup, we were low...

How to assess product copy using EQA models

In ecommerce, writing good product copy is both an art and a science. Not only does product copy need to be written in the correct tone and style for your...

How to create a product matching model using XGBoost

Product matching or data matching is a computational technique employing Natural Language Processing and machine learning which aims to identify identical products being sold on different websites, where product names...

How to create a Naive Bayes product classification model

Assigning products to the right categories is crucial to allowing customers to find what they’re looking for, so product classification models are commonly used by online marketplaces to ensure that...

How to preprocess text for NLP in four easy steps

There’s often a lot of repetition in many data science projects. In tasks that utilise Natural Language Processing (or NLP), for example, you’ll always need to preprocess your text to...

How to detect sarcasm using machine learning

I love sarcasm, but unfortunately I have a shaky ability to easily detect it in the voices of others, an aptitude for misinterpreting serious comments for sarcasm and then inappropriately...

How to detect fake news with machine learning

Long before Donald Trump erroneously applied it to mean “news that he didn’t agree with”, the term “fake news” referred to disinformation and misleading editorial content. In recent years, it’s...

How to speed up the NLP text annotation process

When you’re building a Natural Language Processing model, it’s the text annotation process which is the most laborious and the most expensive for your business. While you can use tools...

How to annotate training data for NLP models using Doccano

Whether you’re performing product attribute extraction, named entity recognition, product matching, product categorisation, review sentiment analysis, or you are sorting and prioritising customer support tickets, NLP models can be extremely...

How to use NLP to identify what drives customer satisfaction

While some people might naively interpret it as negativity, I think one of the best ways you can improve an ecommerce business is to focus on the stuff you’re not...

A quick guide to Product Attribute Extraction models

Product attributes, such as size, weight, wattage, or colour, are critical in ecommerce as they help customers find and select the right product for their needs. However, obtaining, adding, and...

How to use Natural Language Understanding models

Hugging Face Transformers are a collection of State-of-the-Art (SOTA) natural language processing models produced by the Hugging Face group. Basically, Hugging Face take the latest models covered in current natural...

How to create a neural network for sentiment analysis

Sentiment analysis, or opinion mining, is a form of emotion AI and uses natural language processing and computational linguistics to analyse text and infer the sentiment. Sentiment analysis has loads...