How to create a fake review detection model

Picture by Markus Winkler, Pexels.

35 minutes to read

Fake reviews seem to be everywhere these days, leaving customers unsure over which products or businesses are actually any good. Whether you’re shopping on Amazon, checking out a restaurant on Tripadvisor, or reading about a potential employer on Glassdoor, there’s always a risk that the reviews you’re reading are fake.

In this project, I’ll explain some background behind fake review detection, look at some of the features that models use to identify fake reviews and opinion spam, and build a basic fake review detection model that uses TfidfVectorizer and a range of machine learning algorithms to identify fake reviews from real ones.

Types of fake review

One of the reasons why fake reviews can be so hard to spot is that they come in various forms. They form, of course, two main types: human-generated fake reviews and computer-generated fake reviews. However, both human-generated and computer-generated reviews can be positive or negative in sentiment, and can be aimed at both increasing or decreasing the overall rating, or boosting the total number of reviews to help add credibility to the score.

With the rise of more sophisticated AI models, such as GPT-2 and GPT-3, fake reviews generated by computers are also likely to become harder to detect. Previous research on reviews generated via GPT-2 (Salminen et al., 2022) shows that GPT-2 reviews are certainly detectable with relatively decent accuracy by models, and that models outperform humans in their detection. However, not much has been written on the detection of GPT-3 generated reviews so far.

Broadly speaking, the types of fake review you’ll encounter usually fall into one of the four following groups:

Computer generated reviews	AI text generation models such as GPT-2, GPT-3 and various transformer models can be used to create fake computer generated reviews. Data science researchers, such as Salminen et al. (2022) have shown how these AI reviews can be created and detected using machine learning.
Human generated via review farms	Fake reviews can be purchased in bulk via review farms which advertise their services on Facebook and other sites. He, Hollenbeck, and Prospero (2022) studied these fake review providers and found that they'd be purchased for a wide variety of products sold on Amazon, including those with many reviews and high average ratings. They found that after using fake review services, firms saw their share of one-star reviews increase significantly and suggested that review manipulation was most popular for low-quality products.
Human generated fake negative reviews	Human generated fake negative reviews are those reviews written by disgruntled customers, former staff, or competitors who want to drag down a product or business's overall review score by flooding it with malicious negative reviews.
Human generated fake positive reviews	Human generated fake positive reviews are rife on all review platforms that allow them, whether they're posted by ecommerce retailers, Amazon marketplace sellers, restaurants, or HR departments trying to bury the reviews of disgruntled former staff who've slated the company on Glassdoor.

How to spot fake reviews

Researchers who’ve examined fake reviews in detail have identified a wide range of potential features that can help humans and models tell a fake review from a real one. The review text itself is generally the most important feature, since fake reviews often use similar language, especially if they’re written by the same person, company, or review farm.

However, there are also a wide range of non-text features that can be used to detect fake reviews. If you’re interested in understanding fake reviews from the other side, check out the paper from Theodoros Lappas at the 2012 International Conference on Application of Natural Language to Information Systems. It’s written from the attacker’s perspective and looks at the various means used to evade detection and make fake reviews look legit.

These are the most commonly used features you’ll see in papers on fake review detection:

Feature	Description
Review length	The number of words in the review may indicate whether it's real or fake.
Sentiment	Fake reviews are often more polarised in their sentiment, being either very positive, or very negative.
Helpfulness	When review helpfulness is a metric on the review platform, there may be a correlation between fake reviews and lower helpfulness scores.
Reviews per user	Since some reviews may be generated by bots, the user who posted the review may be newly registered, having created their fake account for the sole purpose of leaving a fake review. Detecting the number of reviews per user can be a useful feature for models.
Verified reviews	Some review platforms, such as Trustpilot, don't require you to prove that you've purchased a product or service from a company in order to leave a review, so are at potential risk from fake reviews. If a review is "verified", it shows that it was made after the retailer requested it following a tracked purchase by the customer.
Stealth	According to Lappas (2012), "stealth measures the ability of the review to blend in with the corpus." If a review is written in a completely different style to other reviews, it may stand out, so fake reviews may attempt to look like others in the reviews from the business.
Coherence	Sometimes you may spot reviews that give a very low score but include text that praises the product or service. Lappas (2012) calls this "coherence", and says it "evaluates whether the assigned rating is in accordance with the opinions expressed in the review's text."
Readability	Several machine learning researchers have identified that readability can be used to detect some fake reviews. For example, Lappas (2012) used Flesch Reading Ease (FRE), which other authors have also incorporated into their models.
Review text	The review text is the single most widely-used feature in fake review detection models. NLP techniques such as TF-IDF and count vectorization are used to encode text into a Bag of Words and text-classification techniques, such as Naive Bayes, are then used to classify whether the review is fake or not.

Fake review detection models

Over the past decade or so, machine learning researchers have created a number of different fake review detection models to try to distinguish fake reviews from real reviews in a range of datasets, comprising both human-generated fake reviews, and computer-generated fake reviews.

The vast majority of these opinion spam detection models use text-classification algorithms and NLP preprocessing techniques, such as count vectorization and TF-IDF. However, some also incorporate additional non-text features.

Several techniques have been used to identify this opinion spam, but they basically use three approaches:

Models that use solely text-based features, such as the content of the review.
Models that use solely non-text features, such as reading ease, review scores, and the number of prior reviews.
Models that use a combination of text-based and non-text based features.

Previous studies, such as Jindal and Liu (2008), who analysed 10 million Amazon reviews to detect opinion or review spam, have examined using solely text features (such as the review) and text features alongside others (such as the review score and other features). They generated an accuracy score of 63% with text only, but that increased to 78% when all features were included.

Building our model

To keep things simple, we’re going to build a fake review detection model solely based on the review text itself, and we’ll be focusing on detecting the presence of fake reviews generated with GPT-2 in a dataset that contains both GPT-2 computer-generated reviews and human generated reviews. We’ll be using a dataset from the recent Salminen (2022) paper on this topic.

Install the packages

As we’re using a wide range of machine learning algorithms, there’s a possibility you might need to install some packages using PyPi. I’m using the NVIDIA Data Science Stack Docker container which includes most of these already, so I only needed to install lightgbm and catboost. Any others you don’t have can be installed via Pip.

!pip3 install lightgbm
!pip3 install catboost

Import the packages

Next, import the Python packages below. We’ll be using some NLTK packages for preprocessing our text data, the TfidfVectorizer from scikit-learn to prepare our data using the TF-IDF algorithm, plus a range of different classification algorithms, including XGBoost, LightGBM, CatBoost, and various scikit-learn implementations, such as Multinomial Naive Bayes, Decision Trees, Random Forests, and various others.

import time
import pandas as pd
import numpy as np
import nltk
from nltk.corpus import stopwords
from nltk.stem.porter import PorterStemmer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.metrics import accuracy_score
from sklearn.metrics import f1_score
from sklearn.metrics import roc_auc_score
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import classification_report

from sklearn.svm import LinearSVC
from xgboost import XGBClassifier
from lightgbm import LGBMClassifier
from catboost import CatBoostClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import ExtraTreeClassifier
from sklearn.linear_model import RidgeClassifier
from sklearn.linear_model import SGDClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.ensemble import BaggingClassifier
from sklearn.naive_bayes import BernoulliNB
from sklearn.naive_bayes import MultinomialNB

pd.set_option('max_colwidth', None)

Load the data

Next you need to find and import a fake reviews dataset with which to train your supervised learning classification model. I’ve used a fake reviews dataset created by Salminen et al. (2022) for their recent paper in Journal of Retailing and Consumer Services which used machine learning to detect fake reviews.

The dataset includes computer generated fake reviews, which have the label CG, and original or human-generated reviews, which have the label OR. The fake reviews, tagged CG, were generated using the GTP-2 model, while the human-generated ones are human and should be real. This dataset, of course, overlooks other types of fake review and is restricted only to computer-generated versus human-generated.

df = pd.read_csv('fake reviews dataset.csv', names=['category', 'rating', 'label', 'text'])

df.head()

	category	rating	label	text
0	category	rating	label	text_
1	Home_and_Kitchen_5	5.0	CG	Love this! Well made, sturdy, and very comfortable. I love it!Very pretty
2	Home_and_Kitchen_5	5.0	CG	love it, a great upgrade from the original. I've had mine for a couple of years
3	Home_and_Kitchen_5	5.0	CG	This pillow saved my back. I love the look and feel of this pillow.
4	Home_and_Kitchen_5	1.0	CG	Missing information on how to use it, but it is a great product for the price! I

Check for data imbalance

To understand the target variable, label, we’ll use value_counts() to return the number of each class present. This shows that the data have already been perfectly balanced and contain equal numbers of the classes, so there’s no need to use SMOTE or other techniques for handling imbalanced classification problems.

df['label'].value_counts()

CG       20216
OR       20216
label        1
Name: label, dtype: int64

Prepare the data

Next, we’ll do a quick tidy of the data and will strip out the \n newline character present and convert it to a space using str.replace(). This will stop words being joined together and give us cleaner text before we proceed to the next steps. We’ll also create a new column called target to hold our target variable, and we’ll assign the CG fake reviews with 1, and the real or OR reviews with 0.

df['text'] = df['text'].str.replace('\n', ' ')

df['target'] = np.where(df['label']=='CG', 1, 0)

df['target'].value_counts()

0    20217
1    20216
Name: target, dtype: int64

Create features from punctuation

We’ll be creating a Bag of Words based model, so we first need to do some NLP preprocessing on our text. Punctuation is going to get stripped out, but it’s sometimes a useful indicator, so we’ll do some string replacement on the text column to replace common punctuation features with words that will be retained.

def punctuation_to_features(df, column):
    """Identify punctuation within a column and convert to a text representation.
    
    Args:
        df (object): Pandas dataframe.
        column (string): Name of column containing text. 
        
    Returns:
        df[column]: Original column with punctuation converted to text, 
                    i.e. "Wow! > "Wow exclamation"
    
    """
    
    df[column] = df[column].replace('!', ' exclamation ')
    df[column] = df[column].replace('?', ' question ')
    df[column] = df[column].replace('\'', ' quotation ')
    df[column] = df[column].replace('\"', ' quotation ')
    
    return df[column]

df['text'] = punctuation_to_features(df, 'text')

Tokenize the data

Next we need to take our text column, which is currently stored as a string, and turn it into a Python list of words using a process called tokenization. The NLTK package includes a handy function called word_tokenize() that can be used to perform this task. We’ll run the function on our Pandas column using the Pandas apply() function and then return the list of tokenized data in a new column called tokenized.

nltk.download('punkt');

def tokenize(column):
    """Tokenizes a Pandas dataframe column and returns a list of tokens.
    
    Args:
        column: Pandas dataframe column (i.e. df['text']).
    
    Returns:
        tokens (list): Tokenized list, i.e. [Donald, Trump, tweets]
    
    """
    
    tokens = nltk.word_tokenize(column)
    return [w for w in tokens if w.isalpha()]    

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!

df['tokenized'] = df.apply(lambda x: tokenize(x['text']), axis=1)
df.head()

	category	rating	label	text	target	tokenized
0	category	rating	label	text_	0	[]
1	Home_and_Kitchen_5	5.0	CG	Love this! Well made, sturdy, and very comfortable. I love it!Very pretty	1	[Love, this, Well, made, sturdy, and, very, comfortable, I, love, it, Very, pretty]
2	Home_and_Kitchen_5	5.0	CG	love it, a great upgrade from the original. I've had mine for a couple of years	1	[love, it, a, great, upgrade, from, the, original, I, had, mine, for, a, couple, of, years]
3	Home_and_Kitchen_5	5.0	CG	This pillow saved my back. I love the look and feel of this pillow.	1	[This, pillow, saved, my, back, I, love, the, look, and, feel, of, this, pillow]
4	Home_and_Kitchen_5	1.0	CG	Missing information on how to use it, but it is a great product for the price! I	1	[Missing, information, on, how, to, use, it, but, it, is, a, great, product, for, the, price, I]

Stopword removal

Next we’ll use an NLP preprocessing technique called stopword removal. Stopword removal, as the name suggests removes “stop words”. These are basically words used so commonly that they’re essentially meaningless to most models, so removing them can improve model speed and, sometimes, accuracy, though rarely by very much.

We’ll use the stopwords library from NLTK, set this to english, and then loop through the words in the tokenized list and return it to a new column called stopwords_removed so we can observe the changes.

nltk.download('stopwords');

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!

def remove_stopwords(tokenized_column):
    """Return a list of tokens with English stopwords removed. 
    
    Args:
        column: Pandas dataframe column of tokenized data from tokenize()
    
    Returns:
        tokens (list): Tokenized list with stopwords removed.
    
    """
    stops = set(stopwords.words("english"))
    return [word for word in tokenized_column if not word in stops]

df['stopwords_removed'] = df.apply(lambda x: remove_stopwords(x['tokenized']), axis=1)
df.head()

	category	rating	label	text	target	tokenized	stopwords_removed
0	category	rating	label	text_	0	[]	[]
1	Home_and_Kitchen_5	5.0	CG	Love this! Well made, sturdy, and very comfortable. I love it!Very pretty	1	[Love, this, Well, made, sturdy, and, very, comfortable, I, love, it, Very, pretty]	[Love, Well, made, sturdy, comfortable, I, love, Very, pretty]
2	Home_and_Kitchen_5	5.0	CG	love it, a great upgrade from the original. I've had mine for a couple of years	1	[love, it, a, great, upgrade, from, the, original, I, had, mine, for, a, couple, of, years]	[love, great, upgrade, original, I, mine, couple, years]
3	Home_and_Kitchen_5	5.0	CG	This pillow saved my back. I love the look and feel of this pillow.	1	[This, pillow, saved, my, back, I, love, the, look, and, feel, of, this, pillow]	[This, pillow, saved, back, I, love, look, feel, pillow]
4	Home_and_Kitchen_5	1.0	CG	Missing information on how to use it, but it is a great product for the price! I	1	[Missing, information, on, how, to, use, it, but, it, is, a, great, product, for, the, price, I]	[Missing, information, use, great, product, price, I]

Apply Porter stemming

The final NLP preprocessing step we’ll take is to apply a technique called Porter Stemming, using the Porter Stemmer algorithm. Porter Stemming is a technique that is similar to Lemmatization and converts each word to its root or stemmed form, so “comfortable” becomes “comfort”, “information” becomes “inform”, etc. This can often really help model performance. At the same time, we’ll also convert the words to lower case and return the data in a column called porter_stemmed.

def apply_stemming(tokenized_column):
    """Return a list of tokens with Porter stemming applied.
    
    Args:
        column: Pandas dataframe column of tokenized data with stopwords removed.
    
    Returns:
        tokens (list): Tokenized list with words Porter stemmed.
    
    """
    
    stemmer = PorterStemmer() 
    return [stemmer.stem(word).lower() for word in tokenized_column]

df['porter_stemmed'] = df.apply(lambda x: apply_stemming(x['stopwords_removed']), axis=1)
df.head()

	category	rating	label	text	target	tokenized	stopwords_removed	porter_stemmed
0	category	rating	label	text_	0	[]	[]	[]
1	Home_and_Kitchen_5	5.0	CG	Love this! Well made, sturdy, and very comfortable. I love it!Very pretty	1	[Love, this, Well, made, sturdy, and, very, comfortable, I, love, it, Very, pretty]	[Love, Well, made, sturdy, comfortable, I, love, Very, pretty]	[love, well, made, sturdi, comfort, i, love, veri, pretti]
2	Home_and_Kitchen_5	5.0	CG	love it, a great upgrade from the original. I've had mine for a couple of years	1	[love, it, a, great, upgrade, from, the, original, I, had, mine, for, a, couple, of, years]	[love, great, upgrade, original, I, mine, couple, years]	[love, great, upgrad, origin, i, mine, coupl, year]
3	Home_and_Kitchen_5	5.0	CG	This pillow saved my back. I love the look and feel of this pillow.	1	[This, pillow, saved, my, back, I, love, the, look, and, feel, of, this, pillow]	[This, pillow, saved, back, I, love, look, feel, pillow]	[thi, pillow, save, back, i, love, look, feel, pillow]
4	Home_and_Kitchen_5	1.0	CG	Missing information on how to use it, but it is a great product for the price! I	1	[Missing, information, on, how, to, use, it, but, it, is, a, great, product, for, the, price, I]	[Missing, information, use, great, product, price, I]	[miss, inform, use, great, product, price, i]

Rejoin words

Finally, we need to take our porter_stemmed data that has been preprocessed and rejoin the words back into a string. To do that we’ll create a function called rejoin_words() and use join(), then use apply() to run it on each row in the dataframe, returning the data back to all_text.

def rejoin_words(tokenized_column):
    return ( " ".join(tokenized_column))

df['all_text'] = df.apply(lambda x: rejoin_words(x['porter_stemmed']), axis=1)

df[['all_text']].head()

	all_text
0
1	love well made sturdi comfort i love veri pretti
2	love great upgrad origin i mine coupl year
3	thi pillow save back i love look feel pillow
4	miss inform use great product price i

Create training and test data

Now the data has been preprocessed, it’s ready to be split into the training and test datasets we can use in our machine learning models. We’ll only be using one column of data here - the all_text data we preprocessed above. We’re defining the target column to y, so we’ll be training our model to predict that class value. We’ll split the data up in the usual manner using train_test_split().

X = df['all_text']
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1, shuffle=True)

Run the model selection process

To find the best text classification model for our data we’ll create a model selection process that uses scikit-learn pipelines. Firstly, we’ll create a Python dictionary containing the name and command to instantiate each of a range of different classification models, including XGBClassifier, CatBoostClassifier, RandomForestClassifier, DecisionTreeClassifier, MultinomialNB, and many others.

classifiers = {}
classifiers.update({"XGBClassifier": XGBClassifier(eval_metric='logloss',
                                                   objective='binary:logistic',
                                                   )})
classifiers.update({"CatBoostClassifier": CatBoostClassifier(silent=True)})
classifiers.update({"LinearSVC": LinearSVC()})
classifiers.update({"MultinomialNB": MultinomialNB()})
classifiers.update({"LGBMClassifier": LGBMClassifier()})
classifiers.update({"RandomForestClassifier": RandomForestClassifier()})
classifiers.update({"DecisionTreeClassifier": DecisionTreeClassifier()})
classifiers.update({"ExtraTreeClassifier": ExtraTreeClassifier()})
classifiers.update({"AdaBoostClassifier": AdaBoostClassifier()})
classifiers.update({"KNeighborsClassifier": KNeighborsClassifier()})
classifiers.update({"RidgeClassifier": RidgeClassifier()})
classifiers.update({"SGDClassifier": SGDClassifier()})
classifiers.update({"BaggingClassifier": BaggingClassifier()})
classifiers.update({"BernoulliNB": BernoulliNB()})

Next we’ll define a Pandas dataframe in which we’ll store the results of each model tested. We’ll loop through our dictionary of classifiers, then run a scikit-learn pipeline that uses the TfidfVectorizer to encode the data, then defines the classifier to use. We’ll then use a five fold cross-fold validation process to test each algorithm and return the ROC/AUC score and save it back to the dataframe. This will take quite a while to run, depending on the speed of your data science workstation.

df_models = pd.DataFrame(columns=['model', 'run_time', 'roc_auc', 'roc_auc_std'])

for key in classifiers:
    
    start_time = time.time()
    pipeline = Pipeline([("tfidf", TfidfVectorizer()), ("clf", classifiers[key] )])
    cv = cross_val_score(pipeline, X, y, cv=5, scoring='roc_auc')

    row = {'model': key,
           'run_time': format(round((time.time() - start_time)/60,2)),
           'roc_auc': cv.mean(),
           'roc_auc_std': cv.std(),
    }
    
    df_models = df_models.append(row, ignore_index=True)
    
df_models = df_models.sort_values(by='roc_auc', ascending=False)

df_models

	model	run_time	roc_auc	roc_auc_std
11	SGDClassifier	0.07	0.925416	0.009072
1	CatBoostClassifier	9.76	0.922656	0.010213
2	LinearSVC	0.08	0.922426	0.012557
10	RidgeClassifier	0.08	0.922291	0.013186
4	LGBMClassifier	0.34	0.918359	0.010686
0	XGBClassifier	0.55	0.915800	0.010972
5	RandomForestClassifier	3.47	0.910889	0.013901
3	MultinomialNB	0.07	0.901766	0.019865
12	BaggingClassifier	7.94	0.854603	0.012566
8	AdaBoostClassifier	0.49	0.844673	0.019730
13	BernoulliNB	0.06	0.828504	0.020328
6	DecisionTreeClassifier	1.05	0.737933	0.009911
9	KNeighborsClassifier	0.62	0.705052	0.034773
7	ExtraTreeClassifier	0.12	0.660250	0.012657

Assess the selected model

Our model selection process above identified that the SGDClassifier was the top performer. This is a scikit-learn implementation of the Stochastic Gradient Descent or SGD algorithm. In our example, we were running it on the default settings, so we’ll fit this to our data and see what scores we get. We may be able to slightly increase performance through hyperparameter tuning. Without cross fold validation, we get an accuracy of 87%, which is similar to what the authors of the paper achieved with their model.

bundled_pipeline = Pipeline([("tfidf", TfidfVectorizer()), 
                             ("clf", SGDClassifier())
                            ])
bundled_pipeline.fit(X_train, y_train)
y_pred = bundled_pipeline.predict(X_test)

accuracy_score = accuracy_score(y_test, y_pred)
precision_score = precision_score(y_test, y_pred)
recall_score = recall_score(y_test, y_pred)
roc_auc_score = roc_auc_score(y_test, y_pred)

print('Accuracy:', accuracy_score)
print('Precision:', precision_score)
print('Recall:', recall_score)
print('ROC/AUC:', roc_auc_score)

Accuracy: 0.8694971145919208
Precision: 0.8909774436090225
Recall: 0.8465660009741841
ROC/AUC: 0.8698581135334829

Next steps

We got decent accuracy and a pretty good ROC/AUC score from our simple untuned model using solely text-based features. I intentionally skipped out some steps that it would be worth doing if you need a more accurate model and have more time to invest. Hyperparameter tuning would likely be able to improve model performance a fraction more.

We could also use GridSearchCV to test different data preprocessing techniques. For example, how does CountVectorizer perform versus TFIDFVectorizer? You could also look at how to incorporate other dense features into the spares features data you pass to the model, such as the review score, or the Flesch Kincaid Reading Ease score, or perhaps try ensembling.