How to calculate customer experience metrics in Python

Customer experience metrics and customer satisfaction metrics drive customer retention, so it's vital that you measure them. Here's how it's done in Python.

How to calculate customer experience metrics in Python
Picture by Johann Walter Bantz, Unsplash.
15 minutes to read

Customers are expensive to acquire but generate more and more profit as time goes on. Providing you nurture them, treat them kindly, and apologise and fix any mistakes that occur, they’ll remember you favourably and come back when they need more stuff.

If you treat customers badly, mess up frequently, don’t apologise for your mistakes, and don’t attempt to appease them when things go wrong, they could slate you on the review platforms or social media, and never come back. Your profits then won’t grow anywhere near as fast as they could.

Loads of things happen during the customer experience that could cause them confusion or require them to seek support. They might not be able to select the right product because they lack the knowledge or information, there might be a techical or usability issue with the site, or they might need to request a return, refund, or exchange.

Any contact should be as quick and easy for the customer as possible, and the company should aim to reduce the probability of other customers needing to make contact for the same issue again in the future.

Since customer satisfaction is directly tied to retention and future customer equity, it’s important that you measure the customer experience and take steps to improve it. Here are a range of powerful metrics you can use to monitor customer satisfaction with the code you need to calculate them in Python.

Customer Satisfaction Score (CSAT)

The Customer Satisfaction Score or CSAT is a simple metric that can be used to measure how happy customers are with either your business or a specific product. It’s also a great way to benchmark customer satisfaction within the whole market.

The CSAT score represents the percentage of “positive” responses over the total responses received. There’s no specific guidance on what counts as “positive”, but I usually go with 4 and 5 stars.

You can fetch your star ratings for products and services by using the Feefo Python API, or by scraping Trustpilot reviews or Google reviews.

You can also use Natural Language Processing via sentiment analysis modeling to determine whether product reviews, customer emails, support tickets, chat transcripts, social media messages, or call transcripts were positive or negative and then examine the causes behind the negative scores to help you improve.

def csat(total_responses, positive_responses):
    """Return the Customer Satisfaction or CSAT score for a period.

    Args:
        total_responses (int): Total number of responses received within the period.
        positive_responses (int): Total number of positive responses received within the period.

    Returns:
        Percentage (float) of positive responses received.
    """

    return (positive_responses / total_responses) * 100
print('CSAT:', csat(total_responses=1000, positive_responses=900))
CSAT: 90.0

Net Promoter Score (NPS)

Net Promoter Score or NPS has become the industry standard metric for measuring customer satisfaction in ecommerce. It asks customers a simple question “How likely is it that you would recommend this company, product, or service to a friend or colleague?” and gets them to assign a score from 0 to 10 (or 1 to 10) depending on how they feel.

Customers who rated the company between 0 and 6 out of 10 are considered detractors, and were unhappy. Customers who rated the company 9 or 10 out of 10 are considered promoters, and were happy. The ones who rated the company a 7 or 8 are known as passives. The passives actually get ignored in the NPS calculation, because they’re a bit indifferent to the service. The Net Promoter Score is then based on the percentage of promoters, minus the percentage of detractors, over the total number of respondents.

Over the years, my ecommerce teams have got great results from using NPS. By focusing on the issues causing the detractors they ensure they fix the stuff that annoys customers the most. Then, by focusing on the passives, who don’t technically matter to the score, they identify the areas where small improvements might turn these into promoters.

By using Natural Language Processing to examine customer satisfaction drivers, you can even identify the things that are most likely to be linked with detractors, passives, and promoters, so you can focus your attention on the right things.

def nps(total_promoters, total_detractors, total_respondents):
    """Return the Net Promoter Score (NPS) for a period.

    Args:
        total_promoters (int): Total number of promoters (9 or 10 out of 10) within the period.
        total_detractors (int): Total number of detractors responses (1 to 6 out of 10) within the period.
        total_respondents (int): Total number of responses within the period.

    Returns:
        NPS score (float) based on the percentage of promoters - percentage detractors.
    """

    return ((total_promoters * 100) / total_respondents) - ((total_detractors * 100) / total_respondents)
print('NPS:', nps(total_promoters=900, 
                  total_detractors=100, 
                  total_respondents=1200))
NPS: 66.66666666666667

Retention rate

Although retention rate is typically regarded as a marketing metric, it’s arguably one that requires team effort from the entire business, including operations and customer support. Ultimately, if something goes wrong with a customer’s experience on the site, it’s up to the CS team to try to keep them happy and preventing them from churning.

Therefore, retention is contributed towards significantly by CS teams, particularly when it comes to service recovery, which can often turn things around completely and make someone who received a lacklustre and unforgettable experience become a promoter of the brand that not only comes back, but also tells others how great your service is. Obviously, retention is just the opposite of churn, so you can choose either one to measure your performance.

Calculating (or defining) retention is actually much, much harder than you might imagine. In ecommerce settings, it often requires a predictive model to be calculated properly, but you can use repurchase rate as an approximate surrogate for the metric.

def retention_rate(customers_repurchasing_current_period,
                   customers_purchasing_previous_period):
    """Return the retention rate of customers acquired in one period who repurchased in another.

    Args:
        customers_repurchasing_current_period (int): The number of customers acquired in p1, who reordered in p2.
        customers_purchasing_previous_period (int): The number of customers who placed their first order in p1.

    Returns:
        retention_rate (float): Percentage of customers acquired in p1 who repurchased in p2.
    """

    return (customers_repurchasing_current_period / customers_purchasing_previous_period) * 100
print('Retention rate:', retention_rate(customers_repurchasing_current_period=1000,
                                           customers_purchasing_previous_period=4000))
Retention rate: 25.0

Ticket to Order Ratio

Ticket to Order Ratio is one of my favourite metrics for customer service teams, since it encourages cross-departmental collaboration, which can be a huge frustration in companies that operate within silos. Ticket to Order Ratio looks at the volume of tickets (i.e. customer support chats, emails, or support portal tickets) to the volume of orders.

A high number suggests the CS team are dealing with lots of tickets. This may be because the site content is missing crucial information, because there are issues with the customer experience which are causing confusion, or because the CS team isn’t operating efficiently, so it takes customers several requests to get a resolution.

A low number implies the site is working well, customers can place orders easily, content is good and covers all the key things customers need to know to place their order, and suggests the CS team is collaborating with the ecommerce team, acting as their eyes and ears, so they can fix issues and boost conversion rate. It’s also a really important one for scalability within CS teams and can focus them on working smarter, so they become less busy!

def ticket_to_order_ratio(total_tickets, total_orders):
    """Returns the ratio of tickets to orders.

    Args:
        total_tickets (int): Total chats, emails, or tickets in the period.
        total_orders (int): Total orders in the period.

    Returns:
        Ratio of tickets to orders
    """

    return (total_tickets / total_orders) * 100
print('Ticket to Order Ratio:', ticket_to_order_ratio(total_tickets=200, 
                                                       total_orders=3000))
Ticket to Order Ratio: 6.66

Average Tickets to Resolve

One of the most frustrating things for customers is the effort it can take to get an issue resolved, and this is often correlated with churn. Customers, especially those who ordered online, usually want to have their issue handled online via an automated form, ideally without the need to call a number and sit on hold for ages, or wait for a live chat operator to become available.

Once they’ve submitted their ticket, by whatever means, they want a rapid response that doesn’t require any further effort on their part. The Average Tickets to Resolve metric measures this customer effort, and the labour that goes into supporting customers with their issues.

It’s obviously tied to the Ticket to Order Rate, but can give you more insight into the causes, especially if you can examine the tickets by type (perhaps using NLP). The aim should be to get this number as close to 1 as possible, so any issues customers have are resolved with the minimum amount of back and forth. First Contact Resolution Rate is a very similar metric and counts the number of tickets handled in a single contact.

def average_tickets_to_resolve(total_tickets, total_resolutions):
    """Returns the average number of tickets required to resolve an issue.

    Args:
        total_tickets (int): Total chats, emails, or tickets in the period.
        total_resolutions (int): Total chats, emails, or tickets resolved in the period.

    Returns:
        Average number of tickets it takes to resolve an issue.
    """

    return total_tickets / total_resolutions
print('Average Tickets to Resolve:', average_tickets_to_resolve(total_tickets=650, 
                                                                total_resolutions=300))
Average Tickets to Resolve: 2.166

Average Time to Resolve

Nicely following on from Average Tickets to Resolve is Average Time to Resolve, which looks at the average time it takes to handle a query rather than the number of tickets. Since it’s time-based, it’s harder to calculate than the other metrics. However, many customer service portals or help desks provided it as a standard metric.

To calculate this metric, you’ll likely need to use Pandas to calculate the time to resolve each ticket, then calculate the overall average for the period you want to examine. Crucially, it’s not just about the time it takes to get back to a customer, but the time it takes to actually resolve their query, which is linked to the quality of support provided, their speed of action, and the number of tickets they need to solve the problem.

While some CS teams will complain that it’s not a fair metric because some queries arrive out of hours, heatmap analysis of ticket times often shows that the live chat or other systems are not being manned when customers need them most, so a change in the rota may be needed. It’s also a good one to encourage cross-department collaboration, since the ecommerce and development teams may be able to fix certain issues that cause tickets in the first place.

def time_to_resolve(time_received, time_resolved):
    """Returns the time taken to resolve an issue.

    Args:
        time_received (datetime): Datetime showing when ticket was received.
        time_resolved (datetime): Datetime showing when ticket was received.

    Returns:
        Time taken to resolve issue in hours. 
    """
    
    time_received = datetime.strptime(time_received, "%Y-%m-%d %H:%M:%S")
    time_resolved = datetime.strptime(time_resolved, "%Y-%m-%d %H:%M:%S")
    time_to_resolve = ((time_resolved - time_received).seconds / 60) / 60
    
    return time_to_resolve
print('Time to Resolve:', time_to_resolve(time_received='2021-04-16 10:01:28', 
                                          time_resolved='2021-04-17 09:04:45'))
Time to Resolve: 23.05472222222222

Service Level

Although it’s arguably an operations management metric, Service Level is ultimately correlated with customer satisfaction, so is often reported alongside customer satisfaction metrics, even if it is within the sole control of your warehouse and procurement teams.

Service Level measures the percentage of orders taken during the period that could actually be shipped. Nothing annoys customers more than visiting a website, seeing that a product is advertised as in stock, placing an order, and then receiving a notification from the company that it wasn’t in stock.

A low service level (below 95% is considered poor) costs companies money. The warehouse team need to liaise with customer services to make contact with the customer, calls or emails need to be made, customers may need to be placated with free gifts or free delivery, and the accounts team may need to provide a refund.

def service_level(orders_received, orders_delivered):
    """Return the inventory management service level metric, based on the percentage of received orders delivered.

    Args:
        orders_received (int): Orders received within the period.
        orders_delivered (int): Orders successfully delivered within the period.

    Returns:
        Percentage (float) of orders received that were delivered within the period.
    """

    return (orders_delivered / orders_received) * 100
print('Service level:', service_level(orders_received=2000, 
                                      orders_delivered=1900))
Service level: 95.0

Matt Clarke, Saturday, March 13, 2021

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.