How to calculate marketing metrics in Python

Learn how to calculate marketing metrics such as CPM, CPC, conversion rate, ROMI, ROI, ROAS, CPO, CPA, and the Lin Rodnitzsky Ratio using Python.

How to calculate marketing metrics in Python
Picture by Mailchimp, Unsplash.
16 minutes to read

Marketers can be just as obsessive about data as data scientists, so there are an abundance of well-researched marketing metrics available for analysing marketing performance. Most of the commonly used marketing metrics are actually very easy to calculate, and there are only a handful for which you’ll really require sophisticated models in order to generate accurate results.

The downside with marketing metrics is that there are loads of initialisms, such as CPA, CPO, CPR, CPD, CPT, and many of them have been used more than once to refer to different metrics. Here’s a quick guide to some of the most useful marketing metrics to help you understand them as a data scientist, plus the Python code you need to implement them in your work.

Cost per Mille (CPM)

Cost per Mille (CPM) is effectively the cost per thousand (since “mille” is Latin for “one thousand”) and is a metric most commonly used for benchmarking advertising costs, whether online “display” advertising, ads on TV, on the radio, or in magazines or other formats.

This advertising industry standard metric allows comparisons of the cost of reaching customers across different advertising channels or formats. Many advertisers intentionally obfuscate their advertising costs when providing their “rate card” showing advertising costs, so converting these to CPM is a common task when assessing whether to advertise in any given place.

While it is equivalent to cost per thousand, I’d recommend avoiding breaking from convention and calling it CPT, since another metric - Cost per Transaction - shares the same initialism.

def cpm(total_cost, total_recipients):
    """Return the CPM (or Cost per Mille) based on the marketing cost per 1000 customers.

    Args:
        total_cost (float): Total cost of marketing.
        total_recipients (int): Total number of marketing recipients.

    Returns:
        cpm (float) as total cost of marketing per 1000 customers.
    """

    return (total_cost / total_recipients) * 1000
print('CPM:', cpm(total_cost=14000, total_recipients=20000))
CPM: 700.0
Cost per Click (CPC)

While online display advertising using the CPM to show the price it charges to serve your ad a thousand times, CPC refers to the Cost per Click. With Google Ads, Bing Ads, and Facebook Ads, you pay per click (hence the name “PPC advertising”), with clicks costing anything from less than a pound to many pounds, depending on a range of factors.

The advertising platforms will report to you the average cost you’re paying per click for each keyword, ad group, and ad campaign. However, you’ll often need to calculate this yourself if analysing time series data or drilling down into campaign data. It’s dead easy to calculate - just divide the cost of the ads over the total number of clicks.

def cpc(total_cost, total_clicks):
    """Return the CPC (Cost per Click).

    Args:
        total_cost (float): Total cost of marketing.
        total_clicks (int): Total number of clicks.

    Returns:
        cpt (float) as total cost per click
    """

    return total_cost / total_clicks
print('CPC:', cpc(total_cost=1500, total_clicks=1000))
CPC: 1.5
Conversion Rate (CR)

Conversion Rate (CR) is used everywhere in marketing and ecommerce to measure a whole range of things. Stating the obvious, it simply represents the percentage of customers who “converted” over the total number in the population. “Converted” can be anything.

It’s used both as a metric to examine overall ecommerce performance (i.e. total_transactions / total_sessions), and conversion at product level, campaign level, and keyword level. The marketer’s aim, of course, is to optimise their advertising and the site so that a higher percentage of customers convert and the rate goes up.

def conversion_rate(total_conversions, total_actions):
    """Return the conversion rate (CR) for an action.

    Args:
        total_conversions (int): Total number of conversions.
        total_actions (int): Total number of actions.

    Returns:
        conversion rate (float) percentage
    """

    return (total_conversions / total_actions) * 100
print('Conversion rate:', conversion_rate(total_conversions=10, total_actions=1000))
Conversion rate: 1.0
Return on Marketing Investment (ROMI)

Related to conversion rate is Return on Marketing Investment or ROMI. This metric aims to examine the percentage return provided on the amount spent on advertising, and is calculated by dividing the total_revenue by total_marketing_costs for a period.

For example, if you spent £1000 on marketing and generated £2000 in revenue, your ROMI would be 100%. If you spent £1000 on marketing and generated £3000, your ROMI would be 200%. Importantly, ROMI only looks at marketing costs, so doesn’t usually include the cost of picking, packing, or shipping each order. Therefore, the ROMI needs to be high to be profitable.

ROMI is what I call a “supporting metric” and you shouldn’t use it to measure overall performance. Optimising solely for ROMI can reduce the profit generated. A 1000% ROMI on low revenue still equates to bugger all, while a 600% ROMI on a metric shit tonne is a lot…

It’s handy for assessing which ads are burning profit and which are not, but it shouldn’t be used on its own, and especially not for monitoring the overall performance of your advertising.

def romi(total_revenue, total_marketing_costs):
    """Return the Return on Marketing Investment (ROMI).

    Args:
        total_revenue (float): Total revenue generated.
        total_marketing_costs (float): Total marketing costs

    Returns:
        Return on Marketing Investment (float) or (ROMI).
    """

    return ((total_revenue - total_marketing_costs) / total_marketing_costs) * 100
print('ROMI:', romi(total_revenue=2000, total_marketing_costs=1000))
ROMI: 100.0
Return on Investment (ROI)

Return on Investment or ROI is commonly used, but is still calculated in various ways, depending on the cost data available. Instead of looking solely at the return on the marketing investment, as ROMI does, ROI also includes the additional costs that are incurred when handling an order, such as the picking, packing, and delivery cost, and the cost of the goods themselves.

Few operations managers are able to provide detailed breakdowns of these costs on a per order basis, so it’s common to use an average cost (say a £2 per order) which is designed to cover the operational costs incurred. Some businesses don’t do it this way, and instead embed those costs within the product price, so you’ll need to check how it works within your business.

Taxes are also often included, so you get a percentage figure back which tells you the “actual” (or close to it) return that you achieved from your campaign. Obviously, this one’s a critical metric, and one for which the reasonability for improvement is shared across multiple departments, since it’s affected by the advertising quality, advertising costs, site performance, and operational costs.

def roi(total_revenue, total_marketing_costs, total_other_costs):
    """Return the Return on Investment (ROI).

    Args:
        total_revenue (float): Total revenue generated.
        total_marketing_costs (float): Total marketing costs
        total_other_costs (float): Total other costs

    Returns:
        Return on Marketing Investment (float) or (ROI).
    """

    total_costs = total_marketing_costs + total_other_costs
    return ((total_revenue - total_costs) / total_costs) * 100
print('ROI:', roi(total_revenue=2000, total_marketing_costs=1000, total_other_costs=1000))
ROI: 0.0
Return on Advertising Spend (ROAS)

Return on Advertising Spend (or ROAS) is another commonly used measure for examining the return that advertising or marketing is generating for a business. While it’s arguably similar to ROMI, it’s often preferred by stakeholders in reports, because it equates to the revenue returned for each pound spent on advertising.

Like ROMI, ROAS doesn’t include the additional costs incurred, including the cost of the goods themselves, so the number needs to be set fairly high in order to be profitable for the business. Analysis will be required to determine the optimal ROAS target, since like ROMI, optimising for ROAS can either burn money or limit growth.

The strategy used will ultimately depend on whether the business wants to make a profit on the first order, or plans to try to recover acquisition costs from retention and CLV.

def roas(total_revenue, total_marketing_costs):
    """Return the Return on Advertising Spend or ROAS.

    Args:
        total_revenue (float): Total revenue generated.
        total_marketing_costs (float): Total marketing costs

    Returns:
        Return on Advertising Spend or ROAS (float).
    """

    return total_revenue / total_marketing_costs
print('ROAS:', roas(total_revenue=2000, total_marketing_costs=1000))
ROAS: 2.0
Cost per Order (CPO)

Cost per Order (CPO), sometimes called Cost per Transaction (CPT), measures the cost of generating each transaction and is simply the total cost divided by the total transactions received. CPT can be used on the business as a whole, often incorporating advertising costs, delivery costs, and internal cross charge costs, or on specific advertising channels.

There are numerous derivatives of this metric that you may see used, or wish to use yourself. Cost per Sale (CPS) typically covers the costs spent on generating a sale via marketing or advertising, and excludes operating costs and taxes. Similarly, Cost per Lead (CPL) shows the marketing costs that went into getting each lead, whether it’s from filling in a form or making a phone call to the sales team.

Cost per Registration (CPR) shows the cost of generating each site registration, and CPD shows the Cost per Download. You can devise your own for whatever metric you wish to examine.

def cpo(total_cost, total_transactions):
    """Return the CPT (Cost per Order).

    Args:
        total_cost (float): Total cost of marketing.
        total_transactions (int): Total number of transactions.

    Returns:
        cpt (float) as total cost per transaction
    """

    return total_cost / total_transactions
print('CPO:', cpo(total_cost=14000, total_transactions=1000))
CPO: 14.0
Cost per Acquisition (CPA)

Cost per Acquisition or CPA (not to be confused with the other CPA - Cost per Action) is similar to Cost per Sale (CPS), but instead of looking at the marketing costs that went into obtaining each sale, it examines the marketing cost per new customer acquisition.

New customer acquisitions in ecommerce aren’t always clear cut, particularly in B2B ecommerce, where multiple staff at one company may share an account. You’ll likely want to use Pandas to identify the order number each customer is placing in order to make this easier to calculate. You can do that in Pandas using the below code.

def get_cumulative_count(df, group, count_column, sort_column):
    """Get the cumulative count of a column based on a GroupBy.
    Args:
        :param df: Pandas DataFrame.
        :param group: Column to group by.
        :param count_column: Column to count.
        :param sort_column: Column to sort by.
    Returns:
        Cumulative count of the column.
    Usage:
        df['running_total'] = get_cumulative_count(df, 'customer_id', 'order_id', 'date_created')
    """
    df = df.sort_values(by=sort_column, ascending=True)
    return df.groupby([group])[count_column].cumcount()
df_orders['order_number'] = get_cumulative_count(df_orders, 
                                                 'customer_id',
                                                 'order_id', 
                                                 'date_created')

To calculate Cost per Acquisition you would normally take all the marketing costs for a channel or business and then divide them by the number of customers acquired. Obviously, this is slightly flawed, because it also includes the costs incurred in serving returning customers, but it should give you an approximation.

def cpa(total_cost, total_acquisitions):
    """Return the CPA (Cost per Acquisition).

    Args:
        total_cost (float): Total cost of marketing.
        total_acquisitions (int): Total number of acquisitions.

    Returns:
        cpt (float) as total cost per acquisition
    """

    return total_cost / total_acquisitions
print('CPA:', cpa(total_cost=1500, total_acquisitions=10))
CPA: 150.0
Lin-Rodnitzky Ratio

The Lin-Rodnitzky Ratio is perhaps less widely used than the rest. It’s a specific metric designed for assessing the management style (or deliberate management strategy) of paid search advertising accounts. As explained above, different marketing strategies are used to generate sales, and marketers will likely need to adjust ROAS and ROMI targets over time to continue growing at the preferred rate. The Lin-Rodnitzky Ratio is designed to show that.

It is calculated by obtaining the average cost per conversion for all paid search advertising queries (via Google Ads, Bing Ads, or Facebook Ads), and then dividing it by the average cost per conversion on all the source’s advertising queries that converted.

It’s supposed to show whether the account is being managed conservatively, aggressively, or is being totally mismanaged. Conservative paid search account management comes when the account gets its traffic mostly from the keywords that generate good conversion rates (such as the popular products, and the brand terms). However, this means testing is reduced, and the account may fail to generate sales from other keywords.

Aggressive paid search account management means the company is spending loads on non-converting keywords, and needs to reign things in. Mismanagement occurs when the Lin-Rodnitzky Ratio is very high. I’ve not used it on enough accounts to say whether it’s completely reliable, but it’s becoming a popular one with those who audit PPC accounts.

def lin_rodnitzky_ratio(avg_cost_per_conversion_all_queries,
                        avg_cost_per_conversion_queries_with_one_conversion_or_more):
    """Return the Lin-Rodnitzky Ratio describing the quality of paid search 
    account managemnent.

    Args:
        avg_cost_per_conversion_all_queries (float): Average cost per conversion
         on the whole PPC account.
        avg_cost_per_conversion_queries_with_one_conversion_or_more (float): Average
         cost per conversion for only
        conversions where there was one or more conversions.

    Returns:
        Lin-Rodnitsky Ratio (float).

        1.0 to 1.5 - Account is too conservatively managed.
        1.5 to 2.0 - Account is well-managed.
        2.0 to 2.5 - Account is too aggressively managed.
        2.5 or more - Account is being mismanaged.
    """

    return avg_cost_per_conversion_all_queries / \
    avg_cost_per_conversion_queries_with_one_conversion_or_more
print('Lin-Rodnitzky Ratio:', lin_rodnitzky_ratio(
    avg_cost_per_conversion_all_queries=77.61,
    avg_cost_per_conversion_queries_with_one_conversion_or_more=58.72))
Lin-Rodnitzky Ratio: 1.3216961852861036

Matt Clarke, Saturday, March 13, 2021

Matt Clarke Matt is a Digital Director who uses data science to help in his work. He has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.

Joining Data with pandas

Learn to combine data from multiple tables by joining data together using pandas.

Start course for FREE

Comments