Non contractual Customer Lifetime Value estimated probabilistically with the Beta Geometric/Negative Binomial Distribution (BG/NBD) Model

We assume an online service business where customers/clients continously purchase our services.

For such a service business we generate our customer transactions ourselves instead of using once again one of the few available public Datasets.

We pick a Gamma distribution for the tenure of our 10000 customers with shape, scale = 4, 11

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('darkgrid')
sns.set(rc={"figure.figsize":(10, 8)})

rng = np.random.default_rng(2022)

shape, scale = 4, 11
tenure = rng.gamma(shape, scale, 10000) *2
plt.hist(tenure, density=True, edgecolor='black', bins=50)
plt.title('Histogramm - Customer Tenure')
plt.xlabel('Tenure')
plt.ylabel('Count')
plt.show()

png

Using the Beta Geometric/Negative Binomial Distribution (BG/NBD) Model

The BG/NBD is based on the Pareto/NBD model. Unlike to the Pareto/NBD model, the betageometric (BG) model assume that customer die and drop out immideatly after a transaction, whereas Pareto models a probability that dropout with which customers dropout can occurr anytime.

what else is assumed?

  1. While active, the number of transactions made by a customer follows a Poisson process with transaction rate $\lambda$, which essentially means the time between transactions is exponentially distributed.

  2. Heterogeneity in $ \lambda $ is gamma distributed

  3. Customer become inactive with probability p after every transaction inactive with probability $ p $. Therefore the point in time when the customer dies is distributed (shifted) geometrically across transactions

  4. Heterogeneity in $ p $ follows a beta distribution
  5. The transaction rate $ \lambda $ and the dropout probability $ p $ are independent between customers.

    Note: $ \lambda $ and $ p $ are both unobserved

All customers are assumed to be active customers in this model, so it makes sense to apply it on a cohort of customers who just have made their first purchase.

let’s generate some daily transactional data for a cohort of fresh customers

%%time
from faker import Faker
import pandas as pd
fake = Faker(['it_IT', 'en_UK', 'fr_FR', 'de_DE', 'uk_UA'])
newcols = fake.simple_profile().keys()

profiles = pd.DataFrame(columns=list(newcols))
for i in range(10000):
    profiles.loc[i] = fake.simple_profile()

profiles['customer_id'] = profiles.index   
first_column = profiles.pop('customer_id')
profiles.insert(0, 'customer_id', first_column)
profiles.head()
CPU times: total: 17.2 s
Wall time: 17.3 s
customer_id username name sex address mail birthdate
0 0 abramstey Nuran Trupp-Lachmann F Ritterallee 5/3\n23598 Staffelstein babett14@hotmail.de 1952-02-10
1 1 pichonalex Marcel Leroy M 69, rue de Ollivier\n25378 GoncalvesBourg francoismartin@club-internet.fr 1966-12-13
2 2 le-gallemmanuelle Monique Roche F 85, avenue de Thierry\n67649 Maillet madeleineleroy@dbmail.com 1945-04-22
3 3 jakob73 Nuray Martin B.Sc. F Holtallee 7\n19875 Ahaus hgeisler@yahoo.de 1909-09-26
4 4 shvachkanazar Тетяна Дергач F набережна Лемківська, 2, селище Марʼяна, 101472 leontii90@email.ua 1936-08-12
# model params rounded from CDNOW sample in the paper p. 281
params=dict()

params['r'] = 0.25, 
params['alpha'] = 4.5

params['a'] = 0.8
params['b'] = 2.4

params
{'r': (0.25,), 'alpha': 4.5, 'a': 0.8, 'b': 2.4}
observation_period_end='2021-12-31'
%%time
from lifetimes.generate_data import beta_geometric_nbd_model_transactional_data

transactions = beta_geometric_nbd_model_transactional_data(tenure, params['r'], params['alpha'], params['a'], params['b'], 
                                                  observation_period_end=observation_period_end, freq='D', size=10000)
transactions.shape
CPU times: total: 1min 52s
Wall time: 1min 52s





(29664, 2)
trans_df = transactions.merge(profiles, left_on='customer_id', right_on='customer_id')
trans_df.head()
customer_id date username name sex address mail birthdate
0 0 2021-04-20 21:30:50.384044800 abramstey Nuran Trupp-Lachmann F Ritterallee 5/3\n23598 Staffelstein babett14@hotmail.de 1952-02-10
1 0 2021-08-26 06:44:27.043411199 abramstey Nuran Trupp-Lachmann F Ritterallee 5/3\n23598 Staffelstein babett14@hotmail.de 1952-02-10
2 0 2021-10-21 23:25:51.052166400 abramstey Nuran Trupp-Lachmann F Ritterallee 5/3\n23598 Staffelstein babett14@hotmail.de 1952-02-10
3 1 2021-06-10 07:05:42.827625600 pichonalex Marcel Leroy M 69, rue de Ollivier\n25378 GoncalvesBourg francoismartin@club-internet.fr 1966-12-13
4 1 2021-12-11 21:52:01.636838400 pichonalex Marcel Leroy M 69, rue de Ollivier\n25378 GoncalvesBourg francoismartin@club-internet.fr 1966-12-13

Aggregate the summary data analog to RFM segmentation (Recency, Frequency, Monetary) from the transactional data just generated

from lifetimes.utils import summary_data_from_transaction_data

summary = summary_data_from_transaction_data(trans_df, 'customer_id', 'date', observation_period_end=observation_period_end)
summary = pd.concat([profiles, summary], axis=1)
summary
customer_id username name sex address mail birthdate frequency recency T
0 0 abramstey Nuran Trupp-Lachmann F Ritterallee 5/3\n23598 Staffelstein babett14@hotmail.de 1952-02-10 2.0 184.0 255.0
1 1 pichonalex Marcel Leroy M 69, rue de Ollivier\n25378 GoncalvesBourg francoismartin@club-internet.fr 1966-12-13 1.0 184.0 204.0
2 2 le-gallemmanuelle Monique Roche F 85, avenue de Thierry\n67649 Maillet madeleineleroy@dbmail.com 1945-04-22 3.0 98.0 99.0
3 3 jakob73 Nuray Martin B.Sc. F Holtallee 7\n19875 Ahaus hgeisler@yahoo.de 1909-09-26 0.0 0.0 15.0
4 4 shvachkanazar Тетяна Дергач F набережна Лемківська, 2, селище Марʼяна, 101472 leontii90@email.ua 1936-08-12 1.0 45.0 109.0
... ... ... ... ... ... ... ... ... ... ...
9995 9995 iarynahavrylyshyn Олег Рудько M набережна Маркіяна Шашкевича, 517, хутір Данил... khavrylenko@ukr.net 1988-07-07 0.0 0.0 40.0
9996 9996 gisbertriehl Claire Koch-Anders F Zorbachring 7\n70870 Burgdorf tomas93@hotmail.de 2014-08-13 0.0 0.0 89.0
9997 9997 marta63 Остап Ейбоженко M вулиця Бруно Шульца, 915, хутір Ганна, 76375 vdovychenkobohuslav@meta.ua 1976-07-24 0.0 0.0 100.0
9998 9998 opowell Chelsea Poole F Flat 82\nBryan passage\nNorth Luke\nWF1N 0AL rsmart@hotmail.co.uk 1916-10-24 1.0 47.0 101.0
9999 9999 havrylotsymbaliuk пан Устим Цибуленко M вулиця Василя Симоненка, 628, село Леон, 85627 venedykt92@ukr.net 1975-02-06 2.0 44.0 75.0

10000 rows × 10 columns

"""from lifetimes.generate_data import beta_geometric_nbd_model
#lifetimes.generate_data.beta_geometric_nbd_model(T, r, alpha, a, b, size=1)
#Generate artificial data according to the BG/NBD model.

df = beta_geometric_nbd_model(tenure, params['r'], params['alpha'], params['a'], params['b'], size=10000)
"""
"""Parameters:	
T (array_like) – The length of time observing new customers.
alpha, a, b (r,) – Parameters in the model. See [1]_
size (int, optional) – The number of customers to generate
Returns:	
DataFrame – With index as customer_ids and the following columns: ‘frequency’, ‘recency’, ‘T’, ‘lambda’, ‘p’, ‘alive’, ‘customer_id’"""
'Parameters:\t\nT (array_like) – The length of time observing new customers.\nalpha, a, b (r,) – Parameters in the model. See [1]_\nsize (int, optional) – The number of customers to generate\nReturns:\t\nDataFrame – With index as customer_ids and the following columns: ‘frequency’, ‘recency’, ‘T’, ‘lambda’, ‘p’, ‘alive’, ‘customer_id’'
from lifetimes import BetaGeoFitter
bgf = BetaGeoFitter(penalizer_coef=0)
bgf.fit(summary['frequency'], summary['recency'], summary['T'])
bgf.summary
coef se(coef) lower 95% bound upper 95% bound
r 0.289300 0.005818 0.277897 0.300704
alpha 6.476316 0.213714 6.057437 6.895194
a 0.790777 0.053952 0.685031 0.896524
b 2.530743 0.231390 2.077220 2.984267
from lifetimes.plotting import plot_frequency_recency_matrix
sns.set(rc={"figure.figsize":(10, 8)})
sns.set_style('dark')
plot_frequency_recency_matrix(bgf, cmap='plasma')
plt.show()

png

from lifetimes.plotting import plot_probability_alive_matrix

plot_probability_alive_matrix(bgf, cmap='plasma')
plt.show()

png

Customer ranking

Let us identify the customers with Top 5 expected purchases within next seven days (t=7) based on their transaction history.

t = 7
ppcolname = 'predicted_purchases_' + str(t)
summary[ppcolname] = bgf.conditional_expected_number_of_purchases_up_to_time(t, summary['frequency'],summary['recency'], summary['T'])
summary.sort_values(by=ppcolname, ascending=False).head(5)
customer_id username name sex address mail birthdate frequency recency T predicted_purchases_14 predicted_purchases_30 predicted_purchases_7 monetary_value_average Expected_cond_average_revenue error_rev
2910 2910 carlypalmer Mrs. Joan Heath M Studio 41t\nAbigail port\nEast Jayne\nPO5X 1DL sally36@yahoo.co.uk 1908-01-15 33.0 55.0 55.0 6.851758 14.839702 3.557546 38.734677 38.622364 -0.112313
3129 3129 jsontag Zeynep Steckel F Biengasse 720\n46808 Ansbach peukertmeike@aol.de 1976-05-28 20.0 30.0 31.0 6.324447 13.947335 3.344909 31.671466 31.566594 -0.104872
1677 1677 bohodarokhrimenko Болеслав Дубас M вулиця Шота Руставелі, 248, місто Лариса, 11399 artemtymchuk@gmail.com 1974-01-16 19.0 34.0 34.0 5.749406 12.524436 3.029490 16.370829 16.441952 0.071123
3149 3149 ujones Kieran Jennings F Flat 3\nRoger port\nKieranton\nSR3 0HP abigailphillips@gmail.com 2012-02-22 34.0 71.0 72.0 5.554099 12.087839 2.862532 7.135654 7.237132 0.101478
7500 7500 elombardi Dott. Melania Toldo F Incrocio Bragaglia 3\nCostanzi sardo, 94137 Tr... cgagliano@tim.it 2016-02-24 23.0 47.0 47.0 5.412396 11.756576 2.821701 12.182490 12.282481 0.099991
t = 30
ppcolname = 'predicted_purchases_' + str(t)
summary[ppcolname] = bgf.conditional_expected_number_of_purchases_up_to_time(t, summary['frequency'],summary['recency'], summary['T'])
sorted_summary = summary.sort_values(by=ppcolname, ascending=False)
sorted_summary.head()
customer_id username name sex address mail birthdate frequency recency T predicted_purchases_14 predicted_purchases_30 predicted_purchases_7 monetary_value_average Expected_cond_average_revenue error_rev
2910 2910 carlypalmer Mrs. Joan Heath M Studio 41t\nAbigail port\nEast Jayne\nPO5X 1DL sally36@yahoo.co.uk 1908-01-15 33.0 55.0 55.0 6.851758 13.594318 3.557546 38.734677 38.622364 -0.112313
3129 3129 jsontag Zeynep Steckel F Biengasse 720\n46808 Ansbach peukertmeike@aol.de 1976-05-28 20.0 30.0 31.0 6.324447 12.149527 3.344909 31.671466 31.566594 -0.104872
3149 3149 ujones Kieran Jennings F Flat 3\nRoger port\nKieranton\nSR3 0HP abigailphillips@gmail.com 2012-02-22 34.0 71.0 72.0 5.554099 11.173048 2.862532 7.135654 7.237132 0.101478
1677 1677 bohodarokhrimenko Болеслав Дубас M вулиця Шота Руставелі, 248, місто Лариса, 11399 artemtymchuk@gmail.com 1974-01-16 19.0 34.0 34.0 5.749406 11.114488 3.029490 16.370829 16.441952 0.071123
7500 7500 elombardi Dott. Melania Toldo F Incrocio Bragaglia 3\nCostanzi sardo, 94137 Tr... cgagliano@tim.it 2016-02-24 23.0 47.0 47.0 5.412396 10.659459 2.821701 12.182490 12.282481 0.099991

Model fit assessment

from lifetimes.plotting import plot_period_transactions
plot_period_transactions(bgf)
plt.show()

png

calibration_period_end='2021-10-10'
from lifetimes.utils import calibration_and_holdout_data

summary_cal_holdout = calibration_and_holdout_data(trans_df, 'customer_id', 'date',
                                        calibration_period_end=calibration_period_end,
                                        observation_period_end=observation_period_end )
summary_cal_holdout.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 4998 entries, 0 to 9998
Data columns (total 5 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   frequency_cal      4998 non-null   float64
 1   recency_cal        4998 non-null   float64
 2   T_cal              4998 non-null   float64
 3   frequency_holdout  4998 non-null   float64
 4   duration_holdout   4998 non-null   float64
dtypes: float64(5)
memory usage: 363.3 KB
from lifetimes.plotting import plot_calibration_purchases_vs_holdout_purchases

bgf = BetaGeoFitter(penalizer_coef=0.001)
bgf.fit(summary_cal_holdout['frequency_cal'], summary_cal_holdout['recency_cal'], summary_cal_holdout['T_cal'])
plot_calibration_purchases_vs_holdout_purchases(bgf, summary_cal_holdout)
plt.show()

png

bgf.fit(summary_cal_holdout['frequency_cal'], summary_cal_holdout['recency_cal'], summary_cal_holdout['T_cal'])
plot_cumulative_transactions(bgf, trans_df, 'date', 'customer_id', 280, 140);

png

plot_incremental_transactions(bgf, trans_df, 'date', 'customer_id',280, 140);

png

customer_X = sorted_summary[200:201]
customer_X
customer_id username name sex address mail birthdate frequency recency T predicted_purchases_14 predicted_purchases_30 predicted_purchases_7 monetary_value_average Expected_cond_average_revenue error_rev
8403 8403 trubinguenther Ing. Hanno Ullmann M Gudegasse 8/4\n87671 Grevenbroich riehlmarlen@yahoo.de 1909-03-31 7.0 44.0 46.0 1.601542 3.179439 0.831359 18.652597 18.769815 0.117218
t = 30 # predict number of purchases in next t periods

individual = summary.iloc[customer_X.customer_id]
bgf.predict(t, individual['frequency'], individual['recency'], individual['T'])
# 0.0576511
8403    3.374978
dtype: float64
from lifetimes.plotting import plot_history_alive

days_since_birth = 61 # of this customer X

sp_trans = trans_df.loc[trans_df['customer_id'] == int(customer_X.customer_id)]
plot_history_alive(bgf, days_since_birth, sp_trans, 'date', title='History of P_alive of customer_X='+str(customer_X.username) )
plt.show()

png

sp_trans
customer_id date username name sex address mail birthdate
25086 8403 2021-11-15 11:51:07.696598400 trubinguenther Ing. Hanno Ullmann M Gudegasse 8/4\n87671 Grevenbroich riehlmarlen@yahoo.de 1909-03-31
25087 8403 2021-11-16 21:17:22.970515200 trubinguenther Ing. Hanno Ullmann M Gudegasse 8/4\n87671 Grevenbroich riehlmarlen@yahoo.de 1909-03-31
25088 8403 2021-11-24 04:46:07.064832 trubinguenther Ing. Hanno Ullmann M Gudegasse 8/4\n87671 Grevenbroich riehlmarlen@yahoo.de 1909-03-31
25089 8403 2021-12-02 19:17:03.481641600 trubinguenther Ing. Hanno Ullmann M Gudegasse 8/4\n87671 Grevenbroich riehlmarlen@yahoo.de 1909-03-31
25090 8403 2021-12-13 04:24:15.312902400 trubinguenther Ing. Hanno Ullmann M Gudegasse 8/4\n87671 Grevenbroich riehlmarlen@yahoo.de 1909-03-31
25091 8403 2021-12-15 16:12:24.811142400 trubinguenther Ing. Hanno Ullmann M Gudegasse 8/4\n87671 Grevenbroich riehlmarlen@yahoo.de 1909-03-31
25092 8403 2021-12-23 23:41:56.826096 trubinguenther Ing. Hanno Ullmann M Gudegasse 8/4\n87671 Grevenbroich riehlmarlen@yahoo.de 1909-03-31
25093 8403 2021-12-29 12:19:29.010403200 trubinguenther Ing. Hanno Ullmann M Gudegasse 8/4\n87671 Grevenbroich riehlmarlen@yahoo.de 1909-03-31
no_transactions = trans_df.groupby('customer_id').count().sort_values('date', ascending=False)['date']
no_transactions 
customer_id
2019    88
5280    87
985     75
7987    66
7892    57
        ..
1722     1
5653     1
5651     1
5650     1
6404     1
Name: date, Length: 10000, dtype: int64
df3 = summary[summary['frequency']>0]
df3['prob_alive'] = bgf.conditional_probability_alive(df3['frequency'],df3['recency'],df3['T'])
sns.distplot(df3['prob_alive']);
plt.xlim(0,1)
plt.suptitle('Distribution of Alive Probabilities')
plt.show()

df3['churn'] = ['churned' if p < .1 else 'not churned' for p in df3['prob_alive']]
sns.countplot(df3['churn']);
plt.suptitle('Grouping of alive probabilities')
plt.show()

print('Grouping of alive probabilities')
df3['churn'][(df3['prob_alive']>=.1) & (df3['prob_alive']<.2)] = "high risk"
df3['churn'].value_counts()

png

png

Grouping of alive probabilities





not churned    3841
churned         772
high risk       589
Name: churn, dtype: int64

Model monetary transaction value data modeled with Gamma-Gamma model

## monetray value data gen

shape, scale = 3, 3.  # mean=4, std=2*sqrt(2)
monetary_value_average = rng.gamma(shape, scale, 10000) * 2
plt.hist(monetary_value_average*2, density=True, edgecolor='black', bins=50)
plt.show()

png

summary['monetary_value_average'] = transaction_amount

returning_customers_summary = summary[summary['frequency']>0]
returning_customers_summary.head()
customer_id username name sex address mail birthdate frequency recency T predicted_purchases_14 predicted_purchases_30 predicted_purchases_7 monetary_value_average Expected_cond_average_revenue error_rev
0 0 abramstey Nuran Trupp-Lachmann F Ritterallee 5/3\n23598 Staffelstein babett14@hotmail.de 1952-02-10 2.0 184.0 255.0 0.082734 0.174773 0.041632 32.613560 31.565017 -1.048543
1 1 pichonalex Marcel Leroy M 69, rue de Ollivier\n25378 GoncalvesBourg francoismartin@club-internet.fr 1966-12-13 1.0 184.0 204.0 0.062412 0.131735 0.031418 16.727437 17.774840 1.047403
2 2 le-gallemmanuelle Monique Roche F 85, avenue de Thierry\n67649 Maillet madeleineleroy@dbmail.com 1945-04-22 3.0 98.0 99.0 0.357496 0.738755 0.181780 8.692225 9.658282 0.966057
4 4 shvachkanazar Тетяна Дергач F набережна Лемківська, 2, селище Марʼяна, 101472 leontii90@email.ua 1936-08-12 1.0 45.0 109.0 0.080884 0.168804 0.040935 14.807318 16.211255 1.403937
5 5 petro07 Алла Рябець F набережна Дністровська, 843, хутір Ілля, 106205 sviatoslavadurdynets@ukr.net 2002-08-12 1.0 6.0 34.0 0.172044 0.345416 0.088817 25.410872 24.845904 -0.564968

Important assumption for the Gamma-Gamma Model: the relationship between the monetary value and the purchase frequency is near zero.

As this is met we can continue to train the model and start analysing.

returning_customers_summary[['monetary_value_average', 'frequency']].corr()
monetary_value_average frequency
monetary_value_average 1.00000 -0.00149
frequency -0.00149 1.00000
from lifetimes import GammaGammaFitter

ggf = GammaGammaFitter(penalizer_coef = 0.002)
ggf.fit(returning_customers_summary['frequency'], returning_customers_summary['monetary_value_average'])

<lifetimes.GammaGammaFitter: fitted with 5202 subjects, p: 5.53, q: 2.26, v: 5.10>
ggf.summary
coef se(coef) lower 95% bound upper 95% bound
p 5.532698 0.115701 5.305924 5.759472
q 2.261587 0.044056 2.175237 2.347938
v 5.100490 0.125329 4.854845 5.346135
summary['Expected_cond_average_revenue'] = ggf.conditional_expected_average_profit(summary['frequency'], summary['monetary_value_average'])
summary['Expected_cond_average_revenue'].describe()
count    10000.000000
mean        20.413614
std          7.062209
min          1.499528
25%         16.409391
50%         22.368232
75%         22.368232
max         75.235044
Name: Expected_cond_average_revenue, dtype: float64
# MAPE
from sklearn.metrics import mean_absolute_percentage_error

summary["error_rev"] = summary['Expected_cond_average_revenue'] - summary['monetary_value_average']

mape = mean_absolute_percentage_error(summary['Expected_cond_average_revenue'], summary["monetary_value_average"])
print("MAPE of predicted revenues:", f'{mape:.3f}')

MAPE of predicted revenues: 0.244
summary['Expected_cond_average_revenue'].head(20)
0     31.565017
1     17.774840
2      9.658282
3     22.368232
4     16.211255
5     24.845904
6     22.368232
7     22.368232
8     22.368232
9      8.200775
10    23.078353
11     6.811015
12     9.628732
13    22.368232
14    22.368232
15    22.368232
16    19.063189
17    29.568306
18    34.173798
19    12.974695
Name: Expected_cond_average_revenue, dtype: float64
print("Expected conditional average profit: %s vs. Average profit: %s" % (
    ggf.conditional_expected_average_profit(
        summary['frequency'],
        summary['monetary_value_average']
    ).mean(),
    summary[summary['frequency']>0]['monetary_value_average'].mean()
))
Expected conditional average profit: 20.41361394230678 vs. Average profit: 18.131239279823276

Calculate the Customer Lifetime Value disconted by DCF and a annual interest rate

# refit the BG model 
bgf.fit(summary['frequency'], summary['recency'], summary['T'])

# modelling CLV
summary['clv'] = ggf.customer_lifetime_value(
    bgf,
    summary['frequency'],
    summary['recency'],
    summary['T'],
    summary['monetary_value_average'],
    time=12, # lifetime in months
    discount_rate=0.006 
)
summary['clv'].head(10)
0   52.30111846
1   20.98403156
2   57.38228431
3   47.29754080
4   21.46389456
5   52.60841130
6   14.31632335
7   27.28834943
8   14.72406602
9   15.56113740
Name: clv, dtype: float64
# describe the distribution
pd.options.display.float_format = '{:.8f}'.format
summary['clv'].describe()

count   10000.00000000
mean       56.73016752
std       142.07256046
min         0.00000003
25%        13.83895220
50%        21.00412658
75%        39.37786598
max      3623.81967049
Name: clv, dtype: float64
ax = sns.distplot(summary['clv'], kde=False, rug=True)
ax.set_yscale('log')
plt.suptitle('Distribution of Customer Lifetime Value vs. logarithmic density')

plt.show()

png

Although the 75% percentile is at under 40 bucks, few customers with high variance and 4 digit CLV lift the arithmetic mean of CLV way above that number.

It makes sense to segment these different customer types differently to better approach them. The features from this BG/NBD have been engineered and can be used for such a customer segmentation.

Compare with the original summary df and look at all these new columns:

summary.columns
Index(['customer_id', 'username', 'name', 'sex', 'address', 'mail',
       'birthdate', 'frequency', 'recency', 'T', 'predicted_purchases_14',
       'predicted_purchases_30', 'predicted_purchases_7',
       'monetary_value_average', 'Expected_cond_average_revenue', 'error_rev',
       'clv'],
      dtype='object')

References

[1] Fader, Peter S., Bruce G.S. Hardie, and Jen Shang (2010), “Customer-Base Analysis in a Discrete-Time Noncontractual Setting,” Marketing Science, 29 (6), 1086-1108.