Example 3: Mortality Simulation

We use this example to demonstrate a few additional features of pyprotolinc:

Use of a custom state model
Use of a customized product
(Use of another standard table, still a todo)

A Custom State Model

The following model is in fact part of pyprotolinc (pyprotolinc.models.model_mortality.MortalityStates) but since pyprotolinc is meant to be used (also) as a library it supports the integration of user-provided state models. We start by declaring an IntEnum containing the states.

[1]:

from enum import IntEnum, unique

import numpy as np
from pyprotolinc.results import CfNames
from pyprotolinc.results import ProbabilityVolumeResults
from pyprotolinc.models.state_models import states_model, AbstractStateModel


@states_model
@unique
class MortalityStates2(AbstractStateModel):
    ACTIVE = 0      # the "alive state"
    DEATH = 1
    LAPSED = 2
    MATURED = 3

    @classmethod
    def to_std_outputs(cls):
        return {
            ProbabilityVolumeResults.VOL_ACTIVE: cls.ACTIVE,
            ProbabilityVolumeResults.VOL_DEATH: cls.DEATH,
            ProbabilityVolumeResults.VOL_LAPSED: cls.LAPSED,
            ProbabilityVolumeResults.VOL_MATURED: cls.MATURED,

            ProbabilityVolumeResults.MV_ACTIVE_DEATH: (cls.ACTIVE, cls.DEATH),
            ProbabilityVolumeResults.MV_ACT_LAPSED: (cls.ACTIVE, cls.LAPSED),
            ProbabilityVolumeResults.MV_ACT_MATURED: (cls.ACTIVE, cls.MATURED),
        }

Note that in the last statement above the model is registered with pyprotolinc. The class method to_std_outputs is required to map the states in the current state model to the standard output model. We parametrize which state belongs to which volume vector (=summed head count probability) in the standard output format and which state transition corresponds with the volume movements.

Configuration

Importing the configuration from the current working directory we see that the above state model is selected there:

[2]:

from pyprotolinc.main import get_config_from_file, project_cashflows
run_config = get_config_from_file(config_file='config.yml')
print(run_config.state_model_name)

MortalityStates2

Product Definition

With these tools at hand we can now come to the product definition. Note that the product references the above state model in a class variable.

[3]:

from pyprotolinc.product import AbstractProduct, register
from pyprotolinc.product import calc_term_end_indicator, calc_term_start_indicator, calc_terminal_months, calc_maturity_transition_indicator


@register
class Product_MortalityTerm2(AbstractProduct):
    """ Simple product that pays out on death."""

    STATES_MODEL = MortalityStates2
    PRODUCT_NAMES = ("TERM2", )

    def __init__(self, portfolio):
        self.portfolio = portfolio
        self.length = len(self.portfolio)

        # monthly sum insured (=annuity per year) as an (n, 1)-array
        self.sum_insured_per_month = self.portfolio.sum_insured[:, None] / 12.0

        self.year_last_month, self.last_month = calc_terminal_months(self.portfolio.df_portfolio)

    def get_bom_payments(self, time_axis):
        """ Return the 'conditional payments', i.e. those payments that are due if an
            insured is in the corresponding state at the given time. """

        multiplier_term_end = calc_term_end_indicator(time_axis,
                                                      self.year_last_month,
                                                      self.last_month)
        multiplier_term_start = calc_term_start_indicator(time_axis,
                                                          self.portfolio.policy_inception_yr,
                                                          self.portfolio.policy_inception_month
                                                          )
        multiplier_term = multiplier_term_end * multiplier_term_start

        return {
            self.STATES_MODEL.ACTIVE: [
                (CfNames.PREMIUM,
                 0.0005 * multiplier_term * self.sum_insured_per_month * 12.0
                 )
            ]
        }

    def get_state_transition_payments(self, time_axis):
        # a flat mortality benefit in this product

        multiplier_term_end = calc_term_end_indicator(time_axis,
                                                      self.year_last_month,
                                                      self.last_month)
        multiplier_term_start = calc_term_start_indicator(time_axis,
                                                          self.portfolio.policy_inception_yr,
                                                          self.portfolio.policy_inception_month
                                                          )
        multiplier_term = multiplier_term_end * multiplier_term_start
        return {
            (self.STATES_MODEL.ACTIVE, self.STATES_MODEL.DEATH): [
                (CfNames.DEATH_PAYMENT,
                 -multiplier_term * self.sum_insured_per_month * 12.0
                 )
             ]
        }

    def contractual_state_transitions(self, time_axis):
        """ This method returns a datastructure which encodes
            when and for which records contractual state transitions
            are due.

            Returns: Iterable consisting of three-tuples where
              - first member = from-state
              - sencond member = to-state
              - third member is a binary matrix of the structure "insured x time"
                where a "1" represents a contractual move.
        """
        # for the mortality term product there is only the transition
        # ACTIVE -> MATURED
        return [
            (self.STATES_MODEL.ACTIVE,
             self.STATES_MODEL.MATURED,
             calc_maturity_transition_indicator(time_axis, self.year_last_month, self.last_month)
             )
        ]

The annotation register will register the product under the name “TERM2”. For a more in depth understanding of the structure of the product please refer to the concepts section of the documentation. In short, each product definition must provide methods for:

payments due at the beginning of a months when in a certain state (get_bom_payments)
payments due at the end of the months when a certain state transitons occurs (get_state_transition_payments)
state transitions that do not originate from biometric the projection assumptions (contractual_state_transitions)

The following example demonstrates the principle.

[4]:

# load a portfolio
from pyprotolinc.portfolio import Portfolio
portfolio = Portfolio("portfolio/portfolio_small.xlsx", states_model=MortalityStates2)
portfolio.df_portfolio

INFO - 2023-03-26 12:32:17,917 - pyprotolinc.portfolio - Reading portfolio data from file portfolio/portfolio_small.xlsx.

[4]:

	DATE_PORTFOLIO	ID	DATE_OF_BIRTH	DATE_START_OF_COVER	SUM_INSURED	CURRENT_STATUS	SEX	PRODUCT	PRODUCT_PARAMETERS	SMOKERSTATUS	RESERVING_RATE	DATE_OF_DISABLEMENT
0	2021-12-31	1	1976-04-23	2022-01-01	120000	ACTIVE	m	TERM2	10	S	0.04	NaT
1	2021-12-31	2	1962-09-01	2015-10-01	100000	ACTIVE	m	TERM2	10	N	0.03	2017-10-01

The portfolio shows two insureds in state active at 2021-12-31 both having the PRODUCT set as “TERM2” and the PRODUCT_PARAMETERS set as 10. The latter parameter is meant to indicate the duration of the policy in years.

To test the product definition we still need a time-axis object. We create one with in total 150 months for this test.

[5]:

from pyprotolinc.runner import TimeAxis
time_axis = TimeAxis(portfolio.portfolio_date, 150)

Now we can test the product.

[6]:

prod = Product_MortalityTerm2(portfolio)
bom_pay = prod.get_bom_payments(time_axis)

# this outputs the premium vectors (index 0 is the first and only
# payment type parametrized when in in active state above)
bom_pay[prod.STATES_MODEL.ACTIVE][0]

[6]:

(<CfNames.PREMIUM: 0>,
 array([[ 0., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60.,
         60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60.,
         60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60.,
         60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60.,
         60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60.,
         60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60.,
         60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60.,
         60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60.,
         60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60., 60.,
         60., 60., 60., 60.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
        [50., 50., 50., 50., 50., 50., 50., 50., 50., 50., 50., 50., 50.,
         50., 50., 50., 50., 50., 50., 50., 50., 50., 50., 50., 50., 50.,
         50., 50., 50., 50., 50., 50., 50., 50., 50., 50., 50., 50., 50.,
         50., 50., 50., 50., 50., 50., 50.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]]))

We see that there are 120 times “60” for the first insured and 46 times “50” for the second. Since the TimeAxis starts in December 2021 this mean that for insured #1 there are up to 120 payments (each month for ten years) to be made in the future and 45 for insured #2. This is explained as follows when looking at the portfolio:

[7]:

portfolio.df_portfolio[["DATE_PORTFOLIO", "DATE_START_OF_COVER", "PRODUCT_PARAMETERS"]]

[7]:

	DATE_PORTFOLIO	DATE_START_OF_COVER	PRODUCT_PARAMETERS
0	2021-12-31	2022-01-01	10
1	2021-12-31	2015-10-01	10

For insured #1 (row 0) the case is clear: The cover starts only on January 1st and the full 10 years of the term are in the future, hence there must be 120 payments (conditional on being active). For the insured in the second row the cover started in October 2015. That mean that at the end of December 2021 the policy exists for already three months and six years, i.e. for 75 months. Hence there remain 120 - 75 = 45 months.

Assumptions

Let’s have a look at the assumptions next.

[8]:

with open('mortality_assumptions_simple.yml', 'r') as f:
    print(f.read())


assumptions_spec:

  be:
    # active -> death
    - [0, 1, ["Scalar", 0.0015]]

    # active -> lapse
    - [0, 2, ["Scalar", 0.05]]

  res:
    # active -> death
    - [0, 1, ["Scalar", 0.0015]]

    # active -> lapse
    - [0, 2, ["Scalar", 0.0]]

The state transition (0, 1) is the death while (0, 2) corresponds with lapse. We use simple assumptions for now. To work with the DAV2008T table we need to download it first by running pyprotolinc download_dav_tables on the command shell.

Run

Now we run our custom state model and term:

[9]:

project_cashflows(run_config);

INFO - 2023-03-26 12:32:18,458 - pyprotolinc.main - Multistate run with config: {'working_directory': WindowsPath('D:/programming/pyprotolinc/examples/03_mortality'), 'model_name': 'GenericMultiState', 'years_to_simulate': 121, 'portfolio_path': 'D:\\programming\\pyprotolinc\\examples\\03_mortality\\portfolio/portfolio_small.xlsx', 'assumptions_path': 'mortality_assumptions_simple.yml', 'steps_per_month': 1, 'state_model_name': 'MortalityStates2', 'timestep_duration': 0.08333333333333333, 'outfile': 'results/ncf_out_generic.csv', 'portfolio_cache': 'D:\\programming\\pyprotolinc\\examples\\03_mortality\\portfolio/portfolio_cache', 'profile_out_dir': 'D:\\programming\\pyprotolinc\\examples\\03_mortality\\.', 'portfolio_chunk_size': 1024, 'use_multicore': False, 'kernel_engine': 'PY', 'max_age': 120}
INFO - 2023-03-26 12:32:18,468 - pyprotolinc.portfolio - Porfolio loaded from cache
INFO - 2023-03-26 12:32:18,469 - pyprotolinc.portfolio - Portolio rows: 2
DEBUG - 2023-03-26 12:32:18,472 - pyprotolinc.portfolio - Splitting portfolio for product TERM2.
DEBUG - 2023-03-26 12:32:18,476 - pyprotolinc.portfolio - Initializing portfolio from dataframe
DEBUG - 2023-03-26 12:32:18,485 - pyprotolinc.portfolio - Initializing portfolio from dataframe
INFO - 2023-03-26 12:32:18,495 - pyprotolinc.main - Executions in single process for 2 units
INFO - 2023-03-26 12:32:18,497 - pyprotolinc.main - Projecting subportfolio 1 / 2 with Python engine
DEBUG - 2023-03-26 12:32:18,498 - pyprotolinc.runner - Creating a <Projector> object for chunk 1 of 2
DEBUG - 2023-03-26 12:32:18,499 - pyprotolinc.runner - Starting the simulation for chunk 1 of 2
INFO - 2023-03-26 12:32:18,512 - root - Runner for chunk 1: Early termination in 10/2026
DEBUG - 2023-03-26 12:32:18,513 - pyprotolinc.runner - Starting backwards loop to calculate the reserves for chunk 1 of 2
INFO - 2023-03-26 12:32:18,558 - pyprotolinc.main - Projecting subportfolio 2 / 2 with Python engine
DEBUG - 2023-03-26 12:32:18,559 - pyprotolinc.runner - Creating a <Projector> object for chunk 2 of 2
DEBUG - 2023-03-26 12:32:18,561 - pyprotolinc.runner - Starting the simulation for chunk 2 of 2
INFO - 2023-03-26 12:32:18,588 - root - Runner for chunk 2: Early termination in 1/2033
DEBUG - 2023-03-26 12:32:18,589 - pyprotolinc.runner - Starting backwards loop to calculate the reserves for chunk 2 of 2
DEBUG - 2023-03-26 12:32:18,634 - pyprotolinc.main - Combining results from subportfolios
INFO - 2023-03-26 12:32:18,637 - pyprotolinc.results - Exporting NCF to results/ncf_out_generic.csv
INFO - 2023-03-26 12:32:18,665 - pyprotolinc.main - Elapsed time 0.2 seconds.

Let’s inspect the result.

[10]:

import pandas as pd
pd.read_csv("results/ncf_out_generic.csv", index_col=0).head()

[10]:

	YEAR	QUARTER	MONTH	PREMIUM	DEATH_PAYMENT	RESERVE_BOM(ACTIVE)	...	MV_ACT_LAPSED
0	2021	4	12	0.000000	0.000000	0.000000	...	0.000000
1	2022	1	1	110.000000	-27.500000	-6044.408230	...	0.008333
2	2022	1	2	109.527917	-27.381979	-5955.167198	...	0.008298
3	2022	1	3	109.057859	-27.264465	-5866.393392	...	0.008262
4	2022	2	4	108.589819	-27.147455	-5778.084686	...	0.008227

5 rows × 29 columns

The premium of 110 can be explained as follows: It is obtained as 0.005 * 220000. The factor 0.0005 is introduced in the function get_bom_payments and 220000 is the sum insured of the full portfolio, i.e. both policies together.

Note that the reserves are negative indicating the high profitability.

Re-Run with DAV2008T

In order to rerun with other assumptions we can simple swap out the assumptions_path attribute of the run_config object:

[11]:

run_config.assumptions_path = 'mortality_assumptions_simple_dav2008t.yml'

The corresponding file looks as follows:

[12]:

with open(run_config.assumptions_path, 'r') as f:
    print(f.read())


assumptions_spec:

  be:
    # active -> death
    - [0, 1, ["DAV2008T", "estimate_type:BE",
              "base_directory:tables/Germany_Endowments_DAV2008T"]]

    # active -> lapse
    - [0, 2, ["Scalar", 0.05]]

  res:
    # active -> death
    - [0, 1, ["DAV2008T", "estimate_type:LOADED",
              "base_directory:tables/Germany_Endowments_DAV2008T"]]

    # active -> lapse
    - [0, 2, ["Scalar", 0.0]]

As one can see we have parametrized the mortality assumption by the DAV2008T tables and we are assuming no lapse for the reserve calculations.

[13]:

pd.DataFrame(project_cashflows(run_config)).head()

INFO - 2023-03-26 12:32:18,780 - pyprotolinc.main - Multistate run with config: {'working_directory': WindowsPath('D:/programming/pyprotolinc/examples/03_mortality'), 'model_name': 'GenericMultiState', 'years_to_simulate': 121, 'portfolio_path': 'D:\\programming\\pyprotolinc\\examples\\03_mortality\\portfolio/portfolio_small.xlsx', 'assumptions_path': 'mortality_assumptions_simple_dav2008t.yml', 'steps_per_month': 1, 'state_model_name': 'MortalityStates2', 'timestep_duration': 0.08333333333333333, 'outfile': 'results/ncf_out_generic.csv', 'portfolio_cache': 'D:\\programming\\pyprotolinc\\examples\\03_mortality\\portfolio/portfolio_cache', 'profile_out_dir': 'D:\\programming\\pyprotolinc\\examples\\03_mortality\\.', 'portfolio_chunk_size': 1024, 'use_multicore': False, 'kernel_engine': 'PY', 'max_age': 120}

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [13], in <cell line: 1>()
----> 1 pd.DataFrame(project_cashflows(run_config)).head()

File D:\programming\pyprotolinc\src\pyprotolinc\main.py:102, in project_cashflows(run_config, df_portfolio_overwrite, assumption_wrapper, export_to_file)
     99 rows_for_state_recorder: Optional[tuple[int]] = None  # (0, 1, 2)
    100 num_timesteps = run_config.years_to_simulate * 12 * run_config.steps_per_month
--> 102 model = create_model(get_model_by_name(run_config.model_name), run_config.state_model_name, run_config.assumptions_path, assumption_wrapper)
    104 if df_portfolio_overwrite is None:
    105     if run_config.portfolio_path is None:

File D:\programming\pyprotolinc\src\pyprotolinc\main.py:66, in create_model(model_class, state_model_name, assumptions_file, assumption_wrapper_opt)
     64 elif assumption_wrapper_opt is None and assumptions_file is not None:
     65     loader = AssumptionsLoaderFromConfig(assumptions_file, len(state_model_class))
---> 66     assumption_wrapper = loader.load()
     67 elif assumption_wrapper_opt is not None:
     68     assumption_wrapper = assumption_wrapper_opt

File D:\programming\pyprotolinc\src\pyprotolinc\assumptions\iohelpers.py:166, in AssumptionsLoaderFromConfig.load(self)
    162 """ Read the file and convert into an AssumptionSetWrapper. """
    164 model_builder = AssumptionSetWrapper(self._states_dimension)
--> 166 self._process_be_or_res(self.assumptions_spec["be"], AssumptionType.BE, model_builder)
    167 self._process_be_or_res(self.assumptions_spec["res"], AssumptionType.RES, model_builder)
    168 return model_builder

File D:\programming\pyprotolinc\src\pyprotolinc\assumptions\iohelpers.py:232, in AssumptionsLoaderFromConfig._process_be_or_res(self, assumptions_spec, be_or_res, model_builder)
    229     dav2008t = DAV2008T(kwargs["base_directory"])
    230     del kwargs["base_directory"]
--> 232     model_builder.add_transition(be_or_res, spec[0], spec[1], dav2008t.rates_provider(**kwargs))
    234 elif spec[2][0] == "ScalarAssumptionsTable" or spec[2][0].upper() == "FLAT" or spec[2][0].upper() == "SCALAR":
    235     # assume that the next parameter is a float
    236     const_rate = float(spec[2][1])

File D:\programming\pyprotolinc\src\pyprotolinc\assumptions\dav2008t.py:92, in DAV2008T.rates_provider(self, estimate_type)
     89 estimate_type = estimate_type.upper()
     91 if estimate_type == "BE":
---> 92     return StandardRateProvider(self.be_rates, (Age, Gender, SmokerStatus), offsets=(0, 0, 0))
     93 elif estimate_type == "LOADED":
     94     return StandardRateProvider(self.res_rates, (Age, Gender, SmokerStatus), offsets=(0, 0, 0))

TypeError: __init__() got an unexpected keyword argument 'offsets'

Calculating the ratio of the claims in this and the previous projection we find that the latest claims are 2.68 times as high as the previous ones:

-73.7 / -27.5 = 2.68

[ ]:

assert 1 == 0

Looking at the portfolio we find that we have:

a male smoker born in 1976 (being, say, 45 years at year-end 2021) and
a male non-smoker born in 1962 being 59 years at year-end 2021

We check the mortality rates we would expect to be applied:

a64fb71a89c84029b01a95a71fa366b4

We can validate that these are the rate that are used by pyprotolinc.

[ ]:

from pyprotolinc.assumptions.dav2008t import DAV2008T
dav2008t = DAV2008T(base_directory="tables/Germany_Endowments_DAV2008T")
dav2008_provider = dav2008t.rates_provider(estimate_type="BE")
rates = dav2008_provider.get_rates(age=portfolio.initial_ages // 12,
                                   smokerstatus=portfolio.smokerstatus,
                                   gender=portfolio.gender)
rates

To demonstrate that this explains the increase in the claims level we calculated the sum-insured weighted average rate and divide it by 0.0015:

[ ]:

rates.dot(portfolio.sum_insured) / portfolio.sum_insured.sum() / 0.0015

So the increase in claims in indeed explained by the higher mortality rates in the DAV2008 table compared to our previous simple assumption.