class: center, middle, inverse, title-slide # Claims reserving in general insurance with R and Keras ##
kevinykuo.com/talk/2018/07/user-reserving/
arXiv:1804.09253
### Kevin Kuo
@kevinykuo
### July 2018 --- class: inverse, center, middle # Insurance claims --- # Fender bender! What happens in the life of a claim? .pull-left[
] .pull-right[ - Oops! <br /> - Call in claim to agent or file via app. Claims adjuster set case reserves (approx. how much she expects the claim to cost eventually). - Damages paid for <br /><br /> - Move on with life ] --- # Workers' comp What happens in the life of a (long-tailed) claim? .pull-left[
] .pull-right[ - Exposure to asbestos at shipyard <br /> - Mesothelioma diagnosis (this could be decades later), claim filed. <br /> - Claim settled out of court after months (or more) of litigation <br /> - Move on with life ] --- # The loss reserving exercise <br /><br /> .full-width[.content-box-blue[.Large[Basically, figure out what we gotta pay in the future due to claims.]]] --- # What the actuary sees .large[We're going to look at a shiny app at [https://davidjhindley.com/shiny/claimsreserving/](https://davidjhindley.com/shiny/claimsreserving/). See also the ChainLadder package: [https://github.com/mages/ChainLadder](https://github.com/mages/ChainLadder). ] --- class: inverse, center, middle # Applying neural networks --- # Illustrative triangle ```r library(tidyverse) data <- insurance::schedule_p %>% filter(lob == "private_passenger_auto", calendar_year < 1994, group_code == "43") %>% select(accident_year, development_lag, incremental_paid_loss) data %>% spread(development_lag, incremental_paid_loss) ``` ``` ## # A tibble: 6 x 7 ## accident_year `1` `2` `3` `4` `5` `6` ## <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 1988 133 200 98 139 45 0 ## 2 1989 934 812 619 214 184 NA ## 3 1990 2030 2834 2016 1207 NA NA ## 4 1991 4537 6990 3596 NA NA NA ## 5 1992 7564 8497 NA NA NA NA ## 6 1993 8343 NA NA NA NA NA ``` --- # Treat this as a predictive modeling problem .large[ Each cell of the triangle is a row in the modeling dataset. We just need to come up with some predictors ] ``` ## # A tibble: 8 x 4 ## accident_year development_lag incremental_paid_loss predictors ## <int> <int> <dbl> <chr> ## 1 1988 1 133 ?!?!?!?!?!? ## 2 1989 1 934 ?!?!?!?!?!? ## 3 1990 1 2030 ?!?!?!?!?!? ## 4 1991 1 4537 ?!?!?!?!?!? ## 5 1992 1 7564 ?!?!?!?!?!? ## 6 1993 1 8343 ?!?!?!?!?!? ## 7 1988 2 200 ?!?!?!?!?!? ## 8 1989 2 812 ?!?!?!?!?!? ``` .large[Then we can do something like] ```r crazy_AI_algorithm(incremental_paid_loss ~ predictors, data = data) ``` --- # Data .large[ Schedule P data from [http://www.casact.org/research/index.cfm?fa=loss_reserves_data](http://www.casact.org/research/index.cfm?fa=loss_reserves_data). ] .full-width[.content-box-green[.large[10 accident years (1988-1997) of paid and incurred losses, with 10 development lags, from a bunch of companies and lines of business.]]] --- # Response and predictors .Large[Let's talk about our response variable and predictors!] --- # Response .Large[ - **Response: incremental paid losses and total claims outstanding** We're gonna predict both paid loss and claims o/s in the same model, ain't that cool?! ] --- # Predictors .Large[ - Response: incremental paid losses and total claims outstanding 👍 - **Predictors:** - **Time series of paid losses and case reserves** ] --- # Predictors .full-width[.content-box-green[.large[Let's see what we mean by "time series of paid losses".]]] --- # Predictors .large[Basically, for each cell in the triangle, we take the experience for the AY up to the previous calendar year. For example, for AY 1988 we have:] ``` ## # A tibble: 6 x 3 ## development_lag incremental_paid_loss paid_history ## <int> <dbl> <chr> ## 1 1 133 "" ## 2 2 200 133 ## 3 3 98 133, 200 ## 4 4 139 133, 200, 98 ## 5 5 45 133, 200, 98, 139 ## 6 6 0 133, 200, 98, 139, 45 ``` --- # Predictors .Large[ - Response: incremental paid losses and total claims outstanding 👍 - **Predictors:** - Time series of paid losses and case reserves 👍 - **Company (because we're using data from all companies simultaneously)** ] --- # Predictors .large[ Company codes are indexed as integers. For example, if we have 50 companies, the first company would be `0`, the second would be `1`, and the last one would be `49`. ] --- # Predictors .Large[ - Response: incremental paid losses and total claims outstanding 👍 - Predictors: - Time series of paid losses and case reserves along accident year 👍 - Company (because we're using data from all companies simultaneously) 👍 Now that we've gone through the response and predictors, let's talk about the neural network itself! ] --- # Architecture Looks fancy, but it's just a neural network! <img src="img/nn1.png" width="25%" style="display: block; margin: auto;" /> --- # Embedding layer .large[ The embedding layer maps each company code index to a fixed length vector. While the dimension stay the same, the actual values for each company are *learned* by the network during training, in order to optimize our objective. For example, if the specified length is 5, company #2 might get mapped to `c(0.4, 1.2, -3.7, 3.3, 0.2)`. We can think of this representation as a proxy for characteristics of the companies that are not captured by the time series data input, e.g. size of book, case reserving philosophy, etc. ] --- # Neural network for sequences .large[ Just like a vanilla feedforward neural network, except we feed the sequential input... in sequence. ] <img src="img/rnn.png" width="50%" style="display: block; margin: auto;" /> --- # Helping RNN remember .large[ Obligatory complicated recurrent network figure! (don't worry about the details) Gated recurrent unit (GRU) is an architecture that helps the network "remember" stuff from a long time ago. <img src="img/gru.png" width="50%" style="display: block; margin: auto;" /> ] --- # Putting it all together Again, we're really just applying a bunch of functions, one after another, to our input data. <img src="img/nn1.png" width="25%" style="display: block; margin: auto;" /> --- # Easily implemented using R + Keras The model itself is under 30 lines of code! <img src="img/model-code.png" width="50%" /> --- # Some results Sample results from the company with the most data in the dataset... <img src="img/ppauto-results.png" width="55%" style="display: block; margin: auto;" /> --- # Some results Workers' comp <img src="img/wkcomp-results.png" width="55%" style="display: block; margin: auto;" /> --- # Benchmarking Let's define a couple metrics to bench this new approach against existing methods: `$$RMSPE_l = \sqrt{\frac{1}{|\mathcal{C}_l|}\sum_{C\in\mathcal{C}_l}\left(\frac{\widehat{UL}_C - UL_C)}{UL_C}\right)^2}$$` and `$$MAPE_l = \frac{1}{|\mathcal{C}_l|}\sum_{C\in\mathcal{C}_l}\left|\frac{\widehat{UL}_C - UL_C}{UL_C}\right|,$$` where `\(\mathcal{C}_l\)` is the set of companies in line of business `\(l\)`, and `\(\widehat{UL}_C\)` and `\(UL_C\)` are the predicted and actual cumulative ultimate losses, respectively, for company `\(C\)`. --- # Benchmarking Results for other methods taken from [http://www.casact.org/pubs/monographs/index.cfm?fa=meyers-monograph01](http://www.casact.org/pubs/monographs/index.cfm?fa=meyers-monograph01). <img src="img/comparison_table.png" width="70%" style="display: block; margin: auto;" /> --- # Discussion .Large[Neural networks aren't too shabby at doing some basic reserving work.] <br /> .full-width[.content-box-purple[.Large[But this is just the beginning!]]] --- # Discussion .large[ Future work? - Predictions intervals for reserve variability. - Claims level analytics, where we can take into account things like adjusters' notes and images. - Policy level analytics, towards a holistic approach to pricing + reserving. - Interpretability. Slides and link to repo: [kevinykuo.com/talk/2018/07/user-reserving/](http://kevinykuo.com/talk/2018/07/user-reserving/) Paper: [arXiv:1804.09253](https://arxiv.org/abs/1804.09253) ]