《保险研究》20191006-《车险定价中风险因子重要性测度的比较研究——基于集成学习方法和广义线性回归模型》(张碧怡、肖宇谷、曾宇哲)

[中图分类号]F840.65;F224.7 [文献标识码]A [文章编号]1004-3306(2019)10-0073-11 DOI:10.13497/j.cnki.is.2019.10.006

资源价格:30积分

  • 内容介绍

[摘   要]车险业务中影响车险损失的风险因子很多,如从人因子、从车因子、从属地因子和保单属性因子等,保险公司通常利用这些风险因子对个体风险进行分类,一方面作为车险定价的依据,另一方面也为部门沟通、业务选择和市场细分提供支持。因此,识别风险因子的重要性对提升整个车险业务质量有非常重要的意义。近年来机器学习算法在车险损失预测中的应用越来越多,但目前的研究主要考虑了损失预测的精度,对风险因子的重要性测度缺少系统深入的研究。为此,本文对8个车险数据集,利用两种集成学习方法(随机森林和XGBoost),比较了它们与广义线性回归模型在索赔频率风险因子重要性测度上的一致性。研究结果表明,这两种集成学习方法不仅能提高预测精度,还能提供较一致的风险因子重要性测度。

[关键词]汽车保险;机器学习;变量重要性;随机森林;XGBoost

[基金项目]本文得到教育部人文社会科学重点研究基地重大项目(16JJD910001):基于大数据的精算统计模型与风险管理问题研究;中国人民大学2019年度“中央高校建设世界一流大学(学科)和特色发展引导专项资金”,支持和资助。

[作者简介]张碧怡,中国人民大学统计学院硕士研究生,E-mail:zhang-biyi@163.com;肖宇谷,中国人民大学统计学院风险管理与精算教研室副教授;曾宇哲,中国人民大学统计学院硕士研究生。


A Comparative Study on Measuring Variable Importance in Auto Insurance Pricing—Based on Ensemble Learning and Generalized Linear Regression

ZHANG Bi-yi,XIAO Yu-gu,ZENG Yu-zhe

Abstract:In the auto insurance business,there are many risk factors affecting vehicle losses,such as driver-related factors,vehicle-related factors,geography and the policy information,etc.Insurance companies often use these risk factors to classify individual risks,on the one hand as the basis for auto insurance pricing,and on the other hand to support inter-department communications,business choices and market segmentation.Therefore,identifying the importance of risk factors is very important to improve the quality of the entire auto insurance business.In recent years,machine learning algorithms have been applied more and more in the prediction of vehicle risk loss.However,these studies mainly consider the accuracy of loss prediction,and there is a lack of systematic and in-depth study on the importance measurement of risk factors.To this end,this paper compared the eight auto insurance data sets with two machine learning methods (random forest and XGBoost) with the generalized linear model in the importance measurement of the claim frequency risk factor.The results show that these two machine learning methods can not only improve the prediction accuracy,but also provide an effective risk factor importance measurement.

Key words:auto insurance;machine learning;variable importance measurement;Random Forest;XGBoost