Application of Machine Learning in Automobile Insurance

By Matt Swarbrick

✅ Paper Type: Free Essay	✅ Subject: Computer Science
✅ Wordcount: 3148 words	✅ Published: 23 Sep 2019

Reference this

Share this: Facebook Twitter Reddit LinkedIn WhatsApp

Application of Machine Learning in Automobile Insurance

Background Origin

Although the concept of insurance had taken shape long before the existence of automobiles, world’s first car insurance policy wasn’t issued until 1897 with years of unsecured driving in between. Due to the natural characteristic of car accident including relative simplicity of casualty appraisal and uncertainty of tortfeasor’s solvency, car insurance was born to be part of government regulation. With mass production of cars in 20th century as well as high rate of car collisions and fatalities, Massachusetts and Connecticut government took the first move. Modern insurance policy nowadays is originated from financial responsibility required by Connecticut’s 1925 financial responsibility and compulsory insurance as prerequisite for vehicle registration.

Industry background

Automobile insurance industry is regulated at state level in U.S. Except for New Hampshire, nowadays every state has compulsory car insurance liability laws, which means that everyone is guaranteed to have their car insured whether in shared or voluntary market. In voluntary or private market, car owners usually have decent financial, education background and clean driving history. As a result, insurers and policyholders enjoy free two-way selection. However, shared market is for those having worse than average driving record. In recent years, shared market is shrinking due to the emergence of nonstandard section in voluntary market.

Nowadays, people buy their car insurance from direct response. Generally, auto insurers has their auto pricing algorithm to refer to people’s demographic and geographical, car-related and driving record information for price discrimination. According to data from Statista, young male is more likely to pay more than young female. People living in Michigan has the most expensive cost of auto insurance.

There are mainly two types of auto claims, liability insurance that pays for policyholder’s legal responsibility to others for bodily injury or property damage and collision and comprehensive insurance that compensate for property loss others did to policyholder’s car. Of those two types, liability insurance is the major incurred loss contributor.

Companies in the domain

Of hundreds of automobile insurers, State Farm, Geico, Progressive and Allstate make up 50% of US auto insurance market, of which State Farm takes the absolute majority of 18.2% with direct premiums written more than $41 billion up to 2017. Company with mass volume enjoy complete eco-industrial system to integrate financial power and human resources, thus often being able to offer better customer service and more stable claim settlement.

Problems in the domain and how ML impact it

For most modern car insurers, major problems they should address are reducing cost of customer acquisition, reducing insurance fraud and simplifying claim process.

At primary stage of development, automobile insurers relied on traditional intermediary agent and 4S shops to acquire new clients and evaluate quote at the cost of surrendering part of profits to pay for relevant commissions. P2P intelligent car insurance emerged in response to that situation with help of machine learning. Based on information provided by potential policyholders, online quotation system can give out a appropriate quote. Algorithm behind the system which is basically the process of building a decision tree took profitability and customer payment capacity into consideration and optimize the pricing problem. The system can not only save the intermediate agent brokerage but also the time cost of negotiating and bargaining. Despite the ease and simplicity of the system, self-reported information can be spurious from time to time. That’s when another application of ML come into play. Snapshot became part of usage-based car insurance plan of Progressive in 2011. The plug-in camera installed under steer wheel records and sends back individual driving data, most of which can’t be captured from demographic information to insurers for customized quotation.

As a prevalent issue in all lines of insurance business, Insurance fraud is especially severe in automobile section. Fraud can not only cause capital loss for insurers, increase insurance premium indirectly for other policyholders, and even worse, put lives in danger because of intentional accident. Machine learning is really the game changer in better detecting fraudulent behavior. For example, the artificial intelligence based software solution developed by Azati in 2016 is widely used by automobile insurers to investigate suspicious claims. The system is able to cut suspicious rate by 50% and increase fraud detect accuracy by five-fold. By training data collected from historical insurance fraud record, the system can identify pattern of fraudulent claims and predict on similar behavior. The system can be further improved by adding more features and enlarging sample size to improve accuracy and find specific correlation between feature and behavior that manual review can’t find out.

Traditionally, when an insurance claim is filed, damage estimation process will be done manually and evaluation will be given by body shops who are very likely to overcharge. For those underrate case, to improve claim settlement efficiency as well as fair valuation, Tools like AI Review and AI FNOL Triage supported by ML is applied to authenticate damage and offer intelligence solutions. Comparing word description and pictures of new claims to database of solved claims, those AI tools are able to recognize damaged parts and predict operation on the damage and offer multiple solutions for settlement. The technology largely eased the burden on actuaries, release their time for more complicated claim investigation while at the same time meet urgent need for clients. Other widely-used ML applications concerning claim prediction include R/Shiny which predict on individual insurance claims for up to 10 years into future based on simulated status model and payment model.

General trends of auto insurance

According to research conducted by KPMG, traditional auto insurance business will shrink by over 70% by 2050. Currently, three major transitions are happening in auto business. First, with safety driving revolution, autonomous technology makes cars inherently safer, reducing car accident 90% by estimate by 2050, which makes most of comprehensive car insurance products redundant. Second, on-demand transportation is changing people’s travelling habit. Personal car insurance products may be replaced by commercial ones, eroding huge part of insurers’ profit map. At last, auto manufacturers intaking driving risk and relevant liability under their primary business may totally change type of car insurance from accident liability insurance to product

liability insurance. As car owners considering change their travel plan, insurers should make corresponding shift in products, tactic and of course, their ML model to survive.

Business background of GEICO

As one of the auto insurance agencies in voluntary market, GEICO started its business of offering auto insurance in 1936 exclusively for government officers and then expanded customer base. Now it is the third largest auto insurance writer in US and primarily serves via internet and telephone. To make up for deficiencies such as lack of face-to-face agent to build up customer-representative relationship, GEICO continuously put effort into improving its intelligent pricing, online service and claim settlement system. That’s why we choose GEICO for further analysis of how tools using ML technology is applied in its business and better serve the company in the future.

Problems: Geico

Insurance fraud

1.Introduction

Here comes the first question: how to pick frauds out of tons of claims accurately? Fraud has been a critical issue in car insurance industry. According to Insurance Research Council, automobile claim fraud and buildup added $5.6 billion -$7.7 billion in excess payments paid in the U.S. in 2012(Corum,2015). As a major insurance provider in the U.S., Geico is a huge victim of frauds as well.

2.User

The users for this model would be Geico’s automobile adjusters whose responsibility is to determine whether to settle claims. Their goal is to identify frauds from real claims efficiently and effectively with the help of this model. According to Geico’s job description, adjusters should hold high school education(Geico). We infer that most users may know nothing about machine learning behind our tool.

3.The tool

The goal of the tool is to perfectly identify fraud, which aligns with goal of users. Considering users technical background, the input of the tool will be the information of policyholder which was collected in advance, the description of the claim, the weather of the day, the traffic of that time at that area, and other needed information. Only the description of the claim needs to be entered manually, other information would be matched up from the database automatically after analyzing the input string.

The output of the tool contains just one value denoting the predicted possibility of fraud and one corresponding suggestion. The suggestion is chosen among ‘Low risk of fraud’, ‘Need further investigation’, and ‘High risk of fraud’ based on the value. The only thing the user ought to do is to determine whether to approve, to reject or to further investigate the claim. For example, if one with reckless driving style reports a hit-and-run accident on a country road in the midnight. The traffic information shows that there is almost no vehicle in the area at that time. The output of the tool would be ‘Possibility of fraud: 90%, high risk of fraud’

The mechanism of the tool getting the value based on SVM machine learning. Firstly, we use one part of the National Insurance Crime Database to train the model. Then we test the validation using another part of that database. Finally, the optimized tool is good to be applied to Geico database.

4.Evaluation

We believe that the tool will greatly increase the efficiency of adjusting. Imagine yourself as an adjuster; when you faced with hundreds claims every day, it is impossible for you to identify every fraud from claims. By using this tool, adjusters don’t need to waste time on investigating claims from reliable contractors. We hope that the tool will have initial accuracy of 70%, which increases with more data put in this tool. When it matures, it would detect more than 95% frauds. Time and cost of developing this model would be affordable, we believe that it matures in 1 month. A group of internal technical experts from Geico will take responsible for maintenance of the tool.

Quoting

1. Introduction

Our second question is how to make more accurate and customized pricing for insurance quotes. The traditional way to price the motor insurance is using financial models which depends upon statistical sampling of past performance to forecast future outcomes. However, since artificial intelligence allows predictive analytics based on real events, in real time, using large datasets rather than samples to make the best guess, competitors like Progressive have already using this technology in their pricing policy. So in order to succeed in the market, Geico will have to rapidly move from pricing based on the likely behavior of categories to pricing based on the actual behavior of individuals. Combining sensors and machine learning algorithms will enable Geico to use predictive analytics based on data collected from client drivers.

2.User

The users for this model would be Geico’s pricing analysts whose responsibility is to decide the quote for each client. The current way to collect clients’ information is the online survey including questions like personal profile, motor information and usage etc. But to make more accurate and customized pricing for insurance quotes, we need more directly-relative factors that demonstrate clients’ the driving behavior which would not be described in the survey. For example, the drivers who tend to rush though the yellow light are more likely to cause car accident and ask for claims. However, this driving behavior is hard to be informed from any online survey. Only with sensors on the objects, personal and commercial vehicles, can pricing analysts get real-time big data of clients’ driving behavior to predict their future claims so as to make more precise quotes.

3.Tool

The goal of this model is to help pricing analysts to detect which clients are safer drivers who deserve lower quotes and which clients are dangerous drivers who need to pay higher premiums for their risky behavior.

The input of our model is the monitored driving data collected by sensors installed in clients’ vehicles. The data will include a number of elements of interest to underwriters: miles driven; time of day; where the vehicle is driven (Global Positioning System or GPS); rapid acceleration; hard braking; hard cornering; and airbag deployment.

The output of our model includes two parts: the assessment and the suggested range of quote for each client. We would measure the risky behavior of each client and represent the assessment result with a score between 0 and 100. Those who get 60~100 will be “safe drivers”, while those who get 0~59 will be “dangerous drivers”. The suggested range of quote would be original quote using the traditional pricing method plus the confidence interval of the predicted quote.

We would use thousands of miles worth of monitored driving data collected by sensors, the claims these clients reported and the settlement results as our database for modeling. The mechanism of the tool getting the value based on SVM machine learning.

4.Evaluation

Pros:

· Increase affordability for lower-risk drivers, many of whom are also lower-income drivers.

· Give consumers the ability to control their premium costs by incenting them to reduce miles driven and adopt safer driving habits. Fewer miles and safer driving also aid in reducing accidents, congestion, and vehicle emissions, which benefits both Geico and society.

· The additional data can also be used by insurers to refine or differentiate Geico’s policies.

Cons:

· Raise privacy concerns

· Utilizing sensors can be costly and resource intensive to the insurer.

Claims settlement

1. Introduction

According to a survey conducted by GFC, Geico ranks 10th of the top 10 best auto insurance companies of 2019. Consumers would enjoy a great coverage including breakdown coverage and auto repair Xpress. But there’s one problem that has always been criticized: Geico has the longest and most difficult claim process of these ten companies.

Claim settlement efficiency is of the greatest importance of customer satisfaction. Nowadays, most claims are settled by a manual process which is time consuming. The claim of a tiny scratch may take days even months to be settled which would significantly lower the satisfaction. What is more, this process is quite expensive because it needs a lot of staffs to check those claims and follow up. As we all know, most accidents, such as scratches, are not so serious, repairmen cost is usually below $1000. If there is a tool that helps to identify the seriousness of each claim and settle it, the satisfaction would be jacked up.

2. User

The users of this tool would be customers of Geico who face a claim. They are anxious, nervous and even angry because they were just involved in an accident. They need to report the situation now and the current claim process is: customer need to take a photo of this accident along with photos of VIN and surroundings. Then he/she need to report the accident to Geico service center and describe the problem specifically. The service center then would contact local subsidiary and a local staff would call back, suggest a nearby repair store for the customer to evaluate the problem and then follow up the case. The bottleneck of this process lies in two aspects: first, people are reluctant to communicate by phone. In the current society, we feel more comfortable to talk with screen which is less stressful and through which we could describe the problem more clearly. Second, the process time could be quite long because all processes are done manually. It would take over 20 minutes for the service center to identify the problem, call local subsidiary and report the nearest repair store. Imagine the anxious and nervous situation, the client would, of course, be unhappy.

3. Tool

The goal of this model is to help the user identify the unserious claims quickly and shorten the waiting time. The function of this tool is to automatically identify the damage, check the credibility of claims and provide the locations of several repair stores nearby. The tool works with a medium size database including photos of different type of slight car damage, car insurance information, estimated repair cost and all the location of repair stores that cooperate with Geico.

Considering customers’ technical background, we decide to make the tool more user friendly. So for input variables, we only need them to provide the photo of damage along with the VIN photo and their locations. For the process, the model would analyze the photo, compare it with photos in the database and identify the level of damage. It would also compare the car photo and VIN photo with car insurance information in the database and check the credibility of this claim. Then, it would analyze the location of this accident and provide the location of nearby repair stores.

The output of this tool includes three parts: repair information, credibility information and damage information. For the interface, users could only see repair information, which is the nearby store and damage information which indicates the type of damage and the estimated repair cost. The credibility information goes to the background and is verified by the model we designed in the first part.

Evaluation Pros:

· Shorten the waiting time of a claim;

· Customer need not to communicate with insurance staffs personally;

· Reduce labor cost that needed in the service center;

· Simplify the process of reporting a claim;

· Claims would be automatically stored and are easy to track.

Cons:

· Difficult to classify each type of damage at the beginning;

· Need a lot of data of each type of damage and repair cost;

· Due to the accuracy of the model, some damages may not be correctly classified;

· The labor cost reduced may not compensate the cost of building/training this model.

Interface

Fraud detection system interface:

Claim settlement system interface:

Bibliography

Corum, D. (2015, February 3). Insurance Research Council Finds That Fraud and Buildup Add Up to $7.7 Billion in Excess Payments for Auto Injury Claims. Retrieved from https://www.insurancefraud.org/downloads/InsuranceResearchCouncil02-15.pdf
Auto Damage Adjuster Trainee – Clearlake, Texas City, Pearland. https://geico.referrals.selectminds.com/jobs/auto-damage-adjuster-trainee-clearlake-texas-city-pearland-9269

Matt Swarbrick

Matt holds a BA and MA certificate from Cambridge, and is an subject-matter expert in Business and Management. Matt also writes about subjects like Finance, Economics and Computing/ICT.

Share this: Facebook Twitter Reddit LinkedIn WhatsApp

Cite This Work

To export a reference to this article please select a referencing stye below:

Related Services

View all

Essay Writing Service

From £99

Report Writing Service

From £99

Student reading and using laptop to study

Assignment Writing Service

From £99

DMCA / Removal Request

If you are the original writer of this essay and no longer wish to have your work published on UKEssays.com then please click the following link to email our support team:

Request essay removal