Files
awesome-awesomeness/html/creditmodeling.html
2024-04-20 19:22:54 +02:00

370 lines
21 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<h1 id="awesome-credit-modeling-awesome">Awesome Credit Modeling <a
href="https://github.com/sindresorhus/awesome"><img
src="https://awesome.re/badge-flat.svg" alt="Awesome" /></a></h1>
<p><a href="http://makeapullrequest.com"><img
src="https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square"
alt="PRs Welcome" /></a></p>
<blockquote>
<p>A growing collection of awesome papers, articles and various
resources on credit scoring and credit risk modeling.</p>
</blockquote>
<p>Credit scoring is the term used to describe formal statistical
methods used for classifying applicants for credit into risk classes.
Lenders use such classifications to assess an applicants
creditworthiness and probability of default.</p>
<h2 id="contents">Contents</h2>
<ul>
<li><a href="#introduction">Introduction</a></li>
<li><a href="#credit-scoring">Credit Scoring</a></li>
<li><a href="#institutional-credit-risk">Institutional Credit
Risk</a></li>
<li><a href="#peer-to-peer-lending">Peer-to-Peer Lending</a></li>
<li><a href="#sample-selection">Sample Selection</a></li>
<li><a href="#feature-selection">Feature Selection</a></li>
<li><a href="#model-explainability">Model Explainability</a></li>
</ul>
<h2 id="introduction">Introduction</h2>
<ul>
<li><p><a href="https://www.jstor.org/stable/2983268">Statistical
Classification Methods in Consumer Credit Scoring: A Review</a> -
Classic introduction and review of the subject of credit
scoring.</p></li>
<li><p><a href="https://www.jstor.org/stable/40540227">Consumer Finance:
Challenges for Operational Research</a> - Reviews the development of
credit scoring (the way of assessing risk in consumer finance) and what
is meant by a credit score. Outlines 10 challenges for Operational
Research to support modelling in consumer finance.</p></li>
<li><p><a
href="https://www.slideshare.net/YvanDeMunck/machine-learning-in-credit-risk-modeling-a-james-white-paper">Machine
Learning in Credit Risk Modeling</a> - James (formerly CrowdProcess) is
a now-defunct online credit risk management startup that provided risk
management tools to financial institutions. This whitepaper offers an
overview of machine learning applications in the field of credit risk
modeling.</p></li>
<li><p><a
href="https://www.tandfonline.com/doi/abs/10.1080/03085140601089846">Lending
by numbers: credit scoring and the constitution of risk within American
consumer credit</a> - Examines how statistical credit-scoring
technologies became applied by lenders to the problem of controlling
levels of default within American consumer credit. Explores their
perceived methodological, procedural and temporal risks.</p></li>
<li><p><a href="https://ieeexplore.ieee.org/document/6069610">Machine
Learning in Financial Crisis Prediction: A Survey</a> - Reviews 130
journal papers from the period between 1995 and 2010, focusing on the
development of state-of-the-art machine-learning techniques for
bankruptcy prediction and credit score modeling. Also presents their
current achievements and limitations.</p></li>
<li><p><a href="https://www.bis.org/publ/work887.pdf">Fintech and big
tech credit: a new database</a> - This Working Paper by the Bank of
International Settlements, while not as focused on credit risk, maps the
conditions for and niches occupied by alternative credit, be it provided
by fintechs or big tech companies.</p></li>
</ul>
<h2 id="credit-scoring">Credit Scoring</h2>
<ul>
<li><p><a
href="https://www.sciencedirect.com/science/article/abs/pii/S0377221715004208">Benchmarking
state-of-the-art classification algorithms for credit scoring: An update
of research</a> - There have been several advancements in scorecard
development, including novel learning methods, performance measures and
techniques to reliably compare different classifiers, which the credit
scoring literature does not reflect. This paper compares several novel
classification algorithms to the state-of-the-art in credit scoring. In
addition, the extent to which the assessment of alternative scorecards
differs across established and novel indicators of predictive accuracy
is examined.</p></li>
<li><p><a
href="https://www.sciencedirect.com/science/article/abs/pii/S1876735416300101">Classification
methods applied to credit scoring: Systematic review and overall
comparison</a> - The need for controlling and effectively managing
credit risk has led financial institutions to excel in improving
techniques designed for this purpose, resulting in the development of
various quantitative models by financial institutions and consulting
companies. Hence, the growing number of academic studies about credit
scoring shows a variety of classification methods applied to
discriminate good and bad borrowers. This paper aims to present a
systematic literature review relating theory and application of binary
classification techniques for credit scoring financial analysis. The
general results show the use and importance of the main techniques for
credit rating, as well as some of the scientific paradigm changes
throughout the years.</p></li>
<li><p><a
href="https://projecteuclid.org/euclid.ss/1149600839">Classifier
Technology and the Illusion of Progress</a> - A great many tools have
been developed for supervised classification, ranging from early methods
such as linear discriminant analysis through to modern developments such
as neural networks and support vector machines. A large number of
comparative studies have been conducted in attempts to establish the
relative superiority of these methods. This paper argues that these
comparisons often fail to take into account important aspects of real
problems, so that the apparent superiority of more sophisticated methods
may be something of an illusion. In particular, simple methods typically
yield performance almost as good as more sophisticated methods, to the
extent that the difference in performance may be swamped by other
sources of uncertainty that generally are not considered in the
classical supervised classification paradigm.</p></li>
<li><p><a
href="https://dl.acm.org/doi/10.1007/s10462-015-9434-x">Financial credit
risk assessment: a recent review</a> - Summarizes the traditional
statistical models and state-of-the-art intelligent methods for
financial distress forecasting, with emphasis on the most recent
achievements.</p></li>
<li><p><a
href="https://www.tandfonline.com/doi/abs/10.1057/palgrave.jors.2601932">Good
practice in retail credit scorecard assessment</a> - In retail banking,
predictive statistical models called scorecards are used to assign
customers to classes, and hence to appropriate actions or interventions.
Such assignments are made on the basis of whether a customers predicted
score is above or below a given threshold. The predictive power of such
scorecards gradually deteriorates over time, so that performance needs
to be monitored. Common performance measures used in the retail banking
sector include the Gini coefficient, the KolmogorovSmirnov statistic,
the mean difference, and the information value. However, all of these
measures use irrelevant information about the magnitude of scores, and
fail to use crucial information relating to numbers misclassified. The
result is that such measures can sometimes be seriously misleading,
resulting in poor quality decisions being made, and mistaken actions
being taken.</p></li>
<li><p><a
href="https://link.springer.com/article/10.1057/jors.2012.145">A
literature review on the application of evolutionary computing to credit
scoring</a> - The aim of this paper is to summarize the most recent
developments in the application of evolutionary algorithms to credit
scoring by means of a thorough review of scientific articles published
during the period 20002012.</p></li>
<li><p><a
href="https://fbj.springeropen.com/articles/10.1186/s43093-020-00041-w">Machine
learning predictivity applied to consumer creditworthiness</a> -
Analyzes the adequacy of borrowers classification models using a
Brazilian banks loan database, exploring machine learning techniques,
and comparing their predictive accuracy with a benchmark based on a
Logistic Regression model. Comparisons are based on usual classification
performance metrics.</p></li>
<li><p><a
href="https://alo.mit.edu/wp-content/uploads/2015/06/Household-behaviorConsumer-credit-riskCredit-card-borrowingMachine-learningNonparametric-estimation.pdf">Consumer
credit-risk models via machine-learning algorithms</a> - The authors
apply machine-learning techniques to construct nonlinear nonparametric
forecasting models of consumer credit risk. They are able to construct
out-of-sample forecasts that significantly improve the classification
rates of credit-card-holder delinquencies and defaults.</p></li>
<li><p><a
href="https://ieeexplore.ieee.org/document/7033125">Example-Dependent
Cost-Sensitive Logistic Regression for Credit Scoring</a> - Several
real-world classification problems are example-dependent cost-sensitive
in nature, where the costs due to misclassification vary between
examples. Credit scoring is a typical example of cost-sensitive
classification. However, it is usually treated using methods that do not
take into account the real financial costs associated with the lending
business.</p></li>
<li><p><a
href="https://www.sciencedirect.com/science/article/abs/pii/S0957417414005119">Credit
scoring using the clustered support vector machine</a> - Introduces the
use of the clustered support vector machine (CSVM) for credit scorecard
development. This recently designed algorithm addresses some of the
limitations associated with traditional nonlinear support vector machine
(SVM) based methods for classification. Specifically, it is well known
that as historical credit scoring datasets get large, these nonlinear
approaches, while highly accurate, become computationally expensive. The
CSVM can achieve comparable levels of classification performance while
remaining relatively cheap computationally.</p></li>
<li><p><a
href="https://www.sciencedirect.com/science/article/abs/pii/S0957417416306947">A
comparative study on base classifiers in ensemble methods for credit
scoring</a> - In the last years, the application of artificial
intelligence methods on credit risk assessment has meant an improvement
over classic methods. Recent works show that ensembles of classifiers
achieve the better results for this kind of tasks.</p></li>
<li><p><a
href="https://www.sciencedirect.com/science/article/abs/pii/S0957417409008847">Multiple
classifier application to credit risk assessment</a> - (<a
href="https://www.sciencedirect.com/science/article/pii/S0957417410012364">Corrigendum</a>)
- This paper explores the predicted behaviour of five classifiers for
different types of noise in terms of credit risk prediction accuracy,
and how such accuracy could be improved by using classifier
ensembles.</p></li>
<li><p><a
href="https://www.sciencedirect.com/science/article/abs/pii/S0377221706011866">Recent
developments in consumer credit risk assessment</a> - The riskiness of
lending to a credit applicant is usually estimated using a logistic
regression model though researchers have considered many other types of
classifier, but data quality issues may prevent these laboratory based
results from being achieved in practice. The training of a classifier on
a sample of accepted applicants rather than on a sample representative
of the applicant population seems not to result in bias though it does
result in difficulties in setting the cut off.</p></li>
<li><p><a
href="https://www.sciencedirect.com/science/article/abs/pii/S0169207000000340">A
survey of credit and behavioural scoring: forecasting financial risk of
lending to consumers</a> - Surveys the techniques used — both
statistical and operational research based — to help organisations
decide whether or not to grant credit to consumers. It also discusses
the need to incorporate economic conditions into the scoring systems and
the way the systems could change from estimating the probability of a
consumer defaulting to estimating the profit a consumer will bring to
the lending organisation.</p></li>
<li><p><a
href="https://www.sciencedirect.com/science/article/abs/pii/S0957417407006719">The
comparisons of data mining techniques for the predictive accuracy of
probability of default of credit card clients</a> - This research
compares the predictive accuracy of probability of default among six
data mining methods. From the perspective of risk management, the result
of predictive accuracy of the estimated probability of default will be
more valuable than the binary result of classification.</p></li>
<li><p><a href="https://arxiv.org/abs/2005.14658">Super-App Behavioral
Patterns in Credit Risk Models: Financial, Statistical and Regulatory
Implications</a> - Presents the impact of alternative data that
originates from an app-based marketplace, in contrast to traditional
bureau data, upon credit scoring models. These alternative data sources
have shown themselves to be immensely powerful in predicting borrower
behavior in segments traditionally underserved by banks and financial
institutions. At the same time alternative data must be carefully
validated to overcome regulatory hurdles across diverse
jurisdictions.</p></li>
<li><p><a
href="https://www.sciencedirect.com/science/article/pii/S2405918822000095">Credit
scoring methods: Latest trends and points to consider</a> - “(…) This
article aims at providing a systemic review of the most recent
(20162021) articles, identifying trends in credit scoring using a fixed
set of questions. The survey methodology and questionnaire align with
previous similar research that analyses articles on credit scoring
published in 19912015. We seek to compare our results with previous
periods and highlight some of the recent best practices in the field
that might be useful for future researchers.”</p></li>
</ul>
<h2 id="institutional-credit-risk">Institutional Credit Risk</h2>
<ul>
<li><p><a
href="https://www.federalreserve.gov/publications/2017-september-availability-of-credit-to-small-businesses.htm">Availability
of Credit to Small Businesses</a> - Section 2227 of the Economic Growth
and Regulatory Paperwork Reduction Act of 1996 requires that, every five
years, the Board of Governors of the Federal Reserve System submit a
report to the Congress detailing the extent of small business lending by
all creditors. The most recent one is dated September, 2017.</p></li>
<li><p><a href="https://muse.jhu.edu/article/181124">Credit Scoring and
the Availability, Price, and Risk of Small Business Credit</a> - Finds
that small business credit scoring is associated with expanded
quantities, higher averages prices, and greater average risk levels for
small business credits under $100,000, after controlling for bank size
and other differences across banks.</p></li>
<li><p><a
href="https://link.springer.com/article/10.1023/A:1008699112516">Credit
Risk Assessment Using Statistical and Machine Learning: Basic
Methodology and Risk Modeling Applications</a> - An important ingredient
to accomplish the goal of a more efficient use of resources through risk
modeling is to find accurate predictors of individual risk in the credit
portfolios of institutions. In this context the authors make a
comparative analysis of different statistical and machine learning
modeling methods of classification on a mortgage loan dataset with the
motivation to understand their limitations and potential.</p></li>
<li><p><a
href="https://link.springer.com/article/10.1007/s11009-008-9078-2">Random
Survival Forests Models for SME Credit Risk Measurement</a> - Extends
the existing literature on empirical research in the field of credit
risk default for Small Medium Enterprizes (SMEs), proposing a
non-parametric approach based on Random Survival Forests (RSF) and
comparing its performance with a standard logit model.</p></li>
<li><p><a href="https://arxiv.org/abs/2004.08204">Modeling Institutional
Credit Risk with Financial News</a> - Current work in downgrade risk
modeling depends on multiple variations of quantitative measures
provided by third-party rating agencies and risk management consultancy
companies. There has been a wide push into using alternative sources of
data, such as financial news, earnings call transcripts, or social media
content, to possibly gain a competitive edge in the industry. This paper
proposes a predictive downgrade model using solely news data represented
by neural network embeddings.</p></li>
<li><p><a href="https://ieeexplore.ieee.org/document/935101">Bankruptcy
prediction for credit risk using neural networks: A survey and new
results</a> - The prediction of corporate bankruptcies is an important
and widely studied topic since it can have significant impact on bank
lending decisions and profitability. This work reviews the topic of
bankruptcy prediction, with emphasis on neural-network (NN) models and
develops an NN bankruptcy prediction model, proposing novel indicators
for the NN system.</p></li>
</ul>
<h2 id="peer-to-peer-lending">Peer-to-Peer Lending</h2>
<ul>
<li><a
href="https://www.tandfonline.com/doi/abs/10.1080/08982112.2019.1655159">Network
based credit risk models</a> - Peer-to-Peer lending platforms may lead
to cost reduction, and to an improved user experience. These
improvements may come at the price of inaccurate credit risk
measurements. The authors propose to augment traditional credit scoring
methods with “alternative data” that consist of centrality measures
derived from similarity networks among borrowers, deduced from their
financial ratios.</li>
</ul>
<h2 id="sample-selection">Sample Selection</h2>
<ul>
<li><p><a
href="https://econpapers.repec.org/paper/drmwpaper/2016-10.htm">Reject
inference in application scorecards: evidence from France</a> - Good
introduction and discussion on the topic.</p></li>
<li><p><a
href="https://www.sciencedirect.com/science/article/abs/pii/S0377221706011969">Reject
inference, augmentation, and sample selection</a> - In-depth
discussion.</p></li>
<li><p><a
href="http://www.research.lancs.ac.uk/portal/en/publications/instance-sampling-in-credit-scoring-an-empirical-study-of-sample-size-and-balancing(89b83914-c7f2-499a-8fa1-844d6cb6004d).html">Instance
sampling in credit scoring: An empirical study of sample size and
balancing</a> - Discusses the traditional sampling conventions in credit
modeling and argues that using larger samples provides a significant
increase in accuracy across algorithms.</p></li>
</ul>
<h2 id="feature-selection">Feature Selection</h2>
<ul>
<li><p><a
href="https://www.sciencedirect.com/science/article/pii/S0167923619300570">A
multi-objective approach for profit-driven feature selection in credit
scoring</a> - In credit scoring, feature selection aims at removing
irrelevant data to improve the performance and interpretability of the
scorecard. Standard techniques treat feature selection as a
single-objective task and rely on statistical criteria such as
correlation. Recent studies suggest that using profit-based indicators
may improve the quality of scoring models for businesses.</p></li>
<li><p><a
href="https://link.springer.com/article/10.1057/palgrave.jors.2601976">Data
mining feature selection for credit scoring models</a> - The features
used may have an important effect on the performance of credit scoring
models. The process of choosing the best set of features for credit
scoring models is usually unsystematic and dominated by somewhat
arbitrary trial. This paper presents an empirical study of four machine
learning feature selection methods.</p></li>
<li><p><a
href="https://www.sciencedirect.com/science/article/abs/pii/S0957417409010719">Combination
of feature selection approaches with SVM in credit scoring</a> - An
effective classificatory model in credit scoring will objectively help
managers who rely on intuitive experience. This study proposes four
approaches using the SVM (support vector machine) classifier for feature
selection that retain sufficient information for classification
purposes.</p></li>
</ul>
<h2 id="model-explainability">Model Explainability</h2>
<ul>
<li><p><a
href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3506274">Explainable
Machine learning in Credit Risk Management</a> - Proposes an explainable
AI model that can be used in credit risk management and, in particular,
in measuring the risks that arise when credit is borrowed employing
credit scoring platforms.</p></li>
<li><p><a
href="https://www.bankofengland.co.uk/working-paper/2019/machine-learning-explainability-in-finance-an-application-to-default-risk-analysis">Machine
learning explainability in finance: an application to default risk
analysis</a> - This Staff Working Paper from the Bank of England
proposes a framework for addressing the black box problem present in
some Machine Learning (ML) applications.</p></li>
<li><p><a
href="https://www.sciencedirect.com/science/article/pii/S2405918817300648">Regulatory
learning: How to supervise machine learning models? An application to
credit scoring</a> - The arrival of Big Data strategies is threatening
the latest trends in financial regulation related to the simplification
of models and the enhancement of the comparability of approaches chosen
by financial institutions. Indeed, the intrinsic dynamic philosophy of
Big Data strategies is almost incompatible with the current legal and
regulatory framework as illustrated in this paper. Besides, the model
selection may also evolve dynamically forcing both practitioners and
regulators to develop libraries of models, strategies allowing to switch
from one to the other as well as supervising approaches allowing
financial institutions to innovate in a risk mitigated
environment.</p></li>
</ul>