The recent financial crisis in the US has generated concerns about credit contagion, that is, how the deterioration of a borrower’s future ability to honour his/her debt obligations can affect the ability of other borrowers to repay. After the housing credit boom in the mid-2000s, the housing downturn of the late 2000s saw dramatic increases in mortgage borrowers that defaulted on their debt obligations. The aim of this research is to explore how integrating the geographical locations of US mortgage borrowers into credit risk models can improve the predictive accuracy of credit risk assessments, thereby potentially decreasing the losses on mortgage loans that strongly contributed to the recent financial crisis. This research proposes a regression model for estimating the propensity of US mortgage holders to default on their loans that includes the effects of the neighbours’ characteristics in an accurate rare event method. The main advantage of this proposal is to increase the number of accurately forecasted mortgages defaults. We also expect that the conventional model without neighbours’ characteristics will underestimate the risk created by relaxing lending standards.


Basel II is the international framework for the assessment of international banks’ capital adequacy. The significant innovation of this regulatory framework (Basel Committee on Banking Supervision, 2005) is the greater use of risk assessments provided by banks’ internal systems as inputs to capital calculations. In this context, statistical and mathematical models have been widely employed in order to estimate the probability of a borrower’s failure to fulfil his/ her debt obligation, known as probability of default (PD). These models are called credit scoring models.

An important application of such credit scoring models is the mortgage market, due to its substantial growth over recent decades. The collapse of the mortgage market and its subsequent role in triggering the 2008 financial crisis lends special importance to understanding the key drivers behind the increase in defaulted mortgages. The amount of defaulted mortgages in the US in 2009 was about $1.2 trillion.

The literature on mortgage default has emphasised the role of house prices as well as home equity accumulation for the default decision. Spatial dependence refers to the tendency for an observation associated with a location to be dependent on observations at other locations. Whilst existing studies establish the importance of modelling mortgage risk, the risk associated with the influence of spatial dependence is under-explored.


Only recently have models emerged which include geographical interdependence in credit risk analysis for mortgage data. Professor R. Kelley Pace at Louisiana State University finds that allowing spatial dependence greatly improves the predictive accuracy of credit risk models. The main drawback of his model used is that it is built under the assumption that the number of defaulted and non-defaulted mortgages is almost the same. In a non-spatial context, it has been shown that the PD is underestimated if the number of defaults in the sample is quite low (i.e lower than 5 per cent), as in the dataset on mortgage loans. In a non-spatial context, a correction of this estimate is achieved using the Generalised Extreme Value (GEV) regression model. The application of spatial econometric models to credit risk raises new difficulties that cannot be dealt with using standard econometric models.

This project addresses these challenges by proposing a spatial choice model that is accurate in classifying binary rare events and can handle large sample sizes. The proposed approach is based on a skewed and flexible distribution of the error terms, given by the GEV random variable. The tail behavior of the error distribution is determined by the rarity of the event in the sample, i.e. higher unbalanced samples are associated with higher skewness of the error distribution. If the dependent variable at each spatial location is binary, but the underlying latent variable is continuous, evaluating the likelihood function involves the integral of a truncated multivariate distribution of a dimension equal to the sample size. For large sample sizes, this becomes a difficult computational problem. To overcome this drawback, an estimation procedure is proposed considering that each observation located in space may depend upon a small number of neighbors.

Preliminary Findings

The proposal of this project and its competitors are applied to data on 282,366 mortgages from 2009-2010 in Clark County, one of the areas with the largest concentration of subprime mortgages in the US. The default rate in 2009 is 2.7% and in 2009-10 increases to 5.54%. The empirical results confirm that the main advantage of the model proposed in this project lies in its superior performance in classifying potentially defaulted mortgages for different default rates in the sample. Another strength of this approach is that it provides more reliable estimates of the probabilities of repayment compared to classic alternatives. The empirical analysis also shows that spatial dependence has an important impact on model fit. The disturbances in the model can be spatially dependent because location related variables can be omitted and because nearby properties show similar values for those omitted variables. Furthermore, this research shows that the model here proposed provides a more accurate estimate of the official measure of credit risk, known as Value-at-Risk (VaR). As the VaR constitutes the central point to the determination of capital requirements, financial institution could provide more reliable risk and capital measurements using the proposal of this project.


The adoption of the model proposed in this project to analyse mortgage decisions can lead to some significant insights. Conventional models that ignore neighborhood effects can overestimate the probability of mortgage repayment. This is because a borrower has a higher propensity to repay, holding other things constant, when her/his neighbors also have a high propensity to repay. Therefore, the model suggested in this project can improve the internal assessments of financial institutions when they are evaluating mortgage decisions. It can also provide accurate evaluations of risk generated by relaxing mortgage underwriting standards, which occurred during the 2008/2009 financial crisis.

Raffaella Calabrese

Funding for this project was awarded by the RSA as part of the Membership Research Grant Scheme. Further information on this award can be found at