A Comparative Machine Learning Survival Models Analysis for Predicting Time to Bank Failure in the US (2001-2023)

Diego Vallarino

doi:10.58567/jea03010007

Open Access Journal Article

A Comparative Machine Learning Survival Models Analysis for Predicting Time to Bank Failure in the US (2001-2023)

by Diego Vallarino ^a,*

Independent Researcher, Spain

Author to whom correspondence should be addressed.

JEA 2024 3(1):50; https://doi.org/10.58567/jea03010007

Received: 11 June 2023 / Accepted: 9 July 2023 / Published Online: 15 March 2024

Abstract Page

Download PDF

Abstract

This study investigates the likelihood of time to bank failures in the US between 2001 and April 2023, based on data collected from the Federal Deposit Insurance Corporation's report on "Bank Failures in Brief - Summary 2001 through 2023". The dataset includes 564 instances of bank failures and several variables that may be related to the likelihood of such events, such as asset amount, deposit amount, ADR, deposit level, asset level, inflation rate, short-term interest rates, bank reserves, and GDP growth rate. We explore the efficacy of machine learning survival models in predicting bank failures and compare the performance of different models. Our findings shed light on the factors that may influence the probability of bank failures with a time perspective and provide insights for improving risk management practices in the banking industry.

Keywords: Bank bankruptcy; survival analysis; stratified hazard model; survival machine learning models;

Copyright: © 2024 by Vallarino. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) (Creative Commons Attribution 4.0 International License). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

APA Style

Vallarino, D. (2024). A Comparative Machine Learning Survival Models Analysis for Predicting Time to Bank Failure in the US (2001-2023).  Journal of Economic Analysis, 3(1), 50. doi:10.58567/jea03010007

ACS Style

Vallarino, D. A Comparative Machine Learning Survival Models Analysis for Predicting Time to Bank Failure in the US (2001-2023). Journal of Economic Analysis, 2024, 3, 50. doi:10.58567/jea03010007

AMA Style

Vallarino D. A Comparative Machine Learning Survival Models Analysis for Predicting Time to Bank Failure in the US (2001-2023). Journal of Economic Analysis; 2024, 3(1):50. doi:10.58567/jea03010007

Chicago/Turabian Style

Vallarino, Diego 2024. "A Comparative Machine Learning Survival Models Analysis for Predicting Time to Bank Failure in the US (2001-2023)" Journal of Economic Analysis 3, no.1:50. doi:10.58567/jea03010007

1. Introduction

The forecast of company failure is important in both economics and society. Bankruptcies cause a breach in the business environment's stability, making estimating the sustainability of partners, clients, and financial institutions a particularly difficult and crucial problem for business players.

There is now a large number of bankruptcy prediction models (see M.A. Aziz et al., 2006; H.A. Alaka et al., 2017), however virtually all of them are classification based, which means they may estimate the posterior probability that a certain business would fail based on its financial features. The estimated time to failure is not expressly considered. For example, if a classification model is based on data collected one year prior to failure, the model's output is the posterior probability that a certain business would fail within one year. Decisions based on this probability may not be made in time to avert a failure that occurs in much less than a year.

A survival analysis, on the other hand, is concerned with the time of occurrence of the event of interest. Despite its prevalence in the medical and technological disciplines, survival analysis is seldom used to forecast financial failure. In their assessment of bankruptcy prediction models, Aziz and Dar (2006) included 12 kinds of classification models (ranging from discriminant analysis and logit to case-based reasoning, neural networks, and rough sets), but did not discuss survival analysis. According to this publication, the most often used approaches are multiple discriminant analysis and logistic regression; these two models account for more than half of the papers assessed. A 2018 study from H.A., Alaka at al. identified eight common technologies, including two statistical approaches (multiple discriminant analysis and logistic regression) and six machine learning models.

As a result, we may infer that survival analysis is not a primary focus of financial failure prediction experts. Our research aims to assess the usefulness of survival analysis (SA) to bankruptcy prediction. SA models and classification approaches are classified into two types: statistical and machine learning based. Statistical SA models originally debuted in the early 1970s, whereas machine learning SA models are the outcome of contemporary research. A large body of research confirms that machine learning models outperform statistical models in classification and regression tasks, particularly in classification-based bankruptcy prediction (see F., Barboza, et al., 2017). Several articles offer similar findings on the superiority of machine learning technologies in different areas of survival analysis.

Despite these findings, most writers of bankruptcy prediction approaches, especially when using SA, use the most basic statistical models (see A. Beretta, et al., 2018; R.C. Cox, et ., 2017).

In this paper, we innovate analyzing the results of our survival-ensemble-machine-learning-models comparison and the economic interpretation of these results. Our analysis focuses on the performance of different models in predicting time to bank failures using a set of relevant variables. Specifically, we compare the predictive power of several machine learning survival models, including the Kernel SVM, DeepSurv, Survival Random Forest and MTLR models. To compare the different machine learning algorithms, we use the concordance index (C-index)

Our goal is to identify which model provides the most accurate and informative predictions of time to bank failures and to interpret the economic significance of the model's results. To do so, we consider the significance and magnitude of the estimated coefficients for each variable in the model and compare these results to economic theory and intuition.

By analyzing the results of our model comparison and the economic interpretation of these results, we hope to provide insights into the factors that contribute to bank failures and to provide a better understanding of how different models can be used to predict these failures.

The remainder of the paper is structured as follows. Following a theoretical perspective on the address of different statistical models used to analyze the probability of bank failure. This analysis shows how the statistical tools have advanced in the study of banks bankruptcy. One of these new questions that arise in bank failure analysis is related to the probability of survival, the need to understand the time until bankruptcy becomes critical. That is why this section advances with we conduct a short survey of papers that use survival analysis to solve the financial collapse issue. Following that, we describe the empirical analysis section, which includes the models used, the data source, and the evaluation metrics.

We then present the results of the analysis, including a comparison of the different models, and we discuss the economic perspective of our findings. Finally, we conclude the paper by summarizing the key findings and their implications for future research and policymaking.

Overall, our study contributes to the literature on the use of survival analysis in finance and provides insights into the factors that drive financial collapses, which can help policymakers to design more effective regulations to prevent such events from occurring in the future.

2. Theorical perspective

Signals reflecting a company's operational state may disclose symptoms of financial difficulty, which can subsequently be incorporated into prediction models. was the first to use financial ratios to forecast bankruptcy, and financial ratios have been the most important piece of information in financial distress prediction for decades (; ).

Market-based knowledge may provide us with a timely forecast; that is, on the premise of efficient markets, the market price incorporates all future viewpoints (; ). Corporate governance (Li, Crook, Andreeva, & Tang, 2021; ), corporate efficiency (Li, Crook, & Andreeva, 2017; Paradi, Asmild, & Simak, 2004), external resource considerations (), and macroeconomic issues are all elements to examine (Duffie, Saita, & Wang, 2007; ).

Furthermore, in recent years, unstructured data has received a lot of attention in business research. Mai, Tian, Lee, and Ma (2019) utilized textual data, and used picture data created from financial documents to forecast company bankruptcy using convolutional neural networks in information extraction. Bankruptcy and financial distress prediction research has used statistical analysis and data mining approaches to improve decision-making tools (Yang, You, & Ji, 2011). was the first to apply multiple discriminant analysis (MDA), which was then expanded upon by , , and others.

Later, logistic regression (or Logit) substituted the Z-score because it may provide probabilistic findings (; ), which became a Basel II criterion. Machine learning algorithms have been appearing in the literature since the latter decade of the twentieth century. Jabeur (2023), and Lacher, Coats, Sharma, and Fant (1995) employed neural networks to categorize both bankrupted and non-bankrupted listed companies.

There are other innovative algorithms, including genetic algorithms (Back, Laitinen, & Sere, 1996), rough sets (Dimitras, Slowinski, Susmaga, & Zopounidis, 1999; Li, Wang, & Deng, 2008; ), decision trees (Geng, Bose, & Chen, 2015), support vector machines (Hua, Wang, Xu, Zhang, & Liang, 2007; ), and many hybrid and ensemble models such as , Choi, Son, & Kim (2018), du , and Sun et al (2011).

Another kind of algorithm is mathematical programming. Data envelopment analysis (DEA) is a nonparametric approach for comparing companies and calculating relative efficiency based on the distance to the optimal frontier. , Cielen, Peeters, & Vanhoof (2004), Li, Crook, & Andreeva (2014), , and others have used DEA to forecast bankruptcy and financial difficulty. Altman, Marco, and Varetto (1994), , , and Verikas, Kalsyte, Bacauskiene, and Gelzinis (2007) evaluated the debates of several models (2010). While the preceding strategies represent financial hardship as a classification issue, a survival analysis methodology is concerned with the event's timing as well as its occurrence.

Survival analysis may also benefit from time-varying variables and censoring in modeling, making it preferable to static classification methods. Lane, Looney, and Wansley (1986) were the first to utilize the Cox proportional hazard model to forecast bank collapse. Cox proportional hazard models were employed by to forecast the failure of Finnish industrial and retail enterprises, although they were shown to be somewhat inferior to both discriminant and logit analysis.

Shumway (2001) creates a discrete-time bankruptcy hazard model that incorporates both accounting and market data. The discrete hazard model was used by , Carling, Pan, Ariyan, Narayan, and Truini (2007), Leong, Nguyen, Meredith, et al. (2008), and Leonardis & Rocci (2009) because of the benefits in parameter calculation and the type of variables reported regularly for companies (2008).

When compared to discriminant analysis and logistic regression in terms of prediction accuracy, discovered that the Cox model was similar at equal misclassification costs but worse in adjusting to greater Type I error costs. discovered good findings using survival analysis on troubled enterprises in Indonesia. Recurrent event data are often used in medical research, most notably in the study of epilepsy, asthma, heart attacks, and hospital stays (). Within-subject correlation is a key feature of recurrent event data, in which one event raises or reduces the chance of following occurrences ().

Traditional statistical methods, such as logistic regression and Cox proportional hazards regression, either ignore the presence of recurring events or fail to account for within-subject correlation, resulting in an incorrect estimation of standard errors and a deviation from the original research question (Twisk, Smidt, & de Vente, 2005). Many approaches for analyzing recurring occurrences that incorporate all available information and within-subject correlations have been offered. Marginal intensity approaches, based on various definitions of risk sets, allow all cases to be at risk for each repeated event (Wei, Lin, & Weissfeld, 1989), whereas conditional intensity models are estimated in elapsed time or gap time, and cases are designated at risk for the kth repeated event only after experiencing the (k-1)th event (; ; Prentice, Williams, & Peterson, 1981).

The recurring occurrences in the Andersen-Gill (AG) model () are considered to be ordered but have an equal chance of happening. The Prentice, Williams, and Peterson (PWP) model () assumes that a person is not at risk for a future occurrence until the preceding event occurs. Even though there is considerable literature on modeling recurrent events using the PWP model in the fields of medical (e.g., Ejoku, Odhiambo, & Chaba, 2020; ; Pea, Slate, & González, 2007; ), consumer behavior (Bijwaard, Franses, & Paap, 2006), and product or equipment reliability (e.g., 1983; Jiang, Landers, & Reed Rhoads, 2006).

There are just a few studies in corporate finance. Parker, Peters, and Turetsky (2005), for example, employed the Cox and PWP models to examine the influence of corporate governance characteristics on the recurrent going-concern evaluations performed by auditors on failing enterprises. used the PWP model to examine insurers' recurrent rating shifts. investigated the factors influencing debt contract renegotiations between banks and European corporations using the PWP model in the context of corporate loans.

Zhou et al. (2022) employed the Cox analysis to investigate financial difficulty in their situation. In this effort, they have constructed three alternative models, each with its own set of variables, and he hopes to determine, via a single survival model, which elements most explain financial suffering. These Cox models are not comparable to other survival machine learning algorithms.

3. Empirical Analysis

In this section, we present the empirical analysis of the risk of bank failure in the United States. We analyze all 564 bank failures that occurred between 2001 and April 2023, as reported in the "Bank Failures in Brief - Summary 2001 through 2023" by the Federal Deposit Insurance Corporation (FDIC). The study of bank failures is of great importance in finance and economics, as it has significant implications for financial stability and the broader economy. In this section, we describe the models used in our analysis, the data source, and the evaluation metrics.

3.1. Models

3.1.1. Cox Proportional Hazards Model (coxph)

The Cox proportional hazards model is a widely used semi-parametric model in survival analysis. It assumes that the hazard function can be represented as the product of a time-independent baseline hazard function and a time-varying covariate function. Mathematically, the model can be represented as:

h (t| x) = h_{0} (t) \exp (β^{T} x)

where

h (t| x)

is the hazard function for a given time t and covariate values

x

h 0 (t)

is the baseline hazard function,

β

is a vector of regression coefficients, and

e x p (β X)

is the hazard ratio, which represents the change in hazard associated with a unit change in the covariate.

3.1.2. Multi-Task Logistic Regression (MTLR)

Multi-task logistic regression is a machine learning method that can be used for survival analysis. It is a multi-output learning algorithm that can predict the probability of an event occurring at different time points. Mathematically, the model can be represented as:

h (t| x) = e x p (Σ_{k = 1}^{K} Σ_{j = 1}^{p} β_{k j} x_{k j})

Where

h (t| x)

is the hazard rate for an individual with covariates

x, β_{k j}

are the regression coefficients for the kth characteristic of the jth group, and

x_{k j}

is the kth feature of the jth group.

3.1.3. Kernel Support Vector Machine (Kernel SVM)

Kernel support vector machines are a popular machine learning method for survival analysis. They can handle non-linear relationships between covariates and outcomes by projecting the data into a higher-dimensional space using a kernel function. The model can be represented as:

f (x) = s i g n (Σ_{i = 1}^{n} α_{i} y_{i} K (x_{i}, x) + b)

Where

K (x_{i}, x)

is a kernel function that measures the similarity between the feature vectors

x_{i}

and

x

y_{i}

is the class label of the i-th instance,

α_{i}

are the weights of the support vectors and

b

is the bias.

3.1.4. Random Survival Forest

Random survival forests are an extension of random forests for survival analysis. They use an ensemble of decision trees to predict the survival function. The model can be represented as:

h (t| x) = (1 / B) Σ_{b = 1}^{B} h_{b} (t| x)

Where

h_{b} (t| x)

is the hazard rate for an individual with covariates

x

in the

b t h

decision tree and

B

is the number of trees in the random forest.

3.1.5. DeepSurv

DeepSurv is a deep learning model for survival analysis. It uses a neural network with a flexible architecture to predict the survival function. The model can be represented as:

h (t| x) = e x p (Σ_{i = 1}^{p} β_{i} f_{i} (x) + g (h_{θ} (x)))

Where

h (t| x)

is the hazard rate for an individual with covariates

x, β_{i}

are the regression coefficients for the input features

f_{i} (x), g (\cdot)

is a non-linear function that transforms the output features and

h_{θ} (x)

is a neural network with

θ

parameters.

3.2. Data

In this analysis, we examine data on all 564 bank failures that occurred between 2001 and April 2023, as reported by the Federal Deposit Insurance Corporation (FDIC) in the "Bank Failures in Brief - Summary 2001 through 2023". The dataset contains information on several variables that may be related to the probability of bank failure. These variables include, asset amount, deposit amount, ADR, deposit level, asset level, inflation rate, short-term interest rates, bank reserves, and GDP growth rate. ^{^[1]} The database does not contain censored data.

Asset (Millions): The amount of assets a bank owns can be a good indicator of its financial strength and ability to withstand a crisis. A bank with a large amount of assets is less likely to fail. Deposit (Millions): The amount of deposits of a bank is another indicator of its financial strength, since it represents the confidence of depositors in the bank. A bank with a large number of deposits is less likely to fail.

ADR: The level of ADR (loan to deposit ratio) can be a good indicator of a bank's exposure to credit risk. A bank with a high ADR level could be at higher risk of bankruptcy in a recession or financial crisis. Deposit Level: The level of deposits in relation to the size of the bank can be an indicator of the financial strength of the bank. A bank with a high deposit-to-size ratio is less likely to fail.

Asset Level: The level of assets in relation to the size of the bank can be an indicator of the financial strength of the bank. A bank with a high asset-to-size ratio is less likely to fail. Inflation: The rate of inflation can affect the financial strength of a bank. High inflation can increase the risk of loan defaults, which would increase the probability of bank failure.

FFRate: The short-term interest rate set by the Federal Reserve can affect the financial strength of a bank. A high interest rate increases borrowing costs and reduces the number of loans that can be made, which can increase the likelihood of bank failure.

BanksRes: Bank reserves are funds that banks hold to cover potential losses on their loans and other assets. The higher a bank's reserves, the greater its ability to withstand a financial crisis and the lower its probability of failure.

GDP1pch: Represents the annual percentage change in the real Gross Domestic Product (GDP) of the United States relative to the previous year. The higher the GDP rate, the greater the bank's ability to survive based on the operation and health of its activity.

3.3. Metrics

3.3.1. C-Index

The C-index (also known as the concordance index or the area under the receiver operating characteristic curve) is a widely used metric in survival analysis and medical research to assess the performance of predictive models that estimate the likelihood of an event occurring over a given time period.

The C-index is generated using the rankings of anticipated event occurrence probability for each participant in a dataset. It calculates the percentage of pairings of people in whom the person with the higher anticipated probability experienced the event before the person with the lower projected probability. In other words, it assesses a predictive model's capacity to rank people in order of their likelihood of experiencing the event of interest.

The C-index scales from 0 to 1, with 0.5 representing random prediction and 1 indicating perfect prediction. In medical research, a C-index value of 0.7 or above is considered satisfactory performance for a prediction model. Here is the formula of non-censored data C-Index.

C - i n d e x = \frac{Σ_{i j} 1_{T_{j} < T_{i}} . 1_{η_{j} > η_{i}} . δ_{j}}{Σ_{i j} 1_{T_{j} < T_{i}} . δ_{j}}

η_{i}, t h e r i s k s c o r e o f a u n i t i

1_{T_{j} < T_{i}} = 0 i f T_{j} < T_{i} e l s e 0

1_{η_{j} < η_{i}} = 0 i f η_{j} < η_{i} e l s e 0

δ_{j}, r e p r e s e n t s w h e t h e r t h e v a l u e i s c e n s o r e d o r n o t

4. Results

The Kaplan-Meier curve(, , ) displays the survival probability over time for a group of banks. The x-axis shows the time, and the y-axis displays the survival probability. At the start of the observation period, all banks are assumed to be "alive," represented by the value of 1. Over time, some banks may "die," meaning they fail, and their survival probability decreases.

The table below() shows the evolution of the risk of bank failures over time. At time 1.07, there were 395 banks in the sample, and one bank failed. This translates to a survival probability of 0.99747 (i.e., 395-1/395). The survival probability at time 4.07 is 0.99494, indicating that two more banks have failed since the first observation.

The survival probability decreases as time progresses. At time 42.37, 382 banks remained in the sample, with 13 banks having failed over the observation period. The survival probability at that time was 0.96456. This means that the risk of bank failures increased from 1.07 to 42.37. After that time, the survival probability continued to decrease rapidly, suggesting that there was an increase in the risk of bank failures during that period.

Between 120 and 150 days, there is a significant drop in the survival probability from 0.4000 to 0.1038. This indicates a much higher risk of failure during this time period. This drop in survival probability could be indicative of some event or factor that increases the risk of failure during this time.

It's important to note that the analysis does not provide any insight into the cause of the bank failures, and further investigation would be necessary to determine the reasons behind the increase in the risk of bank failures.

4.1. Model comparison

The paper analyzed the performance of different machine learning survival models in predicting bank failures using a set of relevant variables(). This procedure divided the dataset into a training set and a testing set for machine learning design. The code randomly selects 70% of the rows from the data frame df and assigns them to data.train. The train_index variable stores the numeric row indices of data.train. The remaining rows, which constitute 30% of the original data, are assigned to data.test. This separation allows for training a model on the training set and evaluating its performance on the testing set to assess its effectiveness and generalization capabilities. The concordance index (C-index) was used to compare the predictive power of different models.

According to the results presented in the paper, the model with the highest C-index value of 0.985 was the DepSurv model, indicating that it performed the best in predicting bank failures using the selected variables. The RendForest model had the second-highest C-index value of 0.798, followed by MTLR with a C-index of 0.741. The Cox model had a C-index of 0.666, while the KarnelSVM model had the lowest C-index of 0.571.

These results suggest that the DepSurv model was the most effective in predicting bank failures, followed by the RendForest and MTLR models. This information is valuable for banks and regulatory agencies in predicting the likelihood of bank failures and taking necessary actions to mitigate the risks.

The variables analyzed in the study, including asset amount, deposit amount, ADR, deposit level, asset level, inflation rate, short-term interest rates, bank reserves, and GDP growth rate, can provide insights into the factors that contribute to bank failures. By understanding these variables, banks and regulatory agencies can take measures to reduce the likelihood of bank failures.

These results highlight the potential of ensemble machine learning survival models in predicting bank failures and provides insights into the factors that contribute to these failures. This information can be used to improve the stability of the banking system and reduce the risk of financial crises.

4.2. Economic perspective

4.2.1. Matrix Analysis

The relative weights matrix, that is represented in the figure below(), can be useful in understanding how regulators and analysts assess a bank's risk of failure and which factors they consider most important at different times. However, it is also important to note that these weights may change over time as markets and the economy evolve, and that different regulators and analysts may have slightly different approaches to assessing bankruptcy risk.

In this analysis, it can be seen that at the beginning of the weighting matrix, a relatively high weight is given to the “FFRate” because interest rate fluctuations can have a large impact on a bank's income and expenses, especially in terms of loans and deposits. Also, interest rate changes can signal changes in the broader economy, which can affect the financial health of banks.

However, as time progresses in the weight matrix, it is observed that the "FFRate" loses weight compared to other variables, such as "Inflation" and "BanksRes". This may be due to a number of factors, including the increasing importance of other risk factors such as inflation and a bank's ability to maintain adequate reserves. It may also reflect a heightened awareness on the part of regulators and analysts that the interest rate alone is not enough to assess a bank's risk of failure and that multiple factors need to be considered.

At the end, the weight matrix suggests that the size of a bank's assets and deposits are important factors in reducing the probability of bankruptcy, while a high level of indebtedness and low level of reserves increase the probability of bankruptcy. In addition, inflation appears to be a protective factor against bank failures. ^{^[2]}

4.2.2. 2008 Financial Crisis

It is interesting to note that during the period from 100 to 150, which coincided with the 2008 financial crisis, the survival rate of banks ranged from 0.88861 to 0.10380. This suggests that the financial crisis had a significant impact on the ability of banks to consolidate solvent.

Furthermore, it is important to note that the survival rate continued to decline after the crisis period, albeit at a slower rate. This could be indicative of the aftermath of the crisis, such as the economic downturn that followed and its lingering effects on the economy and the banking sector. Overall, these results underline the importance of considering the economic context and external events when analyzing the financial health of banks.

We can combine the information from the survival table and the weights matrix to better understand the variables that emerge from failure after period 150. From the survival table, we can see that the survival of banks decreases significantly after period 150. This It may be indicative that the variables that have a greater weight in the weight matrix after period 150 have a greater impact on bank failures.

In the weight’s matrix, we can see that the variables "BanksRes" and "Inflation" have a relatively high weight after the 150 period. This suggests that these variables may be more important in predicting bank failures after the financial crisis of 2008. Post-crisis, regulators may have placed more emphasis on the importance of adequate capital buffers and the ability of banks to stabilize solvents in an environment of rising inflation. Therefore, these factors may have been more important in predicting bank failures in the post-crisis period.

4.2.3. If we include the data from GDP1pch

The variables with the greatest weight in predicting bank failure are Asset (Millions), ADR, Deposit Level and Deposit (Millions), in that order, all of them with negative weights, which means that as these variables increase, the probability of bank failure decreases.

GDP1pch variable has the lowest weight of all, but it is also negative, which suggests that a decrease in economic growth increases the probability of bank failure. The other variables with negative weights are Inflation, FFRate, and BanksRes, indicating that high inflation, high interest rate, and low bank reserve also increase the probability of bank failure.

As for the Asset Level and Deposit Level variables, although they have positive weights, their weights are very low compared to other variables, so their effect in predicting bank failure is probably limited.

In summary, the results showed that banks with lower assets, lower deposit levels, high inflation rates, high interest rates, low bank reserves, and low economic growth are more likely to fail.

5. Conclusion

The risk evolution over time was used to analyze bank failures, and it demonstrated a substantial decline in survival probability between particularly between 120 and 150 months, coinciding with the years 2009, 2010 and 2011. After the global financial crisis, which originated in the US.

According to the relative weights matrix study, interest rate changes, inflation, bank reserves, and the amount of assets and deposits were all critical variables in determining a bank's risk of failure. It is worth noting that during the 2008 financial crisis, the survival rate of banks declined dramatically, implying a severe effect on their capacity to stay viable. The data offered in this research may be utilized to enhance banking system stability and lessen the likelihood of financial crises.

When several machine learning survival models were compared in forecasting time to bank failures using a collection of relevant characteristics, the DepSurv model was found to be the most successful, followed by the RendForest and MTLR models. The study's variables, which included asset amount, deposit amount, ADR, deposit level, asset level, inflation rate, short-term interest rates, bank reserves, and GDP growth rate, may give insight into the causes that lead to bank failures. Banks and regulatory bodies may lower the chance of bank failures by knowing these characteristics.

The initial high weight allocated to the "FFRate" variable in the weight matrix is one striking discovery. This highlights the substantial influence that interest rate swings have on a bank's revenue and costs, notably in terms of loans and deposits. Furthermore, fluctuations in interest rates may serve as indications of larger economic movements, which can have an impact on banks' financial stability.

However, as the study goes through the weight matrix, the weight allocated to "FFRate" decreases in comparison to other factors like as "Inflation" and "BanksRes." This trend might be linked to a number of causes, including the growing prominence of other risk variables such as inflation and a bank's capacity to keep enough reserves. It also implies that regulators and analysts are more aware that measuring a bank's risk of collapse needs taking into account many elements rather than relying just on interest rates.

The research suggests at the conclusion of the weight matrix that the size of a bank's assets and deposits has a critical impact in minimizing the likelihood of bankruptcy. A high degree of indebtedness and minimal reserves, on the other hand, enhance the chance of failure. Furthermore, inflation appears as a protective factor against bank failures, emphasizing its significance in financial stability.

The research makes an important insight about the 2008 financial crisis. During this time period, which spanned from 100 to 150 years, the survival rate of banks ranged from 0.88861 to 0.10380. This demonstrates the significant effect of the crisis on banks' capacity to remain solvent. It is worth noting that the rate of survival continued to fall following the crisis, but at a reduced pace. This reduction may be symptomatic of the continuing impacts of the economic slump that followed the crisis, indicating that external events continue to have an impact on the banking industry.

When the insights from the survival table and the weight matrix are combined, it is possible to have a clearer understanding of the elements that contribute to failure beyond period 150. The survival table shows a considerable decline in bank survival throughout this time period, indicating that factors with higher weights in the weight matrix had a stronger influence. The weight matrix, in particular, emphasizes the unusually large weights allocated to "BanksRes" and "Inflation" after the 150th period. In the post-2008 financial crisis environment, these indicators may be increasingly crucial in forecasting bank failures. During this period, regulators may have put a greater focus on the need of appropriate capital buffers and banks' capacity to retain solvency in the face of growing inflation.

One restriction of our research is the possibility of aberrant activities in the investigated banks, as well as a lack of control over their management. Our research is based on Federal Deposit Insurance Corporation (FDIC) official figures, which may not give precise information on particular banks with skewed accounting indications or aberrant activities. We do not have direct access as researchers to establish the degree of irregular activity inside the studied banks. This issue is inherent in dealing with publicly accessible data sources and might limit our investigation. We will specifically acknowledge this issue in the updated text, highlighting the necessity of taking this element into account when interpreting our results. By recognizing this restriction, we want to retain openness and provide readers a thorough knowledge of the limits imposed by the data sources used in our study.

Based on the prior analysis's findings, below are some suggested next steps for additional research. Some are as follows.

Extend the study to include a wider sample of banks or financial institutions, as well as a longer time span of observation. This might lead to a better understanding of the elements that contribute to bank failures and increase the prediction models' accuracy.

Consider using information on the banks' management practices, corporate governance, or social responsibility efforts as additional data sources or variables in the research. This might give a more comprehensive perspective of the variables influencing bank failures and a more sophisticated understanding of the link between these factors and bankruptcy risk.

Funding Statement

This research received no external funding.

Acknowledgment

Acknowledgments to anonymous referees' comments and editor's effort.

Declaration of Competing Interest

The author claims that the manuscript is completely original. The author also declares no conflict of interest.

Notes

U.S. Inflation Rate 1960-2023 from World Bank; Federal Funds Effective Rate from FED; Liabilities and Capital: Other Factors Draining Reserve Balances: Reserve Balances with Federal Reserve Banks: Week Average from FED; Real Gross Domestic Product from U.S. Bureau of Economic Analysis. ↑

Inflation could help banks generate more income on the same assets, increasing their ability to maintain adequate reserves and avoid bankruptcy. However, it is important to note that excessive inflation can also be detrimental to banks and the economy in general. ↑

References

Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23(4), 589–609. https://doi.org/10.2307/2978933 [Google Scholar ][Crossref]
Altman, E. I., Marco, G., & Varetto, F. (1994). Corporate distress diag- nosis: comparisons using linear discriminant analysis and neural networks (the Italian experience). Journal of Banking & Finance, 18(3), 505–529. https://doi.org/10.1016/0378-4266(94)90007-8 [Google Scholar ][Crossref]
Andersen, P. K., & Gill, R. D. (1982). Cox’s regression model for counting processes: A large sample study. The Annals of Statistics, 10, 1100–1120. [Google Scholar ]
Ascher, H. (1983). Regression analysis of repairable systems reliability. In Electronic systems effectiveness and life cycle costing (pp. 119–133). Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-642-82014-4_8 [Google Scholar ][Crossref]
Back, B., Laitinen, T., & Sere, K. (1996). Neural networks and ge- netic algorithms for bankruptcy predictions. Expert Systems with Applications, 11(4), 407–413. https://doi.org/10.1016/S0957-4174(96)00055-3 [Google Scholar ][Crossref]
Bai, C., Liu, Q., Lu, J., Song, F. M., & Zhang, J. (2004). Corporate governance and market valuation in China. Journal of Comparative Economics, 32(4), 599–616. https://doi.org/10.1016/j.jce.2004.07.00 [Google Scholar ][Crossref]
Balcaen, S., & Ooghe, H. (2006). 35 Years of studies on business failure: an overview of the classical statistical methodologies and their related problems. The British Accounting Review, 38(1), 63–93. https://doi.org/10.1016/j.bar.2005.09.001 [Google Scholar ][Crossref]
Beaver, W. H. (1966). Financial ratios as predictors of failure. Journal of Accounting Research, 4, 71–111. https://doi.org/10.2307/2490171 [Google Scholar ][Crossref]
Bharath, S. T., & Shumway, T. (2008). Forecasting default with the merton distance to default model. Review of Financial Studies, 21(3), 1339–1369. https://doi.org/10.1093/rfs/hhn044 [Google Scholar ][Crossref]
Bijwaard, G. E., Franses, P. H., & Paap, R. (2006). Modeling purchases as repeated events. Journal of Business & Economic Statistics, 24, 487–502. ttps://doi.org/10.1198/073500106000000242 [Google Scholar ][Crossref]
Bonfim, D. (2009). Credit risk drivers: evaluating the contribution of firm level information and of macroeconomic dynamics. Journal of Banking & Finance, 33(2), 281–299. https://doi.org/10.1016/j.jbankfin.2008.08.00 [Google Scholar ][Crossref]
Box-Steffensmeier, J. M., & Boef, S. D. (2006). Repeated events survival models: the conditional frailty model. Statistics in Medicine, 25(20), 3518–3533. https://doi.org/10.1002/sim.2434 [Google Scholar ][Crossref]
Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78(1), 1–3. [Google Scholar ]
Cai, J., & Schaubel, D. E. (2003). Analysis of recurrent event data. Handbook of Statistics, 23, 603–623. https://doi.org/10.1016/S0169-7161(03)23034-0 [Google Scholar ][Crossref]
Carling, T., Pan, D., Ariyan, S., Narayan, D., & Truini, C. (2007). Diagnosis and treatment of interval sentinel lymph nodes in patients with cutaneous melanoma. Plastic and Reconstructive Surgery, 119(3), 907–913. [Google Scholar ]
Chang, S.-H., & Wang, M.-C. (1999). Conditional regression analysis for recurrence time data. Journal of the American Statistical Association, 94(448), 1221–1230. https://doi.org/10.1080/01621459.1999.10473875 [Google Scholar ][Crossref]
Chava, S., & Jarrow, R. A. (2004). Bankruptcy prediction with industry effects. Review of Finance, 8(4), 537–569. https://doi.org/10.1093/rof/8.4.537 [Google Scholar ][Crossref]
Choi, H., Son, H., & Kim, C. (2018). Predicting financial distress of contractors in the construction industry using ensemble learning. Expert Systems with Applications, 110, 1–10. https://doi.org/10.1016/j.eswa.2018.05.026 [Google Scholar ][Crossref]
Choodari-Oskooei, B., Royston, P., & Parmar, M. K. (2012). A simulation study of predictive ability measures in a survival model II: ex- plained randomness and predictive accuracy. Statistics in Medicine, 31(23), 2644–2659. https://doi.org/10.1002/sim.4242 [Google Scholar ][Crossref]
Cielen, A., Peeters, L., & Vanhoof, K. (2004). Bankruptcy prediction using a data envelopment analysis. European Journal of Operational Research, 154(2), 526–532. https://doi.org/10.1016/S0377-2217(03)00186-3 [Google Scholar ][Crossref]
Clayton, D. (1994). Some approaches to the analysis of recurrent event data. Statistical Methods in Medical Research, 3(3), 244–262. https://doi.org/10.1177/096228029400300304 [Google Scholar ][Crossref]
Deakin, E. B. (1972). A discriminant analysis of predictors of business failure. Journal of Accounting Research, 3(3), 167–179. https://doi.org/10.2307/2490225 [Google Scholar ][Crossref]
Dimitras, A. I., Slowinski, R., Susmaga, R., & Zopounidis, C. (1999). Business failure prediction using rough sets. European Journal of Operational Research, 114(2), 263–280. https://doi.org/10.1016/S0377-2217(98)00255-0 [Google Scholar ][Crossref]
du Jardin, P. (2017). Dynamics of firm financial evolution and bankruptcy prediction. Expert Systems with Applications, 75, 25–43. https://doi.org/10.1016/j.eswa.2017.01.016 [Google Scholar ][Crossref]
Duffie, D., Saita, L., & Wang, K. (2007). Multi-period corporate default prediction with stochastic covariates. Journal of Financial Economics, 83(3), 635–665. https://doi.org/10.1016/j.jfineco.2005.10.011 [Google Scholar ][Crossref]
Edmister, R. O. (1972). An empirical test of financial ratio analy- sis for small business failure prediction. Journal of Financial and Quantitative Analysis, 7(2), 1477–1493. doi:10.2307/2329929 [Google Scholar ][Crossref]
Ejoku, J., Odhiambo, C., & Chaba, L. (2020). Analysis of recurrent events with associated informative censoring: Application to HIV data. International Journal of Statistics in Medical Research, 9(21). [Google Scholar ]
Geng, R., Bose, I., & Chen, X. (2015). Prediction of financial distress: an empirical study of listed Chinese companies using data mining. European Journal of Operational Research, 241(1), 236–247. https://doi.org/10.1016/j.ejor.2014.08.016 [Google Scholar ][Crossref]
Gepp, A., & Kumar, K. (2008). The role of survival analysis in financial distress prediction. International Research Journal of Finance and Economics, 16(16), 13–34. [Google Scholar ]
Gilson, S. C. (1989). Management turnover and financial distress. Journal of Financial Economics, 25(2), 241–262. https://doi.org/10.1016/0304-405X(89)90083-4 [Google Scholar ][Crossref]
Godlewski, C. J. (2015). The dynamics of bank debt renegotiation in Europe: A survival analysis approach. Economic Modelling, 49, 19–31. https://doi.org/10.1016/j.econmod.2015.03.017 [Google Scholar ][Crossref]
Gönen, M., & Heller, G. (2005). Concordance probability and discrimi- natory power in proportional hazards regression. Biometrika, 92(4), 965–970. https://doi.org/10.1093/biomet/92.4.965 [Google Scholar ][Crossref]
Graf, E., Schmoor, C., Sauerbrei, W., & Schumacher, M. (1999). As- sessment and comparison of prognostic classification schemes for survival data. Statistics in Medicine, 18(17–18), 2529–2545. https://doi.org/10.1002/(SICI)1097-0258(19990915/30)18:17/18<2529::AID-SIM274>3.0.CO;2-5 [Google Scholar ][Crossref]
Graf, E., & Schumacher, M. (1995). An investigation on measures of explained variation in survival analysis. Journal of the Royal Statistical Society. Series D, 44(4), 497–507. https://doi.org/10.2307/2348898 [Google Scholar ][Crossref]
Harrell Jr, F. E., Lee, K. L., & Mark, D. B. (1996). Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statistics in Medicine, 15(4), 361–387. https://doi.org/10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4 [Google Scholar ][Crossref]
Henriques, I. C., Sobreiro, V. A., Kimura, H., & Mariano, E. B. (2020). Two-stage DEA in banks: Terminological controversies and future directions. Expert Systems with Applications, 161, 113632. https://doi.org/10.1016/j.eswa.2020.113632 [Google Scholar ][Crossref]
Hosaka, T. (2019). Bankruptcy prediction using imaged financial ratios and convolutional neural networks. Expert Systems with Applications, 117, 287–299. https://doi.org/10.1016/j.eswa.2018.09.039 [Google Scholar ][Crossref]
Hu, Y. C., & Ansell, J. (2007). Measuring retail company performance using credit scoring techniques. European Journal of Operational Research, 183(3), 1595–1606. https://doi.org/10.1016/j.ejor.2006.09.101 [Google Scholar ][Crossref]
Hu, D., & Zheng, H. (2015). Does ownership structure affect the degree of corporate financial distress in China? Journal of Accounting in Emerging Economies, 5(1), 35–50. https://doi.org/10.1108/JAEE-09-2011-0037 [Google Scholar ][Crossref]
Hua, Z., Wang, Y., Xu, X., Zhang, B., & Liang, L. (2007). Predicting corporate financial distress based on integration of support vector machine and logistic regression. Expert Systems with Applications, 33(2), 434–440. https://doi.org/10.1016/j.eswa.2006.05.006 [Google Scholar ][Crossref]
Jiang, Y., & Jones, S. (2018). Corporate distress prediction in China: a machine learning approach. Account Finance, 58, 1063–1109. https://doi.org/10.1111/acfi.12432 [Google Scholar ][Crossref]
Jiang, S. T., Landers, T. L., & Reed Rhoads, T. (2006). Proportional intensity models robustness with overhaul intervals. Quality and Reliability Engineering International, 22(3), 251–263. https://doi.org/10.1002/qre.713 [Google Scholar ][Crossref]
John, T. A. (1993). Accounting measures of corporate liquidity, leverage, and costs of financial distress. Financial Management, 22(3), 91–100. https://doi.org/10.2307/3665930 [Google Scholar ][Crossref]
Kahl, M. (2002). Economic distress, financial distress, and dynamic liquidation. The Journal of Finance, 57(1), 135–168. https://doi.org/10.1111/1540-6261.00418 [Google Scholar ][Crossref]
Kam, A., Citron, D., & Muradoglu, G. (2010). Financial distress resolution in China – two case studies. Qualitative Research in Financial Markets, 2(2), 46–79. https://doi.org/10.1108/17554171011053667 [Google Scholar ][Crossref]
Kaplan, E. L., & Meier, P. (1958). Nonparametric estimation from in- complete observations. Journal of the American Statistical Association, 53(282), 457–481. https://doi.org/10.1080/01621459.1958.10501452 [Google Scholar ][Crossref]
Kim, M., Ma, S., & Zhou, Y. (2016). Survival prediction of distressed firms: evidence from the Chinese special treatment firms. Journal of the Asia Pacific Economy, 21(3), 418–443. https://doi.org/ 10.1080/13547860.2016.1176645 [Google Scholar ][Crossref]
Kristanti, Farida Titik, & Herwany, Aldrin (2017). Corporate governance, financial ratios, political risk and financial distress: A survival analysis. Accounting and Finance Review, 2(2), 26–34. [Google Scholar ]
Kuhnen, C. M., & Melzer, B. T. (2018). Noncognitive abilities and financial delinquency: the role of self-efficacy in avoiding financial distress. The Journal of Finance, 73(6), 2837–2869. https://doi.org/10.1111/jofi.12724 [Google Scholar ][Crossref]
Kumar, P. R., & Ravi, V. (2007). Bankruptcy prediction in banks and firms via statistical and intelligent techniques – a review. European Journal of Operational Research, 180(1), 1–28. https://doi.org/10.1016/j.ejor.2006.08.043 [Google Scholar ][Crossref]
Lacher, R. C., Coats, P. K., Sharma, S. C., & Fant, L. F. (1995). A neural network for classifying the financial health of a firm. European Journal of Operational Research, 85(1), 53–65. https://doi.org/10.1016/0377-2217(93)E0274-2 [Google Scholar ][Crossref]
Lane, W. R., Looney, S. W., & Wansley, J. W. (1986). An application of the cox proportional hazards model to bank failure. Journal of Banking & Finance, 10(4), 511–531. https://doi.org/10.1016/S0378-4266(86)80003-6 [Google Scholar ][Crossref]
Lee, M. C. (2014). Business bankruptcy prediction based on survival analysis approach. International Journal of Computer Science & Information Technology, 6(2), 103. https://doi.org/10.5121/ijcsit.2014.6207 [Google Scholar ][Crossref]
Leonardis, D. D., & Rocci, R. (2008). Assessing the default risk by means of a discrete – time survival analysis approach. Applied Stochastic Models in Business and Industry, 24(4), 291–306. https://doi.org/10.1002/asmb.705 [Google Scholar ][Crossref]
Leong, R. W., Nguyen, N., Meredith, C. G., et al. (2008). In vivo confocal endomicroscopy in the diagnosis and evaluation of celiac disease. Gastroenterology, 135(6), 1870–1876. https://doi.org/10.1053/j.gastro.2008.08.054 [Google Scholar ][Crossref]
Li, Z., Crook, J., & Andreeva, G. (2014). Chinese companies distress prediction: an application of data envelopment analysis. Journal of the Operational Research Society, 65(3), 466–479. https://doi.org/10.1057/jors.2013.67 [Google Scholar ][Crossref]
Li, Z., Crook, J., & Andreeva, G. (2017). Dynamic prediction of financial distress using Malmquist DEA. Expert Systems with Applications, 80, 94–106. https://doi.org/10.1016/j.eswa.2017.03.017 [Google Scholar ][Crossref]
Li, Z., Crook, J., Andreeva, G., & Tang, Y. (2021). Predicting the risk of fi- nancial distress using corporate governance measures. Pacific-Basin Finance Journal, 68, Article 101334. https://doi.org/10.1016/j.pacfin.2020.101334 [Google Scholar ][Crossref]
Li, H., Wang, Z., & Deng, X. (2008). Ownership, independent direc- tors, agency costs and financial distress: evidence from Chinese listed companies. Corporate Governance: The International Journal of Business in Society, 8(5), 622–636. https://doi.org/10.1108/14720700810913287 [Google Scholar ][Crossref]
Luoma, M., & Laitinen, E. K. (1991). Survival analysis as a tool for company failure prediction. Omega, 19(6), 673–678. https://doi.org/10.1016/0305-0483(91)90015-L [Google Scholar ][Crossref]
Mai, F., Tian, S., Lee, C., & Ma, L. (2019). Deep learning models for bankruptcy prediction using textual disclosures. European Journal of Operational Research, 274(2), 743–758. https://doi.org/10.1016/j.ejor.2018.10.024 [Google Scholar ][Crossref]
Mantel, N. (1966). Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemother Rep, 50(3), 163–170. [Google Scholar ]
Martin, D. (1977). Early warning of bank failure: a logit regression approach. Journal of Banking & Finance, 1(3), 249–276. https://doi.org/10.1016/0378-4266(77)90022-X [Google Scholar ][Crossref]
Merton, R. C. (1974). On the pricing of corporate debt: the risk structure of interest rates. The Journal of Finance, 29(2), 449–470. https://doi.org/10.2307/2978814 [Google Scholar ][Crossref]
Min, J. H., & Lee, Y. C. (2005). Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Systems with Applications, 28(4), 603–614. https://doi.org/10.1016/j.eswa.2004.12.008 [Google Scholar ][Crossref]
Moulton, L. H., & Dibley, M. J. (1997). Multivariate time-to-event models for studies of recurrent childhood diseases. International Journal of Epidemiology, 26(6), 1334–1339. https://doi.org/10.1093/ije/26.6.1334 [Google Scholar ][Crossref]
Ohlson, J. A. (1980). Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research, 18(1), 109–131. https://doi.org/10.2307/2490395 [Google Scholar ][Crossref]
O’Neill, H. M. (1986). Turnaround and recovery: What strategy do you need? Long Range Planning, 19(1), 80–88. https://doi.org/10.1016/0024-6301(86)90131-7 [Google Scholar ][Crossref]
Paradi, J. C., Asmild, M., & Simak, P. C. (2004). Using DEA and worst practice DEA in credit risk evaluation. Journal of Productivity Analysis, 21(2), 153–165. https://doi.org/10.1023/B:PROD.0000016870.47060.0b [Google Scholar ][Crossref]
Parker, S., Peters, G. F., & Turetsky, H. F. (2005). Corporate governance factors and auditor going concern assessments. Review of Accounting and Finance, 4(3), 5–29. https://doi.org/10.1108/eb043428 [Google Scholar ][Crossref]
Peña, E. A., Slate, E. H., & González, J. R. (2007). Semiparametric inference for a general class of models for recurrent events. Journal of Statistical Planning and Inference, 137(6), 1727–1747. https://doi.org/10.1016/j.jspi.2006.05.004 [Google Scholar ][Crossref]
Pfennig, A., Schlattmann, P., Alda, M., Grof, P., Glenn, T., Müller- Oerlinghausen, B., et al. (2010). Influence of atypical features on the quality of prophylactic effectiveness of long-term lithium treatment in bipolar disorders. Bipolar Disorders, 12(4), 390–396. https://doi.org/10.1111/j.1399-5618.2010.00826.x [Google Scholar ][Crossref]
Platt, H. D., & Platt, M. B. (2002). Predicting corporate financial distress: Reflections on choice-based sample bias. Journal of Economics and Finance, 26(2), 184–199. https://doi.org/10.1007/BF02755985 [Google Scholar ][Crossref]
Prentice, R. L., Williams, B. J., & Peterson, A. V. (1981). On the regression analysis of multivariate failure time data. Biometrika, 68, 373–389. https://doi.org/10.1093/biomet/68.2.373 [Google Scholar ][Crossref]
Rahman, M. S., Ambler, G., Choodari-Oskooei, B., & Omar, R. Z. (2017). Review and evaluation of performance measures for survival pre- diction models in external validation settings. BMC Medical Research Methodology, 17(1), 60. https://doi.org/10.1186/s12874-017-0336-2 [Google Scholar ][Crossref]
Schemper, M., & Stare, J. (1996). Explained variation in survival analysis. Statistics in Medicine, 15(19), 1999–2012. https://doi.org/10.1002/(SICI)1097-0258(19961015)15:19<1999::AID-SIM353>3.0.CO;2-D [Google Scholar ][Crossref]
Shumway, T. (2001). Forecasting bankruptcy more accurately: a simple hazard model. Journal of Business, 74(1), 101–124. https://www.jstor.org/stable/10.1086/209665 [Google Scholar ]
Sun, J., Jia, M. Y., & Li, H. (2011). Adaboost ensemble for finan- cial distress prediction: an empirical comparison with data from Chinese listed companies. Expert Systems with Applications, 38(8), 9305–9312. https://doi.org/10.1016/j.eswa.2011.01.042 [Google Scholar ][Crossref]
Tam, K. Y., & Kiang, M. Y. (1992). Managerial applications of neural networks: the case of bank failure predictions. Management Science, 38(7), 926–947. https://doi.org/10.1287/mnsc.38.7.926 [Google Scholar ][Crossref]
Tinoco, M. H., & Wilson, N. (2013). Financial distress and bankruptcy prediction among listed companies using accounting, market and macroeconomic variables. International Review of Financial Analysis, 30, 394–419. https://doi.org/10.1016/j.irfa.2013.02.013 [Google Scholar ][Crossref]
Twisk, J., Smidt, N., & de Vente, W. (2005). Applied analysis of recurrent events: a practical overview. Journal of Epidemiology and Community Health, 59, 706–710. http://dx.doi.org/10.1136/jech.2004.030759 [Google Scholar ][Crossref]
Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B., & Wei, L. J. (2011). On the C-statistics for evaluating overall adequacy of risk predic- tion procedures with censored survival data. Statistics in Medicine, 30(10), 1105–1117. https://doi.org/10.1002/sim.4154 [Google Scholar ][Crossref]
Verikas, A., Kalsyte, Z., Bacauskiene, M., & Gelzinis, A. (2010). Hybrid and ensemble-based soft computing techniques in bankruptcy prediction: a survey. Soft Computing, 14(9), 995–1010. https://doi.org/10.1007/s00500-009-0490-5 [Google Scholar ][Crossref]
Wang, Y. L., & Carson, J. (2010). Macroeconomic factors and insurer rating transitions. Forensic Economics EJournal. Wang, Yuling and Carson, James M., Macroeconomic Factors and Insurer Rating Transitions (February 14, 2010). http://dx.doi.org/10.2139/ssrn.1558456 [Google Scholar ][Crossref]
Wang, Z., & Deng, X. (2006). Corporate governance and financial distress. The Chinese Economy, 39(5), 5–27. https://doi.org/10.2753/CES1097-1475390501 [Google Scholar ][Crossref]
Wang, Z., & Li, H. (2007). Financial distress prediction of Chinese listed companies: a rough set methodology. Chinese Management Studies, 1(2), 93–110. https://doi.org/10.1108/17506140710758008 [Google Scholar ][Crossref]
Wei, L. J., Lin, D. Y., & Weissfeld, L. (1989). Regression analysis of multivariate incomplete failure time data by modelling marginal distributions. Journal of the American Statistical Association, 84(408), 1065–1073. https://doi.org/10.1080/01621459.1989.10478873 [Google Scholar ][Crossref]
Wruck, K. H. (1990). Financial distress, reorganization, and or- ganizational efficiency. Journal of Financial Economics, 27(2), 419–444. https://doi.org/10.1016/0304-405X(90)90063-6 [Google Scholar ][Crossref]
Wu, D., Liang, L., & Yang, Z. (2008). Analysing the financial distress of Chinese public companies using probabilistic neural networks and multivariate discriminate analysis. Socio-Economic Planning Sciences, 42(3), 206–220. https://doi.org/10.1016/j.seps.2006.11.002 [Google Scholar ][Crossref]
Yang, Z., You, W., & Ji, G. (2011). Using partial least squares and support vector machines for bankruptcy prediction. Expert Systems with Applications, 38(7), 8336–8342. https://doi.org/10.1016/j.eswa.2011.01.021 [Google Scholar ][Crossref]
Zhou, F., Fu, L., Li, Z., & Xu, J. (2022). The recurrence of financial distress: A survival analysis. International Journal of Forecasting, 38(3), 1100-1115. https://doi.org/10.1016/j.ijforecast.2021.12.005 [Google Scholar ][Crossref]

Figure 1. Kaplan-Meier survival curve.

Figure 2. Kaplan-Meier survival curve broken down by Deposits Level (1=low - 3=high).

Figure 3. Kaplan-Meier survival curve broken down by Asset Level (1=low - 3=high).

Table 1. Kaplan-Meier survival probabilities (survival) at different time points.

Figure 4. Results from different machine learning models.

Figure 5. Relative weight of each variable based on MTLR model.

Call: surfit(formula = Surv(time, status) ~ 1, data = data.train
time	n.risk	n.event	survival	std.error	lower 95% CI	upper 95% CI
50	381	14	0.9646	0.00930	0.9465	0.9830
100	353	30	0.8886	0.01583	0.8581	0.9202
120	158	193	0.4000	0.02465	0.3545	0.4514
150	41	117	0.1038	0.01535	0.0777	0.1387
180	16	25	0.0405	0.00992	0.0251	0.0655
230	5	11	.0127	0.00562	0.0053	.0302

Article Overview

Article Versions

Related Links

More by Authors Links

A Comparative Machine Learning Survival Models Analysis for Predicting Time to Bank Failure in the US (2001-2023)

Abstract

Cite This Paper

1. Introduction

2. Theorical perspective

3. Empirical Analysis

4. Results

5. Conclusion

Funding Statement

Acknowledgment

Declaration of Competing Interest

Notes

References