Customer churn prediction in the software as a service industry
Author | Affiliation | ||
---|---|---|---|
Centre for Applied Research and Development | |||
LT | Centre for Applied Research and Development | ||
Centre for Applied Research and Development | |||
Centre for Applied Research and Development | |||
Centre for Applied Research and Development |
Date | Start Page | End Page |
---|---|---|
2023 | 99 | 99 |
In the modern commercial environment characterised by a plethora of alternatives available to consumers for identical products, customer retention plays a pivotal role in sustainable business success. This research investigates customer churn prediction through the application of a diverse array of machine learning algorithms, including logistic regression, support vector machines, decision trees, random forests, and gradientboosted trees. We use real-world data obtained from a company specialising in offering subscription-based services designed to enhance individuals’ personal development. The dataset included business-related customer data such as money spent, the last payment date, total orders completed, and customer platform usage data, including the number of activities completed and the timeframe since account creation, etc. Several experiments were conducted, involving the exploration of various feature subsets obtained via “Boruta”, “Boruta Shap”, decision tree feature importance, and correlation coefficient techniques to identify the most promising feature set within different prediction time horizon windows. The trained models underwent evaluation based on multiple performance metrics, including accuracy, precision, recall, and F1 score. This investigation concluded that the gradient-boosted trees algorithm emerged as the most promising model for predicting customer churn, delivering an impressive overall accuracy of 95.5%.