Machine Learning for inconsistency detection in hospital cesarean bills
International Journal of Development Research
Machine Learning for inconsistency detection in hospital cesarean bills
Article History: Received 20th December, 2020; Received in revised form 26th December, 2020; Accepted 11th January, 2021; Published online 28th February, 2021
Copyright © 2021, Ana Carolina Dominicci de Souza et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited
Aims: Evaluate the most adequate Machine Learning models to the analysis of inconsistencies and irregularities on the final values on the bills presented to the health care plan operator. Methods: 1,602 medical bills’ receipts regarding caesarean hospitalizations, 50.20% of the receipts being inconsistent and 49.80% of the receipts not presenting inconsistencies. The selected documents are from the period between 2015 a 2019. Nine important variables on the charge receipts auditor ship were selected, logistic regression algorithm and K-Nearest Neighbors (KNN) algorithm were the classification ones and the observation set was divided in data to the training and the test, in order to verify if the model presented good predictive performance on both steps. Root Mean Squared Error (RMSE), confusion matrix, accuracy, sensibility and specificity were calculated and the Receiver Operating Characteristic curve was designed. Results: A 666.82 RMSE on the test phase, which is considered a expressive value, informing that the linear regression model didn’t get a good predictive performance on the study. KNN algorithm with a 91.20% accuracy level and 91.52% accuracy on logistic regression, on a 0.63 threshold (cutoff), showing a good prediction performance to both models and a small significant difference between them. Conclusions: The results found on this study show that only the KNN models and logistic regression present themselves as a satisfactory tool on the classification of inconsistent receipts. However, the logistic regression model was better because the KNN model needs a superior computational capacity and, when it is applied in a real scenario with a bigger quantity of data, the processing time would be slow. In the future, adopting the classification models, the medical bills auditor’s focus could be directed to the bills classified as inconsistent, dismissing the necessity of all the bills received, making the auditorship process assertive and agile.