The Benefits of Automated Machine Learning in Hospitality: A Step-By-Step Guide and AutoML Tool

Mauro Castelli, Diego Costa Pinto, Saleh Shuqair, Davide Montali, Leonardo Vanneschi

Abstract


The manuscript presents a tool to estimate and predict data accuracy in hospitality by means of automated machine learning (AutoML). It uses a tree-based pipeline optimization tool (TPOT) as a methodological framework. The TPOT is an AutoML framework based on genetic programming, and it is particularly useful to generate classification models, for regression analysis, and to determine the most accurate algorithms and hyperparameters in hospitality. To demonstrate the presented tool’s real usefulness, we show that the TPOT findings provide further improvement, using a real-world dataset to convert key hospitality variables (customer satisfaction, loyalty) to revenue, with up to 93% prediction accuracy on unseen data.

 

Doi: 10.28991/ESJ-2022-06-06-02

Full Text: PDF


Keywords


Artificial Intelligence; Automated Machine Learning; Behavioral Research; Hospitality.

References


Columbus. L., (2018). 10 Ways Machine Learning Is revolutionizing Sales. Available online: https://www.forbes.com/sites/ louiscolumbus/2018/12/26/10-ways-machine-learning-is-revolutionizing-sales/?sh=4c9cc92e3fd1 (accessed on April 2022).

Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., & Hutter, F. (2015). Efficient and robust automated machine learning. Advances in neural information processing systems, NeurIPS Proceedings, 28, 1-9.

Caputo, F., & Walletzký, L. (2017). Investigating the users’ approach to ict platforms in the city management. Systems, 5(1), 1–15. doi:10.3390/systems5010001.

Boto Ferreira, M., Costa Pinto, D., Maurer Herter, M., Soro, J., Vanneschi, L., Castelli, M., & Peres, F. (2021). Using artificial intelligence to overcome over-indebtedness and fight poverty. Journal of Business Research, 131, 411–425. doi:10.1016/j.jbusres.2020.10.035.

Humphreys, A., & Wang, R. J. H. (2018). Automated text analysis for consumer research. Journal of Consumer Research, 44(6), 1274–1306. doi:10.1093/jcr/ucx104.

Shrestha, Y. R., Krishna, V., & von Krogh, G. (2021). Augmenting organizational decision-making with deep learning algorithms: Principles, promises, and challenges. Journal of Business Research, 123, 588–603. doi:10.1016/j.jbusres.2020.09.068.

Melidis, C., Denham, S. L., & Hyland, M. E. (2018). A test of the adaptive network explanation of functional disorders using a machine learning analysis of symptoms. BioSystems, 165, 22–30. doi:10.1016/j.biosystems.2017.12.010.

Mullainathan, S., & Spiess, J. (2017). Machine Learning: An Applied Econometric Approach. Journal of Economic Perspectives, 31(2), 87–106. doi:10.1257/jep.31.2.87.

Ciechanowski, L., Jemielniak, D., & Gloor, P. A. (2020). Tutorial: AI research without coding: The art of fighting without fighting: Data science for qualitative researchers. Journal of Business Research, 117, 322–330. doi:10.1016/j.jbusres.2020.06.012.

Yarkoni, T., & Westfall, J. (2017). Choosing Prediction over Explanation in Psychology: Lessons from Machine Learning. Perspectives on Psychological Science, 12(6), 1100–1122. doi:10.1177/1745691617693393.

Chong, A. Y. L., Ch’ng, E., Liu, M. J., & Li, B. (2017). Predicting consumer product demands via Big Data: the roles of online promotional marketing and online reviews. International Journal of Production Research, 55(17), 5142–5156. doi:10.1080/00207543.2015.1066519.

Liu, X., Shin, H., & Burns, A. C. (2021). Examining the impact of luxury brand’s social media marketing on customer engagement: Using big data analytics and natural language processing. Journal of Business Research, 125, 815–826. doi:10.1016/j.jbusres.2019.04.042.

Mcshane, B. B., & Böckenholt, U. (2017). Single Paper Meta-analysis: Benefits for Study Summary, Theory-testing, and Replicability. Journal of Consumer Research, ucw085. doi:10.1093/jcr/ucw085.

Smith, L. W., & Rose, R. L. (2020). Service with a smiley face: Emojional contagion in digitally mediated relationships. International Journal of Research in Marketing, 37(2), 301–319. doi:10.1016/j.ijresmar.2019.09.004.

Moorman, C., van Heerde, H. J., Moreau, C. P., & Palmatier, R. W. (2019). JM as a Marketplace of Ideas. Journal of Marketing, 83(1), 1–7. doi:10.1177/0022242918818404.

Guo, X., Ye, Q., Law, R., Liang, S., & Zhang, Y. (2022). Power of apologetic responses in online travel community. International Journal of Hospitality Management, 103, 103208. doi:10.1016/j.ijhm.2022.103208.

Sánchez-Medina, A. J., & C-Sánchez, E. (2020). Using machine learning and big data for efficient forecasting of hotel booking cancellations. International Journal of Hospitality Management, 89, 102546. doi:10.1016/j.ijhm.2020.102546.

Kwon, W., Lee, M., & Back, K. J. (2020). Exploring the underlying factors of customer value in restaurants: A machine learning approach. International Journal of Hospitality Management, 91, 102643. doi:10.1016/j.ijhm.2020.102643.

Ahani, A., Nilashi, M., Ibrahim, O., Sanzogni, L., & Weaven, S. (2019). Market segmentation and travel choice prediction in Spa hotels through TripAdvisor’s online reviews. International Journal of Hospitality Management, 80, 52–77. doi:10.1016/j.ijhm.2019.01.003.

Olson, R.S., Moore, J.H. (2019). TPOT: A Tree-Based Pipeline Optimization Tool for Automating Machine Learning. Automated Machine Learning, The Springer Series on Challenges in Machine Learning, Springer, Cham, Switzerland. doi:10.1007/978-3-030-05318-5_8.

Hindman, M. (2015). Building Better Models. The ANNALS of the American Academy of Political and Social Science, 659(1), 48–62. doi:10.1177/0002716215570279.

Wang, W. M., Wang, J. W., Barenji, A. V., Li, Z., & Tsui, E. (2019). Modeling of individual customer delivery satisfaction: an AutoML and multi-agent system approach. Industrial Management and Data Systems, 119(4), 840–866. doi:10.1108/IMDS-07-2018-0279.

Gijsbers, P., Vanschoren, J., & Olson, R. S. (2018). Layered TPOT: Speeding up tree-based pipeline optimization. arXiv preprint arXiv:1801.06007. doi:10.48550/arXiv.1801.06007

Koza, J. R. (1992). Genetic programming: On the programming of computers by means of natural selection (1st Ed.). A Bradford Book, Pennsylvania, United States.

Poli, R., Langdon, W. B., & McPhee, N. F. (2008). A Field Guide to Genetic Programing. Lulu Enterprise, Morrisville, United States.

Osborne, J. W. (2013). Best practices in data cleaning: A complete guide to everything you need to do before and after collecting your data. SAGE Publications, Newbury Park, United States. doi:10.4135/9781452269948.

Suri, N. M. R., Murty, M. N., & Athithan, G. (2019). Outlier detection: techniques and applications. Springer Nature, Cham, Switzerland doi:10.1007/978-3-030-05127-3.

Zhang, Y., Meratnia, N., & Havinga, P. (2010). Outlier detection techniques for wireless sensor networks: A survey. IEEE communications surveys & tutorials, 12(2), 159-170. doi:10.1109/SURV.2010.021510.00088.

Kuhn, M., & Johnson, K. (2019). Feature engineering and selection: A practical approach for predictive models (1st Ed.). CRC Press, New York, United States. doi:10.1201/9781315108230.

Antonopoulou, H., Mamalougou, V., & Theodorakopoulos, L. (2022). The Role of Economic Policy Uncertainty in Predicting Stock Return Volatility in the Banking Industry: A Big Data Analysis. Emerging Science Journal, 6(3), 569-577. doi:10.28991/ESJ-2022-06-03-011.

Hossen, M. I., Goh, M., Hossen, A., & Rahman, M. A. (2020). A Study on the Aspects of Quality of Big Data on Online Business and Recent Tools and Trends towards Cleaning Dirty Data. 2020 11th IEEE Control and System Graduate Research Colloquium (ICSGRC). doi:10.1109/icsgrc49013.2020.9232648.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. the Journal of machine Learning research, 12, 2825-2830.

Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. doi:10.1109/4235.585893.

Zoller, M. A., & Huber, M. F. (2019). Benchmark and Survey of Automated Machine Learning Frameworks. arXiv. Learning, doi:10.48550/arXiv.1904.12054.

Yao, Q., Wang, M., Chen, Y., Dai, W., Li, Y. F., Tu, W. W., ... & Yu, Y. (2018). Taking human out of learning applications: A survey on automated machine learning. arXiv preprint arXiv:1810.13306. doi:10.48550/arXiv.1810.13306.

Banzhaf, W., Nordin, P., Keller, R. E., & Francone, F. D. (1998). Genetic programming: an introduction: on the automatic evolution of computer programs and its applications. Morgan Kaufmann Publishers Inc, Burlington, United States.

Balaji, A., & Allen, A. (2018). Benchmarking automatic machine learning frameworks. arXiv preprint arXiv:1808.06492. doi:10.48550/arXiv.1808.06492.

LeDell, E., & Poirier, S. (2020). H2o automl: Scalable automatic machine learning. 7th ICML Workshop on Automated machine Learning, 17-18 July, 2020, Vienna, Austria.

Dafflon, J., Pinaya, W. H. L., Turkheimer, F., Cole, J. H., Leech, R., Harris, M. A., … Hellyer, P. J. (2020). An automated machine learning approach to predict brain age from cortical anatomical measures. Human Brain Mapping, 41(13), 3555–3566. doi:10.1002/hbm.25028.

Bisong, E. (2019). Google Colaboratory. Building Machine Learning and Deep Learning Models on Google Cloud Platform. Apress, Berkeley, United States. doi:10.1007/978-1-4842-4470-8_7.

Kingsford, C., & Salzberg, S. L. (2008). What are decision trees? Nature Biotechnology, 26(9), 1011–1012. doi:10.1038/nbt0908-1011.

Raileanu, L. E., & Stoffel, K. (2004). Theoretical comparison between the Gini Index and Information Gain criteria. Annals of Mathematics and Artificial Intelligence, 41(1), 77–93. doi:10.1023/B:AMAI.0000018580.96245.c6.

Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics and Data Analysis, 38(4), 367–378. doi:10.1016/S0167-9473(01)00065-2.

Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222. doi:10.1023/B:STCO.0000035301.49549.88.

Schölkopf, B., Mika, S., Burges, C. J. C., Knirsch, P., Müller, K. R., Rätsch, G., & Smola, A. J. (1999). Input space versus feature space in kernel-based methods. IEEE Transactions on Neural Networks, 10(5), 1000–1017. doi:10.1109/72.788641.

Castelli, M., Clemente, F. M., Popovič, A., Silva, S., & Vanneschi, L. (2020). A Machine Learning Approach to Predict Air Quality in California. Complexity, 2020, 1–23. doi:10.1155/2020/8049504.

Ho, C. H., & Lin, C. J. (2012). Large-scale linear support vector regression. Journal of Machine Learning Research, 13(1), 3323–3348.

Wold, S., Esbensen, K., & Geladi, P. (1987). Principal component analysis. Chemometrics and Intelligent Laboratory Systems, 2(1-3), 37–52. doi:10.1016/0169-7439(87)80084-9.

Agrapetidou, A., Charonyktakis, P., Gogas, P., Papadimitriou, T., & Tsamardinos, I. (2021). An AutoML application to forecasting bank failures. Applied Economics Letters, 28(1), 5–9. doi:10.1080/13504851.2020.1725230.

Tsiakmaki, M., Kostopoulos, G., Kotsiantis, S., & Ragos, O. (2020). Implementing autoML in educational data mining for prediction tasks. Applied Sciences (Switzerland), 10(1), 90 2–27. doi:10.3390/app10010090.

Orlenko, A., Kofink, D., Lyytikäinen, L.-P., Nikus, K., Mishra, P., Kuukasjärvi, P., … Moore, J. H. (2019). Model selection for metabolomics: predicting diagnosis of coronary artery disease using automated machine learning. Bioinformatics, 36(6), 1772–1778. doi:10.1093/bioinformatics/btz796.

Liu, X. (2020). Analyzing the impact of user-generated content on B2B Firms’ stock performance: Big data analysis with machine learning methods. Industrial Marketing Management, 86, 30–39. doi:10.1016/j.indmarman.2019.02.021.

Cui, T. H., Ghose, A., Halaburda, H., Iyengar, R., Pauwels, K., Sriram, S., … Venkataraman, S. (2020). Informational Challenges in Omnichannel Marketing: Remedies and Future Research. Journal of Marketing, 85(1), 103–120. doi:10.1177/0022242920968810.

Shuqair, S., Pinto, D. C., & Mattila, A. S. (2019). Benefits of authenticity: Post-failure loyalty in the sharing economy. Annals of Tourism Research, 78, 102741. doi:10.1016/j.annals.2019.06.008.

Shuqair, S., Pinto, D. C., & Mattila, A. S. (2021). An empathy lens into peer service providers: Personal versus commercial hosts. International Journal of Hospitality Management, 99, 103073. doi:10.1016/j.ijhm.2021.103073.

Mohamadou, Y., Halidou, A., & Kapen, P. T. (2020). A review of mathematical modeling, artificial intelligence and datasets used in the study, prediction and management of COVID-19. Applied Intelligence, 50(11), 3913–3925. doi:10.1007/s10489-020-01770-9.

Peters, D. P. C., McVey, D. S., Elias, E. H., Pelzel‐McCluskey, A. M., Derner, J. D., Burruss, N. D., … Rodriguez, L. L. (2020). Big data–model integration and AI for vector‐borne disease prediction. Ecosphere, 11(6). doi:10.1002/ecs2.3157.

Antonio, N., Almeida, A. de, & Nunes, L. (2017). Predicting hotel booking cancellations to decrease uncertainty and increase revenue. Tourism & Management Studies, 13(2), 25–39. doi:10.18089/tms.2017.13203.

Caicedo-Torres, W., Payares, F. (2016). A Machine Learning Model for Occupancy Rates and Demand Forecasting in the Hospitality Industry. Advances in Artificial Intelligence - IBERAMIA 2016. Lecture Notes in Computer Science, 10022. Springer, Cham, Switzerland. doi:10.1007/978-3-319-47955-2_17.

Xin, D., Wu, E. Y., Lee, D. J.-L., Salehi, N., & Parameswaran, A. (2021). Whither AutoML? Understanding the Role of Automation in Machine Learning Workflows. Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. doi:10.1145/3411764.3445306.

De Bruyn, A., Viswanathan, V., Beh, Y. S., Brock, J. K. U., & von Wangenheim, F. (2020). Artificial Intelligence and Marketing: Pitfalls and Opportunities. Journal of Interactive Marketing, 51, 91–105. doi:10.1016/j.intmar.2020.04.007.

Mustak, M., Salminen, J., Plé, L., & Wirtz, J. (2021). Artificial intelligence in marketing: Topic modeling, scientometric analysis, and research agenda. Journal of Business Research, 124, 389–404. doi:10.1016/j.jbusres.2020.10.044.

Kakatkar, C., Bilgram, V., & Füller, J. (2020). Innovation analytics: Leveraging artificial intelligence in the innovation process. Business Horizons, 63(2), 171–181. doi:10.1016/j.bushor.2019.10.006.


Full Text: PDF

DOI: 10.28991/ESJ-2022-06-06-02

Refbacks

  • There are currently no refbacks.


Copyright (c) 2022 Mauro Castelli, Diego Costa Pinto, Saleh Shuqair, Davide Montali, Leonardo Vanneschi