Estimation of Residential Property Market Price: Comparison of Artificial Neural Networks and Hedonic Pricing Model

The correct real estate property price estimation is significant not only in the real estate market but also in the banking sector for collateral loans and the insurance sector for property insurance. The paper focuses on both traditional and advanced methods for real estate property valuation. Attention is paid to the analysis of the accuracy of valuation models. From traditional methods, a regression model is used for residential property price estimation, which represents the hedonic approach. Modern advanced valuation methods are represented by the artificial neural network, which is one of the soft computing techniques. The results of both methods in residential property market price estimation are compared. The analysis is performed using data on residential properties sold on the real estate market in the city of Nitra in the Slovak Republic. To estimate the residential property prices, artificial neural networks trained with the Levenberg-Marquart learning algorithm, the Bayesian Regularization learning algorithm, and the Scaled Conjugate Gradient learning algorithm, and the regression pricing model are used. Among the constructed neural networks, the best results are achieved with networks trained with the Regularization learning algorithm with two hidden layers. Its performance is compared with the performance of the regression pricing model, and it can state that artificial neural networks can considerably improve prediction accuracy in the estimation of residential property market price.


1-Introduction
The estimation of the right price of residential property is essential not only in the real estate market but also for collateral loans and property insurance. It is also significant in business practice for the estimation of the value of a company's real estate property [1]. Legal persons and institutions can estimate the market price of residential property using traditional or advanced appraisal methods. Pagourtzi et al. (2003) ranked the comparative method, the profits method, the contractor's/cost method, and the regression models among the traditional methods [2]. By the comparative method, the price is estimated by comparison with selling prices of recently sold similar properties in the same market location. However, this is only possible in areas with a developed real estate market. The profit method for estimating the market price of a property can be applied to a property that generates revenue and the market value of the property is estimated by the potential cash flow from the property ownership. The contractor's/cost method is based on the principle of estimating how much it would cost at the time of sale to construct a property similar to that valued, taking into account its obsolescence and depreciation. The method is based on the assumption that the buyer will not pay more for the property than he would pay for the construction of a new building relative to the existing one. Regression models are often used as evaluation methods [3]. The regression models represent the hedonic approach to property appraisal in academic research studies [4][5][6]. It is a quantitative method of the comparable approach. "The methodology of hedonic prices allows us to estimate the value contributed by each of the attributes (physical or otherwise) to a property, and to make predictions about the behavior of the rest of the properties when any of these elements vary" [7]. Doumpos et al. (2020) analyzed automated valuation models for property price estimation [8]. The authors compared linear and nonlinear regression models developed with global, local, and locally weighted schemes. Their results indicate the effectiveness of simple linear models with locally weighted schemes.
In recent decades, soft computing techniques have considerably expanded their applications in a wide range of sectors such as computer engineering, industry, economics, financial markets, medicine, and more. They have also found their application in property appraisal. Thanks to them, real estate agencies, banks, and mortgage institutions can perform property pricing automatedly and with high accuracy. It maximizes the profit on the sale of real estate properties and reduces the risk of losses from collateral loans. Park and Bae (2015) analyzed four machine learning algorithms for house price prediction -C4.5, RIPPER (Repeated Incremental Pruning to Produce Error Reduction), Naïve Bayesian, and AdaBoost (Adaptive Boosting) [9]. They applied the machine learning algorithms to a data set that included real estate data, public school ratings, and mortgage rate data. Baldominos et al. (2018) compared another four machine learning algorithms -Support vector regression, k-nearest neighbors, Ensembles of regression trees, and Multi-layer perceptron [10]. Bin et al. (2020) used machine learning techniques for the estimation model and also for the fusion of multi-source urban data for the estimation model [11]. They proposed a multi-source urban data fusion algorithm to fuse house attributes, human activities, spatial features, and street-view images. A boosted regression tree then estimates property prices using fused metadata and expected levels. Hong et al. (2020) showed better predictive performance of machine learning-based predictor (property price predictor based on the Random Forest method) compared to the hedonic pricing model (ordinary-least square-based property price predictor) [12]. Moreno-Izquierdo et al. (2018) compared the performance of the Artificial Neural Network (ANN) and hedonic regression model for the price optimization procedure [7]. They estimated the rental price of Airbnb real estate and reported that ANN achieved considerable improvement in price accuracy. Georgiadis (2018) Analyzed the Performance of the Spatial Autoregressive (SAR) model, Geographically Weighted Regression (GWR), multiple linear regression (MLR), and ANN in real estate property price estimation [13]. He reported a slightly higher accuracy for the GWR model compared to ANN. Tabales et al. (2013), Kutasi and Badics (2016), and Abidoye and Chan (2018) conducted a comparison of the accuracy of real estate property appraisal using ANN and multiple regression analysis (MRA) representing hedonic pricing model [14][15][16]. Reported results show ANN's ability to achieve higher accuracy in price estimation. Chiarazzo et al. (2014) reported that the environmental quality of property location was a significant attribute in property appraisal using ANN [17]. Kang et al. (2020) compared forecasting models for real estate auction sale price developed through a regression model, ANN, and a genetic algorithm [18]. The forecasting model using the genetic algorithm had the best prediction accuracy. The authors reported that the appropriate criteria for the grouping process of the genetic algorithm were crucial in increasing the prediction accuracy of the model. They reported that grouping based on auction appraisal price was the most efficient. Research has shown that ANN is a suitable method for estimating market prices of residential property.
The paper aims to compare the predictive ability of the automated valuation model using ANNs and the hedonic pricing model using the regression method. It estimates market prices of properties sold on the real estate market in the city of Nitra in the Slovak Republic. Based on the performance of developed ANNs, it seeks for a suitable learning algorithm to increase the prediction accuracy of the ANN pricing model. It analyzes ANNs trained using the Levenberg-Marquart (LM) learning algorithm, the Bayesian Regularization (BR) learning algorithm, and the Scaled Conjugate Gradient (SCG) learning algorithm. The paper continues the research published in the conference paper Stubnova and Urbanikova (2019), which analyzed the use of ANNs in residential property appraisal [19]. The first chapter of the paper provides an overview of published research papers in the field of real estate property appraisal. The second chapter describes the principle of ANNs and used learning algorithms. The third chapter describes in detail the collection and processing of data used and the research methodology. The fourth chapter summarizes and interprets the results of the study, which are related to previous research studies. In the final chapter, conclusions are drawn from the results.

2-Artificial Neural Networks
Soft Computing techniques, including ANN, can process data with imprecisions, uncertainties, and approximations. Using complex algorithms, they can solve complex problems that are difficult to describe accurately by mathematical models [20]. Inspiration for ANN comes from the human brain. A significant similarity is in the ability of ANN to learn and thereby improve its performance. The basic building unit of the network is a simplified model of a biological neuron. Neurons process the information with an activation function. The individual neurons are connected by oriented weighted connections and are organized into layers to transmit the information. Figure 1 shows a multilayer neural network architecture. There are three types of layersinput layer, hidden layer, and output layer. They differ in the sources of their inputs and the use of their outputs. The input layer processes the data of the independent variables that are inputs to the ANN and transmits them to the next network layer. The hidden layer processes the outputs from previous layers and transmits them to the next layer. The output layer processes the outputs of the previous hidden layer and gives the value of the dependent variable as an output [21,22]. Figure 1 shows the feedforward neural network where the signal proceeds through directed connections in one directionforward.
ANNs learn and store acquired knowledge by adjusting the connection's weight values and neuron threshold values ( ). When training neural networks, several learning rules can be used. The paper compares the performance of the LM learning algorithm, the BR learning algorithm, and the SCG learning algorithm. These are variations on the Backpropagation algorithm. Li et al. (2012) described the Backpropagation algorithm in two steps [23]. In the first step is the operating signal propagated forward through the network layers. The difference between the real and the expected output is the error signal. In the second step, the error signal is backpropagated through the network, during this backpropagation, the weight values and threshold values are adjusted. The gradient descent method is used to minimize the error signal and therefore optimize the network performance.

2-1-Levenberg-Marquart Learning Algorithm
The LM learning algorithm is a combination of the gradient descent method and the Gauss-Newton method. It minimizes a non-linear function with a numerical solution [24]. According to Yu and Wilamowski (2011), the learning rule of the LM algorithm is given by: Where +1 and are components of the weight vector ;  is a combination coefficient with positive value; is the identity matrix; is the vector of training errors defined as = −̂, where are target values and ̂ are output values; is the Jacobian matrix defined as: Where is a number of weights, is a number of outputs and is a number of patterns [25].

2-2-Bayesian Regularization Learning Algorithm
The BR learning algorithm adds an additional term to a commonly used performance Equation 3. By adding this additional term, performance Equation 4 can penalize weights to improve network generalization ability. Parameters of performance functions are optimized according to the LM algorithm.
Where and are parameters to optimize, is the sum of squares of network weights, is a number of data points, = −̂, where are target values, ̂ are output values [26,27,28].

2-3-Scaled Conjugate Gradient learning algorithm
The SCG learning algorithm is based on the conjugate gradient method, which is suitable for large-scale problems. The step size scaling method is used to avoid line-search at each iteration, which makes the training process timeconsuming. The calculations used in the algorithm are described in detail by Møller (1993) [29].

3-Methodology and Data
The data were obtained from the internet real estate portal TopReality (2019) in the period from Sept. 9, 2019, to Sept. 13, 2019 [30]. The search criteria for published properties were: property type = apartment, locations = city of Nitra, bid category = sale. Based on these search criteria, 711 properties were selected from which mislabeled properties that did not meet some of the criteria, duplicate properties, and properties that did not include all monitored parameters were excluded. 256 properties were obtained after the selection. Table 1 shows the monitored property parameters. The categorical parameters were converted to dummy variables. Table 2 shows their descriptive statistics for categorical parameters. Table 3 shows descriptive statistics for quantitative variables.    Figure 2 shows the flowchart of the research methodology. The pricing models were developed using the MATLAB R2019b program. The performance of the hedonic pricing model and ANNs train with the LM learning algorithm, the BR learning algorithm, and the SCG learning algorithm were compared. In the paper, the regression model represents hedonic pricing models: Where is the th observation of the dependent variable, is the th observation of the 1 × vector of the independent variables, is the ( + 1) × 1 vector of the parameters, where 0 is the intercept, and is the random error for the th observation.
The data were divided into a training and test set at a ratio of 85:15 for the regression model. + 1 parameters , = 0, ⋯ , were estimated using the least square method of estimation. The stepwise regression method with backward elimination was used to decide which variables to include in the regression model. It starts with a full model containing all independent variables and eliminates one variable at each step. It chooses to eliminate the variable, which elimination will cause the least increase of the residual sum of square. It calculates -test value (significance level to stay) for each variable and eliminates the variable with the smallest value. The backward elimination terminates when none of the values is less than the critical value for elimination, therefore, all remaining variables in the model meet the criterion to stay [31].
The data were divided into a training, validation, and test set at a ratio of 70:15:15 for ANNs trained with the LM and the SCG learning algorithms. The data were divided into a training and test set at a ratio of 85:15 for ANNs trained with the BR learning algorithm and for the regression model. The determination coefficient ( 2 ), root mean square error ( ), mean absolute error ( ) and the mean absolute percentage error ( ) were used to evaluate performance of the pricing model.
Where ̅ is the mean of target values.

4-Results and Discussions
Twenty neural networks, four networks with one hidden layer and sixteen networks with two hidden layers, for each of the training algorithms, were developed and validated. The values 5, 10, 15, and 20 are used as the number of neurons in hidden layers. Table 4 shows the performance index values for developed networks. Networks trained with the BR learning algorithm with two hidden layers (10 neurons in the first hidden layer and 20 neurons in the second hidden layer) achieve the best values of 2 , and . Network trained with the BR algorithm with two hidden layers (15 neurons in both hidden layers) achieved the best value of . The difference in the results can be explained by the higher sensitivity of to large deviations. It is clear from Table 4 that networks trained whit the BR algorithm achieved significantly better results than networks trained with the LM and the SCG learning algorithms. A regression pricing model using the stepwise regression method with backward elimination was developed to provide a comparison with neural network models. The resulting model, containing 12 variables and an intercept, was constructed in nine steps. Table 5 shows the -test values for eliminated variables in each step. (10) Table 6 shows the performance index values for constructed regression model. The best performing ANNs trained with the BR algorithm achieved values 3.39% and 3.58%, on the contrary, the regression pricing model achieved value 8.41%. Table 7 shows the estimated property prices for seven selected real estate properties from the test set. Based on the results, it can be stated that pricing models based on ANNs achieved a better predictive ability than the pricing model based on the regression method. Better prediction accuracy of ANNs compared to regression pricing models was reported by Moreno-Izquierdo et al.  [7,[14][15][16]. These results are also supported by the literature review by Valier (2020), which examined research papers analyzing the accuracy of automated valuation models [32]. ANNs were indicated as more effective and reliable for mass evaluation of residential properties compared to regression models in 29 research papers. Compared to regression models, which were indicated as more effective compared to ANNS only in 6 research papers. Among the constructed neural networks, ANNs trained with the BR algorithm achieved the best results. Kayri (2016) analyzed the predictive ability of neural networks trained with BR and LM algorithms, he achieved higher predictive ability using the BR training algorithm [27]. The BR algorithm can approximate the price function well, despite a smaller data set containing 256 observations. The ability to merge training and validation data sets into a training set is an advantage of the BR algorithm for small data sets [26]. Based on previous research and the results of our work, it can be concluded that ANN with the BR learning algorithm and two hidden layers is suitable for the estimation of residential property market price.

5-Conclusion
The paper analyzes the predictive ability of the automated valuation model using ANNs and the hedonic pricing model using the regression method. Regression pricing models are the traditional methods of residential property appraisal. With the increase in the use of soft computing techniques in various areas, researchers began to explore the possibilities of their use in the valuation of real estate properties. The paper analyzed ANN's ability to accurately estimate the market price of residential properties sold on the real estate market in the city of Nitra in the Slovak Republic. 60 ANNs trained with the LM learning algorithm, the BR learning algorithm, and the SCG learning algorithm were constructed and validated. Based on the comparison of their prediction accuracy, it can be stated that neural networks trained with the BR learning algorithm achieved the best results in the estimation of the market price of residential properties. ANN trained with the BR learning algorithm comprising two hidden layers (10 neurons in the first hidden layer and 20 neurons in the second hidden layer) achieved the best results in the monitored performances indices 2 =0.9749, =3561.288, and =3.39%. ANN trained with the BR learning algorithm comprising two hidden layers (15 neurons in both hidden layers) achieved the best results in the monitored performance index = 6220.818. The regression pricing model constructed using the stepwise regression method with backward elimination achieved significantly worse values of the monitored performance indices R 2 =0.7898, RMSE=11454.91, MAE=8516.60, and MAPE=8.41%. The results of the analysis indicate the suitability of using ANN in the estimation of the market prices of residential properties. The use of the BR training algorithm, which achieved the highest predictive ability among the used training algorithms, is recommended.

6-Funding
This paper was supported by the University Grant Agency of Constantine the Philosopher University in Nitra UGA no. VII/16/2019.

7-Conflict of Interest
The author declares that there is no conflict of interests regarding the publication of this manuscript. In addition, the ethical issues, including plagiarism, informed consent, misconduct, data fabrication and/or falsification, double publication and/or submission, and redundancies have been completely observed by the authors.