Modified Weighted Mean Filter to Improve the Baseline Reduction Approach for Emotion Recognition

Participants' emotional reactions are strongly influenced by several factors such as personality traits, intellectual abilities, and gender. Several studies have examined the baseline reduction approach for emotion recognition using electroencephalogram signal patterns containing external and internal interferences, which prevented it from representing participants’ neutral state. Therefore, this study proposes two solutions to overcome this problem. Firstly, it offers a modified weighted mean filter method to eliminate the interference of the electroencephalogram baseline signal. Secondly, it determines an appropriate baseline reduction method to characterize emotional reactions after the smoothing process. Data collected from four scenarios conducted on three datasets was used to reduce the interference and amplitude of the electroencephalogram signals. The result showed that the smoothing process can eliminate interference and lower the signal's amplitude. Based on the three baseline reduction methods, the Relative Difference method is appropriate for characterizing emotional reactions in different electroencephalogram signal patterns and has higher accuracy. Based on testing on the DEAP dataset, these proposed methods achieved accuracies of 97.14, 99.70, and 96.70% for the four categories of emotions, the two categories of arousal, and the two categories of valence, respectively. Furthermore, on the DREAMER dataset, these proposed methods achieved accuracies of 89.71, 97.63, and 96.58% for the four categories of emotions, the two categories of arousal, and the two categories of valence, respectively. Finally, on the AMIGOS dataset, these proposed methods achieved accuracies of 99.59, 98.20, and 99.96% for the four categories of emotions, the two categories of arousal, and the two categories of valence, respectively.

Although several baseline reduction methods have been applied to deep learning, recording the EEG signal without external or internal interference is complex even though the participants are calm [5,9]. Internal disturbances experienced during calm conditions included Electrooculograms (EOG), Electrocardiograms (ECG), Electromyograms (EMG), and emotional reactions [5,13]. External interference is often caused by electrode power lines. This disturbance constantly increases the EEG baseline signal amplitude fluctuation in the long term [14][15][16][17]. It also leads to its inability to represent a neutral condition. On the contrary, Narayana et al. (2019) [6], and Zhuang et al. (2018) [7] stated that neutral states are usually fulfilled if the EEG signal amplitude is lower than the concentration and emotional conditions. It is essential to examine the appropriate method used to reduce the interference encountered when recording the baseline EEG signal in this study. Several approaches have been employed to eliminate interference or artefacts, including regression, Wavelet Transform, Independent Component Analysis (ICA) [18][19][20][21], and Principal Component Analysis (PCA). However, these algorithms focus only on detecting and removing some of them, such as EOG, ECG, and EMG.
Another approach this is commonly used is smoothing. This process improves data quality by converting the noise into a smoother signal. Among several others, the Mean Filter method reduces interference in EEG signals by smoothing the short-term amplitude fluctuations. This method has several advantages, such as producing a low MSE value and low computational time compared to the other methods [22]. In contrast, the Mean Filter method cannot reduce the amplitude of the EEG baseline signal in the long term [23,24], thereby leading to the use of weighted values. This was derived from the baseline EEG signal amplitude. It was used to maintain its pattern irrespective of the fact that the amplitude was lowered. Considering that the baseline EEG signal has non-stationary characteristics, the Z-score normalization can be used to determine its weight, as well as normalize its data [25]. Modification of the Mean Filter approach by adding a weight value based on the normalization of the Z-Score is known as the Modified Weighted Mean Filter method.
The earlier description led to the formulated problem that the baseline EEG signal is disturbed both internally and externally, therefore, it is unable to represent each participant's neutral condition. This led to the proposition of two main contributions, namely (1) Offering a Modified Weighted Mean Filter method to eliminate the interference of the EEG baseline signal. (2) Determining an appropriate baseline reduction method to characterize emotional reactions in different EEG signal patterns using the feature values after smoothing with the MWMF approach. Based on the literature review from 2015 to 2021, the proposed contribution has never been evaluated in previous studies [3].

2-Related Works
The baseline reduction approach has been proven to increase the accuracy of emotional recognition based on EEG signals [4,10]. This was detected by subtracting the feature value of the experiment EEG signal by the average baseline. This approach aims to produce signals representing different emotional reactions [12]. The emotional and neutral states are represented by experiment and baseline EEG signals, respectively. The baseline EEG signal was recorded before the participants were given the stimulus medium. They were expected to be calm and free from emotional reactions [16,26,27]. Neutral conditions have lower amplitude values than meditation, concentration, and emotional states [6,7].  first researched a baseline reduction approach [4]. This process employed the feature values of the experiment EEG signal and the average baseline. These were obtained from the extraction procedure using the Differential Entropy method. This is a suitable approach because it tends to characterize the EEG signal with respect to the time series and its frequency [28,29]. The value of the DE is obtained using Equation 1 [3,30]: The value of is represented as Euler's constant, 2 depicts variance, and ℎ is the DE for each segment of the EEG signal. The DE feature values from the baseline and experiment EEG signals were used for the reduction process.  stated that the baseline reduction process employs the Difference method. The result of the reduction process is equivalent to the feature value of the experiment EEG signal. Furthermore,  represented the value of the DE feature from an experiment EEG signal in a 3D Cube. This is spatially represented by the frequencies and EEG channels. The DE feature value from the experiment signal described in the 3D Cube is then used as input data in the classification method. The technique proposed by  involves using the CNN method. To avoid the loss of feature information from the input data (matrices 9 × 9), the CNN method proposed by  uses the SAME padding and does not employ pooling. As shown in Figure 1, the 1 st 3D Cube data (input data) was convoluted four times. The 1 st , 2 nd , and 3 rd use a 4 × 4 filter, and the stride value are 1. The 4 th convolution uses a 1 × 1 filter, with a stride value of 1. The feature map of the fourth convolution was reshaped and connected to the hidden layer (fully connected). Each node was connected to the output layer (two outputs) from the hidden one. The convolution process was then activated using a Rectified Linear Unit (ReLU). Based on some tests on the DEAP dataset, average accuracy of 89.45% and 90.24% were obtained for recognizing the valence and arousal emotions, respectively. A baseline reduction approach triggered emotional recognition based on EEG signals to produce a higher accuracy [4,8].

Figure 1. The architecture of the CNN method for four emotion classes [4]
Nevertheless, studies on the baseline reduction approach are still being carried out, such as that by . A combination of CNN and LSTM methods was proposed to optimize the extraction and classification processes. These were used for the extraction process and emotion classification, respectively. Based on tests on the DEAP dataset, the proposed baseline reduction approach produced an accuracy of 90,80% and 91,03% for valence and arousal emotions, respectively [8]. In addition to using a hybrid method for the extraction and classification processes, Cheng et al. (2020) attempted to optimize the baseline reduction approach. This led to the proposed extraction and classification processes using the multi-Grained Cascade Forest (gcForest) technique. Based on tests carried out on the DEAP, the proposed baseline reduction approach accurately identified arousal and valence values of 97.53% and 97.69%, respectively. Relating to the tests on the DREAMER dataset, accuracies of 90.41% and 89.03% for arousal and valence emotions were obtained, respectively [9]. Liu et al. (2020) proposed a Capsule Network method considering the spatial information of EEG signals in the baseline reduction approach. Based on analysis performed on the DEAP dataset, the average accuracies for arousal and valence were 98.31% and 97.97%, respectively. Concerning the DREAMER dataset, accuracies of 94.59% and 95.26% were obtained for arousal and valence, respectively [10]. Its usage to identify two categories of emotions caused Zhao et al. (2020) [11] to use the baseline reduction approach to identify four of them, namely high arousal and positive valence, high arousal and negative valence, low arousal and positive valence, and low arousal and negative valence. This investigation involved using the CNN method for the extraction and classification processes. The tests carried out on the DEAP, and AMIGOS datasets yielded mean accuracies of 93.53% and 95.86% for the four emotional classes.
In addition to the extraction and classification processes, other efforts to optimize the baseline reduction approaches were carried out by Wirawan et al. (2021). This led to the examination of three baseline reduction methods, namely the Difference, Relative, and Fractional Difference methods [4,12,31]. In the Difference method, the baseline reduction process involves subtracting the value of the experiment EEG signal feature by the average feature value of the baseline EEG signal feature. In the Relative Difference method, the baseline reduction is achieved by dividing the feature value of the experiment EEG signal by the average feature value of the baseline EEG signal feature. The Fractional Difference method is a combination of Difference and Relative Difference approaches. The Difference method's baseline reduction process was first applied by   [4]:  The Difference Method: The baseline reduction process is achieved by subtracting the value of the DE feature from the experiment EEG signal by the average on the baseline EEG signal. The Difference method is obtained using Equation 2:  The Relative Difference Method: The baseline reduction process is achieved by dividing the value of the DE feature from the experiment EEG signal by the average on the baseline EEG signal [12] which is obtained using Equation 3 :  The Fractional Difference Method: The Fractional Difference method is obtained using Equation 4. It combines the Difference and Relative Difference methods. The value of the DE feature in the experiment EEG signal is subtracted from the average on the baseline EEG signal [12]. The result was then divided by the average value of DE features in the baseline EEG signal: wherein _ ( ) denotes the DE feature value of the x frequency band of the j th index of an experiment EEG signal.
_ ( ) is the baseline EEG signal's average DE feature value of the x frequency band. _ ( ) denotes the final DE feature value of the x frequency band at the j th index of an experiment EEG signal. The _ ( ) calculation process is given in Equation 5: wherein _ ( ) defines the DE feature value of the frequency band x at the j th index of the EEG baseline signals. N denotes the number of DE feature values of EEG baseline signals, while C is the number of channels. The _ ( ) indicates the average DE features for the frequency band x indicates the average DE features for the frequency band x on the EEG baseline signal. Based on the classification process carried out with the CNN method, tested on the DEAP dataset, the Difference, Relative, and Fractional Difference methods were used to obtain 81.25, 82.16, and 82.10% accuracy of emotional arousal, respectively. Sequentially, these were also used to obtain an accuracy of 80.61, 81.37, and 81.47%, for valence emotions. These three basic reduction methods produced significantly similar accuracies. It is essential to carry out further studies to determine an appropriate baseline approach for characterizing emotional reactions in different EEG signal patterns [12].
Although several reduction methods have been applied to deep learning, recording the baseline EEG signal without external or internal interference is complex even when participants are calm. These internal disturbances include Electrooculograms (EOG), Electrocardiograms (ECG), Electromyograms (EMG), and emotional reactions. External interference is often caused by the electrode power lines, and it constantly increases the fluctuation of the EEG baseline signal amplitude in the long term [14][15][16][17]. This causes the signal to be unable to represent a neutral condition. Three public datasets based on EEG signals are often used to investigate emotional recognition, particularly studies that adopted baseline reduction approaches, such as DEAP, DREAMER, and AMIGOS. In accordance with the random selection of these participants, the signal amplitude of the baseline and experiment EEG signals were the same, as shown in Figures 2 to 4.
The raw EEG signals obtained from the first trial with the first participant are shown in Figure 2. In the DEAP dataset, the duration of each experiment was 63 seconds. The first three seconds are the baseline EEG signals, while the remaining 4 to 63 s are that of the experiment. Additionally, the 128 Hz sampling rate represents each second.  Figure 3 shows the raw EEG signals of the eighth participant during the eighth trial. The overall duration of the experiment is 75.9 seconds. In the AMIGOS dataset, the first five seconds are the baseline EEG signals, while the remaining 6 to 75.9 seconds are that of the experiment. Furthermore, the 128 Hz sampling rate represents each second. Figure 4 shows the raw EEG signals obtained from the tenth participant in the tenth trial, and the overall duration was 72 s. In the DREAMER dataset, the first five seconds were the baseline EEG signals, while the remaining 6 to 72 s were that of the experiment. Coincidentally, the 128 Hz sampling rate represents each second. The baseline EEG signal amplitude is same than the experiment EEG signal. Both external and internal interferences cause this problem during the recording process [5,9,32]. Internal disturbances include Electrooculograms (EOG), Electrocardiograms (ECG), Electromyograms (EMG), and emotional reactions. The power line current often causes external interference at each electrode [14][15][16]. This constantly increases the fluctuations of the EEG signal amplitude in the long term, thereby causing the baseline to be unable to represent neutral conditions [16,17]. Conversely, the studies by Narayana et al. (2019) [6] and Zhuang et al. (2018) [7] and reported that the neutral state is fulfilled if the EEG signal amplitude is less than that of the concentration and emotional states.
This study designed a Modified Weighted Mean Filter (MWMF) method to reduce interference as well as represent the neutral condition in the baseline EEG signal. Interestingly, this technique is a development of the Mean Filter method in which a weight value is added, and its value is discerned using Equation 6: where xi denotes the input value of the EEG signal at the i th index, 2(n +1) s the window length, and zj indicates the current output value at the j th index. Since the determination of the window lengths significantly affects the extent of delay and latency, an increase in its value triggers the extent of delay elements and latency, and vice versa [33]. Based on these conditions, the window length used in this study was 1 (n=1). The Mean Filter method reduces noise in the baseline EEG signal through a smoothing process [22]. Additionally, it is essential to examine the use of weights in this approach to represent the neutral condition of baseline EEG signals. The purpose was to reduce its amplitude. A neutral state tends to be achieved by reducing the baseline EEG signals. Furthermore, weight values were generated by normalizing its amplitudes. It was also used to maintain the same baseline EEG signal pattern even though the value of the amplitude was lowered. The Z-score method was employed to normalize the baseline EEG signals. Its use for the normalization process tends to overcome the non-stationary values in the EEG signal. Several critical procedures need to be considered when developing EEG signal-based emotional recognition models, such as feature extraction, representation, and classification [3,34,35]. The determination of these frameworks refers to the research carried out by Wu et al.  [4,12]. Secondary datasets such as DEAP, DREAMER, and AMIGOS, were used to validate the emotional recognition.

3-The Proposed Model
Based on the two contributions proposed in this study, the stages of emotional recognition concerning EEG signals are arranged, as shown in Figure 5.

Figure 5. Emotional recognition is based on EEG signal stages
There were four emotional recognition steps, two of which are blue rectangles, and contributed to this study. The first is a smoothing approach, namely, the Modified Weighted Mean Filter technique used to eliminate interference and decrease the amplitude of the baseline EEG signals. The second contribution is to determine which of the three reduction methods is appropriate for characterizing emotional reactions from different EEG signal patterns based on those that have been smoothed. Furthermore, this model uses EEG signal data to recognize the four and two emotional classes. This study applied three public datasets, namely DEAP, DREAMER, and AMIGOS. These have different characteristics regarding the number of channels used, trial mechanisms performed individually or in groups, and the duration. The use of these three datasets was crucial for validating the proposed model.

3-1-1-Segmentation
This process divides EEG signals into baseline and experiment segments. Regarding the DEAP dataset, the first three seconds are the baseline signals ( 0 … 383 ), while the fourth to the sixty-third s are that of the experiment ( 384 … 8063 ). Meanwhile, in DREAMER and AMIGOS, the first five seconds are the baseline signals, while the next second is that of the experiment. Figure 6 shows the segmentation process for the DEAP dataset on the Fp1 channel for the first participant [4].

Figure 6. EEG signal segmentation processes on channel Fp1
The segmentation process was performed for each channel during the diverse trials. The DEAP dataset contains 32 channels and 40 trials for each participant. Meanwhile, that of DREAMER includes 14 channels and 18 trials. The AMIGOS dataset consists of 14 channels and 20 trials.

3-1-2-Smoothing Baseline Signal (Contribution)
A neutral state is achieved by eliminate disturbances and decrease the amplitude the baseline EEG signals, and this process is the main contribution of this study. Furthermore, a Modified Weighted Mean Filter (MWMF) method was proposed to eliminate disturbances and decrease the amplitude of the EEG baseline signals. Weight values were generated by normalizing their amplitudes. These were also used to maintain the same baseline EEG signal pattern even though its amplitude was reduced. The Z-score technique was used to normalize the baseline EEG signals and to overcome the non-stationary values. Based on Equation 6, the baseline procedure involving the use of the MWMF method consisted of three processes, namely normalization, padding, and smoothing.

 Normalization
The values resulting from the normalization process were converted using an absolute function (ABS). This aims to produce a positive weight value to overcome the problem of outliers when smoothing the baseline EEG signal data. Table 1 shows the values obtained using Z-Score normalization, which lasted for three seconds (384 Hz). Based on Table 1, Equation 7 is used to calculate the weight value determined using the Z-Score normalization method [25,36].
̅ is the average value of the EEG baseline signal data, and is the standard deviation. is the value of the EEG baseline signal data at the i th index, and is the weighted value at the i th index. The value of i = 0, 1, 2, 3,….., M-1, where M is the number of data points (sampling rate).

 Padding
The padding process is performed by adding a null value (0) both at the beginning and end of the EEG baseline signal data ( ) and weight value data ( ). It allows the MWMF method to perform the smoothing operation on all baseline EEG signal data ( ) and weight value data ( ). Given that the window length is one (1), the padding process is carried out by adding a value of zero (0) both at the beginning and end of the data. Table 2 shows the padding process for the three s baseline signal on the DEAP dataset. In addition, the 1 (s) EEG signal had 128 sampling rates. At 3 (s), the baseline signal contains 384 sampling rates. In this process, the three-second baseline EEG signal produces and values of 386 data points for the DEAP dataset. The DREAMER and AMIGOS datasets had and values of 642 data points in five seconds.

 Smoothing Stage
After the weight and padding processes were performed, the baseline EEG signal was smoothed using the MWMF method. Table 3 shows the smoothing process for the baseline EEG signal data (DEAP dataset), starting from the 2 nd to 385 th data points.  Table 3, the smoothing process of the EEG baseline signal using the MWMF method is represented by Equation 8: In the MWMF method, the values of = n, n+1, n+2……, m+2n, while m is the amount of data. Considering that the total weight needs to meet the requirements of ∑ = 1, determining the weight value of the Z-Score is carried out as follows ∑ + =− [37].

3-1-3-Decomposition
Decomposition was carried out on the smoothed baseline and experiment EEG signals. A bandpass filter was used to convert both signals into four frequency bands. This method filters the frequency of the EEG signals based on the low and high-pass ranges. Table 4 shows the low and high-pass fields for each frequency band. Decomposition was performed for all channels on the baseline and experiment EEG signals, as shown in Table 5 for the Fp1 channel.

3-2-1-Extraction Features
The extraction process was carried out to obtain the relevant features of the EEG signal. This research employed the Differential Entropy (DE) method, as shown in Equation 1. The feature extraction was performed every second (128 sampling rate/data) on the smoothed baseline and the experiment EEG signal in each frequency band, as shown in Table  6 for the Fp1 channel.   The DEAP dataset produced three DE features for the baseline (ℎ 1 (x) -ℎ 3 (x)) and 60 for the experiment EEG signal (ℎ 4 (x) -ℎ 63 (x)) in each channel, frequency band, trial, and participant. The DREAMER and AMIGOS datasets produced five DE features for the baseline EEG signals. On the contrary, the resulting one from the experiment EEG signal corresponds to each participant's trial duration in the respective channels, frequency band, and participant.

3-2-2-Baseline Reduction (Contribution)
After the feature values of the smoothed baseline and the experiment EEG signals were obtained, a reduction process was performed. Three baseline reduction methods were examined to determine the most appropriate one for describing emotional responses from different EEG signal patterns. The DE features from both signals were used for the baseline reduction. This research reviewed three reduction processes, namely the Difference, Relative, and Fractional Difference methods. The average value of the DE features on the smoothed baseline EEG signal for each frequency band in a channel and during an experiment on one participant concerning the DEAP dataset was calculated as the initial step. Based on Equation 5, the calculation process is as follows: Next, using Equations 2 to 4 the baseline reduction processes for the Difference, Relative, and Fractional Difference methods for each frequency band are shown in Table 7.  Table. 7, 60 DE features ( _ 4 -_ 63 ) were generated for one channel (Fp1), frequency band, trial, participant, and baseline reduction method on the DEAP dataset.

3-2-3-Representation Features
After the baseline reduction process for all frequency bands, EEG channels, and trials for one participant was completed (as shown in Table 7), the feature representation procedure was performed. These were represented based on the International System 10-20 standard. The reduced DE feature of the experiment EEG signal ( _ ) was mapped onto 9 × 9 matrices for each frequency band. This matrix describes the placement position of all channels on the head. A combination of each frequency band was represented using the 3D Cube method. Figure 7 shows the feature representation process for all channels in the DEAP dataset (32 channels).

3-3-Classification
The DE features of the experiment EEG signal represented in the 3D Cube are used as the input data in the classification process. This study employed the Convolutional Neural Network (CNN) method for this procedure. It involves adopting a CNN architecture based on Yang et al. (2018) that designed two or four emotional categories, as shown in Figure 1. However, the two emotional categories were high and low for arousal and valence emotions. By contrast, the four emotional categories include high arousal and positive valence (HAPV), high arousal and negative valence (HANV), low arousal and negative valence (LANV), and low arousal and positive valence (LAPV). This study uses the L2 regularization and Adam Optimizer methods to calculate and update the loss value. Several parameters were determined, including the learning rate (1e-4), epoch (50), and batch size (128).

3-4-Assessment Model
There were two assessment models, namely evaluation and performance parameters. The evaluation model used in this research was a K-fold cross-validation method, with a value of 10. This was carried out on each participant using three public datasets. The evaluation process results were measured using the following parameters, namely accuracy, precision, recall, and F1 rate.

4-Results and Discussion
Based on the tests carried out on the DEAP, DREAMER, and AMIGOS datasets, the smoothing process performed using the Modified Weighted Mean Filter method tends to reduce the interference and the amplitude of the EEG baseline signal. The aim was to represent the neutral condition of the participants. Figure 8 shows the baseline EEG signal patterns before and after smoothing using the MWMF method. The application of the MWMF method was used to prove that the baseline EEG signal can represent a neutral condition. Furthermore, its smoothed features were used to reduce the experiment EEG signals. This process aims to characterize emotional reactions according to the different signal patterns. The validation process was carried out using four scenarios for each dataset. The essence was to evaluate the ability of the MWMF method to represent neutral conditions on the EEG signal and assess its impact on optimizing the baseline reduction approach.
 The first test scenario involves using a three-second smoothed baseline EEG feature for the DEAP dataset (with MWMF). The DREAMER and AMIGOS datasets use a five-second smoothed baseline EEG feature (with MWMF). Furthermore, this baseline signal feature was used for its reduction. Three basic methods were employed, namely Difference, Relative, and Fraction Differences. In addition, this scenario examined the baseline reduction process utilizing an unsmoothed EEG feature (without MWMF).
 In the second scenario, the smoothing process used the first three seconds of the experiment EEG signal contained in the DEAP dataset (with MWMF). It adopted the first five seconds contained in the DREAMER and AMIGOS datasets (with MWMF). Furthermore, this experiment signal feature was used for baseline reduction. Three basic methods were used: Difference, Relative, and Fraction Differences. In addition, this scenario examined the baseline reduction process utilizing an unsmoothed experiment EEG feature (without MWMF).
 In the third scenario, the smoothing process used the last three seconds of the EEG experiment signal in the DEAP dataset (with MWMF). It also used the last five seconds in the DREAMER and AMIGOS datasets (with MWMF). This experiment signal feature was used for baseline reduction. Three basic methods, namely Difference, Relative, and Fraction Differences, were also adopted. In addition, this scenario examined the baseline reduction process utilizing an unsmoothed experiment EEG feature (without MWMF).
 In the fourth scenario, the smoothing process uses three seconds in the middle of the experiment EEG signal contained in the DEAP dataset (with MWMF). It also employed five seconds in the middle of the signal contained in the DREAMER and AMIGOS datasets (with MWMF). This feature was used for baseline reduction, as well as three basic methods, namely Difference, Relative, and Fraction Differences. In addition, this scenario examined this procedure by utilizing an unsmoothed experiment EEG feature (without MWMF).
The total number of validation processes is twenty-four for each dataset. Figure 9 shows the validation process of the proposed method on the DEAP dataset. Each test results are the average accuracies value for the four emotional classes. Additionally, statistical analysis was carried out using the Wilcoxon test to ascertain the twenty-four validation processes on each dataset. It was also used to determine the significant increase or decrease of the four emotional classes for the three-baseline reduction approach with or without the MWMF method. This led to the development of two hypotheses.
 Ho: There is an insignificant difference in the accuracy of the baseline reduction methods with and without the MWMF approach. This condition is true if the 2-tailed value is greater than or equal to the degree of significance (α = 0.05), expressed as 2-tailed ≥ α.
 Ha: There is a significant difference in the accuracy of the baseline reduction methods with and without the MWMF approach. This condition is true if the 2-tailed value is smaller than the degree of significance (α = 0.05), expressed as 2-tailed < α. Table 8 shows the average accuracy and Wilcoxon test results for the baseline reduction method, with and without the MWMF approach for the three datasets. The validation process was carried out on the DEAP, DREAMER, and AMIGOS datasets to determine the average emotional recognition accuracy of the four classes with and without the MWMF method. These were validated using the Wilcoxon test, with and without the MWMF method. This analysis has several values supporting the significance test hypothesis, including Positive, Negative Ranks, Ties, and 2-tailed. A Positive Rank indicated the extent of data that experienced absolute accuracy when the baseline reduction method was compared with and without MWMF (with MWMF > without MWMF). A Negative Rank depicts the extent of data that experienced a significant decrease in accuracy when the baseline reduction method was compared with and without MWMF (with MWMF < without MWMF). Ties represent the extent of data that does not accurately change when the baseline reduction method is compared with and without MWMF (with MWMF = without MWMF). The 2-tailed (Asymp. sig.) is the significance value obtained from the baseline reduction method with and without MWMF. Furthermore, it was also used to test the hypotheses. Ha was accepted if the 2-tailed value was < 0.05; otherwise, Ho was accepted.
Based on the validation of the DEAP, DREAMER, and AMIGOS datasets, the Relative and Fractional Difference methods tend to produce higher accuracy when combined with the MWMF method for the four-class emotional recognition. In all scenarios, Ha was accepted because the 2-tailed value is less than 0.05 at 0.000. All participants experienced an increased accuracy in the DEAP and AMIGOS datasets. The Positive Rank values for the DEAP dataset were 32, whereas that of AMIGOS was 31. On the contrary, in the DREAMER dataset, only 1 participant out of 23 others experienced a decrease in accuracy (Negative Rank). However, the majority, namely 22 participants, experienced increased accuracy. Statistical tests concluded that there is a significant increase in accuracy when using a combination of Relative or Fraction Difference and MWMF methods.
The MWMF method tends to reduce the interference and amplitude of the baseline EEG signal. This represents the neutral condition of the participants [6,7]. Additionally, this approach, as well as a smoothing process, can also eliminate interference and decrease the amplitude of the experiment EEG signal. Both smoothed features resulted in a significant increase in accuracy, as shown in Table 8. The baseline reduction approach can use all EEG signal features to represent emotional reactions in different patterns. The feature values from smoothed signals significantly increased the accuracy of emotional recognition when used in the Relative and Fractional Difference methods. The raw EEG signal does not represent emotional reactions in different patterns when using the Difference method. This is caused by the smoothed features yielding a low value. The adoption of the Difference method did not affect the raw EEG signal pattern, as stated in Equation 2. As shown in Figures 10-A and 10-B, there was no change in the raw EEG signal both before and after the baseline reduction employed this method. The formed pattern was still rough even though this signal has been the baseline reduction and smoothing processes. Compared to the Difference method, applying the MWMF approach to the Relative and Fractional Difference techniques yielded a raw EEG signal with a stable pattern. Figures 10-C and 10-D show the reduced pattern using both methods, producing a more stable signal than before its reduction, as indicated in Figure 10-A. Nevertheless, the Fraction Difference method combines the Difference (subtraction) and Relative Difference (division) approaches. Irrespective of the fact that when combined with the MWMF method, it increases the emotional recognition accuracy, an effective baseline reduction is performed through the division process (as shown in Equation 9) [12]. Therefore, the Relative Difference approach is more appropriate for baseline reduction. The findings of this study are compared with the average accuracy of previous studies, particularly those concerning the recognition of the four emotional categories, as shown in Table 9. Based on the comparison of accuracy in Table 9, the proposed method tends to produce the highest accuracy for the recognition of the four emotional classes than that of Liu [39][40][41]. Although Zhao et al. (2020) applied a baseline reduction approach and a CNN method, the result obtained was less accurate than the one obtained in the present study [11]. Additionally, this research also examined the recognition of two emotional classes, namely high and low for arousal and valence, respectively. The results obtained from the average accuracy were then compared with previous research, particularly those that applied the baseline reduction approach, as shown in Table 10. Based on Table 10, the proposed baseline reduction method tends to produce higher accuracy than the ones designed by  and Wirawan et al. (2021) [8,12]. Although this study referenced the feature extraction, representation, and classification methods from , applying the MWMF method increases the accuracy of emotional recognition [4]. Irrespective of the fact that this method is highly accurate, based on testing the DEAP dataset, it gave a precise, slight result for emotion valence compared to the research carried out by Cheng et al. (2020) and Liu et al. (2020). Both proposed the gcForest and the Capsule Network methods for emotional classification. These techniques can monitor deep learning processes and tend to consider the spatial information in the EEG signal. Future research needs to design a classification method to address this issue. The CNN method is unable to process spatial information from EEG signals and requires innumerable training sessions [3,10,42]. Considering that the three public datasets used had an imbalanced distribution, this study also measures the proposed method's accuracy, precision, recall, and F1 values, as shown in Table 11. Despite being applied to unbalanced datasets, the accuracy, precision, recall, and F1 had the same high values. Therefore, the proposed method for emotional recognition based on EEG signals is robust.

5-Conclusion
The baseline EEG signals used in the reduction approach were disturbed both internally and externally. This disturbance led to its inability to represent each participant's neutral conditions. These tend to be achieved if the amplitude of the baseline EEG signal is less than that of the experiment. This research examined a Modified Weighted Mean Filter method for smoothing baseline EEG signals. In addition, this approach can smooth the amplitude of experimental EEG signals. It is aimed at reducing noise and amplitude. Smoothed EEG signals represent neutral states, and their features are used for baseline reduction. This process aims to characterize emotional reactions according to different signal patterns. Among the three methods, Relative Difference is the most appropriate for executing this process. A combination of the Modified Weighted Mean Filter and Relative Difference methods significantly increases the accuracy of the emotional recognition based on the validation and statistical tests carried out on the DEAP, DREAMER, and AMIGOS datasets. Therefore, it was concluded that applying the Modified Weighted Mean Filter method can optimize the baseline reduction process, mainly when the Relative Difference approach is used.
Although the overall emotional recognition accuracy improved significantly, some issues led to its low accuracy in the DEAP and DREAMER datasets. Considering that the use of the CNN method was unable to represent spatial information between parts of an object and its whole in the classification process, it decreased the accuracy of emotional recognition. Therefore, it is crucial to examine the classification method to overcome these problems in future research.