Rapid Seizure Classification Using Feature Extraction and Channel Selection

The Seizure is an abnormal electrical activity in the brain; it can be diagnosed by a neurologist and could be classified using recorded data. Medical data, such as EEG signal usually contain many features and attributes that are not important for the classification process. Dimension reduction is an important step to reduce irrelevant information. Features extraction is one algorithm for dimension reduction step. Another one is the channel selection algorithm. These algorithms speed up the process of classification and improve accuracy. This paper proposes an approach based on extracting EEG features, channel selection to reduce the computation capacity, and trained model used for classification. Variance parameter is used for channels selection, by taking the maximum three ones. Eleven features are extracted from the selected channels and averaged to be the input for the classifier. Six classifiers are used to select the most accurate one. Ensemble classifier was the more accurate one to classify all seizure cases correctly as it is got 100% sensitivity for continuous testing and 97.6% for the random testing set. for speed and memory usage and hard for interpretability. Decision Tree; speed is fast, memory usage is small, and interpretability is easy. Ensemble and Linear Discriminant; quick, short and easy. Logistic Regression; fast, medium and smooth. Six


Background
Epilepsy is the most common neurological condition affecting humans of all ages. Due to the unpredictable characteristics of epilepsy, this makes it an understood neurological disease. There are an anticipated 50 million human beings with epilepsy in the world, up to 75% live in corrupt countries with little or no get entry to scientific services or treatment. Epilepsy is described by recurrent seizures related to sudden sporadic neuronal releases in the cerebrum and can result in other problems, even death [1,2]. Electroencephalography (EEG) is one of the primary modalities regularly utilized for remote epileptic seizure recognition. It becomes an inexpensive and noninvasive stage to investigate the inconspicuous qualities of the disease. The Seizure is the characterizing property of epilepsy, which reflects abnormal periods of activity in the EEG [2,3]. Features extraction is a critical step in epilepsy detection. Its importance lies in building an accurate model for classification. It is profitable to restrict the number of information in a classifier to have models with less computation [4]. There are many techniques used for features extraction like Auto-Regressive (AR) [5], Principle Component Analysis (PCA) [6], Empirical Mode Decomposition (EMD) [7], and Statistical features technique [8]. A statistical feature had been used for extracting features in several algorithms [9,10].
Full-channel EEG signals recorded using electrodes. Their number varies from 18 to 23 electrodes on the scalp is neither wearable nor computationally effective. The process of channel selection is used to eliminate calculations, particularly in real-time applications. With the channel selection, it aims to decrease the number of channels not including distinguishing information [11,12].
Several algorithms used the concept of channel selection, such as; combining the advantages of both feature enhancement and channel selection to progress the detector performance [13]. Another algorithm compares electrode montage reduction by using only nine electrodes instead of using all electrodes [14]. Different algorithm selected EEG channels to eliminate power consumption in the detection process without affecting accuracy [15], and variance, the difference invariance, entropy, random selection and extra focal channels, and doctor's choice are also used and resulted from a valid range [16].
Classification is the step of identifying groups or classes based on similarities between them. This step is essential in this proposed approach to differentiate between seizure itself -ictal-and normal non-ictal periods. Classification involves two main steps: First step, the dataset information or concepts are grouped into two classes (seizure and normal) to learn the model. The second step, the model of the previous step is used for classification [17].
Several algorithms are used for this task such as Artificial Neural Network (ANN) [18], Support Vector Machine (SVM) [17,19], Linear Discriminant Analysis (LDA) [20,21], K-nearest neighbors (KNN) [22,23], decision tree [24], ensemble [25] and logistic regression [26]. This paper is arranged as follow: section II describes the proposed algorithm. The EEG dataset is presented in Section III. Section IV lists the performance metrics evaluation. The results are presented in section V. Section VI shows the conclusion, and the paper ends with the references.

Proposed Approach
The main target of this work is to detect the epileptic seizures from EEG recording automatically. EEG records the ictal and non-ictal cases, the proposed approach detects every seizure states and it acceptable to classify the normal state to be a seizure in limited instances, and this makes the assurance of avoiding any complications while the opposite case is not fair. The proposed approach fundamentally relies upon consequently recognizing epilepsy from a brief period EEG recording. The proposed method is work as in-  The EEG signal is composed of 23 channels generated from electrodes, which attached on the scalp. These channels make calculations complex and increase the load on the system. Due to these limitations, the channels selection step is extensively essential. Once gutters are selected, this is the time to extract features. At this point, the averaging step is taking place for feature reduction.
Finally, the features will be utilized to prepare the classifier. The prepared classifier will be used to test new cases to evaluate the performance of the ready classifier.

Variance Channel Selection
The channel selection step is intended to choose the most affected channels by seizure. Only one feature is used to be calculated for all channels, according to this feature a channel would be selected. Then the other features would be calculated for only the selected ones.
The simple method for selecting channels for features extraction and classification is the variance of EEG signal amplitude: Where c is the channel, c Χ is training seizure data, μ c is the mean of training seizure data, k is the number of samples of training seizure data. The channels selection based on the highest values of

Feature Extraction
In this step, a group of features of the EGG signal are extracted.

Classification
Extracted averaged features would be used in this step to build the classification model. Different six models are created in this work to use the best model as a classifier. The used classifiers models are Support Vector Machine (SVM) which has the following characteristics; speed is medium, memory usage is also medium, and interpretability is easy. K-Nearest Neighbors (KNN); medium for speed and memory usage and hard for interpretability. Decision Tree; speed is fast, memory usage is small, and interpretability is easy. Ensemble and Linear Discriminant; quick, short and easy. Logistic Regression; fast, medium and smooth.

Dataset Description
This work is performed using the CHB-MIT EEG dataset, CHB-

Experimental Results
In this work, the model goes under two phases, one is the training phase, and the other is the test phase. In the training phase, the proposed model is trained using massive data (250 samples). The training phase is applied to the data after channel selection step, and the results are estimated for these cases.
In the test phase, the proposed model is evaluated in two cases.
The first case, the test is carried out on random dataset (80 samples / each sample is 10 Sec period -2560 points for each channel), and the other is carried out on continuous dataset (82 samples). The continuous data is taken from patient number 3. The obtained results are shown in the tables below.
The proposed algorithm primarily goes through the channel selection stage, this selection is performed using the variance calculation of all channels, and this is illustrated in (Table 1). Table 1 shows an example for one sample from patient 1, at hour 21 (from 387 to 397 sec). Then the algorithm selects the three channels with maximum variance, then extract all other features (11 features) for these selected channels only as shown in (Table 2).  Then the features extracted of the selected channels are averaged to minimize the calculations as in (Table 3) (Table 4).   (Figure 2).  As it is clear from Figure 2, there are two extremely abnormal points noticed, which affects the accuracy of the model at this stage as in (Table 5). These points are inpatient 4, 12 hours 5, 32 at 510 to 520. These points are filtered from the training data, and now the data was ready for the training phase.  The other testing data set are continuous dataset from one patient with connected EEG recording. Also, the performance result is mentioned in the next (Table 7) with graphical form as in ( Figure   4).   Accuracy parameter is also essential, but it is not the better indication of the model performance to detect the seizure cases, however in training or testing phase. For example KNN model gives an accuracy higher than all other models in training phase as shown in (Table 5&6). But, in its sensitivity is lower than ensemble model, these parameters are also evaluated by MisRate in the same (Ta-

Conclusion
Epilepsy patient suffers from a lot of convulsions and complications, so, it is strongly needed a model that works to determine the seizure as fast as possible to avoid these symptoms. This paper pres-ents a proposed algorithm based on extracted EEG features, channel selection to reduce the computation capacity, and trained model for classification. Variance parameter is calculated for all channels and then used to select the channels by taking the maximum three ones. Eleven features are extracted from the selected channels and averaged to be the input for the classifier in both training and testing steps. Six classifiers are used in this work; Support Vector Machine, Linear Discriminant Analysis, K-nearest neighbors, decision tree, ensemble and logistic regression. The classifiers are tested using two sets of data; random data and continuous data. The results showed that the ensemble classifier was the more accurate to classify all seizure cases correctly as it is got 100% sensitivity for continuous testing data set and also the maximum sensitivity with 97.6% for the random testing set.

Funding
There was no funding organization for this study.

Conflict of Interest
Author Athar Ein-shoka declares that she has no conflict of interest. Author Mohamed Moawad declares that he has no conflict of interest. Author Ahmed Salem declares that he has no conflict of interest. Author Ayman El-sayed declares that he has no conflict of interest.

Ethical Approval
In this study, cases of epilepsy patients are treated through an offline database from the physionet site.
This article does not contain any studies with human participants or animals performed by any of the authors.