Abstract
Data streams are rapidly and constantly growing. Analysis of rapidly changing data streams is quite difficult since the amount of data increases in timely manner. Individual patient records provide a vital resource for health research for the benefit of society, such as understanding the association between human immune system and viruses. As the patient records have been constantly growing, data reduction techniques are needed to reduce the complexity of the data, the cost of data storage and to enhance generalization performance. This study uses the concept of data stream mining to predict the effect of antibody features (IgGs) and primary Natural Killing (NK) cells' cytotoxic activities on RV144 vaccine receipts and to disclose the functional relationship between immune system and HV virus. In order to adapt the data stream mining techniques, this data is manumitted to mimic a data stream. We propose a novel instance selection framework that identifies relevant and important instances and yields better results than the entire data set. The RV144 vaccine data set contains 100 data samples in which 20 of them are the placebo samples and 80 of them are the vaccine injected samples. Each data sample has twenty antibody features that consist of features related to IgG subclass and antigen specificity. To accomplish our goal the data randomly divided into four chunks which have been utilised for sequential random sampling of the data. In addition, a synthetic data set was created and divided into five chunks similar to RV144 data set. Then each chunk is sequentially added to the database at a time. However, instead of using entire data set to select samples, we utilised one chunk at a time and most relevant and important instances of upcoming samples are selected before new chunk of data has arrived. Therefore, our framework does not only reduce the size of data set but also reduce the cost of storage.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC) |
Publisher | IEEE |
Pages | 003356-003361 |
ISBN (Print) | 9781509018970 |
DOIs | |
Publication status | E-pub ahead of print - 9 Feb 2016 |
Event | 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC) - Budapest Duration: 9 Feb 2016 → … |
Conference
Conference | 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC) |
---|---|
Period | 9/02/16 → … |
Keywords
- reservoirs
- predictive models
- vaccines
- data models
- data mining
- support vector machines
- conferences