A Hybrid Error-Driven Approach to Data Stream Classification
G. Abinaya1, Aditya Subramanian2, Harsh Kumar3, Sanjeev Rao4, Sourav Patra5
1Mrs. G. Abinaya, Department of Computer Science and Engineering, SRM Institute of Science and Technology, Chennai, (Tamil Nadu), India.
2Aditya Subramanian, Department of Computer Science and Engineering, SRM Institute of Science and Technology, Chennai, (Tamil Nadu), India.
3Harsh Kumar, Department of Computer Science and Engineering, SRM Institute of Science and Technology, Chennai, (Tamil Nadu), India.
4Sanjeev Rao, Department of Computer Science and Engineering, SRM Institute of Science and Technology, Chennai, (Tamil Nadu), India.
5Sourav Patra, Department of Computer Science and Engineering, SRM Institute of Science and Technology, Chennai, (Tamil Nadu), India.
Manuscript received on 23 March 2019 | Revised Manuscript received on 30 March 2019 | Manuscript published on 30 March 2019 | PP: 1500-1506 | Volume-7 Issue-6, March 2019 | Retrieval Number: F2647037619/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Challenges in the field of data stream mining include the vast volume of data being mined, the speed at which data arrives, and the presence of concept drifts. Traditionally, data classification has always involved the assumption of prior knowledge of the data sets, a method which is not particularly suitable when dealing with high-speed data streams. As such, various methods have been developed for the specific use-case of stream data mining, which are able to handle concept drifts during the data mining process with varying degrees of accuracy. Here, a probabilistic queuing model – based on an existing ‘SyncStream’ algorithm – is used in order to passively detect and account for the presence of abrupt concept drifts. In addition, other aspects of the system are tuned for better classification accuracy and throughput.
Keywords: Classification, Data streams, Data mining, Queuing theory, Learning
Scope of the Article: Classification