Volume 32, Issue 2, September 2017, Pages 345–353
Hind Ra'ad Ibraheem1 and Enas Mohammed Hussein2
1 Computer Science Department, AL-Mustansiriyah University, Iraq
2 Computer Science Department, AL-Mustansiriyah University, Iraq
Original language: English
Copyright © 2017 ISSR Journals. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The data stream has recently emerged in response to the continuous data problem. Stream data is usually in vast volume, changing dynamically, possibly infinite, and containing multi-dimensional features. The attention towards data stream mining is increasing as regards to its presence in wide range of real-world applications, such as e-commerce, banking, sensor data and telecommunication records. Similar to data mining, data stream mining includes classification, clustering, frequent pattern mining etc. techniques; the special focus of this paper is on classification methods invented to handle data streams. Performance of data stream classification is measuring by involving processing speed, memory and accuracy. Also, a classification algorithm must meet several requirements in order to work with the assumptions and be suitable for learning from data streams so studying purely theoretical advantages of algorithms is certainly useful and enables new developments. Here we present a comprehensive survey of the state-of-the-art data stream mining algorithms with a focus on classification because of its ubiquitous usage. It identifies mining constraints, proposes a general model for data stream mining, and depicts the relationship between traditional data mining and data stream mining. In this paper, we propose a new streaming data classification algorithm based on Hoeffding tree algorithm called Fast Decision Tree Algorithm (FDTA) as an improvement method to classify stream data and compared between them according to the three measures which are classification accuracy, memory space and execution time.
Author Keywords: Data Stream, Data Stream mining, Data Stream classification, Hoeffding tree algorithm, FDTA.
Hind Ra'ad Ibraheem1 and Enas Mohammed Hussein2
1 Computer Science Department, AL-Mustansiriyah University, Iraq
2 Computer Science Department, AL-Mustansiriyah University, Iraq
Original language: English
Copyright © 2017 ISSR Journals. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
The data stream has recently emerged in response to the continuous data problem. Stream data is usually in vast volume, changing dynamically, possibly infinite, and containing multi-dimensional features. The attention towards data stream mining is increasing as regards to its presence in wide range of real-world applications, such as e-commerce, banking, sensor data and telecommunication records. Similar to data mining, data stream mining includes classification, clustering, frequent pattern mining etc. techniques; the special focus of this paper is on classification methods invented to handle data streams. Performance of data stream classification is measuring by involving processing speed, memory and accuracy. Also, a classification algorithm must meet several requirements in order to work with the assumptions and be suitable for learning from data streams so studying purely theoretical advantages of algorithms is certainly useful and enables new developments. Here we present a comprehensive survey of the state-of-the-art data stream mining algorithms with a focus on classification because of its ubiquitous usage. It identifies mining constraints, proposes a general model for data stream mining, and depicts the relationship between traditional data mining and data stream mining. In this paper, we propose a new streaming data classification algorithm based on Hoeffding tree algorithm called Fast Decision Tree Algorithm (FDTA) as an improvement method to classify stream data and compared between them according to the three measures which are classification accuracy, memory space and execution time.
Author Keywords: Data Stream, Data Stream mining, Data Stream classification, Hoeffding tree algorithm, FDTA.
How to Cite this Article
Hind Ra'ad Ibraheem and Enas Mohammed Hussein, “AN IMPROVEMENT CLASSIFICATION ALGORITHM UTILIZING STREAMING DATA,” International Journal of Innovation and Scientific Research, vol. 32, no. 2, pp. 345–353, September 2017.