Abstract:
Time-series data tend to enjoy regular fluctuations. Statisticians have developed a wide variety of techniques to predict future values of a temporal variable. Most of these approaches use prediction techniques; one example is the employment of auto regression and moving averages to predict future numerical values.
Our project uses a data mining technique called classification to predict both the occurrence of surges in time-series, and the expected durations of those surge, as opposed to future values predicted using other techniques. Such surges can occur in a number of time series events, examples of which include demands for energy, weather forecasting, and variation in traffic volume. Our chosen technique can be employed to extract meaningful statistics and other useful characteristics of time series data.
Classifier performance depends greatly on the characteristics of the data to be analyzed. Many algorithms are part of classification analysis. For this study, we chose for comparison the decision tree, support vector machines, and Adaboost. To validate the quality of algorithms for our given problem, we used precision and recall measures as comparators between different algorithms. The minimal accepted precision score was set as 60%, with 70% as the preferred such score, as such a result would be more robust. Our initial experiments yielded a precision score of 64%, and the best results attained a score of 77%.