In order to perform a Time series Classification we use Decision Tree, and then we look at the performance of the classification.
We use the Synthetic Control Chart Time Series. This dataset contains 600 examples of control charts synthetically generated by the process in Alcock and Manolopoulos (1999).
data <- read.table("C:/07 - R Website/dataset/TS/synthetic_control.txt", header = FALSE)
# Data Preparation
pattern100 <- c(rep('Normal', 100),
rep('Cyclic', 100),
rep('Increasing trend', 100),
rep('Decreasing trend', 100),
rep('Upward shift', 100),
rep('Downward shift', 100))
# Create data frame
newdata <- data.frame(data, pattern100)
# Classification with Decision Tree
library(party)
tree <- ctree(pattern100~., newdata)
# Classification Performance
tab <- table(Predicted = predict(tree, newdata), Actual = newdata$pattern100) # confusion matrix
tab
Actual
Predicted Cyclic Decreasing trend Downward shift Increasing trend
Cyclic 97 0 3 0
Decreasing trend 0 99 8 0
Downward shift 0 1 89 0
Increasing trend 2 0 0 96
Normal 1 0 0 0
Upward shift 0 0 0 4
Actual
Predicted Normal Upward shift
Cyclic 0 0
Decreasing trend 0 0
Downward shift 0 0
Increasing trend 0 6
Normal 100 4
Upward shift 0 90
sum(diag(tab))/sum(tab) # accuracy
[1] 0.9516667
From the resul of the tree model (not shown here) we have 25 terminal nodes, and 49 branches. From the Confusin Matrix above, we can see that we have si different patterns, and in the main diagonal we have the corrected prediction, and the off main diagonalvalues say to use the numebr of missclassified observations. The Accuracy is 95.16%. The worst misclassification is for Decreasing trend with 8 misclassified observations. Moreover, the maximum confusion is for Downward shift. On the contrary, Normal trend is 100% correctly classified.