The objective of this research is monitoring 'Big Data'. The use of Statistical Process Monitoring (SPM) for large data sets is gaining traction and new techniques are needed. SPM is used in practice to monitor various process characteristics, with the goal of quickly detecting systematic deviations within the process.
Every process is subject to variation. This variation may stem from normal causes (inherent in the process), or special causes (by unexpected events). A process is considered to be "in-control" if there is no variation by special causes. If there are special circumstances influencing the process, then the process is referred to as "out-of-control '. The special causes have to be detected as soon as possible and, where possible, be resolved. The SPM field of study has focused primarily on smaller data sets, since the collection of data, in practice, was often expensive and time consuming. Driven by technological advances the available data is exploding in size and that requires new methods within the field of SPM.
In practice processes often depend on multiple related variables. Multivariate control charts have been developed to monitor these processes. However, the use of these charts on high-volume, high-velocity data sets with variables of different data types has received little attention. With the aid of empirical data sets these methods will be investigated in the field and further developed.