Anomaly Detections for Manufacturing Systems Based on Sensor Data—Insights into Two Challenging Real-World Production Settings [original]

sensors

Article

Anomaly Detections for Manufacturing

Systems Based on Sensor Data—Insights into

Two Challenging Real-World Production Settings

Klaus Kammerer 1,* , Burkhard Hoppenstedt 1, Rüdiger Pryss 2, Steffen Stökler 3,

Johannes Allgaier 4and Manfred Reichert 1

1Institute of Databases and Information System, University of Ulm, 89081 Ulm, Germany;

[email protected] (B.H.); manfr[email protected] (M.R.)

2Institute of Clinical Epidemiology and Biometry, University of Würzburg, 97080 Würzburg, Germany;

ruediger[email protected]

3Uhlmann Pac-Systeme GmbH & Co. KG, 88471 Laupheim, Germany; stoekler[email protected]

4ATR Software GmbH, 89231 Neu-Ulm, Germany; [email protected]

*Correspondence: klaus.kammer[email protected]

Received: 9 October 2019; Accepted: 2 December 2019; Published: 5 December 2019





Abstract:

To build, run, and maintain reliable manufacturing machines, the condition of their

components has to be continuously monitored. When following a fine-grained monitoring of these

machines, challenges emerge pertaining to the (1) feeding procedure of large amounts of sensor

data to downstream processing components and the (2) meaningful analysis of the produced data.

Regarding the latter aspect, manifold purposes are addressed by practitioners and researchers.

Two analyses of real-world datasets that were generated in production settings are discussed in this

paper. More specifically, the analyses had the goals (1) to detect sensor data anomalies for further

analyses of a pharma packaging scenario and (2) to predict unfavorable temperature values of a 3D

printing machine environment. Based on the results of the analyses, it will be shown that a proper

management of machines and their components in industrial manufacturing environments can be

efficiently supported by the detection of anomalies. The latter shall help to support the technical

evangelists of the production companies more properly.

Keywords: anomaly detection; sensor data; machine learning; production machines

1. Introduction

For manufacturing companies, the management of machine failures is becoming increasingly

important. Due to the increasing complexity of the machines, downtimes of any kind can affect the

overall success of a company. Buying a spare system is not an alternative solution as the acquisition

costs normally surpass the benefit of the spare system. In addition, most replacement systems also

require regular maintenance, even if they are not used. That is why companies are looking for new

ways to manage machine failures cost-effectively. Predictive Maintenance is one topical subject, among

others, which is a promising direction to tackle machine failures before they actually occur. However,

the selection of appropriate techniques from the field of Predictive Maintenance is challenging as

numerous aspects have to be considered [

]. In addition to Predictive Maintenance, there are many

other approaches in this context for coping with machine failures such as Condition Monitoring or

Continuous Improvements. Moreover, the trend towards machine learning raises the question of

whether machine failures can be easily predicted. Although technical developments have improved

the possibilities for manufacturing companies to cope with machine breakdowns, their practical

application is still a challenging task for many reasons. On the basis of these considerations, the

Sensors 2019,19, 5370; doi:10.3390/s19245370 www.mdpi.com/journal/sensors

Sensors 2019,19, 5370 2 of 18

work at hand presents two real-world cases that were carried out in cooperation with manufacturing

companies. For these companies, the detection of system errors is of utmost importance. In this

context, it was shown that the detection of anomalies [

] of a machine is crucial for these companies.

However, the meaningful detection of such anomalies is very complex. Interestingly, so far, many

manufacturing companies often employ a selected choice of technical evangelists that are only able

to detect anomalies based on their practical experiences over time. Such experts, in turn, are very

expensive by design. To relieve them from manual decisions, this paper elaborates on how machine

failures of these scenarios can be managed by analyzing sensor data of the production machines.

In particular, the two examples will show that different types of sensor data, as well as detection

techniques, should be considered. The first presented real-world setting is related to pharma packing

machines. The latter machines wrap tablets into individual packaging units (i.e., blisters), and usually

comprise several other components. For example, a product loader component pushes blisters and

leaflets into cartons. This procedure, in turn, is prone to errors. Therefore, the packaging process needs

to be continuously monitored to reduce costly downtimes as well as to comply with federal regulations.

The continuous monitoring procedure, in turn, generates a large amount of sensor data coming from

sensors that are related to the several components of the packaging machine. In addition, the pharma

packing machine can be individualized for each customer, which might lead to many sensor parameter

settings of the same machine type, including all obscured components.

The second presented real-world setting is related to 3D printing machines in the field of

optical products. In contrast to the first example, in the second setting, not only sensor values

from the machine itself are important, but also sensors that measure the environment of the machine.

Note that this fact distinguishes this scenario from the first one. Here, for example, an increase in the

environmental temperature or humidity may have a significant influence on the production process of

the manufacturing machine. Anomalies, such as room temperature that has negative effects on the

machines, should, therefore, be avoided.

Regarding the anomaly detection in general, plenty of algorithms were proposed that address

different use cases [

]. The main objective of these algorithms is to analyze streaming data to develop

models that can be used with an appropriate number of parameters and across applications, i.e., for

machines with different characteristics and purposes. As in most practical cases, a very large number

of sensor values must be analyzed simultaneously, detailed knowledge of the individual parameters

is at least mostly of secondary importance. For manufacturers, anomaly detection algorithms with

a minimum of parameter settings should be easy to apply and suitable for streaming analysis [

Consequently, this type of algorithms should feature mathematical operations with low computational

complexity in order to minimize latency in long-running streaming analyses.

Based on the two real-world examples and their different characteristics to predict anomalies for

production machines, this work contributes to the following major insights (see also Figure 1):

•

For the pharma packaging machine scenario, sensor streaming data was evaluated with the goal

to predict anomalies of one particular machine type. More precisely, machine-internal sensors (see

Figure 1a) are acquired and their data is transmitted to a sensors data processing service. The latter

then evaluates the sensor data with a distance profiling method (see Figure 1b). For this use case,

it is shown that the requirement of efficient anomaly detection methods was found, which is also

able to cope with the huge amount of sensor data of the analyzed packaging machine. Following

this, the pharma packaging company can address other machine types as well.

•

For the 3D machine scenario, environmental sensor data was evaluated with the goal to predict

anomalies that may affect the production machines. More precisely, temperature sensors (see

Figure 1d) send data to a message broker via the Message Queuing Telemetry Transport protocol

(MQTT) (see Figure 1c). A machine learning service (see Figure 1e) then predicts temperature

values that may have negative effects. If a temperature anomaly is detected, the sensor data

processing and the machine learning services notify a machine operator via a service message

using the message broker. In this use case, several investigated machine learning approaches are

Sensors 2019,19, 5370 3 of 18

presented, which had the goal to efficiently detect anomalies of temperature values. Furthermore,

considerations on different sizes of training and test data are discussed.

•

For production companies, the detection of anomalies becomes increasingly import. As more

research is required to get better insights into real-world scenarios and datasets, this works

contributes with the results of two complex use cases. The selection of appropriate algorithms and

their parameterization is still challenging, which can be also seen from the fact that less standard

software is offered in this context.

Broker

Machine

Machine Learning

Service

start/stop

Machine

Operator

Environmental

Sensors

Sensor Data

Processing Pipeline

Internal Sensors

Machine Data Use Case

Environmental Data Use Case

Production Environment

transmit

environmental

sensor

data

publish alerts

transmit

machine

sensor

data

notify

publish machine events

Figure 1. Schematic overview of the presented use cases.

The remainder of this paper is organized as follows. First, related works are discussed in Section 2.

In Section 3, the method of distance profiling is applied to a real-world dataset of a pharma packaging

machine, while Section 4shows methods to predict anomalies of environmental data of a 3D machine’s

manufacturing room. Section 5discusses the results and deals with the revealed limitations. Section 6

concludes the work with a summary and outlook.

2. Related Work

Regarding the

acquisition of sensor data

in manufacturing systems, which is an important

prerequisite of this work, different related works exist. The authors of [

] discuss requirements for

data acquisition of production systems and introduce an architecture based on the Open Platform

Communications Unified Architecture (OPC UA) for data transmission and the precision time protocol

(PTP) for time synchronization [

]. The authors of [

], in turn, provided an overview of methods,

technologies, and exchange protocols to enable dynamic data acquisition of sensor data in industrial

systems. Challenges regarding the representation and transformation of sensor data in cyber-physical

systems are presented in [

]. In the field of semiconductor manufacturing, various case studies exist

to improve manufacturing processes by analysing manufacturing sensor data [

]. Thus, Advanced

Process Control (APC) methods utilize control strategies and analyses to identify machine faults and

their causes [10]. The results are then used to, for example, optimize maintenance schedules [11,12].

In general, different

anomaly detection techniques

exist, which can be classified into statistical

methods, such as Statistical Profiling [

], Parametric Statistical Modeling [

], and Machine Learning

Approaches [15].

Concerning

use cases of anomaly detection

in industrial systems, the authors of [

] introduce

a model-based approach for the prediction of energy consumption in production plants in order to

detect anomalies using the ANODA algorithm [

]. [

] detected anomalies by applying a Bayesian

network and scoring the resulting features according to a scoring model. The authors of [

], in turn,

developed an assistance system for data acquisition, process monitoring, and anomaly detection in

industrial and agricultural processes and evaluated three use cases. Anomaly detection was developed

Sensors 2019,19, 5370 4 of 18

individually for each use case and is based, for example, also on the distance profiling approach, with

a local outlier factor and PCA-based anomaly detection [19,20].

Regarding

stream-based anomaly detection

in general, different approaches exist. [

] conducted

anomaly detection based on an auto-regressive, data-driven model of the considered data stream.

They analyzed data streams of real-world wind speed measurements. The authors of [

], in turn,

presented an approach based on half-space trees, which is a one-class anomaly detection algorithm.

Its advantages with respect to the computational complexity are constant time and memory footprints.

Finally, [

] introduced an anomaly detection method based on the Hierarchical Temporal Memory

(HTM), which is a stream-based sequence memory algorithm. Furthermore, they introduced datasets

containing real-world data streams with labeled anomalies to enable benchmarks for stream-based

anomaly detection.

For

time series prediction

, plenty of related works exist. In [

], an algorithm called Ultra Fast

Forest Tree (UFFT) was investigated that examines the behavior of an ensemble of regression trees

on streaming data. The work introduces a hybrid adaptive system for the induction of random

forests from streaming data. The UFFT system is an incremental algorithm that poses a constant time

complexity to process each instance, works online, and uses the Hoeffding bound to decide when

a split test is installed on a leaf leading to a decision node [25]. The algorithm uses a continuous data

stream and during the training phase, short-term memory is used. This method, in turn, is restricted

to binary classification. However, it can be extended to a multiple classifier by increasing the number

of classifiers and building a random forest of binary trees. Although UFFT is not better in prediction,

it is significantly faster than the C4.5 algorithm [

]. However, UFFT is not able to detect little or abrupt

concept drifts like [

], who investigated drift detection. They present an algorithm, which creates

Regression Trees (RTs), instead of Random Forest Trees, based on data streams in the presence of

concept drifts. RTs are faster in learning, but are prone to outliers, as they do not work as an ensemble.

The Fast Incremental Regression Tree-Drift Detection (FIRT-DD) algorithm allows for model adoption at

any time and is able to deal with local concept drifts and adapts locally. By doing so, global model

adoption is avoided to gain efficiency. The change detection algorithm is based on change detection

units (CDUs) that monitor the growing process. CDUs require few memory spaces per node and

a small, constant amount of time complexity for each sample.

Ref. [

] defined concept drifts as a change of the underlying joint probability, i.e., a change

P(Y|X)

, and distinguished between three main approaches of concept drifts: (1) methods that

explicitly detect concept drifts, (2) methods that use ensembles of decision models, and, (3) methods

that are based on data management using a sliding window (similar to our presented approach).

Fast Incremental Model Trees with Drift Detections (FIMT-DD), the successor of FIRT-DD, is based

on randomized model trees and combines them to ensembles, which leads to a random forest

regression [

]. Each leaf and node of a tree is built on randomly and independently chosen attributes.

The authors additionally created an Online Regression Forest (ORF), i.e., out of ten FIMT-DD, with

a tree depth of five. However, there is no striking outperformance of one of the presented algorithms

when comparing a single online tree, including optional splits with a random forest consisting of ten

trees [

]. While performance, in general, depends on the dataset, empirical analyses showed that

a single online option trees with averaging fits best for most compared datasets.

Altogether, the combination of (1) how anomaly detection methods have been integrated into

technical settings of real-world examples of production machines and (2) application examples of how

anomaly detection of sensor data can be performed in a data-driven manner, without major parameter

optimizations, has not been presented by the discussed other works so far as done in this paper.

3. Pharma Packing Use Case—Machine Sensor Data

The following use case presents the application of distance profiling to a real-life dataset of a pharma

packaging machine with the goal to detect anomalies of a machine component. In the first step, relevant

sensor data was collected. Then, the obtained sensor data were processed by a data processing pipeline.

Sensors 2019,19, 5370 5 of 18

Practical results show that distance profiling can be a valuable method to detect the anomalies of the

pharma packaging machine components.

Use Case Description:

Uhlmann Pac-Systeme GmbH & Co. KG is a mechanical engineering

company headquartered in Laupheim, Germany. Uhlmann is a supplier for pharmaceutical wrapping

packaging machines. The blister machines of Uhlmann put tablets into individual packaging units.

For example, the offered machines form blisters with individual courts for tablets from a plastic or

aluminum foil strand (see Figure 2). Tablets, in turn, are fed and sorted into blister courts. The latter are

completed with a cover sheet, while finished blisters are finally punched out by the Uhlmann machines.

Figure 2. Uhlmann blister machine B1440i.

As all Uhlmann packaging machines are used in the pharmaceutical industry, country-specific

laws and regulations must be considered and fulfilled. These legal requirements are related to the

validation of machines, including the provision of detailed documentation of all process steps that are

performed during drug packaging. Notably, every packaging machine delivers sensor data, which can

be continuously monitored in order to detect anomalies. As an important preliminary technical step

for the detection of anomalies, the acquisition of sensor data and the generation of actuator signals

must be provided by programmable logic controllers (PLC) [

]. Along with the example of a product

loader, this preliminary step will be shortly delineated, as it shows the characterization of the resulting

sensor data.

The product loader station of an Uhlmann blister machine loads blisters and leaflets into a carton.

The product loader consists of several sensors and servomotors. Specifically, two functional assemblies

with servomotors are running in parallel, while the sensors of each single servomotor generate four

signals that represent the

1. mechanical position (MP) of the assembly (ProductLoader.Position),

2. the difference of a reference value and the actual value of the MP (ProductLoader.Dref),

3. the power consumption of the assembly (ProductLoader.Current),

4. and the reference signal of the assembly (ProductLoader.Reference).

Overall, eight signals per product loader can be acquired. The physical process of product loading

and product releasing is executed in continuous cycles. Note that the packaging performance of

a machine is therefore expressed in cycles per minute. During these cycles, different anomalies may

occur. A machine component can lock, which leads to a production halt. This can be caused, in turn,

by a faulty feeding of the packaging box. Furthermore, wear and tear of the ball bearings can lead

to increased frictional resistance or a complete failure of the ball bearing lubrication. Both anomalies

are reported by the technical evangelists to be detectable by analyzing the power consumption of the

product loader.

Fundamentals:

To detect anomalies in time series data, distance profiling can be applied [

A distance profile, in turn, is a vector

of the Euclidean distances between a given time series pattern

pand every possible subsequence siin the respective time series s(see Figure 3).

Sensors 2019,19, 5370 6 of 18

Pattern p

Subsequence

Distance

Profile D=

Time Series s

dp,1 dp,2 dp,3 dp,n-m+1

sn-m+1

...

Figure 3. Distance profile for a signal.

In this paper, Mueen’s Algorithm for Similarity Search (MASS) was used to calculate the distance

profile [

]. When calculating the distance profile, the resulting time complexity is

O(nm)

. MASS uses

a convolution-based method to calculate the distance profile of a pattern in

O(n×log n)

. Furthermore,

a z-normalization is applied to the generated distance profile vector during the calculation as well.

If the Euclidean distance between a pattern

and a subsequence

is smaller than the value of

a threshold, then

and

are considered to be similar.

Sensor Data Acquisition:

When processing

sensor data of the pharma packing machines, different steps have to be performed [

]. First, sensor

data has to be collected from a PLC and transferred to a collection component. We developed a binary

transmission protocol to transfer sensor data directly from the PLC to a collection component via TCP

sockets. The protocol offers different frame types for time synchronization, signal metadata description,

and sensor data point transmission.

Sensor Data Processing:

The transferred raw data stream has to be split into windows and

pre-processed if it features disturbing noise. After the pre-processing step, the data is processed through

fast-Fourier transformations and finally stored if no further processing steps are required. In this

step, we adjusted the already existing technical development to enable the integration of anomaly

detection approaches after the pre-processing step. To enable this, the utilized processing pipeline must

provide a high variability for every sensor signal, which means that the pipeline is generally difficult

to manage and configure. To address these issues, we have developed a Sensor Data Processing (SDP)

framework in C# of Microsoft. NET for collecting, processing, storing, and visualizing raw sensor data

in a continuous processing chain, according to the data stream processing model [

]. The SDP defines

a graph-based processing model, in which processing nodes are connected to each other to handle all

aforementioned and required pipeline steps in a controlled way (see Figure 4).

Figure 4. Schema of the sensor data processing pipeline.

Sensor Data Analysis:

The dataset used in this work was generated by an Uhlmann endurance

test arrangement in the Uhlmann technical center in Laupheim, Germany. To be more specific,

an isolated product loader station was running in a continuous operation mode. For our practical

evaluation, sensor data were collected over 10 days by the SDP and stored in a MongoDB database.

In total, 1,127,790 timestamps per signal (i.e., time synchronization frames) and 18,736,022,320 data

points (i.e., floating-point numbers representing the measured sensor data) were recorded during the

mentioned time period, including NULL values for the signal gaps. In terms of disk space, the recorded

values correspond to roughly 22 GB. Further note that the dataset comprises a few signal gaps due to

connectivity losses between the PLC and the storage component. Importantly, the dataset contained

records of bearing damage, which eventually led to the failure of the machine. Apart from that,

the dataset can be considered as being healthy, as the endurance test and an evaluation of the log

Sensors 2019,19, 5370 7 of 18

data generated by the test execution system showed no other abnormalities. Figure 5shows the

mechanical position and power consumption values of five arbitrary product loader cycles (

) of

the considered dataset. During a cycle, the mechanical position values increase monotonously, while

a product is loaded into a carton (x-axis). The power consumption (y-axis) shows positive values if the

product loader is increasing in its speed, while negative values show breaking actions. When a new

cycle starts, the position is set back to zero.

Figure 5. Power consumption (blue) and reference signal (orange) of an assembly.

Distance Profiling:

For this practical use case, the SDP processing pipeline was adjusted by

changing the receiver node to receive BMTT data, a windowing node (see below), and a processing node

implementing the distance profiling based on the MASS, and an output node to publish MASS results

to a broker (see Figures 1c and 4). A pre-processing node is not necessary as the MASS processing node

calculates z-normalization and distance profiles simultaneously. As the SDP pipeline processes data

streams, windowing had to be applied [

]. Therefore, correlation windows are used. The latter are

a specialization of session windows [

]. In contrast to session windows, correlation windows are

triggered by one event at the beginning of a new window. In the SDP, correlation windows may be

also generated based on other signals. Here, a window trigger is executed based on the master encoder,

represented by a saw-tooth signal (see Figure 5). That means, the regular clocking of the master encoder

(about

720°

) was used to divide the related signals into uniform windows by detecting its falling slope

(dataFrame.position[i+1] - dataFrame.position[i])

<−

500). The MASS algorithm was then applied to

the data of the power consumption signal of the product loader, as it offers the most expressiveness

of the mechanical process and features a very high resolution, i.e., one data point per 2 milliseconds.

In contrast, the mechanical position offers only about 1000 different, monotonously increasing values

(see Figure 5). Moreover, a comparison between the individual patterns becomes possible. Therefore,

a new counter value is created for each recognized new pattern of a window. In order to classify a new

pattern, the Euclidean Distance is used to determine the signals of existing patterns with the signals of

new patterns. In practice, the counter value increases rapidly at the beginning of an analysis series, as

well as in the range of any anomaly and drops below a certain limit thereafter. Note that the number of

patterns caused by detected anomalies are gradually considered “normal” by MASS, and are therefore

smoothed. Conversely, this means that after a large number of detected patterns, it is not assumed that

the anomaly has been disappeared.

Results:

On the given 10-days dataset, the application of the MASS with the developed SDP

processing node took a total run time of 60 min. As a result, the MASS detected fewer than 5 patterns per

Sensors 2019,19, 5370 8 of 18

data point analysis after 22 h in the dataset, i.e., the baseline was reached. Interestingly, about 14 h prior

to bearing damage, the number of detected patterns increased rapidly (see Figure 6).

At 10:00 a.m.,

the number of detected patterns returned to 1 pattern per sample. At 01:30 p.m., the number of

detected patterns increased again to 7, whereas at 01:48 p.m., the machine finally stopped working

due to bearing damage. Figure 7a shows window-layered plots of the power consumption in the

normal condition (few detected patterns), whereas Figure 7b in bad condition (many detected patterns).

The x-axis represents the reference signal of the product loader in degree, while the y-axis represents

the actual power consumption in milliampere (mA) of the two assemblies. In Figure 7b, power

consumption shows a higher variance. Interestingly, the idle phase shows the highest variance for

power consumption. One reason for this may be the control loop used as during slow movements, the

actual position is approached to the reference position with smaller speed changes,. A higher variance

also means a lower motor precision of the movements in the normal state.

Figure 6.

Number of detected patterns (y-Axis) and time (x-Axis) of a product loader of the last

14 h collected.

Figure 7.

Phase folded plot of (

) normal situation with less detected patterns and (

) a detected

anomaly with many patterns.

Summary:

The dataset of the product loader component of a pharma packaging machine was

analyzed to detect anomalies in the collected dataset. The selection of a proper anomaly detection

algorithm is challenging as many variants of a machine exist and the selection of parameters for

Sensors 2019,19, 5370 9 of 18

each machine requires high efforts. Therefore, the development of a parameterless algorithm that fits

into the setting of the pharma packaging machine should be a major goal. In this paper, a distance

profile anomaly detection algorithm was presented and applied. Specifically, the distance profile was

calculated with the MASS based on the z-Euclidean Distance. The latter offers various advantages,

e.g., it requires few parameters. For the sensor data acquisition, the developed sensor data processing

framework (SDP) was used. The latter is based on a processing pipeline model that is configurable

and extensible, i.e., other anomaly detection algorithms can be flexibly integrated. Although it is

a rather simple method, MASS showed promising results on the presented real-world dataset. It was

possible to detect anomalies in a meaningful manner, i.e., based on the number of detected patterns,

13 h as well as 18 min before the damage. Practically, this means that a technical evangelist of the

pharma packaging company can better analyze the data to decide whether bearing damage can be

avoided by stopping the machine or replacing components before damage actually occurs. However,

it has also been shown that the integration of the anomaly detection method requires considerable

technical efforts.

4. 3D Printing Machine—Temperature Environment Data

The following use case presents the application of prediction models to a real-life dataset of

a measuring room of 3D printing machines with the goal to predict anomalies of temperature values.

In the first step, relevant sensor data were collected. Then, the obtained sensor data are processed and

promising features are selected. Finally, machine learning models predict upcoming values and warn

technical machine operators about possible anomalies.

Use Case Description:

An industrial company in the field of 3D production machines operates

a measuring room equipped with nine temperature sensors and additional sensors for air humidity

(%), air pressure (mBar), and airflow (m/s). The room contains an arbitrary number of machines and

their operators enter and leave the room at arbitrary points in time. Machines within the room are

allowed to operate within a temperature threshold, which is defined individually for each machine.

A temperature control unit tries to keep the temperature in the room between the upper and lower

threshold. The challenge of this project is to figure out temperature anomalies for the measuring

room that is solely based on the provided environmental data and does not contain any contextual

information of the operated machines, such as the number of operators or the actual number of

the machines in the room. Furthermore, no standards for acceptable errors were set for the project.

The models presented below return a prediction error on a basis for which the technical evangelists

can decide after a notification whether the error rate is acceptable or not.

Sensor Data Acquisition:

The measuring room is sending the current sensor information with

a transmission rate of one value per 3 min. Figure 8illustrates the used architecture. Hereby, all values

are combined using the JavaScript Object Notation (JSON). Next, all sent messages are transferred via

the Message Queuing Telemetry Transport protocol (MQTT), which is a publish/subscribe based message

protocol for machine-to-machine (M2M) communication. The machine learning code is implemented

as a separate service using the scikit-learn python library [

]. The scikit service subscribes to the

environmental values and uses them to train a machine learning model. The model is constantly

evaluated and, whenever the predicted temperature value exceeds a defined threshold, the responsible

machine operators (i.e., the technical evangelists) are alarmed. In contrast to the first use case, here, the

integration of the anomaly detection method is technically easier. However, the selection of proper

algorithms is more challenging than for the first use case.

Sensors 2019,19, 5370 10 of 18

MASCHINE

BROKER

publish()

ML-SERVICE

subscribe()

start/stop

OPERATOR

alert()

Figure 8. Schematic Architecture of the Environmental Use Case.

Sensor Data Processing:

The considered dataset included 115 variables, for which 52 variables

contain only NULL values. The remaining 63 variables also include NULL values, up to 64%.

17,454 rows and 15 columns remain after all NULL values have been dropped. As some attributes

are directly calculated from others (i.e., change rates), meaning that there is a linear dependency and

they have no explainable power to the target, we decided to discard them (see Table 1). Consequently,

13 columns were analyzed. Furthermore, timestamps are used as an index. We created scatter plots

of the remaining main variables

air_humidity, air_pressure, airflow and temp_avg

to enable

visual checks for cross-correlations. For example, the scatter plots in Figure 9are quadratic, meaning

that there seems to be no correlation between those three. If there is no correlation between the

variables, all of them can be included in the model. Otherwise, the information might be redundant.

If we take a closer look at the nine temperature sensors, we can see that they only vary between

21.99

(+)

, as a global maximum, and 20.29

(−)

degrees, as a global minimum. The standard deviation

(

std

) varies between 0.20 for

temp2

and 0.31 for

temp9

. The overall mean is 21.27 degrees, which is

the mean of

temp_avg

. We set the std in ratio with the mean to calculate the coefficient of variation.

The lowest value comes for

temp2

and

air_pressure

, which is 0.009. The highest coefficient of

variation is 0.359 for

airflow

. If we calculate the correlation between

temp_avg

and the remaining

variables, we can obtain that there is a weak correlation between air_humidity and temp_avg (0.22).

Table 1.

Data as used for the applied machine learning approach containing 17,454 values.

The timestamp is used as an index. Min

(−)

and Max

(+)

values for each single temp variable and are

marked in bold for each column.

Variable Mean Std Min Q0.25 Q0.50 Q0.75 Max

air_humidity 26.52 5.94 13.8 22.4 25.8 30.1 44.8

air_pressure 961.21 9.11 941 954 962 969 978

airflow 0.08 0.03 0.01 0.06 0.07 0.09 0.25

temp_avg 21.17 0.23 20.57 21.01 21.2 21.34 21.69

temp1 21.23 0.23 20.63(+) 21.07 21.26 21.4 21.99(+)

temp2 21.26 0.20(−)20.61 21.13 21.28(+) 21.4 21.81

temp3 21.11 0.29 20.35 20.92 21.15 21.32 21.89

temp4 20.94(−)0.26 20.29(−)20.77(−)20.98(−)21.14(−)21.55(−)

temp5 21.21 0.23 20.4 21.05 21.24 21.38 21.73

temp6 21.18 0.25 20.47 21.01 21.2 21.37 21.82

temp7 21.33(+) 0.22 20.61 21.19(+) 21.36 21.49(+) 21.88

temp8 21.05 0.24 20.45 20.89 21.08 21.23 21.6

temp9 21.10 0.31(+) 20.31 20.88 21.14 21.34 21.85

Sensors 2019,19, 5370 11 of 18

Figure 9. Scatter plots of selected features.

Sensor Data Analysis:

Sensors provide data in an interval of three minutes. As the explanatory

variables, we selected humidity, pressure, airflow, and temperature from the past. In the following,

a data-driven approach is pursued, since no statement can be made about a global cyclicity of the

observed data. To train the model for the prediction, we need to shift the target

temp

by 10 rows to

predict 30 min of the future. If we want to predict 60 min of the future, we need to shift the target

temp

by 20 rows. For more or less prediction time periods, the shift works accordingly. We trained

three different regressor types, i.e., a regression tree, a random forest, and a multi-layer perceptron (MLP)

as a neural network regressor. The advantages of multilayer perceptrons are the capability to learn

non-linear models and to learn models in real-time using a

partial_fit

. The advantages of the

regression trees and random forests are their efficiency for the training time and their replicability,

eventually enabling meaningful predictions by analyzing the resulting tree structure. When using

MLP regressor

, too many hidden layers can lead to overfitting. To avoid this, the MLP regressor

was gradually supplied with event-driven training data and has been designed with 1 hidden layer

containing 100 neurons. For the solver function, we used

lbfgs

, which is an optimizer in the family

of quasi-Newton methods. These models were selected as they are common standard models for

which well-documented program libraries, i.e., the Python scikit-learn package, exist and whose

standard parameterization has been developed by years of community expertise. Recurrent neural

networks, for example, are more difficult to train and parameterize and were not considered for

these reasons. As we basically address a problem of time series, we did not let the model shuffle

the training data and therefore set the shuffle to FALSE. For better prediction results with the MLP

Sensors 2019,19, 5370 12 of 18

regressor, it was necessary to scale the data using the

StandardScaler

implemented in scikit-learn.

For parameterization, the default values developed by the community were used as a starting point.

Therefore, the maximum depth parameter for the

Decision Tree Regressor

was set to 10. Increasing

this value leads to overfitting, whereas decreasing the value leads to unsharp results, as the tree

structure is not distinct enough. The structure of the

Random Forest Regressor

varied in the tree sizes

(between 50 and 1000) and depths (

between 5 and 10

). As the splitting criterion, we used the mean

squared error. All parameter settings used can be found in Appendix A.

Training:

After preparing the data, we split up the dataset into training and test set. Hereby,

we used two approaches, for which we compared error rates, based on the three models. In a first

approach, we incrementally trained the model with data that have been available so far, thereby the

training set grew in each step incrementally with the same size, whereas the size of the testing set

stayed fixed at 10 rows, which enabled us for a prediction of 30 min. In a second approach, we also

kept the size of the training set fixed for the past 120 min, which corresponds to 40 rows of data and let

the model predict the next 30 min. Figure 10 shows error rates for the three models based on data from

the past 120 min (“Train last 40 rows”) and for all available data (“Train all known rows”). Thereby,

“index” refers to the current row of the training set.

index

Tree

Forest MLP

90 0.115 0.145 0.135 0.185 0.096 0.108

590 0.13 0.135 0.245 0.085 0.1 0.152

1090 0.11 0.062 0.082 0.05 0.029 0.09

1590 0.04 0.033 0.057 0.039 0.019 0.026

2090 0.02 0.01 0.035 0.014 0.02 0.039

2590 0.11 0.102 0.087 0.123 0.075 0.061

3090 0.055 0.047 0.093 0.036 0.027 0.067

3590 0.055 0.086 0.164 0.054 0.043 0.054

4090 0.03 0.044 0.199 0.044 0.037 0.028

4590 0.05 0.04 0.098 0.14 0.09 0.065

5090 0.05 0.054 0.149 0.04 0.035 0.062

5590 0.055 0.157 0.863 0.055 0.044 0.062

6090 0.055 0.122 0.067 0.078 0.042 0.076

6590 0.095 0.031 0.138 0.044 0.048 0.034

average 0.069 0.076 0.172 0.071 0.050 0.066

Train last 40 rows

(error rates)

Train all known rows

(error rates)

Tree

Forest MLP

Figure 10.

Comparison of the three models based on the two training approaches with a 30 min

prediction time.

Results:

To compare the error rates over a time period of one day, we visualize errors on different

training sizes and different test sizes, as can be obtained from Figure 11. Note that the error rate

increases if we increase the prediction time. The amount of training data does not affect the error rates

in general, but by decreasing the training time, the variance of the error rate increases. Thus, there

is a negative correlation between the variance of the error rate and the training time. In Figure 12,

it can be obtained that the random forest approach adapts to the general pattern of the function and is

not heavily affected by a changing behavior between 7 a.m. and 9 a.m. Note that current in Figure 12

corresponds to the training set, while prediction corresponds to the test set. Based on these insights, the

technical evangelist can be provided with a useful and efficient method to consider the temperature of

the 3D printing machines.

Sensors 2019,19, 5370 13 of 18

Figure 11. Measured temperature and error rate comparison of Random Forest prediction.

Figure 12. Resulting temperature prediction of half an hour.

5. Discussion

All required data processing steps (acquisition, processing, and analysis) are highly related to

the presented use cases. Concerning the acquisition of sensor data, different sensor systems exist.

Usually, they use different data models and provide different connection and communication methods

(e.g., (a) synchronous communication or message-oriented communication). However, a common

exchange model between source and processing system should exist. We briefly summarize the

preliminary data processing steps before the anomaly detection methods could be applied. This shows

that the anomaly detection requires challenging technical preliminaries that must be considered as well.

In our first use case, we developed a transmission protocol to stream sensor data between

one source (PLC) and one sink (SDP framework), while the second use case transfers sensor data from

one source to a message broker via MQTT. Using these approaches, the anomaly detection methods

could be efficiently integrated. However, some more valuable insights are briefly mentioned. Although

for industrial use cases in general, the Open Platform Communications Unified Architecture (OPC UA)

is widely used to transfer data between systems and to call PLC-related functions [

]. For our use

case one, it was not sufficient due to bigger message sizes. Additionally, OPC UA is not designed to

continuously transfer large amounts of sensor data. After the acquisition of sensor data, the latter

Sensors 2019,19, 5370 14 of 18

has to be (1) pre-processed in order to convert it into analysis-compliant data models, (2) reduced

in its noise as well as (3) being normalized for further analyses. Furthermore, for stream-enabled

sensor data, windowing had to be applied for the first use case by creating windows depending on

the current machine cycle speed: a faster cycle speed leads to a higher window generation frequency.

Thus, windows contain the same physical processes of a machine, which enables their comparison

on the other. In general, streaming data architectures may follow the kappa or lambda architecture

pattern [

]. The SDP framework follows the kappa architecture, i.e., the collected dataset can be

processed by stream replaying. Then, for the second use case, a service collects MQTT messages and

evaluates them continuously with the scikit-learn library.

5.1. Limitations

For the proposed sensor data analysis using distance profiling, various limitations exist. First of

all, the sensitivity of the threshold for the patterns depends on two limiting factors. First, the warm-up

phase of the packaging machine must be properly considered, as it does not reflect the normal

operation of the machine. For example, lubricants in the considered pharma packaging machine

must heat up, which leads to a reduction of mechanical resistance and, thus, to a reduction of power

consumption, which, in turn, has to be considered for the anomaly detection (contextual anomaly) and

the selected thresholds, respectively. Second, there is no common knowledge about a threshold value

that indicates a useful alarm setting with respect to the number of detected patterns for a particular

machine. Consequently, for the shown results of the distance profiling for the Uhlmann Blister Machine

B1440i, it might be the case that our determined threshold can be optimized. Furthermore, pharma

packaging machines run with different operating speeds during production that should be considered

as well. However, this challenge can be addressed with a dynamic time warping [

]. As we analyzed

a dataset with a fixed operation speed, this was not necessary for the scenario shown in the work at

hand. In order to overcome these limitations, the presented concept can be extended by weighted

queries to indicate important sequences within the mechanical insertion process of the product loader.

Concerning the second use case, a major limitation refers to the unknown expected time-series

pattern. In contrast to the first use case, the environment of a machine is usually less controlled.

For example, we have not considered cases like the opening of a door by a machine operator. As the

presented approach is mainly data-driven, it is very vulnerable to effects from sources that are not

measured. Furthermore, the number of false alarms, also denoted as false positives, was very high

here. The machine operator wants to be informed if a critical state is nearly reached, but the number of

false alerts should be minimized to avoid irritations. This aspect is therefore subject to further research.

5.2. Summary

Although still many technical limitations came along with the shown results, the insights can

provide helpful support for the technical evangelists and machine operators of production companies.

On the other, we have shown that preliminary steps are required to integrate the anomaly detection

methods into the existing machine settings. Importantly, the integration approaches were accomplished

that allow for flexible technical changes. To conclude, the detection of anomalies based on sensor data

of a manufacturing machine as well as its environment is useful to support the daily life of machine

operators and technical evangelists. On the other, such type of support can be also a starting point to

train new machine operators more efficiently.

6. Conclusions

In this paper, the collection, processing, and analysis of two real-world data sets from Industry

4.0 scenarios were shown. The first use case focused on sensor data of a pharma packaging machine,

for which the data set of the product loader component was analyzed to detect anomalies. Note that

the selection of a proper detection algorithm is a challenging task as many machine variants exist

and the selection of parameters for each machine requires high efforts. Therefore, a parameterless

Sensors 2019,19, 5370 15 of 18

algorithm had to be developed to support machine operators and technical evangelists. Following this,

a distance profile anomaly detection algorithm was presented and applied to the real-life dataset of

the product loader. Specifically, the distance profile was calculated with the MASS and based on the

z-Euclidean Distance. The latter offers various advantages (e.g., requires only a few parameters).

Furthermore, we showed the analysis and prediction of temperature values in the second use case.

Here, the evaluation of different machine learning algorithms revealed promising results. Notably, in

contrast to the first approach, the data collection is different as other transmission protocols were used.

For example, the frequency of new values was lower in comparison to the first use case, and, therefore,

we did not need to use a binary-based transmission protocol, but relied on the standard IoT approach

and the MQTT.

Altogether, as production machines become more and more complex, respective companies

crave for methods to relieve their technical evangelists in the best possible way. The results of the

two uses have shown that this is possible. However, new approaches are still needed to better cope

with the variety of settings, sensors, and production machines [

]. Consequently, our future work

is driven by the provision of data-driven analytical sensor frameworks in the context of Industry 4.0

production scenarios.

Author Contributions:

K.K. and B.H. designed the concept, conceived and wrote the paper. R.P. and M.R.

corrected and supervised this work. S.S. and J.A. did the data curation and provided corrections.

Funding: This research received no external funding.

Conflicts of Interest: The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ANODA ANOmaly Detection Algorithm

FIMT-DD Fast Incremental Model Trees with Drift Detections

FIRT-DD Fast Incremental Regression Tree with Drift Detection

JSON JavaScript Object Notation

HTM Hierarchical Temporal Memory

IIoT Industrial Internet of Things

MASS Mueen’s Algorithm for Similarity Search

MLP Multi Layer Perceptron

MP Mechanical Position

MQTT Message Queue Telemetry Transport Protocol

M2M Machine-to-Machine

OPC UA Open Platform Communications Unified Architecture

ORF Online Regression Forest

PCA Principal Component Analysis

PLC Programmable Logic Controller

PTP Precision Time Protocol

RT Regression tree

SDP Sensor Data Processing Framework

UFFT Ultra Fast Forest Tree

Appendix A. Parameter Selection for the Evaluation of Environmental Sensor Data

DecisionTreeRegressor ( c r i t e r i o n = ’mse ’ ,

max_depth=10 ,

max_features=None,

max_leaf_nodes=None ,

min_impurity_decrease =0.0 ,

min_impurity_split=None,

min_samples_leaf =1 ,

Sensors 2019,19, 5370 16 of 18

min_samples_split=2,

min_weight_fraction_leaf =0.0 ,

presort=False ,

random_state=0,

s p l i t t e r = ’ best ’ )

MLPRegressor ( a cti v at ion = ’ relu ’ ,

alpha =0.0001 ,

batch_size= ’ auto ’ ,

beta_1 =0.9 ,

beta_2 =0.999 ,

early_stopping=False ,

ep si lon =1e −08,

hidden_layer_sizes =(100 ,) ,

learning_rate= ’ constant ’ ,

l e a r n i n g _ r a t e _ i n i t =0. 001 ,

max_iter =200 ,

momentum=0.9 ,

n_iter_no_change =10 ,

nesterovs_momentum=True ,

power_t =0.5 ,

random_state=None ,

sh u ff l e=True ,

solver= ’ l bfgs ’ ,

t o l = 0.0001 ,

va li d a t i o n _ fr ac ti o n =0.1 ,

verbose=True ,

warm_start=False )

RandomForestRegressor ( bootstrap=True ,

c r i t e r i o n = ’mse ’ ,

max_depth=10 ,

max_features= ’ auto ’ ,

max_leaf_nodes=None ,

min_impurity_decrease =0.0 ,

min_impurity_split=None,

min_samples_leaf =1 ,

min_samples_split=2,

min_weight_fraction_leaf =0.0 ,

n_estimators =1000 ,

n_jobs=−1,

oob_score=False ,

random_state=1,

verbose =0 ,

warm_start=False )

References

Hoppenstedt, B.; Pryss, R.; Stelzer, B.; Meyer-Brötz, F.; Kammerer, K.; Treß, A.; Reichert, M. Techniques and

Emerging Trends for State of the Art Equipment Maintenance Systems—A Bibliometric Analysis. Appl. Sci.

2018,8, 916. [CrossRef]

2. Hawkins, D. Identification of Outliers; Chapman and Hall: London, UK, 1980.

Hodge, V.; Austin, J. A survey of Outlier Detection Methodologies. Artif. Intell. Rev.

2004

,22, 85–126.

[CrossRef]

Knorr, E.; Ng, R. Algorithms for Mining Distance-Based Outliers in Large Datasets. In Proceedings of the

24th International Conference on Very Large Data Bases (VLDB ’98), New York, NY, USA,

24–27 August 1998

;

Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1998; pp. 392–403.

Faltinski, S.; Flatt, H.; Pethig, F.; Kroll, B.; Vodenˇcarevi´c, A.; Maier, A.; Niggemann, O. Detecting Anomalous

Energy Consumptions in Distributed Manufacturing Systems. In Proceedings of the 10th International

Conference on Industrial Informatics, Beijing, China, 25–27 July 2012; pp. 358–363.

Sensors 2019,19, 5370 17 of 18

Leitner, S.; Mahnke, W. OPC UA–Service-oriented Architecture for Industrial Applications. ABB Corp.

Res. Cent. 2006,48, 61–66.

Schlechtendahl, J.; Keinert, M.; Kretschmer, F.; Lechler, A.; Verl, A. Making Existing Production Systems

Industry 4.0-Ready. Prod. Eng. 2015,9, 143–148. [CrossRef]

Jirkovsk

y, V.; Obitko, M.; Maˇrík, V. Understanding Data Heterogeneity in the Context of Cyber-Physical

Systems Integration. IEEE Trans. Ind. Inform. 2016,13, 660–667. [CrossRef]

Moyne, J.; Iskandar, J. Big Data Analytics for Smart Manufacturing: Case Studies in Semiconductor

Manufacturing. Processes 2017,5, 39. [CrossRef]

10.

Feng, J.; Jia, X.; Zhu, F.; Moyne, J.; Iskandar, J.; Lee, J. An Online Virtual Metrology Model With Sample

Selection for the Tracking of Dynamic Manufacturing Processes With Slow Drift. IEEE Trans. Semicond. Manuf.

2019,32, 574–582. [CrossRef]

11.

Ramirez-Hernandez, J.A.; Crabtree, J.; Yao, X.; Fernandez, E.; Fu, M.C.; Janakiram, M.; Marcus, S.I.;

O’Connor, M.; Patel, N. Optimal Preventive Maintenance Scheduling in Semiconductor Manufacturing

Systems: Software Tool and Simulation Case Studies. IEEE Trans. Semicond. Manuf.

2010

,23, 477–489.

[CrossRef]

12.

Ramírez-Hernández, J.A.; Fernandez, E. Optimization of Preventive Maintenance scheduling in

semiconductor manufacturing models using a simulation-based Approximate Dynamic Programming

approach. In Proceedings of the 49th IEEE Conference on Decision and Control (CDC), Atlanta, GA, USA,

15–17 December 2010; pp. 3944–3949.

13.

Hill, D.J.; Minsker, B.S.; Amir, E. Real-time Bayesian Anomaly Detection in Streaming Environmental Data.

Water Resour. Res. 2009,45. [CrossRef]

14.

Candès, E.J.; Li, X.; Ma, Y.; Wright, J. Robust Principal Component Analysis? J. ACM

2011

,58, 11. [CrossRef]

15.

Nairac, A.; Townsend, N.; Carr, R.; King, S.; Cowley, P.; Tarassenko, L. A System for the Analysis of Jet

Engine Vibration Data. Integr. Comput. Aided Eng. 1999,6, 53–66. [CrossRef]

16.

Vodenˇcarevi´c, A.; Büning, H.K.; Niggemann, O.; Maier, A. Using Behavior Models for Anomaly Detection in

Hybrid Systems. In Proceedings of the 2011 XXIII International Symposium on Information, Communication

and Automation Technologies, Sarajevo, Bosnia and Herzegovina, 27–29 October 2011; pp. 1–8.

17.

Eberhardt, J.S., III; Radano, T.A.; Peterson, B.E. Application of Machine Learned Bayesian Networks to

Detection of Anomalies in Complex Systems. U.S. Patent 9,349,103, 24 May 2016.

18.

Windmann, S.; Maier, A.; Niggemann, O.; Frey, C.; Bernardi, A.; Gu, Y.; Pfrommer, H.; Steckel, T.; Krüger, M.;

Kraus, R. Big Data Analysis of Manufacturing Processes. J. Phys. 2015,659, 012055. [CrossRef]

19.

Phua, C.; Alahakoon, D.; Lee, V.C.S. Minority Report in Fraud Detection: Classification of Skewed Data.

SIGKDD Explor. Newsl. 2004,6, 50–59. [CrossRef]

20.

Eickmeyer, J.; Li, P.; Givehchi, O.; Pethig, F.; Niggemann, O. Data Driven Modeling for System-Level

Condition Monitoring on Wind Power Plants. In Proceedings of the 26th International Workshop on

Principles of Diagnosis, Paris, France, 31 August–3 September 2015.

21.

Hill, D.; Minsker, B. Anomaly Detection in Streaming Environmental Sensor Data: A Data-driven Modeling

Approach. Environ. Model. Softw. 2010,25, 1014–1022. [CrossRef]

22.

Tan, S.C.; Ting, K.M.; Liu, T.F. Fast Anomaly Detection for Streaming Data. In Proceedings of the 22nd

International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, 16–22 July 2011.

23.

Ahmad, S.; Lavin, A.; Purdy, S.; Agha, Z. Unsupervised Real-time Anomaly Detection for Streaming Data.

Neurocomputing 2017,262, 134–147. [CrossRef]

24.

Gama, J.; Medas, P.; Rocha, R. Forest Trees for On-line Data. In Proceedings of the ACM Symposium on

Applied Computing, Nicosia, Cyprus, 14–17 March 2004; pp. 632–636.

25.

Hoeffding, W. Probability Inequalities for Sums of Bounded Random Variables. J. Am. Stat. Assoc.

1963

,58, 13.

[CrossRef]

26.

Quinlan, J. C4.5: Programs for Machine Learning; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA,

1993.

27.

Ikonomovska, E.; Gama, J.; Sebastião, R.; Gjorgjevik, D. Regression trees from data streams with

drift detection. In Proceedings of the International Conference on Discovery Science, Porto, Portugal,

3–5 October 2009; pp. 121–135.

28.

Ikonomovska, E.; Gama, J.; Džeroski, S. Online Tree-based Ensembles and Option Trees for Regression on

Evolving Data Streams. Neurocomputing 2015,150, 458–470. [CrossRef]

Sensors 2019,19, 5370 18 of 18

29.

Ikonomovska, E.; Gama, J.; Zenko, B.; Dzeroski, S. Speeding-up Hoeffding-based Regression Trees with

Options. In Proceedings of the 28th International Conference on Machine Learning, Washington, DC, USA,

28 June–2 July 2011; pp. 537–544.

30. Bolton, W. Programmable Logic Controllers; Newnes: Oxford, UK, 2015.

31.

Yeh, C.C.M.; Zhu, Y.; Ulanova, L.; Begum, N.; Ding, Y.; Dau, H.A.; Silva, D.F.; Mueen, A.; Keogh, E. Matrix

Profile I: All Pairs Similarity Joins for Time Series: A Unifying View that Includes Motifs, Discords and

Shapelets. In Proceedings of the 16th International Conference on Data Mining (ICDM), Barcelona, Spain,

12–15 December 2016; pp. 1317–1322.

32.

Mueen, A.; Viswanathan, K.; Gupta, C.; Keogh, E. The Fastest Similarity Search Algorithm for Time Series

Subsequences under Euclidean Distance. 2019. Available online: http://www.cs.unm.edu/~mueen/

FastestSimilaritySearch.html (accessed on 15 November 2019).

33.

Gama, J.; Gaber, M. Learning from Data Streams: Processing Techniques in Sensor Networks; Springer: Berlin,

Germany, 2007.

34.

Babcock, B.; Babu, S.; Datar, M.; Motwani, R.; Widom, J. Models and Issues in Data Stream

Systems. In Proceedings of the 21st ACM Symposium on Principles of Database Systems (PODS ’02),

Madison, WI, USA, 3–5 June 2002; pp. 1–16.

35.

Akidau, T.; Bradshaw, R.; Chambers, C.; Chernyak, S.; Fernández-Moctezuma, R.J.; Lax, R.; McVeety, S.;

Mills, D.; Perry, F.; Schmidt, E.; et al. The Dataflow Model: A Practical Approach to Balancing Correctness,

Latency, and Cost in Massive-scale, Unbounded, Out-of-order Data Processing. Proc. VLDB Endow.

2015

8, 1792–1803. [CrossRef]

36.

Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.;

Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res.

2011

12, 2825–2830.

37.

Kreps, J. Questioning the Lambda Architecture. 2014. Available online: https://www.oreilly.com/radar/

questioning-the-lambda-architecture/ (accessed on 3 November 2019).

38.

Rakthanmanon, T.; Campana, B.; Mueen, A.; Batista, G.; Westover, B.; Zhu, Q.; Zakaria, J.; Keogh, E.

Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping. In Proceedings

of the 18th International Conference on Knowledge Discovery and Data Mining (KDD), Beijing, China,

12–16 August 2012; pp. 262–270.

39.

Hoppenstedt, B.; Reichert, M.; Kammerer, K.; Spiliopoulou, M.; Pryss, R. Towards a Hierarchical Approach

for Outlier Detection in Industrial Production Settings. In Proceedings of the CEUR Workshops of the

EDBT/ICDT 2019 Joint Conference (CEUR-WS.org 2019), Lisbon, Portugal, 26 March 2019.

2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access

article distributed under the terms and conditions of the Creative Commons Attribution

(CC BY) license (http://creativecommons.org/licenses/by/4.0/).