Efficient Processing of Geospatial mHealth Data Using a Scalable Crowdsensing Platform [original]

sensors

Article

Efficient Processing of Geospatial mHealth Data

Using a Scalable Crowdsensing Platform †

Robin Kraft 1,2,* , Ferdinand Birk 1, Manfred Reichert 1, Aniruddha Deshpande 3,

Winfried Schlee 4, Berthold Langguth 4, Harald Baumeister 2, Thomas Probst 5,

Myra Spiliopoulou 6and Rüdiger Pryss 7

1Institute of Databases and Information Systems, Ulm University, 89081 Ulm, Germany;

[email protected] (F.B.); manfred.r[email protected] (M.R.)

2Department of Clinical Psychology and Psychotherapy, Ulm University, 89081 Ulm, Germany;

[email protected]

3Department of Speech-Language-Hearing Sciences, Hofstra University, Hempstead, NY 11549, USA;

[email protected]

4Clinic and Policlinic for Psychiatry and Psychotherapy, University of Regensburg,

93053 Regensburg, Germany; [email protected] (W.S.); [email protected] (B.L.)

Department for Psychotherapy and Biopsychosocial Health, Danube University Krems, 3500 Krems, Austria;

[email protected]

6Department of Technical and Business Information Systems, Otto-von-Guericke-University Magdeburg,

39106 Magdeburg, Germany; [email protected]

7Institute of Clinical Epidemiology and Biometry, University of Würzburg, 97080 Würzburg, Germany;

ruediger[email protected]

*Correspondence: r[email protected]

† This paper is an extended version of the conference paper: Kraft, R.; Birk, F.; Reichert, M.; Deshpande, A.;

Schlee, W.; Langguth, B.; Baumeister, H.; Probst, T.; Spiliopoulou,M.; Pryss, R. Design and Implementation

of a Scalable Crowdsensing Platform for Geospatial Data of Tinnitus Patients. In Proceedings of the 2019

IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS), Cordoba, Spain,

5–7 June 2019; pp. 294–299.

Received: 8 May 2020; Accepted: 16 June 2020; Published: 18 June 2020





Abstract:

Smart sensors and smartphones are becoming increasingly prevalent. Both can be used to

gather environmental data (e.g., noise). Importantly, these devices can be connected to each other as

well as to the Internet to collect large amounts of sensor data, which leads to many new opportunities.

In particular, mobile crowdsensing techniques can be used to capture phenomena of common interest.

Especially valuable insights can be gained if the collected data are additionally related to the time

and place of the measurements. However, many technical solutions still use monolithic backends

that are not capable of processing crowdsensing data in a flexible, efficient, and scalable manner.

In this work, an architectural design was conceived with the goal to manage geospatial data in

challenging crowdsensing healthcare scenarios. It will be shown how the proposed approach can

be used to provide users with an interactive map of environmental noise, allowing tinnitus patients

and other health-conscious people to avoid locations with harmful sound levels. Technically, the

shown approach combines cloud-native applications with Big Data and stream processing concepts.

In general, the presented architectural design shall serve as a foundation to implement practical and

scalable crowdsensing platforms for various healthcare scenarios beyond the addressed use case.

Keywords:

mHealth; crowdsensing; tinnitus; geospatial data; cloud-native; stream processing;

scalability; architectural design

Sensors 2020,20, 3456; doi:10.3390/s20123456 www.mdpi.com/journal/sensors

Sensors 2020,20, 3456 2 of 21

1. Introduction

Nowadays, smartphones can be considered as an everyday object. Along this trend, many other

trends have been fanned. For example, as many people become more and more health-conscious,

smartphones can be utilized to support the trend to monitor someone’s own health status. Mobile

crowdsensing is one technology that is often used in this context. With mobile crowdsensing, a person’s

health status can be efficiently monitored, but beyond this opportunity, each crowd user benefits from

the measurements of other crowd users as well since measured values can be compared among the

users over time. Especially for users that suffer from a chronic disease or disorder, the approach of

mobile crowdsensing can be very helpful [

–

]. For the tinnitus chronic disorder, the measurement of

noise exposure can help each person in a crowd to avoid harmful locations based on the individual

measurements and their aggregations. For example, if a defined threshold in terms of the average

loudness is exceeded for a certain location, tinnitus users can be warned to avoid these places and

therefore avoid an unhealthy noise exposure.

Importantly, smartphones with their built-in sensors are able to reliably measure the noise level

of a user’s surrounding and store these measurements to a backend, which calculates average values

and responses with the aggregated data. Based on this information, places with an unhealthy noise

exposure can be, for example, visually highlighted on a map on the smartphone of a user. On the one

hand, as potentially millions of noise measurements have to be collected and processed continuously

and concurrently, such a backend in the mHealth context has to ensure a high degree of scalability to

avoid a degradation of the service for its users [

]. In addition, since the workload of such a system

can change frequently (e.g., due to day times or public events), such a backend should also provide

elasticity. On the other hand, efficient computations on geospatial data are non-trivial and choosing

an adequate data representation format can be a complex task. Furthermore, the architecture of such

a system should allow for flexible technical changes. Ensuring flexibility can become important,

for example, if a mobile operating system changes the sensor features used for measurements or other

additional requirements emerge. Therefore, monolithic backends using relational databases are usually

not sufficient if crowdsensing is used to store, process, and deliver stream-based smartphone data.

When working on a scenario like the measurement of noise levels, which is considered in the work at

hand, we propose that new architectural designs become necessary. For this purpose, we discuss a

mobile crowdsensing architecture, which is based on (1) cloud-native applications as well as (2) Big

Data and stream processing concepts. This work discusses the newly conceived architectural design

and also shows an implemented prototype on top of the architecture, which enables the aforementioned

noise level measurements. The architecture, its design principles, and the prototype show that for

tinnitus patients, the compiled approach is feasible in particular. However, the proposed approach can

be used for other healthcare scenarios with similar requirements as well.

As health-conscious people, especially those suffering from a chronic disease, crave technical

applications like those shown here; the overarching goal of this work is to utilize the power of the

crowd on one hand, but provide a scalable and flexible technical solution on the other hand. To

summarize the pursued aspects, the contributions of our work are as follows:

•

Practically, geospatial data that are gathered by the smartphones of crowd users shall be the basis

to provide a noise exposure map for tinnitus patients and the general public. This map shall help

patients and other health-conscious people to avoid unhealthy places.

•

Technically, a cloud-native architectural design based on microservices and stream processing has

been composed, which is able to collect, process, aggregate, store, and deliver measurement data

in mobile crowdsensing scenarios in an efficient and scalable manner. In this context, it is shown

how the mobile context is captured by the proposed architectural design in such a way that it is

able to cope with potentially increasing amounts of workloads without a noticeable degradation

of the resulting response times for its users.

Sensors 2020,20, 3456 3 of 21

•

Generally, the technical setting shown in this paper shall help other researchers to create scalable

mobile crowdsensing solutions for various healthcare scenarios.

Moreover, recent works show [

] that approaches that combine contemporary mobile technology

with paradigms from the healthcare side should address the question how this combination can be

accomplished so that various diseases and their particular aspects can be flexibly addressed. To this end,

based on the architecture that is presented in this work, including the promising experimental results,

the approach may constitute as a solid basis for other disorders than tinnitus. For example, for the

management of stress, noisy places should be also avoided. In addition, stress constitutes generally

a negative factor and is also being negative for many diseases and disorders, including tinnitus.

Therefore, the measurements done in this work may be beneficial for other scenarios. As stress is also

being important for companies and the management of employees, the solutions can also be utilized

in non eHealth scenarios. Finally, the presented approach can be adapted to other measurements

like, for example, the aggregated gathering of weather-related factors. The latter can then be used,

for example, for the better management of migraine.

In summary, when designing a system in such a context, the following technical challenges must

be addressed:

•Scalability and Elasticity

: The system should be able to cope with increasing workloads without

significant loss of performance and adapt its resources accordingly.

•Efficient Geospatial Processing

: The system should be able to efficiently process geospatial data

and store it in an adequate data representation format.

•Flexibility

: The system should allow flexible changes in order to cope with changing requirements.

An overview of the resulting system design that addresses these challenges is shown in Figure 1.

A mobile application as well as Internet of Things (IoT) sensors are able to contribute their measurement

data by sending the data to an ingress service in the cloud. The measurements of all users in a certain

area are then processed in the backend by utilizing stream processing techniques. More specifically,

the received measurements are continuously aggregated and prepared for being visualized on the

smartphone of the crowd users, before finally being stored in a database. The mobile application can

then request the aggregated data by querying an access service and displaying it directly on the device.

The shown architecture distinguishes itself from others (like, for example, TrackYourTinnitus [

])

as it allows for the efficient processing of data from various input sources (e.g., IoT sensors and

mobile applications) in the mHealth context by making use of stream processing concepts, which are

embedded in a cloud-native design.

This work is an extension of the following conference paper [

]. It substantially extends the

conference paper by the following aspects. First, the used data model for the representation of the

geospatial data as well as the indexing mechanisms for its storage are discussed (see Section 5).

Second, the stream processing as well as other implementation aspects in the Measurement Context

are described in more detail (see Section 6). Third, the implemented prototype of the architecture is

presented in more detail (see Section 7). Fourth, the performance and scalability of the conceptual

architecture were evaluated by conducting load tests on a running instance of the developed prototype

(see Section 8). Finally, related work and background information are discussed more extensively.

The work at hand is built upon the following structure. In Section 2, related work is discussed,

while Section 3introduces relevant background information. The proposed architecture is presented in

Section 4, followed by a discussion in Section 5that shows in what way geospatial data is captured and

processed by the architecture. The measurement context, which is a peculiarity of the approach,

is separately discussed in Section 6. Selected insights into the prototypical implementation are

presented in Section 7. The experiments conducted in the scope of the performance evaluation of the

architecture and their results are presented in Section 8, before concluding the paper with a summary

and an outlook in Section 9.

Sensors 2020,20, 3456 4 of 21

Measurement Data

Mobile

App

IoT Sensors

Text

Persistence

Visualization Data

Cloud

Ingress

Service

Access

Service

Backend

Stream Processing

Figure 1. System design overview of the crowdsensing platform.

2. Related Work

Mobile crowdsensing platforms that are used for urban sensing and collaborative noise maps

have been already discussed. In order to enhance the spatial and temporal data resolution of noise

pollution in cities, the authors of [

] implemented a participatory sensing approach, making use of an

Android application and an urban sensing platform. Importantly, the smartphones’ microphones and

GPS-sensors are utilized by the mobile application to perform location-related noise measurements.

Following this, the collected data is transferred to the urban sensing platform. Furthermore, users are

enabled to access this information basis and generate real-time noise maps or data graphs. The authors

of [

], on the other hand, realize a noise monitoring platform and acoustic urban planning in smart

cities by leveraging crowdsensing based on an Android application and Open Source data collection

and processing techniques. Noise reduction interventions are recommended to urban planners in

order to enable them to comply with European laws and regulations, using a web-based visualization

application. However, none of these platforms offers an interactive visualization of the measured

noise data directly on the mobile application, which would presuppose that the respective architecture

is designed for mobile clients. Moreover, these approaches do not set their main focus on efficiency

and scalability.

Crowdsensing platforms and architectures that address scalability and efficiency have also

been considered in the past. The authors of [

] propose the middleware infrastructure MECA

(mobile edge capture and analysis middleware for social sensing applications). It is designed for

mobile data collection of crowdsensing applications in an efficient, flexible, and scalable manner.

The common infrastructure allows for the collection of real-time data for different kinds of applications

simultaneously. By introducing a high-level abstraction of phenomena to be measured, applications

can express diverse data needs. Additionally, data can be shared among different applications with

common information needs. Primitive data processing can be performed on the edge of the network

(e.g., base stations in cellular networks). The platform CAROMM (context-aware real-time open

mobile miner), proposed by the authors of [

], supports data collection for mobile crowdsensing

applications by leveraging real-time mobile stream mining. This reduces the amount of sent data

as well as energy-usage on mobile devices, while providing comparable accuracy to conventional

approaches on the other. Different types of stream data can be captured on mobile devices, processed,

managed, analyzed, and finally queried by mobile users. The collaborative mobile sensing platform

MOSDEN (mobile sensor data engine) [

] enables us to capture and share sensed data between

multiple distributed (mobile) applications. Its design goals are ease of use, ease of development and

deployment, scalability and performance, ease of access to both on-board and external sensors, support

Sensors 2020,20, 3456 5 of 21

for on-board data analytics and collaboration, as well as data sharing. In contrast to CAROMM,

MOSDEN separates data collection, processing, and storage from the domain-specific application logic

by providing standardized interfaces in order to reduce complexity as well as to ease the re-usability

opportunities for developers. CARDAP [

], a context-aware real-time data analytics platform, deals

with energy-efficient and context-aware distributed mobile data analytics in the context of distributed

applications like crowdsensing. It utilizes a standardized component-oriented approach in order to

provide the application-specific analytics and additionally addresses local data storage and processing

in fog and cloud environments. The authors of [

] propose an approach for mobile crowdsensing

based on the cloud-based publish/subscribe middleware CUPUS (cloud-based publish/subscribe

middleware). The platform is able to acquire sensor data from mobile devices in a flexible and

energy-efficient manner, and to perform near real-time processing of Big Data streams. It allows us to

manage mobile sensor resources within the cloud, filtering, and aggregating sensor data on mobile

devices, before they are transmitted to the cloud based on global data requirements, and to send push

notifications from the cloud to mobile devices. However, none of the approaches discussed focuses

on the mHealth context and therefore do not consider aspects such as privacy. In addition, from

a technical point of view, none of these approaches combines cloud-native and stream processing

concepts to enable efficient and (horizontally) scalable processing of measurement data, as proposed

in the work at hand.

Platforms that deal with efficient processing of data streams have already been discussed in the

literature. For example, Microsoft StreamInsight [

] is an extensible stream processing platform that

enables continuous query processing, while ensuring a well-defined temporal model over incoming

events. Its extensibility infrastructure allows us to process geospatial data with the help of the SQL

Server Spatial library [

]. Furthermore, other approaches for processing geospatial data in particular

have been considered in the past. The authors of [

] review recent developments in the context of

crowdsourcing of geospatial data in particular. They identify two basic technologies that facilitate

these developments: geo-referencing (e.g., GPS) and the Web 2.0 to enable user-generated content (e.g.,

by uploading data via broadband communication). Challenges and opportunities of geospatial big data

are discussed in [

]. The authors highlight the emerging opportunities through the advancements

of sensor and communication technologies as well as mobile devices and highlight the importance

of high performance computing in this context. As an example, the XML-based system G-Portal [

]

has been designed to organize and manage geospatial as well as geo-referenced information in order

to make it available through a web search and an interactive map. Other geospatial applications like

GeogDL [17] have been built on top of G-Portal.

Crowdsensing of geospatial data has already been considered in other application domains,

such as vehicular networks [

] or location-based games [

]. Furthermore, the feasibility of

crowdsensing platforms in the mHealth context that support people suffering from chronic disorders,

especially tinnitus, has already been shown by [

–

]. However, to the best of our knowledge, none

of these approaches combines efficient and scalable processing of geospatial data with crowdsensing

in the mHealth context, as it is done in this work.

3. Background Information

Mobile crowdsensing (MCS) is a paradigm that is increasingly utilized in the mHealth

context [22,23].

It has been shown that MCS has the potential to reveal meaningful medical insights

when it is combined with Ecological Momentary Assessments (EMA) [

]. For instance, the authors

have developed the mobile crowdsensing platform TrackYourTinnitus (TYT), which is designed for

patients with the tinnitus disease [

]. Note that the tinnitus disorder can be described as the phantom

perception of a sound. The related symptoms of patients are subjective and vary over time. In order

to monitor and evaluate this variability of symptoms over time, TYT was realized based on EMAs

and mobile crowdsensing. Notably, with a prevalence rate of 10–15% of the population worldwide,

tinnitus is a chronic disorder with a high economic burden. Since there is no general treatment,

Sensors 2020,20, 3456 6 of 21

patients search for valuable experiences or methods to better manage their symptoms in daily life.

One possible countermeasure constitutes the avoidance of noisy places, as it is often reported that

patients suddenly get a tinnitus episode or that their already existing symptoms worsen after they

visited a concert. In addition, also for other phenomena like stress management, the avoidance of noisy

places is recommended. Therefore, in this work, the power of the crowd shall be leveraged to create

noise level maps of a region. More precisely, users should be able to measure the current noise level of

their environment with their smartphones using mobile crowdsensing techniques. Noise levels maps

that are created utilizing this data can be used by tinnitus patients to avoid noisy places (if the collected

data is reliable). Or, being utilized by other users for other healthcare questions like the reduction

of stress.

Although this work exploits the power of the crowd, it technically differs greatly from

TrackYourTinnitus with respect to three major issues: First, the platform should be able to process many

concurrent requests for incoming and outgoing measurement data without a significant performance

loss. Hence, it should be scalable and, in the best case, also be elastic. To get a better idea of these two

criteria, the following definitions for scalability and elasticity from the literature are used.

Definition 1.

Scalability is “the ability of a system to maintain the satisfaction of its quality goals to levels that

are acceptable to its stakeholders when characteristics of the execution environment (“the world”) and design

(“the machine”) vary over expected ranges.” [25,26]

Definition 2.

“Elasticity is the degree to which a system is able to adapt to workload changes by provisioning

and deprovisioning resources in an autonomic manner, such that at each point in time the available resources

match the current demand as closely as possible.” [27]

Second, the platform should be able to efficiently process, store, and deliver a large amount of

geospatial data. Geospatial data or geographic data denote data with “implicit or explicit reference

to a location relative to the Earth” [

]. Finding an adequate representation format for this kind of

data is a key task in the design phase of such a platform. In general, computations on geospatial

data are complex as operations on high-resolution coordinates, that are needed in order to aggregate

geographically and hierarchically related data, are costly [

]. Therefore, it is of utmost importance to

select an efficient approach for indexing and aggregating geospatial data. Third, the platform should

be designed in a generic way, so that it is able to process different types of geospatial crowdsensing

data (e.g., noise pollution, air pollution, or traffic information) as well as different types of utilized

sensors (e.g., smartphone or stationary sensors).

To conclude, designing a crowdsensing platform that is able to collect, process, store, and deliver

data for noise measurements comes with several new challenges compared to other crowdsensing

solutions of the authors in particular and other existing work in general.

4. Technical Approach

The core functions that were identified for the mobile crowdsensing platform, that enables the

implementation of a noise level map for tinnitus patients, are shown in Table 1. After the requirements

analysis and during the design phase, it was decided to decompose the system into bounded contexts.

The term originates from domain-driven design (DDD) and “delimits the applicability of a particular

model, so that team members have a clear and shared understanding of what has to be consistent

and how it relates to other contexts” [

]. The contexts serve as the inner boundaries for a global

domain (e.g., crowdsensing of geospatial data), and are the result of a strategic decomposition

of large components into smaller, more coherent components [

]. After defining these contexts,

respective microservices were developed that can be flexibly adapted or replaced if the requirements

of a context change. To this end, the five bounded contexts User Identity, Social, Measurements,

Incentives, Communication, and Sensors were identified. Following this, functions were mapped to

Sensors 2020,20, 3456 7 of 21

one of these contexts as shown in Table 1. Finally, one or more microservices compose a bounded

context, as different patterns must be supported to technically implement a bounded context through

microservices.

Table 1.

Core functions of the platform mapped to bounded contexts. Reproduced with permission

from Kraft et al., In Proceedings of the 2019 IEEE 32nd International Symposium on Computer-Based

Medical Systems (CBMS); published by IEEE, 2019 [5].

# Function Bounded Ctx

1.1 Let users register and authenticate with the backend. User Identity

1.2 Let users change their password and provide lost-password recovery. User Identity

1.3 Let users deactivate and delete their user account. User Identity

2.1 Let users maintain a User Profile with personal information. Social

2.2 Let users join groups and start, follow, and contribute to discussions. Social

2.3 Provide geospatial relations of groups and discussions. Social

2.4

Trigger a notification to the user on new contributions in subscribed

discussions or subscribed areas of interest.

Social

3.1

Collect measurements provided by smartphones and other IoT-devices and

streamline them as a common input stream.

Measurements

3.2

Aggregate the measurements to provide min-, max-, and average values

within certain geospatial areas and time-based windows.

Measurements

3.3

Allow geospatial request filtering by specifying the area of interest and time

windows.

Measurements

3.4

Allow access to single stored measurements with a pagination like limitation

for the number of results.

Measurements

3.5

Provide an API that returns the results in a common geospatial format to

allow straightforward visualization features with commonly used frontend

technologies.

Measurements

4.1

Track user contributions for authorization of additional functionality and to

provide a feature that users can evaluate their progress.

Incentives

4.2

Maintain awards and streaks for certain achievements that motivate users to

continue in contributing measurements.

Incentives

5.1 Inform users about certain events via email. Communication

5.2 Inform users about certain events via push-notifications. Communication

5.3

Let the user define preferences for the type of events he or she likes to be

informed.

Communication

6.1 Manage meta-information about statically deployed sensors. Sensors

A cloud-native approach was selected in order to enable an efficient and scalable processing

of concurrent requests for noise measurements. A cloud-native application (CNA) denotes an

application that is explicitly designed to be deployed in the cloud. To this end, such applications

are distributed, horizontally scalable, and elastic by design. Moreover, CNAs are composed of

microservices, with a minimum of isolated states [

]. The developed prototypical implementation

uses several microservices, utilizing Docker (https://www.docker.com/) as container technology and

Kubernetes (https://kubernetes.io/) as container-orchestration system. Notably, in order to enable

decoupledprocessing of incoming geospatial data, the cloud-native approach was combined with stream

processing. Stream processing denotes a programming paradigm. in which data from an unbounded

(i.e., infinite and ever growing) dataset (data stream or event stream) is continuously read and processed.

This ongoing processing is taking place in a continuous, asynchronous, and non-blocking manner [

Furthermore, due to its compatibility to Apache Kafka (https://kafka.apache.org/), the library Kafka

Streams was used for the implementation of the stream processing. Apache Kafka is a distributed

stream processing platform that allows users to publish and subscribe to streams of messages. In Kafka,

Sensors 2020,20, 3456 8 of 21

services can act as producer and publish different messages to Kafka Topics. Every interested service

can then act as a consumer that subscribes to these topics and reads the respective messages [33].

Figure 2shows the overall architecture of the platform. Incoming measurements from smartphones

are handled by the central Measurement API Services. Measurements are then forwarded by these

services to the stream processing with Apache Kafka. In addition, any type of an Internet of Things

(IoT) sensor can use the message queuing telemetry transport (MQTT) protocol in order to directly

contribute measurement data to Apache Kafka. Furthermore, the measurement API services provide

an interface for the other services that allows them to consume raw or transformed measurement

data. In Section 6, this process is described in detail. User authentication and authorization for all

access-restricted services are handled by the Authentication Services. Different autonomous services

manage their individual databases for sensor data, social and discussion data, incentive (i.e., challenges

and awards) data, and finally communication (i.e., contact) data.

Mobile

App

Authentica-

tion

Services

raw noise measurements

Noise

Measurement

API

Services

Social

Services

Measure-

ment

Data

Apache Kafka

Kafka Streams

Services

Kafka

Connect

Social/

Discussion

Data

intermediate topics

ﬁnal results

Incentive

Services Challenges/

Awards

Data

IoT Sensors

Communica-

tion

Services Contact

Data

Push

Notiﬁcation

Service

API-Gateway

Sensor Data

sensor meta information

Credentials

Data

Sensor

Services

Figure 2.

Architecture of the crowdsensing platform. The components depicted in orange are part of

the proof-of-concept implementation. Reproduced with permission from Kraft et al., In Proceedings of

the 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS); published

by IEEE, 2019 [5].

Sensors 2020,20, 3456 9 of 21

5. Representation of Geospatial Data

In order to efficiently transmit, store, and process geospatial data, it has to be represented by an

adequate data format. A coordinate referencing system (CRS) or spatial reference system (SRS) [

] is

a system that describes how entities are located in space based on their coordinates. It allows us to

unambiguously identify any point in a geographical space (e.g., on Earth). In this context, an object

is related to a coordinate system by a geodetic datum. One of the most known geodetic datums is

the World Geodetic System (WGS). Its latest revision WGS84 [

] is used by the Global Positioning

System (GPS) [

], which, in turn, is commonly used for navigation purposes. WGS approximates the

sea-level of the earth by using a defined ellipsoid. A point is then described by latitude and longitude

angles on this surface.

Furthermore, in order to provide interoperability between systems, we decided to use the OGC

Simple Feature Access Specification [

], published by the Open Geospatial Consortium (OGC).

It defines a common architecture on how geometric objects that are associated with a spatial reference

system can be stored and accessed. Its geometry model includes definitions for geometries like

points, lines, and polygons. GeoJSON [

] is a geospatial data interchange format based on the

JavaScript Object Notation (JSON) that uses WGS84 as coordinate reference system and encodes

geographic data structures according to the simple feature access specification. A GeoJSON object

may not only represent a region of space (i.e., a Geometry), but also additional properties forming

a spatially bounded entity (i.e., a Feature), or a list of features (i.e., a FeatureCollection). A feature

object contains an Identifier (id), a geometry object, and additional properties that describe the entity.

Geometry objects are of type Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon,

or GeometryCollection, and are described by coordinates, which are encoded as arrays in the form

[longitude, latitude]

, (and optionally altitude/elevation). The properties attribute contains an

arbitrary JSON object and can be used in order to associate data with the geometry. In the case of the

developed prototypical implementation, these properties were used to store the (sensor) measurement

data. Figure 3shows an example for a GeoJSON feature, including a polygon, defined by the

coordinates of its vertices, and the properties including (sensor) measurement values. The right-hand

side of the figure shows a graphical representation of the polygon described by the geometry.

{

" type " : " Feature " ,

" geometry " : {

" type " : "Polygon" ,

"coordinates" : [

[ 9.96005 , 48.42798 ] ,

[ 9.95979 , 48.42733 ] ,

[ 9.96051 , 48.42690 ] ,

[ 9.96150 , 48.42712 ] ,

[ 9.96177 , 48.42776 ] ,

[ 9.96104 , 48.42819 ] ,

[ 9.96005 , 48.42798 ]

} ,

"properties": {

"measurement_value1" : 50.5 ,

"measurement_value2" : 0

. . .

} ,

" id " : 622051731373752300

}

Figure 3.

Example for GeoJSON feature object, including a polygon geometry and its visualization

(Screenshot from http://geojson.io).

Sensors 2020,20, 3456 10 of 21

In order to enable efficient storage, processing, and hierarchical aggregation of geospatial data,

it is helpful to partition data into buckets. To this end, we make use of a Discrete Global Grid Systems

(DGGS). The following definition is provided by the Open Geospatial Consortium (OGC):

Definition 3.

“A DGGS is a spatial reference system that uses a hierarchical tessellation of cells to partition and

address the globe. DGGS are characterized by the properties of their cell structure, geo-encoding, quantization

strategy and associated mathematical functions” [39].

A series of discrete global grids represents the spatial reference system of a DGGS. Each of

these grids has a finer resolution, as it encompasses an increasing number of cells with respect to its

predecessor grid. Since DGGSs cover the whole spherical surface of the earth, they can be used to

partition data collected anywhere on the planet.

Furthermore, Uber’s Hexagonal Hierarchical Spatial Index (H3) [

] was used as a

DGGS-implementation. Its grid system (a visualization of the grid system can be found on Uber’s

website [

]) allows for the representation of the same data efficiently and in differently sized buckets.

These characteristics, in turn, are important to aggregate (and visualize) data on different scales.

For instance, for the two points depicted in Figure 4a, indexing them with H3 at resolution 10, results

in two different indexes with their bounding polygons being next to each other (see Figure 4b), while

indexing them with resolution 6, results in one common index including both points (see Figure 4c).

Notably, the H3 library allows for a specific geospatial location to determine in which bucket it has to

be placed and, inversely, to calculate the boundary of each bucket if its index is known.

(

) Two GeoJSON

Points

(

) The points from

(a) indexed with H3 at

resolution 10

(

) The points from (a)

indexed with H3 at

resolution 6

Figure 4.

GeoJSON points and their respective H3-indexed polygons at different resolutions

(Screenshots from http://geojson.io/).

6. Measurement Context

The measurement context is a decisive aspect and key factor to create a noise level map.

This section describes in what way the challenges to efficiently represent and aggregate noise

measurements are addressed. At first, the developed document-based (NoSQL) data model for

noise measurements that is used for the Measurement Context is shown in Figure 5. GeoJSON Simple

Features (see Section 5) are used to model measurements and aggregations in order to provide better

compatibility with different geo-libraries and data-storage in MongoDB (https://www.mongodb.

com/). Importantly, the latter supports indexing of GeoJSON structures inherently. Technically, the

attributes Type, Geometry, and Property are used. A unique id and additionally a geo-index, that is

calculated using the H3 library, is assigned to each noise measurement. Properties are used to store the

measurement payload, which contains the type of the sensor, the trigger of the measurement, and one or

more Measurement_Types (e.g., LAeq, LCPeak, and TWA [

]), amongst other attributes. Each of these

Sensors 2020,20, 3456 11 of 21

types can either contain only the type and a value if they represent a single measurement, or contain

minimum, maximum, mean, and count values, if they represent an AverageFeature. The latter is

used to store aggregated values for a specific time window in a specific geographical area (i.e.,

a hexagonal polygon). Any user-related data are stored separately from the measurement data in order

to preserve privacy.

MeasurementFeature .-+ Geometry 1 Average Feature

+ id type id

type coordinates type

Geometry +-1 Geometry

Properties --+--y_ Properties � Properties

h3idx h3idx

sensor_type

created_at trigger _type h3_resolution

accepted_at +-created_by _userid center_coord

created_by _user _type calculated_at

MeasurementUserMapping window_id

position_accuracy_in_meters

+ +-from

measurement_id Measurement_ Types

+to

user_id

+ Device PrivacySettings � Measurement_ Types

created

-at userid type:

r+-settings value:

+ Device count

brand l....E:: PrivacySettings min:

model privacy_type max

device

-id privacy_user_type mean

Figure 5.

Data model for noise measurements. Reproduced with permission from Kraft et al., In

Proceedings of the 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems

(CBMS); published by IEEE, 2019 [5].

Before sending measurements, the user can authenticate himself to the system. The authentication

follows a token-based approach based on OAuth 2.0 [

] in order to reduce the transmission of user

credentials and enable separation of the authentication in a distributed environment. Users may log in

once with their credentials and then receive a signed access token that can be used for all subsequent

requests for a certain period of time. Only the Authentication Service (see Section 4) is able to retrieve

the user information associated with a token, while other services are only able to validate if the token

is valid. Alternatively, users may share their measurements anonymously.

Figure 6shows the implemented data flow of measurements in the Measurement Context.

The Data Ingress Phase consists of Steps 1–4, which are briefly outlined: In Step 1, measurement

records from the mobile application are sent to an endpoint of the ingress service, while in Step 2, the

ingress service checks records for validity. Additionally, the endpoint handler attaches timestamps

as well as a user_id, if the user is authenticated. An example for a GeoJSON object, as it is sent to the

ingress service, is shown in Figure 7. In Steps 3 and 4, the measurement is published to the Kafka

topic noise-raw-measurements, and a confirmation message is sent to the mobile application (if the

measurement is valid). Thereafter, the stream processing phase is performed in Apache Kafka in Steps

5–12. In the following, these steps are discussed in more detail.

Sensors 2020,20, 3456 12 of 21

Mobile

App

Access

Service

3: Append noise measurement

to Kafka Topic

noise-raw-

measurements

Ingress

Service

Single Noise

Measure-

ments

Aggregations User

Data

Apache Kafka

5: poll from noise-raw-measurements

7: publish noise-anonymized-h3-10-indexed,

noise-anonymized-h3-5-indexed,

noise-user-measurement-mapping

8: poll from

noise-anonymized-h3-10-indexed,

noise-anonymized-h3-5-indexed

Kafka Connect

Mongo Con-

ncteor

11.1: poll from

noise-anonymized-h3-10-indexed

11.3: poll from noise-user-measurement-mapping

11.2: poll from different aggregate topics

1: Post Record

2: Check record format and

set correct USERID from token

4: Respond to user device,

that noise measurement has been

accepted and will be processed

13: Send request

to visualize data

14: Authorize request

using provided token

15: Load data of interest from

corresponding database

16: Perform ﬁnal aggregation steps

and apply privacy ﬁlters

17: Respond with proper data

for visualization purposes

Kafka Streams

Preparator

6: pre-checks, validation,

and privacy separation

10: publish to different

noise-average-rXwY aggregate topics

12: efﬁciently off-load

topics to database

using batches

Labeling:

1. Data Ingress Phase

2. Stream Processing Phase

3. Data Access Phase

Kafka Streams

Aggregator

9:time-windowed average

calculation based on H3-indexes

Figure 6.

Data flow of noise measurements. Reproduced with permission from Kraft et al.,

In Proceedings of the 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems

(CBMS); published by IEEE, 2019 [5].

{

" type " : " Feature " ,

" geometry " : {

" type " : " Point " ,

"coordinates" : [

9.967101535538086,

48.384883089298114

]

} ,

"properties": {

"measurement_types" : {

"LAeq" : {

" type " : "LAeq" ,

" value " : 70.5179

} ,

. . .

} ,

. . .

" device " : {

"brand" : " apple " ,

"model" : " iphone4 , 1 " ,

"device_id" : " uuid "

}

Figure 7. Example GeoJSON object that is sent to the ingress service.

Sensors 2020,20, 3456 13 of 21

Table 2. Kafka topics used in the noise measurement stream processing.

Topic Key Description

noise-raw-measurements created_at The entrance topic for every measurement.

noise-user-measurement-mapping userid

Contains mapping objects that relate

measurement-id and user-id.

noise-anonymized-h3-10-indexed H3idx(10)

Contains measurements that are filtered and

anonymized. The key is an H3 index of

resolution 10, in order to be correctly assigned

to partitions used in average aggregation.

noise-anonymized-h3-5-indexed H3idx(5)

Contains measurements that are filtered and

anonymized. The key is an H3 index of

resolution 5, in order to be correctly assigned

to partitions used in average aggregation.

noise-average-r10w15 H3idx(10)

Contains smaller aggregations in

H3-resolution 10 and a window-length

of 15 min.

noise-average-r10w60 H3idx(10)

Contains smaller aggregations in

H3-resolution 10 and a window-length

of 60 min.

noise-average-r5w60 H3idx(5)

Contains larger aggregations in H3-resolution

and a window-length of 60 min.

noise-average-r10w1440 H3idx(10)

Contains smaller aggregations in

H3-resolution 10 and a window-length

of 1 day.

noise-average-r5w1440 H3idx(5)

Contains larger aggregations in H3-resolution

5 and a window-length of 1 day.

The Java library Kafka-Streams was utilized in order to publish measurement data to different

topics. Table 2shows the topics currently used in the developed stream processing implementation,

together with the key that is used to identify each record and a description for the respective

topic. In Steps 5–7, all messages from the noise-raw-measurements topic are processed by the

preparator, i.e., measurements are validated and subsequently anonymized for privacy reasons.

To be precise, the mapping between users and individual measurements is stored in a separate

topic noise-user-measurement-mapping, to which only the user himself has access. This way,

all measurements used for the aggregations can be stored without any user data, but the user can

still retrieve his own measurements. Furthermore, the coordinates of the measurements are used

to calculate H3-indexes with the resolutions 10 (smallest) and 5 (intermediate). The anonymized

and indexed data are then published to respective Apache Kafka topics. In Steps 8–10, data from

the previous steps are serving as input for the aggregator, in which averages (or other aggregation

operations) for the different resolutions and time windows are calculated based on H3-indexes, as

shown in Figure 8. A time window is thereby characterized by a window length (e.g., 15 min to allow

for reasonably current data) and a retention time. The latter specifies for how long the window is

updated retrospectively, if measurements are incoming at a later time (e.g., one day, so that the data

from previous days can be considered complete). In the context of noise measurements, minimum and

maximum values are determined and averages are calculated with respect to the logarithmic scale

of decibels (see Figure 9). Aggregation results are then published to different Apache Kafka topics.

In Steps 11–12, the produced results are persisted to a MongoDB with Apache Kafka Connect in order

to allow us to efficiently query the data that would otherwise be partitioned across multiple topics

in the Kafka Cluster. Finally, in Steps 13–17, data are requested and prepared for visualization in the

data access phase, in which the following is processed: In Step 13, a request to the RESTful API of

the access service is sent by the mobile application, specifying a H3 resolution, a time window, and

a geo-boundary. Optionally, if the request is access-restricted, authorization is performed in Step 14.

Sensors 2020,20, 3456 14 of 21

Furthermore, in Step 15, MongoDB’s geospatial indexes are utilized in order to efficiently load data

from the corresponding databases. In Step 16, final aggregations steps are performed if the data is not

already present in the database in the requested format (e.g., a time window or resolution that is not

pre-aggregated), and privacy filters are applied (e.g., to specify that only the users themselves can see

their own raw measurements). Finally, in Step 17, data is delivered to the mobile application, in which

it can be used for visualization on a map.

private Aggregator<...> performAggregation(H3Core h3) {

return (h3idx, measurementFeature, averageFeature) -> {

// Initialize the Average Feature based on the H3-index and its geo-boundary

if (averageFeature.getH3idx() == null) {

averageFeature.initialize(h3idx, new Polygon("Polygon",

GeoUtils.geoListToDoubleArray(h3.h3ToGeoBoundary(h3idx))),

h3.h3GetResolution(h3idx));

averageFeature.setCenterCoord(GeoUtils.getCenterPoint(h3, h3idx));

}

Map<String, Type> averageValues = averageFeature.getProperties().getMeasurementTypes();

// For each measurement type recalculate the average separately with its own count

measurementFeature.getProperties().getMeasurementTypes().forEach((type, measurementType) -> {

Type averageValue = averageValues.getOrDefault(type, new Type(type, 0));

averageValue.recalculateWithMeasurement(measurementType);

averageValues.put(type, averageValue);

});

return averageFeature;

};

}

Figure 8. Method that performs the aggregation when a new measurement is added.

public void recalculateWithMeasurement(Type value) {

if (count == 0) {

min = value.min;

max = value.max;

}

min = (value.min < min) ? value.min : min;

max = (value.max > max) ? value.max : max;

mean = (mean != null) ? mean : this.value;

value.mean = (value.mean != null) ? value.mean : value.value;

int totalcount = count + value.count;

// Calculate mean with respect to the logarithmic scale of decibels

mean = 10 * Math.log10((count * Math.pow(10, 0.1 * mean) + value.count

* Math.pow(10, 0.1 * value.mean)) / totalcount);

count = totalcount;

}

Figure 9.

Class method that recalculates the minimum, maximum, and mean values of a

Measurement_Type.

7. Proof-of-Concept Prototype

A proof-of-concept prototype was implemented, which is briefly outlined in the following.

Figure 10 shows selected screenshots of the prototype. The current environment sound level in

A-weighted decibels (i.e., dB(A)) is continuously displayed to the user of the mobile application (see

Figure 10a). Pressing the measure button initiates a noise measurement. For this process, (1) the

A-weighted and C-weighted sound levels are tracked over a time period of 30 seconds and cached for

further processing, (2) the equivalent continuous A-weighted sound level (LAeq) and the C-weighted

peak sound level (LCpeak) [

] are calculated over these cached sound levels, (3) the results are

displayed to the user, and (4) finally posted to the backend (either immediately or delayed if the

application is currently not able to establish a connection to the server). These measurements are then

processed by the backend as described in Section 6. Following this, the mobile application as well

as a website (see Figure 10b) can request aggregations for different geo-boundaries, time windows,

and zoom levels (i.e., H3 resolutions) through the access service. Utilizing this data, an adequate

visualization of the noise exposure in the form of a map can be provided to the user of the mobile

Sensors 2020,20, 3456 15 of 21

application or the website. The noise exposure is thereby indicated by a color gradient between

harmful (red) and harmless (green).

(a) Mobile Application (b) Website

Figure 10.

Screenshots of the mobile application and the website showing the noise level map.

Reproduced with permission from Kraft et al., In Proceedings of the 2019 IEEE 32nd International

Symposium on Computer-Based Medical Systems (CBMS); published by IEEE, 2019 [5].

At present, users in the region of Ulm, Germany have been acquired to test the proof-of-concept

prototype. First feedback indicates that users generally value the application and recognize its

benefits. However, regarding the mobile application, so far, solely an iOS mobile application has been

implemented. Therefore, an Android application is currently under development. The main reason to

only opt for iOS, for the first release, was that sound measurements are more reliable and comparable

on iOS regarding the analysis and interpretation of results. Since various hardware vendors with

different microphones use Android as operating system, unlike iOS, which only runs on a relatively

small number of Apple devices allowing for an easier pre-calibration, the evaluation of retrieved sound

levels becomes more complex as it may vary among different Android vendors and would require a

mechanism for ad-hoc microphone calibration. However, the completion of an Android application

will be a decisive step in order to represent the majority of smartphone users. Furthermore, based

on the feedback of the users, the iOS application will be revised and extended by new features. The

latter could include incentive mechanisms as well social and communication features, as they were

not realized for the first version of the application. In addition, regarding the backend, extensive

performance tests are performed in order to evaluate the scalability of the platform, as described in

the following section. Finally, an external sensor is currently tested in combination with an Android

application that is able to measure even more precise environmental data. In this context, it is also

tested how users experience such an external sensor application over time.

Sensors 2020,20, 3456 16 of 21

8. Performance Evaluation

In order to evaluate the performance and scalability of the conceived architecture, benchmark

load tests were conducted on a running instance of the prototypical implementation of the backend,

following guidelines for measuring performance of parallel computing systems [

], and computer

systems in general [

]. In the scope of this paper, we evaluate to what extent the performance of

the Access Service (i.e., the service used by clients to request measurement data from the backend)

under different workloads develops. Other user-centric services like the ingress service are evaluated

analogously.

8.1. Experimental Setup and Methodology

For the experiments, the prototypical backend was deployed to the bwCloud, a cloud provider

for scientific and educational purposes by a federation of German universities in the Federal State of

Baden-Württemberg in the context of the bwCloud SCOPE project (https://www.bw-cloud.org/en/).

The bwCloud provides infrastructure as a service (IaaS) and based on the Open Source software

Openstack (https://www.openstack.org/), and the distributed object store and file system Ceph

(https://ceph.io/). Terraform (https://www.terraform.io/) is then used as Infrastructure as Code

(IAC) tool to configure the cloud resources based on structured text-files. IAC is one of the central parts

in building a cloud-native application, as it allows us to maintain, install, and deploy the infrastructure

in a reproducible manner [

]. Using this concept, eight virtual machines (i.e., nodes) with each 4

VCPUs and 8 GB RAM running CoreOS 1855.4.0 (http://coreos.com/) were configured for the overall

experimental setup. One of these nodes serves as master node for the Open Source software Rancher

(https://rancher.com/), which is used to facilitate the creation and management of the Kubernetes

cluster. Another node, in turn, is used as master node for the Kubernetes cluster itself. The remaining

six nodes serve as worker nodes for the cluster. Finally, the developed backend, composed of a total

of 14 microservices (see Figure 6, some less relevant services (e.g., for authentication) are omitted for

simplicity), was deployed to this cluster setting.

The Open Source testing tool Gatling(https://gatling.io/) for the produced load is used for the

actual measurements. Gatling allows us to simulate concurrent users in a resource-saving manner by

sending asynchronous messages via non-blocking protocols like HTTP. Previous to the experiment,

randomly generated (sound measurement) data for the city of Ulm has been posted to the backend,

as displayed in Figure 10b. Benchmark workloads are created by simulating different numbers of

concurrent users that (simultaneously) access the stored measurement data for the city of Ulm, and

for the last day via the REST API (i.e., the Access Service) utilizing Gatling’s

atOnceUsers

function.

The experiments are run on a single Ubuntu 19.04 machine in the university network that has a stable

Internet connection to the bwCloud infrastructure. For each run, the 50% quantiles (i.e., the median) of

response times are recorded, as the median is more robust towards outliers as other summary methods

like the mean [

]. Note that this median value is considered and handled as a single measurement.

The experiments were then repeated and the median of the measurements incrementally recomputed,

as well as the confidence intervals (CI) at confidence level 1

−α=

0.95, as described in [

], determined.

A confidence interval provides a measure of accuracy for the experiment, as it bounds the uncertainty

of a summarized data set (i.e., the median) of sample data that results from the randomness of

non-deterministic measurements. The interval can be interpreted as a 95% probability (i.e., confidence

level) that the observed CI contains the true median [

]. Each experiments is repeated ntimes until

the CI is within 5% of the median of the respective measurements. Note that

5 measurements are

required to compute the confidence intervals for statistical reasons [

]. To this end, each test run was

repeated for n=20 times.

Sensors 2020,20, 3456 17 of 21

8.2. Results

The results of the experiments are illustrated in Figure 11. The exact medians and confidence

intervals of the measurements are shown in Table 3. It can be seen that the response times increase

almost linearly, as indicated by the dashed trend line. In order words, these results suggest that

the system provides almost ideal linear scaling under different workloads. Regarding the longer

response times of up to about 9 seconds, one has to take into account that actual simultaneous requests

were used to simulate concurrent users in order to represent extreme situations of workloads. As the

proposed architecture is composed entirely out of microservices, horizontal scaling in a large-scale

cluster beyond our prototypical experimental setup might lead to a flatter curve of response times.

However, this could not be assessed due to a limited amount of (cloud) resources in our experimental

setup. Another limitation resulting from the lack of available resources and an suitable load balancer

is that the elasticity of the architecture cannot be assessed at the current stage. Furthermore, as the

load tests are run from a single machine, no statement can be made for a distributed environment.

Generally, the architecture has to prove its suitability in different (mHealth) scenarios. Nonetheless,

the experiments, which have been conducted have shown that the proposed architecture is feasible in

a running environment and can sustain increasing quantities of load in a scalable manner. Therefore,

it can be considered as a first and solid mainstay for the healthcare scenario addressed in this work.

Many more considerations and experiments are needed to map the results and architecture to a more

generic system. In addition, as medical evidence is always an important aspect in healthcare scenarios,

this aspect has to be considered in the light of the presented technical achievements.

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

01000 2000 3000 4000 5000

RESPONSE TIME (MS)

NUMBER OF SIMULATED CONCURRENT USERS

Figure 11. Median of the response time measurements for different numbers of simulated concurrent

(simultaneous) users. The dashed line represents an ideal linear trend line. All experiments were

repeated 20 times and the confidence intervals (CI) at confidence level 0.95 (1

−α=

0.959) were within

5% of the respective median.

Finally, two more aspects must be investigated more in-depth. Beyond the technical performance

of the approach, studies will be conducted to capture the demands and experiences of users more

properly. Second, based on the aggregation idea of data, it must be evaluated whether other context

aspects can be further utilized. For example, there is a difference whether the measurements are

accomplished at the beginning or at the end of a day. Or, as another example, the specific city or area,

for which the measurements are accomplished, might have different characteristics and needs as other

areas. These factors will therefore be considered in future work.

Sensors 2020,20, 3456 18 of 21

Table 3.

Median of response times and confidence intervals (CI) in milliseconds (ms) for each number

of simulated concurrent users.

Number of Concurrent Users Median of Response Time (ms) Confidence Intervals (CI)

500 1080.5 [1052 ms, 1115 ms]

1000 2000 [1974 ms, 2027 ms]

1500 2941 [2874 ms, 2972 ms]

2000 3873 [3826 ms, 3937 ms]

2500 4592 [4564 ms, 4662 ms]

3000 5543.5 [5306 ms, 5607 ms]

3500 6414.5 [6264 ms, 6674 ms]

4000 7384 [7174 ms, 7535 ms]

4500 8204 [7904 ms, 8433 ms]

5000 8943 [8795 ms, 9182 ms]

In summary, particularly by taking the promising experimental results into account, the three

main technical challenges raised in the introduction are addressed by the proposed technical solution

as follows:

•Scalability and Elasticity

: The architecture is scalable and elastic due to its cloud-native design

based on microservices.

•Efficient Geospatial Processing

: Geospatial data is efficiently processed in the architecture

by utilizing stream processing techniques and a DGGS as spatial reference system for data

representation.

•Flexibility

: The architecture is flexible due to the modularity of microservices of the

cloud-native design.

9. Summary and Outlook

This work presented an approach to create a noise level map using a mobile crowdsensing

platform capable of processing noise measurements from a large number of crowd users and their

smartphones. Noise exposure for a specific area is thereby indicated by a color coding and different

zoom levels. The latter features were made possible through a newly designed measurement context

that stores and aggregates noise measurements by developing a sophisticated stream processing

pipeline. From a patient point of view, first study results based on a proof-of-concept prototype

indicate that users value the platform’s general approach and welcome its ease of use. From the

technical point of view, a performance evaluation has been conducted that suggests linear scaling of

the conceived architecture under increasing amounts of workloads. However, this work also discussed

technical aspects that need to be improved in order to finally provide a feasible approach that can

be reliably applied in various practical scenarios. Furthermore, extensive performance tests will be

conducted against state-of-the-art architectures to evaluate the scalability of the system in different

mHealth and eHealth scenarios again and again. With regard to tinnitus disorder, a noise level map

that is based on the discussed approach may be used by patients to avoid burdensome places. Many

other useful features for tinnitus patients were revealed when testing the proof-of-concept prototype

in practice. For instance, users indicated they could complete a tinnitus-related questionnaire while

performing a noise measurement. Using this information, the data of the tinnitus questionnaire can

be related to the recorded noise levels to enable users to learn more about the daily fluctuations of

their tinnitus. Furthermore, other statements of users indicate that the overall incentive management

must be enhanced to motivate users in participating over a longer period of time. From a technical

point of view, it was revealed that a complex technical architecture and infrastructure are required for

the implementation of the discussed features. The resulting solution, on the other hand, can also be

used in other mHealth contexts. For example, the system could be used to measure weather-related

factors in the context of migraine. In addition, machine learning (ML) approaches on the data streams

Sensors 2020,20, 3456 19 of 21

have the potential to further improve the system, e.g., by supplementing data sets of areas with sparse

measurement contributions. Overall, it was revealed that mobile crowdsensing in the mHealth context

is still in its infancy. On the other hand, approaches such as the one presented in this work show that

mobile crowdsensing is a promising paradigm for mHealth scenarios. More importantly, the work at

hand shows that the combination of medical-driven information science and computer science is an

important field that requires more in-depth investigations of interdisciplinary teams.

Author Contributions:

Conceptualization, R.K. and F.B.; methodology, R.K., M.R., and H.B.; software, R.K. and

F.B.; validation, R.K.; formal analysis, R.K.; investigation, R.K.; resources, M.R. and H.B.; data curation, R.K.;

writing—original draft preparation, R.K. and R.P.; writing—review and editing, R.K. and R.P.; visualization, R.K.;

supervision, M.R., H.B., and R.P.; project administration, A.D., W.S., B.L., T.P., M.S., and R.P. All authors have read

and agreed to the published version of the manuscript.

Funding: This research received no external funding.

Conflicts of Interest: The authors declare no conflict of interest.

References

1. Probst, T.; Pryss, R.; Langguth, B.; Schlee, W. Emotional states as mediators between tinnitus loudness and

tinnitus distress in daily life: Results from the “TrackYourTinnitus” application. Sci. Rep.

2016

,6, 20382.

[CrossRef] [PubMed]

Schlee, W.; Pryss, R.; Probst, T.; Schobel, J.; Bachmeier, A.; Reichert, M.; Langguth, B. Measuring the

moment-to-moment variability of tinnitus: the TrackYourTinnitus smart phone app. Front. Aging Neurosci.

2016,8, 294. [CrossRef] [PubMed]

Kraft, R.; Schlee, W.; Stach, M.; Reichert, M.; Langguth, B.; Baumeister, H.; Probst, T.; Hannemann, R.;

Pryss, R. Combining Mobile Crowdsensing and Ecological Momentary Assessments in the Healthcare

Domain. Front. Neurosci. 2020,14, 164. [CrossRef] [PubMed]

Pryss, R. Mobile Crowdsensing in Healthcare Scenarios: Taxonomy, Conceptual Pillars, Smart Mobile

Crowdsensing Services. In Digital Phenotyping and Mobile Sensing; Springer: Berlin/Heidelberg, Germany,

2019; pp. 221–234.

Kraft, R.; Birk, F.; Reichert, M.; Deshpande, A.; Schlee, W.; Langguth, B.; Baumeister, H.; Probst, T.;

Spiliopoulou, M.; Pryss, R. Design and Implementation of a Scalable Crowdsensing Platform for Geospatial

Data of Tinnitus Patients. In Proceedings of the 2019 IEEE 32nd International Symposium on Computer-Based

Medical Systems (CBMS), Cordoba, Spain, 5–7 June 2019; pp. 294–299.

Schweizer, I.; Darmstadt, T.; Probst, F.; Bärtl, R.; Darmstadt, T.; Mühlhäuser, M.; Darmstadt, T.; Schulz, A.;

Darmstadt, T. Noisemap - real-time participatory noise maps. In Proceedings of the Second International

Workshop on Sensing Applications on Mobile Phones, Seattle, WA, USA, 1–4 November 2011; pp. 1–5.

Zappatore, M.; Longo, A.; Bochicchio, M.A. Crowd-sensing our Smart Cities: A Platform for Noise

Monitoring and Acoustic Urban Planning. J. Commun. Softw. Syst. 2017,13, 53. [CrossRef]

Ye, F.; Ganti, R.; Dimaghani, R.; Grueneberg, K.; Calo, S. Meca: Mobile edge capture and analysis middleware

for social sensing applications. In Proceedings of the 21st International Conference on World Wide Web,

Lyon, France, 16–20 April 2012; ACM: New York, NY, USA, 2012; pp. 699–702.

Sherchan, W.; Jayaraman, P.P.; Krishnaswamy, S.; Zaslavsky, A.; Loke, S.; Sinha, A. Using on-the-move

mining for mobile crowdsensing. In Proceedings of the 2012 IEEE 13th International Conference on Mobile

Data Management, Bengaluru, India, 23–26 July 2012; pp. 115–124.

10.

Jayaraman, P.P.; Perera, C.; Georgakopoulos, D.; Zaslavsky, A. Efficient opportunistic sensing using mobile

collaborative platform mosden. In Proceedings of the 9th IEEE International Conference on Collaborative

Computing: Networking, Applications and Worksharing, Austin, TX, USA, 20–23 October 2013; pp. 77–86.

11.

Jayaraman, P.P.; Gomes, J.B.; Nguyen, H.L.; Abdallah, Z.S.; Krishnaswamy, S.; Zaslavsky, A. Cardap:

A scalable energy-efficient context aware distributed mobile data analytics platform for the fog. In East

European Conference on Advances in Databases and Information Systems; Springer: Berlin/Heidelberg, Germany,

2014; pp. 192–206.

Sensors 2020,20, 3456 20 of 21

12.

Antoni´c, A.; Marjanovi´c, M.; Pripuži´c, K.; Žarko, I.P. A mobile crowd sensing ecosystem enabled by CUPUS:

Cloud-based publish/subscribe middleware for the Internet of Things. Future Gener. Comput. Syst.

2016

56, 607–622. [CrossRef]

13.

Kazemitabar, S.J.; Demiryurek, U.; Ali, M.; Akdogan, A.; Shahabi, C. Geospatial stream query processing

using Microsoft SQL Server StreamInsight. Proc. VLDB Endow. 2010,3, 1537–1540. [CrossRef]

14.

Heipke, C. Crowdsourcing geospatial data. ISPRS J. Photogramm. Remote Sens.

2010

,65, 550–557. [CrossRef]

15.

Lee, J.G.; Kang, M. Geospatial big data: Challenges and opportunities. Big Data Res.

2015

,2, 74–81.

[CrossRef]

16.

Lim, E.P.; Liu, Z.; Yin, M.; Goh, D.H.L.; Theng, Y.L.; Ng, W.K. On organizing and accessing geospatial and

georeferenced Web resources using the G-Portal system. Inf. Process. Manag.

2005

,41, 1277–1297. [CrossRef]

17.

Theng, Y.L.; Goh, D.H.L.; Lim, E.P.; Liu, Z.; Yin, M.; Lee-San Pang, N.; Wong, P.B.B. Applying scenario-based

design and claims analysis to the design of a digital library of geography examination resources. Inf. Process.

Manag. 2005,41, 23–40. [CrossRef]

18.

Xiao, L.; Chen, T.; Xie, C.; Dai, H.; Poor, H.V. Mobile crowdsensing games in vehicular networks. IEEE Trans.

Veh. Technol. 2017,67, 1535–1545. [CrossRef]

19.

Freschi, V.; Delpriori, S.; Klopfenstein, L.C.; Lattanzi, E.; Luchetti, G.; Bogliolo, A. Geospatial data

aggregation and reduction in vehicular sensing applications: The case of road surface monitoring.

In Proceedings of the 2014 International Conference on Connected Vehicles and Expo (ICCVE), Vienna,

Austria, 3–7 November 2014; pp. 711–716.

20.

Schlieder, C.; Kiefer, P.; Matyas, S. Geogames: Designing location-based games from classic board games.

IEEE Intell. Syst. 2006,21, 40–46. [CrossRef]

21.

Matyas, S.; Matyas, C.; Schlieder, C.; Kiefer, P.; Mitarai, H.; Kamata, M. Designing location-based mobile

games with a purpose: collecting geospatial data with CityExplorer. In Proceedings of the 2008 International

Conference on Advances in Computer Entertainment Technology, Yokohama, Japan, 3–5 December 2008;

pp. 244–247.

22.

Pryss, R.; Schlee, W.; Langguth, B.; Reichert, M. Mobile crowdsensing services for tinnitus assessment and

patient feedback. In Proceedings of the 2017 IEEE International Conference on AI & Mobile Services (AIMS),

Honolulu, HI, USA, 25–30 June 2017; pp. 22–29.

23.

Ganti, R.K.; Ye, F.; Lei, H. Mobile crowdsensing: current state and future challenges. IEEE Commun. Mag.

2011,49, 32–39. [CrossRef]

24.

Ebner-Priemer, U.; Kubiak, T. Psychological and psychophysiological ambulatory monitoring. Eur. J. Psychol.

Assess. 2007,23, 214–226. [CrossRef]

25.

Duboc, L.; Rosenblum, D.; Wicks, T. A framework for characterization and analysis of software system

scalability. In Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference

and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, Dubrovnik, Croatia,

3–7 September 2007; pp. 375–384.

26.

Duboc, L.; Letier, E.; Rosenblum, D.; Wicks, T. A case study in eliciting scalability requirements.

In Proceedings of the 2008 16th IEEE International Requirements Engineering Conference, Barcelona, Spain,

8–12 September 2008; pp. 247–252.

27.

Herbst, N.R.; Kounev, S.; Reussner, R. Elasticity in cloud computing: What it is, and what it is not.

In Proceedings of the 10th International Conference on Autonomic Computing (ICAC ’13), San Jose, CA,

USA, 26–28 June 2013; pp. 23–27.

28.

ISO 19109: 2015 Geographic Information–Rules for Application Schema; Standard, International Organization for

Standardization: Geneva, Switzerland, 2015.

29.

Pryss, R.; John, D.; Reichert, M.; Hoppenstedt, B.; Schmid, L.; Schlee, W.; Spiliopoulou, M.; Schobel, J.;

Kraft, R.; Schickler, M.; et al. Machine Learning Findings on Geospatial Data of Users from the

TrackYourStress mHealth Crowdsensing Platform. In Proceedings of the 2019 IEEE 20th International

Conference on Information Reuse and Integration for Data Science (IRI), Los Angeles, CA, USA,

30 July–1 August 2019; pp. 350–355.

30.

Evans, E. Domain-Driven Design: Tackling Complexity in the Heart of Software; Addison-Wesley Professional:

Boston, MA, USA, 2004; ISBN 978-032-112-521-7.

31.

Nadareishvili, I.; Mitra, R.; McLarty, M.; Amundsen, M. Microservice Architecture: Aligning Principles, Practices,

and Culture; O’Reilly Media: Sebastopol, CA, USA, 2016; ISBN 978-1-4919-5625-0.

Sensors 2020,20, 3456 21 of 21

32.

Kratzke, N.; Quint, P.C. Understanding cloud-native applications after 10 years of cloud computing—A

systematic mapping study. J. Syst. Softw. 2017,126, 1–16. [CrossRef]

33.

Narkhede, N.; Shapira, G.; Palino, T. Kafka: The Definitive Guide: Real-time Data and Stream Processing at Scale;

O’Reilly Media: Sebastopol, CA, USA, 2017.

34.

Lott, R.; Ryden, K.; Desruisseaux, M.; Mark, H.; Heazel, C. OGC Abstract Specification Topic 2: Referencing

by coordinates; Open Geospatial Consortium: Wayland, MA, USA, 2019.

35.

Decker, B.L. World Geodetic System 1984; Technical report; Defense Mapping Agency Aerospace Center: St

Louis, MO, USA, 1986.

36.

Federal Aviation Administration (FAA). Global Positioning System wide Area Augmentation System (WAAS)

Performance Standard; Technical Report; Federal Aviation Administration: Washington, DC, USA, 2008.

37.

Herring, J. OpenGIS Implementation Standard for Geographic information-Simple feature access-Part 1:

Common architecture. OGC Doc. 2011,4, 122–127.

38.

Butler, H.; Daly, M.; Doyle, A.; Gillies, S.; Hagen, S.; Schaub, T. The GeoJSON Format; RFC 7946. Avaiable

online: www.rfc-editor.org/info/rfc7946 (accessed on 10 May 2020). doi:10.17487/RFC7946. [CrossRef]

39.

Purss, M.; Gibb, R.; Samavati, F.; Peterson, P.; Rogers, J.; Ben, J.; Dow, C. OGC Abstract Specification Topic 21:

Discrete Global Grid Systems Abstract Specification; Technical Report; Open Geospatial Consortium: Wayland,

MA, USA, 2017.

40.

Brodsky, I. H3: Uber’s Hexagonal Hierarchical Spatial Index. 2018. available online: https://eng.uber.com/

h3/ (accessed on 15 August 2019).

41.

Earshen, J.J. Sound Measurement: Instrumentation and Noise Descriptors. In The Noise Manual; Royster, H.,

Royster, D., Driscoll, P., Layne, M., Eds.; American Industrial Hygiene Association: Fairfax, VA, USA, 2003;

Volume 5, Chapter 3, pp. 41–100, ISBN 978-193-150-402-7.

42.

Hardt, D. The OAuth 2.0 Authorization Framework; RFC 6749. Available online: www.rfc-editor.org/info/

rfc6749 (accessed on 10 May 2020). doi:10.17487/RFC6749. [CrossRef]

43.

Hoefler, T.; Belli, R. Scientific benchmarking of parallel computing systems: Twelve ways to tell the masses

when reporting performance results. In Proceedings of the International Conference for High Performance

Computing, Networking, Storage and Analysis, Austin, TX, USA, November 15–20 2015; pp. 1–12.

44.

Le Boudec, J.Y. Performance Evaluation of Computer and Communication Systems; Epfl Press: Lausanne,

Switzerland, 2011.

45.

Brikman, Y. Terraform: Up & Running: Writing Infrastructure as Code; O’Reilly Media: Sebastopol, CA, USA, 2019.

2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access

article distributed under the terms and conditions of the Creative Commons Attribution

(CC BY) license (http://creativecommons.org/licenses/by/4.0/).