scieee Science in your language
[en] (orig)
Investigations on MAC and Link Layer for a
wireless PROFIBUS over IEEE 802.11
von Diplom-Informatiker
Andreas Willig
aus Berlin
von der Fakult¨at IV - Elektrotechnik und Informatik
der Technischen Universit¨at Berlin
zur Erlangung des akademischen Grades
Doktor der Ingenieurwissenschaften
- Dr.-Ing. -
genehmigte Dissertation
Promotionsausschuss:
Vorsitzender: Prof. Dr.-Ing. G¨unter Hommel
Gutachter: Prof. Dr.-Ing. Adam Wolisz
Gutachter: Prof. Dr.-Ing. Klaus David
Tag der wissenschaftlichen Aussprache: 17. Mai 2002
Berlin 2002
D 83
Contents
1 Introduction 20
1.1 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2 Industrial Communication Systems 26
2.1 Requirements for Industrial Communication Systems / Field Buses . . . . . . . . . . . 27
2.2 Architectural Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 Popular Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.1 Systems using Token-Passing-MAC’s . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3.2 Other Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3.3 MAP/MMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3 Wireless LANs / IEEE 802.11 31
3.1 Wireless LANs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.1.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.1.2 WLAN Properties Important for MAC Design . . . . . . . . . . . . . . . . . . 33
3.1.3 WLAN Standards / Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 IEEE 802.11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.2 PHY Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.3 MAC Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4 PROFIBUS and Wireless PROFIBUS 45
4.1 The PROFIBUS Fieldbus System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.1.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.1.2 Physical Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.1.3 Link Layer Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
1
4.1.4 MAC- and Data Link Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.1.5 Important Properties of the PROFIBUS . . . . . . . . . . . . . . . . . . . . . . 56
4.2 Wireless Industrial Communication Systems . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2.1 Integrated Scenario: General Considerations . . . . . . . . . . . . . . . . . . . . 59
4.3 Integrated Scenario for Wireless PROFIBUS . . . . . . . . . . . . . . . . . . . . . . . 60
4.3.1 System under Study and Realtime Performance Measures . . . . . . . . . . . . 60
4.4 Wireless PROFIBUS over IEEE 802.11 MAC . . . . . . . . . . . . . . . . . . . . . . . 64
4.4.1 DCF-based approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4.2 PCF-based approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.4.3 SRD service handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.4.4 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5.1 Overview of other Work in Wireless Fieldbus Systems . . . . . . . . . . . . . . 67
4.5.2 Real-Time Data Transmission with IEEE 802.11 . . . . . . . . . . . . . . . . . 69
5 Behavior of the PROFIBUS Protocol under Link Errors 72
5.1 PROFIBUS over Error Prone Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.1.1 Major Causes for Ring Instability . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.1.2 Ring Stability Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.1.3 Analytical PROFIBUS Ring Membership Model . . . . . . . . . . . . . . . . . 75
5.1.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.1.5 Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.2 PROFIBUS over Wireless Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6 Error Behavior of Wireless Channels and its Modeling 104
6.1 Sources of Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.1.1 Path Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.1.2 Multipath Fading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.2 Measurement Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.2.1 IEEE 802.11 / PRISM I PHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.2.2 Measurement Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
2
6.2.3 Format of the Generated Packet-Stream . . . . . . . . . . . . . . . . . . . . . . 112
6.3 Measurement Evaluation Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.3.1 Indicator Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.3.2 Trace Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.4 Measurement Parameters and Environment . . . . . . . . . . . . . . . . . . . . . . . . 115
6.4.1 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.4.2 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.5 Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.5.1 Packet-Related Phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.5.2 Packet Losses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.5.3 Positions of Bit Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.5.4 Mean Bit Error Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.5.5 Burst Length Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.5.6 Error Densities and Error Clustering . . . . . . . . . . . . . . . . . . . . . . . . 131
6.6 Review of other Measurement Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.7 Stochastic Modeling of Bit- and Packet-Errors . . . . . . . . . . . . . . . . . . . . . . . 137
6.7.1 Overview of Common Stochastic Models . . . . . . . . . . . . . . . . . . . . . . 137
6.7.2 Bipartite Model for Generating Indicator Sequences . . . . . . . . . . . . . . . 140
6.7.3 Comparison of Different Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.8.1 Summary of Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.8.2 Stochastic Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.8.3 Overall Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.8.4 Consequences for Design of MAC and Link-Layer Protocols . . . . . . . . . . . 149
7 Polling Protocols for Wireless PROFIBUS 151
7.1 System under Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.1.1 Description of the System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
7.2 Polling-based Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
7.2.1 k-limited Round-Robin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
7.2.2 Functional Repolling Framework . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7.2.3 Relaying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
7.2.4 Adaptive Functional Repolling . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
3
7.3 Realtime-Performance Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
7.3.1 Method of Investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
7.3.2 Comparison of Round-Robin with PROFIBUS . . . . . . . . . . . . . . . . . . 168
7.3.3 Comparison of the Polling-protocol Modifications with Round-Robin . . . . . . 179
7.4 Polling-based MACs for wireless PROFIBUS . . . . . . . . . . . . . . . . . . . . . . . 202
7.4.1 Integration Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
7.4.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
7.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
7.6 Discussion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
8 Conclusions and Outlook 209
A Main Characteristics of the Bipartite Model 212
A.1 Asymptotic Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
A.2 Distribution of Generated Burst Lengths . . . . . . . . . . . . . . . . . . . . . . . . . . 214
A.3 Correlation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
B Applicability of Simple FEC Schemes to the Measurement Traces 217
4
List of Figures
2.1 Hierarchy of information flows in manufacturing applications (from [68, p. 13]) . . . . 26
3.1 Hidden terminal scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Exposed terminal scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 DSSS PHY PPDU Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4 DSSS PHY Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1 PROFIBUS protocol stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2 Interactions of SDA service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3 Interactions of SDN service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4 Interactions of SRD service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5 PROFIBUS frame formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.6 Integrated PROFIBUS LAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.1 A single stations life cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2 Logical structure of PROFIBUS simulation model . . . . . . . . . . . . . . . . . . . . 82
5.3 ¯
N(3600) and ¯
Nvs. BER (independent errors) . . . . . . . . . . . . . . . . . . . . . . . 84
5.4 ¯
M(3600) and ¯
Mvs. BER (independent errors) . . . . . . . . . . . . . . . . . . . . . . 84
5.5 ¯
M(3600) vs. BER (independent errors) . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.6 ¯
N(3600) vs. BER (independent errors) . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.7 Distribution function C(s) for different BER’s . . . . . . . . . . . . . . . . . . . . . . . 88
5.8 ¯
N(3600) vs. BI for m = 0.001 (Gilbert errors) . . . . . . . . . . . . . . . . . . . . . . . 88
5.9 N(t) vs. time (Gilbert errors) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.10 MSOT vs. gap factor (independent errors) . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.11 CSOT vs. gap factor (independent errors) . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.12 ¯
M(3600) vs. BER (independent errors) . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5
5.13 ¯
N(3600) vs. BER (independent Errors) . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.14 Sample coefficients of variation for Nvs. BER (independent errors) . . . . . . . . . . . 95
5.15 ¯
N(3600) vs. BI for m = 0.001 (Gilbert errors) . . . . . . . . . . . . . . . . . . . . . . . 95
5.16 ¯
N(10000) vs. BER (independent errors) with 10 masters and 36% load . . . . . . . 96
5.17 ¯
N(10000) vs. BI for m = 0.001 (Gilbert errors) with 10 masters and 36% load . . . 96
5.18 ¯
M(10000) vs. BER (independent errors) with 4 and 10 masters and 36% load . . . 97
5.19 ¯
M(10000) vs. BI m= 0.001 (Gilbert errors) with 4 and 10 masters and 36% load . 97
5.20 N(t) vs. time (Gilbert errors, both protocol improvements) . . . . . . . . . . . . . . . 98
5.21 ¯
N(3600) vs. PLR (independent packet losses) and no bit errors . . . . . . . . . . . . . 100
5.22 ¯
M(3600) vs. PLR (independent packet losses) and no bit errors . . . . . . . . . . . . . 101
5.23 ¯
N(3600) vs. PLR (independent packet losses) and BER of 103. . . . . . . . . . . . . 102
5.24 ¯
M(3600) vs. PLR (independent packet losses) and BER of 103. . . . . . . . . . . . . 103
6.1 Multipath fading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.2 Measurement setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.3 Position of our setup within the building . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.4 Setup of PTZ measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.5 Packet loss rate vs. trace number for longterm1 measurement . . . . . . . . . . . . . 120
6.6 Position of portal crane (0=close proximity, 1=short distance, 2=longer distance) for
longterm1-measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.7 Cumulative distribution functions of packet loss burst lengths and packet loss-free burst
lengths (for COMP-PLIS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.8 Conditional probability that packet n+kis lost given that packet nis lost (for COMP-
PLIS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.9 Autocovariance function of packet loss burst lengths for the compound loss sequence
and k0= 1 for longterm1 measurement (for COMP-PLIS) . . . . . . . . . . . . . . . 124
6.10 Autocovariance function of packet loss-free burst lengths for the compound loss sequence
and k0= 1 for longterm1 measurement (for COMP-PLIS) . . . . . . . . . . . . . . . 124
6.11 Positions of bit errors, factorial trace 83 (QPSK modulation, with scrambling, 6012
bytes packet size) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.12 Positions of bit errors, factorial trace 37 (5.5 MBit/s CCK modulation, no scrambling,
2016 bytes packet size) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.13 Mean bit error rate vs. trace number for remaining traces (logarithmic scale) . . . . . 127
6.14 Mean bit error rate vs. trace number for longterm1 measurement (logarithmic scale) 127
6.15 Mean error burst length vs. k0for selected BPSK traces BEIS (factorial measurement) 129
6
6.16 Mean error burst length vs. k0for selected QPSK traces BEIS (factorial measurement) 129
6.17 Mean error burst length vs. trace number for selected k0(for BEIS of longterm1
measurement) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.18 Coefficients of variation (CoV) of error burst lengths and error-free burst lengths vs.
k0for selected BPSK traces BEIS (factorial measurement) . . . . . . . . . . . . . . . 130
6.19 Coefficients of variation (CoV) of error burst lengths and error-free burst lengths vs.
k0for selected QPSK traces BEIS (factorial measurement) . . . . . . . . . . . . . . . 131
6.20 Cumulative distribution function for the error densities of all BPSK traces without
scrambling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.21 Cumulative distribution function for the error densities of all QPSK traces without
scrambling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.22 A sample bipartite Markov chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.23 Conditional probability that packet n+kis erroneous given that packet nis erroneous 144
6.24 Conditional probability that packet n+kis erroneous given that packet nis erroneous 145
7.1 Logical structure of polling simulation model . . . . . . . . . . . . . . . . . . . . . . . 167
7.2 Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS pro-
tocol vs. number of wireless terminals Nfor 10% low priority load, Gilbert-/Elliot error
model and different round-robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . 170
7.3 Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS pro-
tocol vs. number of wireless terminals Nfor 10% low priority load, Semi-Markov error
model and different round-robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . 171
7.4 Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS pro-
tocol vs. number of wireless terminals Nfor 10% low priority load, independent error
model and different round-robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . 171
7.5 Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS pro-
tocol vs. number of wireless terminals Nfor 10% low priority load, complex error model
and different round-robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7.6 Fraction of time, that all stations are member of the logical PROFIBUS ring over all
different gap factors and TT T RT values vs. station number Nfor different error models
(10% low priority load) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7.7 Remaining bandwidth for low priority data for the rr-1 protocol and the PROFIBUS
protocol (best value over all parameters) BLvs. number of wireless terminals Nfor
50% low priority load and independent and complex error models . . . . . . . . . . . . 173
7.8 Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS pro-
tocol vs. number of wireless terminals Nfor 50% low priority load, Gilbert-/Elliot error
model and different round-robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . 175
7
7.9 Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS pro-
tocol vs. number of wireless terminals Nfor 50% low priority load, Semi-Markov error
model and different round-robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . 176
7.10 Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS pro-
tocol vs. number of wireless terminals Nfor 50% low priority load, independent error
model and different round-robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . 176
7.11 Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS pro-
tocol vs. number of wireless terminals Nfor 50% low priority load, complex error model
and different round-robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.12 Fraction of time, that all stations are member of the logical PROFIBUS ring over all
different gap factors and TT T RT values vs. station number Nfor different error models
(50% low priority load) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.13 Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, Gilbert-
/Elliot error model and different round robin bounds k. . . . . . . . . . . . . . . . . . 182
7.14 Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, Gilbert-
/Elliot error model and different round robin bounds k. . . . . . . . . . . . . . . . . . 182
7.15 Ratio of the remaining bandwidth for low priority data for the rrk+spr protocol and
the rrk protocol BL(rrk+spr)
BL(rrk)vs. number of wireless terminals Nfor 50% low priority
load, Gilbert-/Elliot error model and different round robin bounds k. . . . . . . . . . 183
7.16 Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, Semi-Markov
error model and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . 183
7.17 Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, Semi-Markov
error model and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . 184
7.18 Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, independent
error model and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . 184
7.19 Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, independent
error model and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . 185
7.20 Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, complex error
model and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . 185
7.21 Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, complex error
model and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . 186
8
7.22 Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, Gilbert-
/Elliot error model and different round robin bounds k. . . . . . . . . . . . . . . . . . 188
7.23 Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, Gilbert-
/Elliot error model and different round robin bounds k. . . . . . . . . . . . . . . . . . 188
7.24 Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, Semi-Markov
error model and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . 189
7.25 Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, Semi-Markov
error model and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . 189
7.26 Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, independent
error model and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . 190
7.27 Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, independent
error model and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . 190
7.28 Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, complex error
model and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . 191
7.29 Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, complex error
model and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . 191
7.30 Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 10% low priority load, Gilbert-/Elliot error model
and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
7.31 Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 50% low priority load, Gilbert-/Elliot error model
and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
7.32 Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 10% low priority load, Semi-Markov error model
and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
7.33 Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 50% low priority load, Semi-Markov error model
and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
9
7.34 Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 10% low priority load, independent error model
and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
7.35 Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 50% low priority load, independent error model
and different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
7.36 Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 10% low priority load, complex error model and
different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
7.37 Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 50% low priority load, complex error model and
different round robin bounds k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
7.38 Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk pro-
tocol g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load,
Gilbert-/Elliot error model and different round robin bounds k. . . . . . . . . . . . . 198
7.39 Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk pro-
tocol g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load,
Gilbert-/Elliot error model and different round robin bounds k. . . . . . . . . . . . . 198
7.40 Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk pro-
tocol g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load,
Semi-Markov error model and different round robin bounds k. . . . . . . . . . . . . . 199
7.41 Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk pro-
tocol g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load,
Semi-Markov error model and different round robin bounds k. . . . . . . . . . . . . . 199
7.42 Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk pro-
tocol g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load,
independent error model and different round robin bounds k. . . . . . . . . . . . . . . 200
7.43 Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk pro-
tocol g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load,
independent error model and different round robin bounds k. . . . . . . . . . . . . . . 200
7.44 Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk pro-
tocol g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load,
complex error model and different round robin bounds k. . . . . . . . . . . . . . . . . 201
7.45 Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk pro-
tocol g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load,
complex error model and different round robin bounds k. . . . . . . . . . . . . . . . . 201
10
List of Tables
3.1 Maximum allowed transmit powers in different regions . . . . . . . . . . . . . . . . . . 39
4.1 Relation between bitrate and distance for the RS-485 PHY and cable type A . . . . . 48
4.2 PROFIBUS FDL-Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.1 Fixed parameters common for the analytical model and the simulation model . . . . . 83
5.2 Fixed parameters for ring stability simulations . . . . . . . . . . . . . . . . . . . . . . 83
5.3 Fixed parameters for ring re-inclusion simulations . . . . . . . . . . . . . . . . . . . . . 86
5.4 Fixed parameters for ring stability simulations over wireless channel . . . . . . . . . . 99
6.1 Path loss exponents for different environments . . . . . . . . . . . . . . . . . . . . . . 108
6.2 Adjustable radio parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.3 Adjustable packet stream parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.4 Fixed parameters for longterm1 and longterm2 measurements . . . . . . . . . . . . 118
6.5 Fixed parameters for factorial measurement . . . . . . . . . . . . . . . . . . . . . . . 118
6.6 Variable parameters for factorial measurements . . . . . . . . . . . . . . . . . . . . . 118
6.7 First order statistics of compound packet fate sequence of longterm1 measurement
(BL: burst length) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.8 First order statistics of burst lengths (BL) of the longterm1 measurement (COMP-
PLIS, k0= 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.9 Number of lost packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.10 Mean Bit Error Rates for different modulation types (factorial measurement) . . . . 126
6.11 Burst lengths of error burst with density one for QPSK and BPSK traces . . . . . . . 131
6.12 Summary statisitis of trace 24 (EBL=error burst length, EFBL=errorfree burst length,
MBER=mean bit error rate) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.13 Transmission times for 1 GB data over channels with different error models (based on
factorial trace 24) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
11
7.1 Summary statistics of selected traces constituting the complex error model (EBL=error
burst length, EFBL=errorfree burst length, MBER=mean bit error rate, PLR=packet
loss rate, PLB=packet loss burst) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
7.2 Different frametypes of polling-based protocols, with minimum parameters, x is either
SDR or SPR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
7.3 Simulation parameters for comparison between PROFIBUS and k-limited round robin 169
7.4 Maximum number of negative confirmations over all N’s for the different error models
(PROFIBUS protocol, TT T RT = 0.005s, gap factor = 1) . . . . . . . . . . . . . . . . . 169
7.5 Common simulation parameters for performance comparison of the different protocol
modifications rrk+x vs. k-limited round robin . . . . . . . . . . . . . . . . . . . . . . . 180
7.6 Repoll function set for the frk protocols, for all functions additionally fx(i) = 1 if
i > max retry holds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
7.7 execute-explicit-poll frame statistics for rrk+spr, independent error model, N=
20, k= 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
B.1 Mean packet error rate (PER) and max. PER for QPSK and BPSK modulation and
different scrambling modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
12
List of Acronyms
CAN Controller Area Network Small scale fieldbus for automotive applications
PROFIBUS PROcess FIeld BUS
PROFIBUS-DP PROFIBUS Decentralized Periphery
PHY Physical layer
FDL Fieldbus Data Link
MAC medium access control
FMS Fieldbus Message Specification PROFIBUS application layer services
MMS Manufacturing Message Specification
MAP Manufacturing Automation Protocol
LAN local area network
WLAN wireless local area network
NRZ non-return to zero
STP shielded twisted pair
FIFO first in first out
SAP service access point
SDA send data with acknowledge
SDN send data with no acknowledge
SRD send and receive data
CSRD cyclic send and receive data
DSAP destination service access point
SSAP source service access point
LAS list of active stations
13
GAPL gap list
FCB frame count bit
FCV frame count valid bit
SC short acknowledgement
DA destination address
SA source address
FC frame control
FCS frame check sequence
FIP factory instrumentation protocol
TDMA time division multiple access
DSSS direct sequence spread spectrum
FHSS frequency hopping spread spectrum
DCF distributed coordination function
PCF point coordination function
QoS quality of service
DTMC discrete time markov chain
BER bit error rate
MBER mean bit error rate
PER packet error rate
PLR packet loss rate
BI burstiness index
MSOT mean station outage time
CSOT cumulated station outage time
FEC forward error correction
LOS line of sight
NLOS non line of sight
BPSK binary phase shift keying
QPSK quaternary phase shift keying
DBPSK differential binary phase shift keying
14
DQPSK differential quaternary phase shift keying
BMBOK binary m-ary bi-orthogonal keying
QMBOK quaternary m-ary bi-orthogonal keying
CCK complementary code keying
ISM industrial, scientific and medical
CoV coefficient of variation
EDM electrical discharge machine
PER packet error rate
NIC network interface card
BEIS bit error indicator sequence
PLIS packet loss indicator sequence
COMP-PLIS compound packet loss indicator sequence
PEIS packet error indicator sequence
OSI Open Systems Interconnection
ICS Industrial Communication System
MMPP Markov Modulated Poisson Process
OFDM Orthogonal Frequency Division Multiplexing
MSDU MAC service data unit
PDU protocol data unit
AP access point
ARQ automatic repeat request
15
Acknowledgements
This work was carried out during my stay as a research and teaching assistant at TU Berlin. It would
not have been possible without the help, understanding and friendship of several persons.
First of all I would like to thank Prof. Adam Wolisz, who served not only as a reviewer for my thesis,
but was a very fair, helpful and humorable “boss” during the last six-and-a-half years. He provided
me not only with insightful discussions, but also with the necessary degree of freedom. I learned a lot
from him.
I am very grateful to Prof. Klaus David, who agreed to review my thesis on a very short-term basis.
The Deutsche Forschungsgemeinschaft (DFG) supported parts of my work during the last two years.
There are three persons who carried out work helpful for my thesis. Christian Hoene and Martin
Kubisch put much effort into the measurement hard- and software, while Andreas opke helped me
a lot with performing simulations.
For the tedious effort of proofreading and critiquing parts of my thesis I have to thank the following
persons: Holger Karl, Morten Schl¨ager, Helen J. Wang and Adam Wolisz. Specifically Holger Karl
has put much effort into my thesis and into earlier papers and reports.
I had the pleasure to share much of my work and life during the last years with my dear colleagues
Morten Schl¨ager, Jean-Pierre Ebert, Berthold Rathke, Holger Karl, and Petra Hutt, who were a
constant source of helpful discussions, practical help in certain situations, spontaneous chats, and
several kinds of entertainment. We spent much time together and I consider them as friends.
This work would not have been possible without the support of my family and friends, who helped
me over certain difficult situations by just being there. I would like to name here my mom Angela
Hauschildt and my sister Christiane Reiners, and my closest friends Klaus M¨uller and Christoph
Volkmann. I am also being indebted to my former wife arbel Peters. I still have many reasons to be
grateful to her. Helen Wang provided me with lots of motivation during the last stage of my thesis.
Berlin, in October 2001
Andreas Willig
16
Summary
Fieldbus systems are a specific class of LAN technology for industrial and factory applications, targeted
for fulfilling hard (safety-critical) realtime requirements in harsh environments. Many of these appli-
cations involve mobile subsystems and could benefit from recent wireless LAN technologies replacing
the current cable-based systems.
The IEEE 802.11 wireless LAN standard is currently the leading WLAN technology, many components
are commercially available. Hence, an immediate question is if and how this technology could be used
for wireless fieldbus systems. An important aspect of this question is how to provide good “realtime
performance” over the error-prone and time-variable wireless link. The notion of realtime performance
captures the hard timing- and reliability requirements of industrial applications.
This thesis answers this question for the PROFIBUS, a fieldbus system popular in Europe. As a final
goal, it follows the vision of integrating wired and wireless stations into a single PROFIBUS LAN.
The scope of the investigations is focused to the MAC- and link-layer, where the PROFIBUS employs
a token-passing protocol on top of a broadcast medium, where the stations form a logical ring. The
results reported in this thesis suggest to use a specifically tailored MAC protocol on top of the 802.11
physical layer, and to drop the 802.11 MAC protocol as well as the PROFIBUS MAC protocol. One
obvious candidate, namely the 802.11 MAC protocol, could not be matched well with the PROFIBUS
services, and the other obvious candidate, the PROFIBUS protocol, has problems with the stability
of the logical ring over the error-prone wireless medium, even after introducing some modifications.
The choice of dropping the PROFIBUS MAC protocol complicates integration of wired and wire-
less stations in a single LAN (where clearly the wish is to leave the wired station’s protocol stack
unchanged), but offers the opportunity to look for MAC protocols with better realtime performance.
In this thesis the class of polling-based protocols is proposed as a candidate. It is shown, that even
a very simple k-limited round-robin variant often achieves much better realtime performance than
the PROFIBUS protocol. Furthermore, the performance of round-robin can be improved. Three
modifications of round-robin are proposed, which are designed to cope with the error characteristics
of a wireless link as they are found in measurements in an industrial environment. However, the much
better realtime performance of these protocols comes at the price of a slightly modified semantics of
the PROFIBUS link layer services.
The overall contribution of this thesis is to provide the first steps towards a wireless PROFIBUS
integrating wired and wireless stations.
17
Zusammenfassung
Bei sogenannten Feldbussystemen handelt es sich um eine spezielle Klasse lokaler Netzwerke (LANs),
die insbesondere industrielle Anwendungen im Blick hat. Diese Anwendungen zeichnen sich durch
harte Echtzeit-Bedingungen aus: es m¨ussen sicherheitskritische Nachrichten, z.B. Alarme, innerhalb
einer maximalen Zeit sicher ¨ubertragen werden onnen. Hinzu kommt, daß Feldbussysteme vielfach
in rauhen Umgebungen eingesetzt werden. Viele industrielle Anwendungen haben mobile Subsys-
teme, und onnen somit von aktuellen drahtlosen Netzwerktechnologien (wireless LANs, WLANs)
profitieren, die Mobilit¨at in nat¨urlicher Weise unterst¨utzen, im Gegensatz zu den bisherigen kabelge-
bundenen Technologien.
Der IEEE 802.11 wireless LAN Standard ist derzeit die f¨uhrende WLAN-Technologie. Der Standard
ist ausgereift, und fertige Systeme bzw. Komponenten sind kommerziell erh¨altlich. Es ist daher
naheliegend zu fragen, ob und wie diese Technologie f¨ur drahtlose Feldbussysteme genutzt werden
kann. Eine zentrale Frage dabei ist, wie ¨uber das drahtlose Medium trotz hoher Fehlerraten und zeit-
variablen Fehlerverhaltens eine oglichst gute “Echtzeit-Leistung” (“Realtime-Performance”) erzielt
werden kann. Der Begriff der Echtzeit-Leistung faßt gleichzeitig Zeit- und Zuverl¨assigkeitsaspekte der
¨
Ubertragung sicherheitskritischer Daten ins Auge.
Die vorliegende Dissertation bearbeitet diese Frage f¨ur den PROFIBUS, einem in Deutschland und
Europa weit verbreiteten Feldbussystem. Das Fernziel, auf das hin diese Arbeit die ersten Schritte
macht, ist die Integration drahtloser und verdrahteter Stationen in einem einzigen PROFIBUS LAN.
Der Focus liegt hierbei auf der Mediumzugriffs- und Sicherungsschicht des OSI-Referenzmodells, wo
der PROFIBUS ein Token-Passing-Protokoll auf einem Broadcast-¨
Ubertragungsmedium einsetzt, und
die angeschlossenen Stationen einen logischen Ring bilden. Die vorgestellten Ergebnisse legen nahe,
ein speziell auf die (Fehler-) Eigenschaften des drahtlosen Mediums zugeschnittenes Zugriffs- und
Sicherungsprotokoll zu verwenden, und die existierenden Protokolle, namentlich das PROFIBUS
Token-Passing Protokoll und das IEEE 802.11 Zugriffsprotokoll, nicht in Betracht zu ziehen. Das
802.11 Zugriffsprotokoll bietet keine ausreichende Unterst¨utzung zur Implementierung der Dienste der
PROFIBUS-Sicherungsschicht. Das PROFIBUS-Protokoll hingegen hat ¨uber einem fehlerbehafteten
bzw. drahtlosen Medium erhebliche Probleme mit der Stabilit¨at des logischen Rings, selbst nach
Einf¨uhrung geeigneter Protokollmodifikationen.
Die Entwurfsentscheidung, das PROFIBUS-Zugriffsprotokoll durch ein anderes zu ersetzen, erschwert
die Integration drahtloser und verdrahteter Stationen in einem PROFIBUS LAN, insbesondere unter
der Randbedingung, daß der Protokollstack der verdrahteten Stationen nicht ver¨andert werden soll.
Umgekehrt erm¨oglicht sie aber auch die Suche nach Protokollen mit besserer Echtzeit-Leistung, als
sie das PROFIBUS-Protokoll bietet.
18
Die vorliegende Arbeit identifiziert die Klasse polling-basierter Protokolle als aussichtsreichen Kandi-
daten. Es wird gezeigt, daß bereits ein einfaches k-beschr¨anktes Round-Robin Protokoll oft eine erhe-
blich bessere Echtzeit-Leistung erzielt als das PROFIBUS-Protokoll, der Unterschied betr¨agt bis zu
einer Gr¨oßenordnung. Weiterhin werden drei Modifikationen des Round-Robin-Protokolls vorgestellt,
mit denen die Echtzeit-Leistung weiter erh¨oht werden kann. Beim Entwurf dieser Modifikationen
wurden die spezifischen (Fehler-) Eigenschaften drahtloser Medien einbezogen, wie sie sich in Mes-
sungen in einer industriellen Umgebung zeigten. Die verbesserte Echtzeit-Leistung der modifizierten
Protokolle wird allerdings auf Kosten einer leicht ver¨anderten Semantik der Dienste der PROFIBUS-
Sicherungsschicht erreicht.
Insgesamt gesehen liefert die vorliegende Arbeit die ersten Schritte auf dem Weg zu einem PROFIBUS-
System, in welchem drahtlose und verdrahtete Stationen integriert sind.
19
Chapter 1
Introduction
In the past 20 years we have seen rapid advances in local area network (LAN) technologies, going
hand in hand with an ever increasing number of application areas. One specific field of application are
distributed control systems in industrial environments, e.g, in machine plants. As compared to office
environments, these applications have different requirements. First, often the environmental conditions
are harsh. For example, explosible environments or environments with aggressive chemicals, strong
motors, drives, robots, controllers, and much more devices capable of creating electromagnetic noise
and strong magnetic fields, are frequently found. Second, in distributed control systems a multitude of
intelligent controllers exist. These are jointly responsible for controlling and monitoring an industrial
process. Hence, these controllers have to communicate with each other and with lower level devices
like sensors and actuators. These devices constitute the controllers interface to the underlying physical
process. For communication between controllers and low-level devices often hard realtime requirements
are posed: the correctness of a message transmission depends not only on its integrity, but also on
timely and reliable/acknowledged delivery before a certain deadline. In hard realtime systems a task
or a message missing a deadline can lead to damage of persons or material (accordingly, we speak
of safety-critical messages), or at least to an interruption of the industrial process. As an example,
consider in a milling machine the pressure on the cooling water tube drops rapidly due to a program
malfunction damaging the tube. The machine should be stopped within a certain time for avoiding
severe damage. In contrast to this, with soft realtime requirements a message missing a deadline may
disturb a user, but does not lead to real damage.
For interconnecting controllers and low-level devices often LAN technology is used. However, instead
of using popular technologies as Ethernet or Token-Ring, people resort to a special class of LANs
designed for meeting hard realtime requirements and to survive in harsh environments. These systems
are called fieldbuses, and for even more rigid requirements on communications (isochronous data and
response times in microsecond range) they are called sensor/actuator buses. [128], [30], [100], [127].
With respect to the Open Systems Interconnection (OSI) reference model [166, chap. 1.4], most
fieldbus systems have the distinguishing property that only layer 1 (physical), 2 (medium access
control (MAC) and data link) and 7 (application) are covered, all other layers are empty. Fieldbus
applications are usually not considered to need internetworking capabilities.
Another class of LANs, which attracted much attention in the last ten years, is the wireless local area
network (WLAN) technology. Users appreciate the obvious advantages WLAN technology can offer:
20
mobility and reduced need for cabling. Nowadays, the IEEE 802.11 WLAN standard is the leading
wireless technology. It is standardized, offers comparably high bit rates of up to 11 MBit/s, makes
use of license-free bands and it is possible to buy off-the-shelf components.
Much of the research effort spent on 802.11 and on WLANs in general centers around the question,
how to overcome the sometimes bad, unpredictable, and time-variable error behavior of wireless
transmission. As opposed to todays wired LANs, transmission errors cannot considered to be a
rare exception. Instead, they are a serious problem, and total transmission outages lasting for several
seconds are frequently observed (see Chapter 6).
It is obvious that the advantages of WLAN technology would be also desirable for industrial applica-
tions. Some examples are: [52, chap. 2]:
Applications with mobile subsystems, e.g., autonomous transport vehicles, robots, turntables.
Implementation of distributed control systems in explosible areas (where sparks created by cables
or adaptors have to be avoided) or in the presence of aggressive chemicals, capable of damaging
cables.
Rapid Prototyping of industrial plants without putting much effort into cabling.
Mobile machine plant diagnosis systems, and wireless stations for programming and configuring.
Hence, a basic question is if and how the leading 802.11 WLAN technology can be used for a wireless
fieldbus. A major challenge is clearly to satisfy the hard realtime requirements even for the case of
bad and unpredictable transmission errors.
In this thesis we demonstrate that using the popular IEEE 802.11 direct sequence spread spectrum
(DSSS) Physical layer (PHY) there is no hope for implementing hard, i.e., deterministic guarantees
of acknowledged delivery of safety-critical messages within a prescribed time-bound. A wireless link
showing total transmission outages in the timescale of several seconds cannot be expected to allow for
reliable message transmission within a desired timescale of milliseconds, as often required by fieldbus
applications. Furthermore, due to the physical mechanisms leading to transmission errors, we expect
that the sometimes high error rates and the variability of the error behavior will not disappear when
using comparable wireless transmission technologies. Hence, many of the problems which occur with
802.11 will also occur with other wireless technologies.
Therefore, instead of seeking absolute guarantees for wireless fieldbuses1, this thesis follows an ap-
proach, which could somewhat sloppy be called stochastic hard realtime: the percentage of safety-
critical messages which can be transmitted reliably (i.e., acknowledged) within a prespecified time-
bound should be as high as possible (>99.x%), even at the cost of other performance measures like
throughput or mean delay. We define a set of performance measures, called the realtime performance
measures, which capture the timeliness and reliability requirements. Of course, this approach limits
the application areas of a wireless fieldbus: when deterministic guarantees in the range of 10 to 100
msec are essential (e.g., cooling water control in a nuclear power plant), wireless fieldbuses are ruled
out. However, in applications, where occasionally emergency stop conditions due to message losses
are tolerable, wireless fieldbuses can offer their potential.
1From now on we will use the term “wireless fieldbus” as an abbreviation for “a wireless fieldbus based on IEEE
802.11 DSSS-compliant technology”.
21
This thesis focuses on the PROcess FIeld BUS (PROFIBUS), a standardized and well-known fieldbus.
This system is quite popular in Europe, there are many products and applications.2Due to its
widespread usage it is attractive to have a wireless extension of PROFIBUS, i.e., a system where
wired stations and wireless station can run jointly in a single PROFIBUS LAN.
Now we can state the central question this thesis focuses on:
How and to what extent can the IEEE 802.11 technology be used to create a wireless
PROFIBUS extension with the best possible realtime performance? What are good MAC-
and link-layer protocols for this?
The MAC and link-layer protocol is in general a key issue for performing hard real-time communica-
tions, since, if these layers are not able to give tight time- and reliability guarantees (medium access
time), this can be hardly corrected by other protocol layers.
The approach taken in this thesis is to keep the PHY layer fixed (to the IEEE 802.11 DSSS PHY),
and also the PROFIBUS link-layer interface. The first decision takes the advantage that IEEE 802.11
technology is available, stable, and cheap. The latter decision allows easy porting of applications and
application-layer instances (as is typical for fieldbus systems, the PROFIBUS specification covers only
layer 1, 2, and 7 of the OSI reference model). Within this range several alternatives are explored:
The PROFIBUS MAC and link-layer protocol (a semi-reliable protocol with an underlying token-
passing scheme on top of a broadcast medium) runs directly on the 802.11 DSSS PHY. As one
contribution of this thesis we show that the PROFIBUS protocol even with some modifications
has serious problems regarding its realtime-behavior, when transported over error-prone, wireless
type links (Chapter 5). This finding applies also to the case where all PROFIBUS stations have
a wired transceiver and a piece of cable is replaced by a wireless link. Here also the token-passing
protocol has to be transported over a wireless link, which should be avoided.
Find a mapping between the PROFIBUS link-layer services and the IEEE 802.11 MAC protocol,
which in some parts is designed with realtime services in mind (Chapter 3). However, MAC-
services and link-layer services do not match well.
Look for a specifically tailored MAC and link-layer protocol for the wireless side, which, however,
could be integrated with the existing PROFIBUS protocol. This integration is important in order
to run wired and wireless stations in a single LAN without changing wired station’s protocol
stacks.
The design space for the alternative using specifically tailored protocols has to the author’s best
knowledge not been explored so far. It includes architecture and protocol stack, physical layer, MAC
and link-layer, mobility, power saving, security and authentication, management and configuration
issues, applications. Out of this design space, this thesis makes the following contributions, focused
on the design of MAC and link-layer protocols:
We identified some problems in the PROFIBUS MAC and link-layer protocol when faced to a
lossy link. We propose incremental improvements to the protocol, but still the delay results
obtained for safety-critical messages are unsatisfying. And this motivated our investigation on
entirely different approaches.
2The WWW site of the PROFIBUS user organization (www.profibus.com) gives a number of 200.000 installations.
22
We advocate the class of polling-based protocols as an attractive candidate for a wireless
PROFIBUS MAC and link-layer protocol, when the goal is to maximize probability of successful
delivery of safety-critical messages within time.
It provides characterization and stochastic modeling of a wireless link using real-world traces
taken in an industrial environment. The results are used as a design input for the polling-based
MAC and link-layer protocols. Furthermore, they are used for parameterizing existing stochastic
error models and as a motivation for developing a new class of stochastic error models. This
class offers better modeling accuracy as compared to well-known stochastic models at moderate
increase in model complexity.
Some approaches for polling-based protocols are presented, taking the findings from the mea-
surements and the corresponding stochastic error models into account. Starting from a baseline
protocol (k-limited round robin), three additional protocol mechanisms are discussed and com-
pared with respect to their realtime performance. It is shown that already the round-robin
protocol significantly outperforms the PROFIBUS protocol under many circumstances. Fur-
thermore, it is shown that the realtime performance already delivered by k-limited round robin
can be improved significantly with the proposed modifications. All this together justifies the
recommendation given in this thesis to drop the PROFIBUS protocol on the wireless link and
replace it by another one.
Polling is desirable for the following reasons:
In the PROFIBUS protocol LAN membership depends critically on permanent and reliable
transmission of specific control frames. Loss of these frames causes undesirable loss of ring
members. This is critical, since only ring members are allowed to transmit data. In contrast,
in the class of polling-based protocols considered here, after successful registration at a central
controller LAN membership is not an issue.
A polling approach frees the stations from competing for the right to transmit data frames or
to make reservations with a central scheduler, while on the other hand being more efficient than
strict TDMA. In many wireless MAC protocols all stations share a single (or a few) logical
channels for transmitting data or reservations, introducing danger of collisions. In polling-based
systems, one logical channel is used per station, eliminating collisions a priori. This is appealing
for systems, where the focus is not on optimizing throughput, but on optimizing the probability
of successful delivery of alarm messages.
In general, bandwidth assignment by a central scheduler can be more efficient than decentral-
ized approaches, since potentially more knowledge about pending requests is available at the
scheduler.
Clearly, polling schemes have also some disadvantages. A single point of failure is introduced, which
needs to be addressed by introducing redundancy. Furthermore, scalability concerns immediately
come to mind. However, these pose no problem, since distributed control systems tend to have only
small to medium numbers of stations.
23
1.1 Structure of the Thesis
Chapter 2 gives a general introduction to the topic of field buses and industrial communication systems:
their requirements, architecture, traffic characteristics, popular systems.
Chapter 3 focuses on wireless LAN technology and 802.11. After a brief introduction to the general
topic of WLANs (including some important properties of wireless media, as relevant for MAC design),
we portray some existing systems. Following this, a more detailed overview on the IEEE 802.11
WLAN standard is given, including its architecture, the different physical layers and both the DCF
and PCF MAC layers. The properties of the IEEE 802.11 DSSS PHY are of immediate importance,
since they serve as the basis for the work on wireless link error characterization and the design of the
polling protocols.
In Chapter 4 we describe the PROFIBUS system in detail including all the important protocol aspects,
a description of the link layer services including the link layer interface, and a summary of the most
important properties of the PROFIBUS protocol w.r.t. real-time behavior. After this, some general
issues for wireless industrial communication systems are discussed, which are then specialized to the
case of a wireless PROFIBUS. An important issue is a general description of the system under study.
Of special importance is the definition of the realtime performance measures, capturing the realtime-
and reliability requirements. An overview on related work on wireless field bus systems and wireless
PROFIBUS system follows. This chapter also discusses the approach to find a mapping between the
PROFIBUS link-layer services and the IEEE 802.11 MAC services (and protocol). It turns out that
both do not match well.
In Chapter 5 the behavior of the classical PROFIBUS MAC and link layer protocol when operated
over different types of error prone links, is investigated. Two different case are distinguished. In
the first case the protocol is operated in a “cable-like” (yet error-prone) environment, i.e. the kind
of environment for which the PROFIBUS was designed. The results indicate that the protocol was
not designed with too high error rates in mind. The second case is that of a “wireless-like” medium,
where no immediate feedback information from the channel is available and where beneath simple bit
errors the loss of whole packets is an issue. It shows up that in both cases the PROFIBUS protocol
has serious problems with keeping all stations in a state of being LAN members. These problems
can be attributed to the need of explicit LAN membership / ring membership maintenance using
special control frames. Two modifications of the protocol and its parameters are proposed, which give
significant improvements, however, the ring membership problem remains serious. This is taken as a
motivation to look for alternative protocol approaches, which do not require permanent exchange of
control frames for LAN membership maintenance.
Chapter 6 is devoted to a more precise characterization of the error behavior of a wireless link. After a
brief review of the physical mechanisms leading to distorted waveforms, and hence transmission errors,
the most important results of bit- and packet error measurements taken in an industrial environment
are reported. These results can serve as an input for the design of wireless MAC protocols for
industrial applications. A second usage of these results is the “real-world data” parameterization of
stochastic packet level error models, which are an important part of simulation models for assessing
the performance of wireless MAC protocols. Several models are presented, and the class of bipartite
models is introduced, giving good accuracy in predicting relevant performance measures at a moderate
computational complexity.
In Chapter 7 we take the results of the measurements as input for the design of some concrete polling
24
algorithms. After a description of the system under study and the measures of interest (the “realtime
performance measures”) the realtime performance of the algorithms is evaluated and compared with
a baseline algorithm, an k-limited round robin algorithm. Furthermore, the algorithms are compared
with the classical PROFIBUS protocol, both operated with the same load and channel models. This
comparison allows to check the basic claim that polling algorithms can give better realtime perfor-
mance.
Finally, in Chapter 8 the conclusions and an outlook on possible research directions is given.
25
Chapter 2
Industrial Communication Systems
In this chapter we give an introduction to the notion and some basic architectural characteristics of
industrial communication systems and field buses, including a brief overview on existing systems. We
discuss the set of requirements for field buses and industrial communication systems, which make
them distinct from other types of LANs.
Industrial communication systems and fieldbuses are only a single, yet important cornerstone in the
process of making production plants more flexible and yielding a higher degree of integration of
machines and tools from different vendors. In the eighties and in the beginning of the nineties the
Computer Integrated Manufacturing (CIM) concept was a major stream of this process [146].
Business Data
Processing
PLC, CNC, E/A-Systems
Controllers: RC,
Cell Control Computer
Production Control
Computer
Requirement
Volume
Typical Data
Requirement
Time
Typical Response
Field Devices: Sensors,
Actuators, Drives
Steuerungsebene / Process Control Layer
Führungsebene / Cell Control Layer
Leitebene / Production Control Layer
Planungsebene / Enterprise Layer
Seconds
Milliseconds
Microseconds
Minutes Megabytes
Kilobytes
Hundreds of Bytes
Bytes
Sensor-Aktuator-Ebene / Sensor-Actuator Layer
Figure 2.1: Hierarchy of information flows in manufacturing applications (from [68, p. 13])
The trend towards distributed control systems posed, amongst others, the problem of proper commu-
nications between devices of different vendors, using different protocols, media, and running different
26
applications. An established way of grouping the different types of manufacturing applications and
their needs for communications is to place them in a hierarchy of information flows, such that similar
functions and similar data are grouped into the same layer. An example, taken from [68, p. 13] is shown
in Figure 2.1. Although this distinction is not really sharp, it can be said that the boundary between
the cell control layer and the process control layer is targeted by field buses, while the lower boundary
between the process control layer and the sensor/actuator layer is targeted by sensor/actuator buses.
Some general references for field buses and realtime MAC protocols are [128, 30, 13, 32, 127, 94].
2.1 Requirements for Industrial Communication Systems /
Field Buses
The requirements for industrial communication systems / field buses are much different from those
for LANs in office environments. Clearly, they depend on the targeted application area, but some
requirements are typical.
For this thesis the most important requirements are those regarding the timing- and reliability be-
havior. Fieldbuses are targeted for hard realtime requirements [100, chap. 2], [153]. In hard realtime
communication systems there exists a class of messages with timing constraints. If a message belonging
to this class is not transmitted successfully within a certain time bound, this is a system malfunction,
possibly leading to a catastrophe. Consider as an (extreme) example an alarm message generated
by a pressure sensor in the cooling water circulation system of a nuclear power plant. In contrast to
this, for soft realtime systems missing a timing constraint may lower the systems usability, but does
not compromise the systems integrity. As an example, consider an online transaction system, where
users expect their answers within one second, but do not get too annoyed, if it occasionally takes two
seconds. As another example, for packet-based speech conversations like in Voice over IP systems,
there are stringent delay requirements (the delay is desired to be 250 ms), but a certain packet loss
rate (typically 1%) seems tolerable, depending on the codec and the influence of error concealment
techniques [65, chap. 7].
The requirements for hard realtime communications can be summarized as follows (compare [127]):
Safety-critical messages must be reliably (i.e., acknowledged) transmitted within a bounded time.
The time bound is application-dependent. For cell control and process control applications they
are often in the range of 1-100 msec (compare Figure 2.1). If the time bound cannot be met,
this should be signaled to upper layers.
There should be support for priorities, to allow distinction between urgent (safety-critical) and
non-urgent messages.
Packets can be equipped with deadlines. On deadline expiration appropriate actions have to be
taken (dropping the packet, notifying upper layers).
For some data it must be known, how old they are (“freshness”). For example, the current
position of a moving drive is valid only for a short time.
Messages with stringent timing constraints are typically only of some few bytes length (compare
Figure 2.1).
27
Both periodic and aperiodic (asynchronous) traffic types should be supported. Some applications
(e.g., drive control) require even strictly isochronous traffic.
A frequently observed phenomenon are alarm storms: a machine plant runs for some time in a
“normal” operation mode. Then something critical happens, and all the subsystems involved in an
industrial plant detect the error condition and try to send alarm messages to other stations. This
often leads to generation of further alarm messages, congesting the communications system more and
more.
Beneath the traffic- and realtime requirements fieldbuses have often to fulfill environmental require-
ments:
Fieldbuses are expected to operate in environments with much electromagnetical noise, e.g., due
to the presence of strong motors, drives, electrical discharges, remote controls.
In some applications the environment can be explosible or aggressive chemicals may be present.
Operation in free air or wet environments, with large temperature differences, vibrations.
In chemical engineering installations a single fieldbus installation can extend to up to 10 kilometres,
however, in manufacturing applications the geographical area covered is much smaller, typical in the
range of 20-50 meters. In some installations small devices like sensors and actuators get their power
supply via the fieldbus.
2.2 Architectural Characteristics
Many fieldbuses and sensor/actuator networks share the property that their specification covers only
the layers 1 (physical layer), 2 (MAC- and data link-layer), and 7 (application layer) of the OSI
reference model, compare Section 2.3. Distributed control applications consist typically only of a
small to medium number of stations, hence, internetworking capabilities are not needed. The other
layers are empty or their functionality is put into one of the remaining layers:
The representation layer is unnecessary, since all usable data types and their memory represen-
tation are defined in a fixed manner.
Session layer functionality is considered unnecessary.
The network layer is empty, since internetworking is not needed.
With an empty network layer the transport layer can also be left empty.
2.3 Popular Systems
In this section a brief overview on some popular fieldbus and sensor/actuator buses is given, with
emphasis on their MAC protocols.
Many fieldbus and sensor/actuator buses (including the PROFIBUS) use an explicit token-passing
scheme on top of a broadcast medium. Broadcast mediums are preferred over ring-type networks,
28
since in the latter case a cable-break destroys the whole network. The token-passing systems are
discussed first.
2.3.1 Systems using Token-Passing-MAC’s
The PROFIBUS is a fieldbus, hence its targeted application area is the interconnection of cell con-
trollers and process controllers. It uses a token-passing protocol. This fieldbus is the focus of this
thesis, it is discussed in much more detail in Section 4.1.
The IEEE 802.4 Token-bus [71] was designed for factory automation applications with the goal of
guaranteeing an upper bound on medium access time [165, chap. 3]. It is part of the Manufacturing
Automation Protocol (MAP) protocol stack (see below), but has not gained widespread usage. The
system uses a token passing protocol on top of a broadcast medium, and the stations form a logical
ring. In this respect it is quite similar to the PROFIBUS. There are differences in the ring maintenance
/ ring-inclusion mechanisms: in the IEEE 802.4 Token-bus a contention-based approach is used, while
in PROFIBUS the ring members have to poll a certain address range in order to include new stations.
The BITBUS specification came from Intel in 1983; in 1991 an extended version was accepted as an
IEEE standard [70], [68]. It is designed as a field bus, its main application area is at the cell layer.
The standard covers only the layers 1, 2, and 7. The BITBUS concept includes a communication
system and a microcontroller card. The communication system is based on a master/slave access
scheme (employing SDLC, a predecessor of HDLC [151] created by IBM) and round-robin polling by
the master. In an extended version the protocol allows for changing masters by means of explicit
token passing. Hence, the protocol has similarities to the PROFIBUS protocol. The microcontroller
card contains a small realtime operating system and allows the user to load arbitrary tasks into the
card. The hardware is not specified, but the operating systems interface is. It seems that nowadays
the BITBUS is not widely used in Europe.
2.3.2 Other Systems
The FIP fieldbus (fr.: flux information processuale, engl.: factory instrumentation protocol) is a
european fieldbus standard, which emerged from a french standard [178], [30], [89]. The standard
covers only the layers 1, 2 and 7 of the OSI reference model. The distinguishing feature of FIP is
its real-time database concept. The basic idea is that most applications are interested merely in the
value of some process variable (as denoted by a variable identifier), and not in which station actually
produces this variable. Therefore, FIP introduces a producer / consumer concept and utilises a central
station (called bus arbiter, BA). The BA keeps a preconfigured polling table, holding the variable
identifiers of interest. When traversing the poll table, the BA broadcasts the variable identifier.
The producer of the variable responds to this by broadcasting the variables value, and each station
interested in this value (consumer) takes a copy and places it into an internal cache. FIP is designed
for fixed configurations, since adapting to shifts in communication needs requires changing the polling
table.
The P-NET protocol was originally developed in Denmark, and later on adopted as a european
standard [176]. Two types of stations are used: master stations and slave stations. The right to
transmit is passed between multiple master stations according to a virtual token passing scheme,
between master and slave a simple request/answer scheme is used. In the virtual token passing
29
scheme each station maintains an access counter, which is incremented upon every medium idle time
longer than 40 bit times. If this access counter happens to be the same as one master’s station address,
this master is said to have the token.
The InterBus-S system was first defined by a german vendor, in 1994 the specification became a
german standard [36], [37], [38]. The standard covers the layers 1, 2, and 7 of the OSI reference model.
This system is mainly designed as a master / slave communication system with strictly isochronous
services. The latter are required to enable drive control applications or the isochronous operation of
a PLC (Programmable Logic Controller). Instead of using a bus, the architecture is built upon a ring
topology. The master station sends out data frames in fixed intervals. Each slave is assigned a specific
slot in this frame. When a frame reaches a slave, it reads the contents of its slot (considering this
as input data) and writes some other data to the same position (output data). After this the frame
is sent to the next station. While this approach is fine for exchanging cyclic measurement data, the
transport of asynchronous data is more cumbersome, since only a small number of additional slots
within a frame is provided. Hence, a message can span several cycles.
The Controller Area Network (CAN) is a widely used fieldbus for (geographically) small installations,
e.g., for interconnecting several devices and controllers in cars. First invented by Bosch and Intel, it
has become an ISO standard [50]. Controller Area Network (CAN) is a decentralized multi-master
network using a CSMA/CD with priority arbitration. The protocol allows for smooth implementation
of rate monotonic scheduling approaches [98]. A station that wants to transmit, waits until the bus
goes idle, then starts with a startbit. After the startbit the 12 (or 30) bit long arbitration field is
transmitted. More precisely: every station starts transmitting the first bit of the arbitration field,
and in parallel reads back the signal from the bus. If the read back bit differs from the transmitted
bit, the station has lost contention and waits for the next cycle. If a station is not eliminated during
the arbitration phase, it continues to transmit the next bit of the arbitration field. A prerequisite
for implementing this is that overlapping signals on the medium still give a valid signal. A similar
protocol is used on the ISDN S0bus [83].
2.3.3 MAP/MMS
A much broader approach was taken by the MAP consortium [80], [112]. Within MAP a full protocol
stack covering all seven layers of the OSI reference model was specified. In fact, MAP version 3.0
was the first true OSI protocol stack on the market. In the application layer it uses Manufacturing
Message Specification (MMS), a well defined set of services for automation purposes [72]. MMS
offers services like variable access, event management, semaphore management, file transfers, program
execution etc. The application layer services implement the virtual field device abstraction, which
in turn is used by the application processes. A virtual field device represents a real field device
as a collection of objects (variables, execution environments, events, files) and operations on them,
accessible from other stations using the MMS services. On the MAC layer the IEEE 802.4 token bus
is used [71]. However, MAP has not gained widespread acceptance. One reason for this were the
serious performance problems, which are due to the fact that a single message has to pass fourteen
protocol layers [101].
30
Chapter 3
Wireless LANs / IEEE 802.11
This chapter serves the purpose of providing background information about wireless local area network
(WLAN) systems, and specifically about the IEEE 802.11 WLAN standard. The latter is of particular
relevance for this work, since it is nowadays the leading technology. The commercial availability of
IEEE 802.11 equipment makes it attractive and allows for experimentation and measurements. The
experience we can gain from measurements gives practical hints for design of MAC and link layer
protocols.
In Section 3.1 we describe some general characteristics of WLANs and discuss some specific problems
for design of MAC and link layer protocols. Furthermore, a brief overview of current WLAN stan-
dards is provided. In Section 3.2 the IEEE 802.11 WLAN standard is presented, including the main
characteristics of a specific 801.11 compliant chipset.
Some general references covering wireless transmission and wireless networking are [4, 28, 78, 124,
182, 183, 166], and a lot of separate topics in wireless are treated in [56]. References [107, 150, 125, 9],
deal with the general topic of WLANs, and for MAC aspects please refer to [6, 60, 93].
3.1 Wireless LANs
3.1.1 Basics
The notion of wireless local area network (WLAN) systems subsumes systems, where data is trans-
mitted wireless over short distances and in packet-switched mode. The focus on packet-switching
makes WLANs different from existing cellular networks (e.g., GSM), which are primarily designed to
support telephony or pager applications. WLANs typically offer higher bitrates. For example, the
current IEEE 802.11 WLAN systems support bit rates up to 11 MBit/s, as compared to GSM (9.6
kBit/s) [144], the GSM enhancement GPRS (116 kBit/s) or the future UMTS (2 MBit/s indoor) [182].
Current WLAN systems use either infrared (e.g., IrDA [187, 166]) or radio frequencies below 6 GHz.
Using radio frequencies allows to cover distances of 50-300 m (depending on transmission power).
Furthermore, radio waves below 6 GHz can propagate through walls (depending on both frequency
and material) and can be reflected on several types of surfaces, enabling non line of sight (NLOS)
31
communications. In contrast, systems based on infrared only allow for line of sight (LOS) communi-
cations.
Most radio-based WLANs transmit in license-free frequency bands, e.g., the industrial, scientific and
medical (ISM) bands, which are granted by the FCC1and the CEPT2. In these bands 26 MHz of
spectrum between 902 MHz and 928 MHz, 83.5 MHz of spectrum between 2.4 GHz and 2.4835 GHz
and 125 MHz of spectrum between 5.725 GHz and 5.85 GHz are allocated. The transmit power
is legally restricted to 1 Watt. Since radio waves in the ISM bands can penetrate walls, can be
reflected / diffracted and are subject to multipath fading, the wireless channel is comparably bad and
time-varying (see Chapter 6).
Spread Spectrum Technologies
Many types of wireless LANs, including the IEEE 802.11 WLAN, rely on spread spectrum techniques
[58], where a narrowband information signal is spreaded to a wideband signal at the transmitter
and despreaded to a narrowband signal at the receiver. The two most important spread spectrum
techniques are direct sequence spread spectrum (DSSS) and frequency hopping spread spectrum (FHSS).
By using a wideband signal the effects of narrowband noise or narrowband interference are reduced
[58, chap. E].
In DSSS systems a data bit is multiplied with a fixed sequence of bits, called chip sequence, i.e., every
chip is XORed with the data bit. The resulting chip sequence is transmitted. The receiver knows
the chip sequence of the transmitter and tries to decode the original data bit from the received chip
sequence, e.g., using correlation techniques. In many DSSS systems with multiplexing in the time
domain all stations use the same chip sequence (e.g., the IEEE 802.11 with DSSS PHY, see Section
3.2), while in CDMA systems every station has its own.
In FHSS the available frequency band is divided into a number of subbands. The transmitter station
hops through the subbands according to a predetermined schedule. The receiver must know this
schedule and change frequencies synchronously with the transmitter. It is distinguished between slow
FHSS and fast FHSS. In slow FHSS the transmitter transmits several information bits on the same
frequency before hopping to another frequency. In fast FHSS the transmitter changes the subband
several times during a single information bit. Fast FHSS systems are more costly to realize, because
of the need of fast and accurate synchronization.
A third technique called time hopping spread spectrum is not used in current WLAN technologies.
In recent time there has been considerable interest in Orthogonal Frequency Division Multiplexing
(OFDM) techniques [179]. OFDM is a multi-carrier technique, where blocks of symbols are transmitted
in parallel over a number of subcarriers. A symbol transmitted on each subcarrier has an increased
symbol duration τas compared to full-rate transmission. The symbol duration τis usually much larger
than the delay spread of the channel, this way combatting intersymbol-interference. The recently
submitted IEEE 802.11a standard [122] uses an OFDM PHY, as well as the recently submitted
HIPERLAN/II standard [44, 45].
1Federal Communications Commission, a US government organization for Telecommuncation regulation issues.
2Conference of European Posts and Telecommunications Admistration.
32
Architecture
Many wireless LANs have a (micro- or pico-) cellular structure, where two types of stations are distin-
guished: wireless terminals (WT) and base stations (BS), the latter constituting a cell by its maximum
transmission radius (typically between 5 and 100 m). The base stations are often interconnected by
abackbone system or distribution system. Usually, the WTs communicate only with a single BS and
with other WTs located in the same cell. For inter-cell communications the BS act as a forwarder,
using the backbone to direct packets to the cell containing the target WT. When a WT moves from
one cell to another, it has to associate itself with the new BS (handover), as future packets destined to
the WT have to be directed to the new BS. In some systems the BS does not only act as a forwarder,
but also plays a major role in wireless MAC protocols, e.g., as a central scheduler granting transmis-
sion rights to its associated WTs. The class of WLANs with BS and a backbone is often denoted as
infrastructure WLANs. In contrast to infrastructure WLANs are ad-hoc WLANs, where WT in close
proximitiy communicate with each other on a peer-to-peer basis or in a multi-hop fashion, without
having a central station.
3.1.2 WLAN Properties Important for MAC Design
The wireless medium has some special properties affecting the design of MAC- and link layer protocols.
A major challenge is the error-prone and time-varying channel (see Chapter 6). Common ways to
address this are ARQ protocols (based on immediate acknowledgements), forward error correction
(FEC) codes or (maybe adaptive) combinations of both approaches (hybrid error control) [99].
It is not possible to transmit and receive simultaneously on the same frequency band, due to overcharge
of receive filters . Hence, the transmitter cannot detect collisions by itself, as is required by, e.g., the
collision detection part of a CSMA/CD MAC protocol. A possible solution would be to use feedback
given by the receiver on another frequency band (busy-tone solutions [167]; these require a second
antenna).
A B C
Figure 3.1: Hidden terminal scenario
Several problems arise due to path loss (discussed in Chapter 6) in conjunction with a threshold
property: wireless receivers require the signal to have a minimum strength, before it is recognized.
Therefore, if the distance between two stations exceeds some threshold, they cannot hear each other.
For protocols incorporating carrier sensing (e.g. CSMA) this gives rise to the hidden terminal scenario
[167], depicted in Figure 3.1. Consider three stations A, B, and C with transmission radii as indicated
33
by the circles. Stations A and C are in range of B, but A is not in the range of C and vice versa. If
C starts to transmit to B, A does not hear this and considers the medium to be free. Hence, A also
starts packet transmission and a collision occurs at B.
A B DC
Figure 3.2: Exposed terminal scenario
There exists a second scenario, where carrier sensing leads to false predictions about the channel state
at the receiver: the so-called exposed terminal scenario, depicted in Figure 3.2. The four stations A,
B, C and D are placed such that the pairs A/B, B/C, and C/D can hear each other, all remaining
combinations can’t. Consider the situation where B transmits to A, and one short moment later C
starts to transmit to D. Station C performs carrier sensing and senses the medium busy, due to B’s
transmission. As a result, C postpones its transmission. However, C could safely transmit its packet
to D, without disturbing B’s transmission to A. This leads to a loss of efficiency.
Radio waves can reach the receiver on several paths of different length, leading to multipath fading.
As a result, the channel impulse response shows several maxima (delay spread) and different infor-
mation symbols can overlap in time (intersymbol interference) [4, chap. 4.5]. Often, equalization
techniques are used to compensate delay spread [132]. However, these techniques frequently need
training sequences or preambles, as well as processing power and energy. Another reason for the need
of preambles is the need to let the receiver learn about the clock frequency of the transmitter (bit
synchronization). In such systems, in front of every packet a fixed sequence needs to be transmitted.
These training sequences can occur at several places. For example, in the GSM standard a training
sequence is placed in the middle of a packet [182, chap. 3.3], while IEEE 802.11 with DSSS (see 3.2)
use a preamble at the beginning of a packet.
To summarize, MAC protocol designs based on an IEEE 802.11 PHY are faced to the following
problems:
Error-prone and time-varying channel.
It is not possible to transmit and receive simultaneously on the same channel. The send/receive
turnover costs some time.
Not all stations see the same signals, carrier sensing gives accurate information only for the
sensing point.
Need for preambles.
34
The susceptibility of CSMA-based protocols to the hidden-terminal and exposed-terminal problem
and the complexity of appropriate countermeasures (e.g., the RTS/CTS protocol described in Section
3.2.3) makes their usage in MAC protocols for industrial WLANs questionable.
3.1.3 WLAN Standards / Systems
Beneath the IEEE 802.11 WLAN standard, discussed in Section 3.2, some other systems have emerged.
The european HIPERLAN standard [43], [183, chap. 9], standardized by ETSI in 1996, works in
an exclusively assigned frequency band of 150 MHz width (5.15 GHz - 5.3 GHz), subdivided into 5
channels a 23 MHz. The standard covers the physical layer, the MAC layer and the data link layer
(including some routing functionality for multihop-forwarding). It uses two different bit rates: 1.5
MBit/s (low bitrate) for a packets preamble and header, and 23 MBit/s (high bitrate) for a packets
data part. HIPERLAN employs the EY-NPMA MAC protocol, a CSMA-based stochastic MAC
protocol with priorities (based on packet deadlines), collision avoidance algorithm and immediate
acknowledgements. Basically, HIPERLAN is an ad-hoc network, with support for overlapping cells:
stations situated in more than one cell can act as a forwarder. HIPERLAN has not gained any
commercial impact, there have never been any products available.
The successor of HIPERLAN, the HIPERLAN Type 2 or HIPERLAN/2 standard was finalized in 2000
[44], [143]. HIPERLAN/2 uses OFDM transmission in the 5GHz frequency band with 52 subcarriers.
The user bitrate is up to 54 MBit/s. The standard prescribes several modulation schemes (BPSK,
QPSK, 16QAM, 64QAM, see [155]) and code rates, and stations can adapt to current transmission
conditions by selecting proper modulation schemes / code rates. HIPERLAN/2 is an infrastructure
network, i.e. it consists of picocells, each one organized around an access point (AP). Every station
must be able to operate as AP. In peer-to-peer situations without any fixed (cabled) network infras-
tructure, a network can be built up by explicitly electing an AP from the available stations. The
MAC layer uses small packets with only two possible sizes: 9 bytes (for control packets) and 54 bytes
(for data and control packets). The AP runs a central scheduler, which grants transmission rights
on demand (demand-assignment protocol). By selecting proper scheduling schemes, quality of service
issues can be addressed. As basic transmission mode, a TDD/TDMA (Time Division Duplexing /
Time Division Multiple Access) MAC scheme with fixed length superframes is used [60]. The data
link control (DLC) protocol uses a selective repeat protocol. The radio link control (RLC) protocol
includes connection management, mobility management, frequency management and power manage-
ment. Several convergence layers placed on top of the DLC/RLC protocols support interconnection
between HIPERLAN/2 and different types of core networks (e.g., Ethernet, FireWire). No products
are available at time of writing.
There are two emerging standards specifically targeted for home environments and for wireless cou-
pling of small devices: the HomeRF and Bluetooth systems. The Bluetooth system [61] is defined by
an industry-driven working group, and aims at interconnecting low-cost devices, using cheap wireless
transceivers. It is defined to be a scatter ad-hoc network, i.e., many ad hoc Bluetooth networks can
share the same area and frequency band. It is based on a frequency-hopping CDMA scheme in the
license free 2.4 GHz ISM band, working on 79 subchannels of 1 MHz width. The frequency is changed
every 625 µs (dwell time), corresponding to a hopping frequency of 1600 Hz. A channel is defined by
a particular hopping sequence. On each channel a raw bitrate of 1 MBit/s is available. A piconet is
formed by a particular hopping sequence and consists of maximum eight stations: one master station
35
and maximum seven slave stations. A large number of (nonorthogonal) hopping sequences is prede-
fined. Within a piconet the master is responsible for arbitrating access to the channel, furthermore,
only master-slave communications is used. The channel is organized in slots of one dwell-time length.
One slot can carry a single packet. It is possible to transmit long packets of 3 or 5 slots length.
Within such packets frequency hopping is suppressed. Full duplex communications is achieved by
using a TDD scheme. Both isochronous connection oriented service and asynchronous connectionless
services are available
The HomeRF system [114] is designed for home scenarios, where users want to access the Internet
or the plain switched telephone system (PSTN) with handheld devices from everywhere within their
homes. The HomeRF working group was formed in 1997 and delivered a first specification in january
1999. A HomeRF network consists of a control point and several clients. The control point is attached
both to the PSTN and to a main home PC with internet access. Hence, voice and data are transmitted
over the same wireless network, the HomeRF SWAP protocol has support for asynchronous data
(TCP/IP communications) and for isochronous data (telephony). For asynchronous data transfer a
peer-to-peer mode is available. In contrast, voice data transmission requires a control point. The
physical layer is largely adapted from the IEEE 802.11 frequency hopping PHY (see Section 3.2).
It uses a hop frequency of 50 Hz, operates in the 2.4 GHz ISM band and offers a raw bitrate of 1.6
MBit/s. The MAC layer is based on a superframe structure, with a superframe being 20 ms long (each
superframe is transmitted on a single frequency). A superframe is subdivided into two contention free
periods (CFP) and a contention period (CP), located between the CFPs. Speech data is transmitted
on a TDMA basis in the second CFP, the corresponding retransmissions are placed in the first CFP
of the following superframe (thus using another frequency). The CP is used for asynchronous data
transmission and employs a CSMA/CA scheme derived from IEEE 802.11.
Finally, wireless asynchronous transfer mode (ATM) WLANs are targeted to support ATM services
on a wireless medium. Some prototypes have been built (e.g., the WATM system [113], or the Magic
WAND demonstrator [105]), however, none of these systems converged into commercial products.
3.2 IEEE 802.11
The IEEE 802.11 WLAN standard [120] was finalized in 1997, a revised version appeared in 1999.
It belongs to the IEEE 802.x family of LAN and MAN standards. It offers the same abstract MAC
interface as IEEE 802.3 Ethernet, hence, it can be used with an IEEE 802.2 logical link control
(LLC) sublayer [119]. In 1999, two extensions were defined, providing additional physical layers: [122]
describes an OFDM PHY in the 5 GHz band, while [121] describes an 11 MBit/s extension of the 2.4
GHz DSSS PHY. A detailed description of IEEE 802.11 can be found in [123].
Basically, the standard describes an architecture, services and protocols for an ethernet-like wireless
LAN, using a CSMA/CA-based multiple access method with enhancements for time-bounded services.
The protocols are designed to run on top of several physical layers. In this section an overview of
the IEEE 802.11 architecture and its PHY and MAC layers is given, with emphasis on the mostly
deployed DSSS PHY. Because of its widespread usage we have chosen the DSSS PHY as the basis for
this thesis.
Certain aspects of IEEE 802.11, e.g., security, authentication or power saving are not relevant for this
thesis and therefore not discussed.
36
For the discussion about services the terminology of the OSI reference model is used, which is sum-
marized in Section 4.1.3.
3.2.1 Architecture
The main elements of the IEEE 802.11 architecture are stations,access points (AP), portals, the
wireless medium, the basic service set (BSS), the distribution system (DS) and extended service sets
(ESS).
Astation is usually some computer or portable device with a network interface card (NIC), the latter
carrying IEEE 802.11 PHY and MAC entities. On a station typically a user runs applications which
initiate or respond to communications with other devices or users. Every station has to support
station services: authentication, deauthentication, privacy and data delivery. While data delivery is
self-explaining, the other services are designed to cope with the fact that a WLAN is “open” by its
broadcast nature, as compared to wired LANs, where every member has to be explicitly attached to
a wire:
With authentication / deauthentication services different stations can assure each other that
they are administratively allowed to communicate. Without authentication a station is not
allowed to transmit data frames.
The privacy service provides encryption of data frames from being eavesdropped.
Abasic service set (BSS) is a set of stations which can communicate directly using the wireless
medium. It is not required that every pair of stations belonging to a BSS can hear each other. It
suffices that, if we take the stations as the edges of a graph and draw vertices between them if radio
connectivity is given, the graph is connected.
There are two types of BSS: a BSS with an AP is denoted as infrastructure BSS, while a BSS without
an AP is denoted as independent BSS (IBSS). The case of an IBSS corresponds to ad-hoc networks,
which do not use authentication/deauthentication services. In an IBSS all communications is done on
a peer-to-peer basis, however, without multihop forwarding.
The case of an infrastructure BSS is different. All communications is relayed through the AP. Several
APs can be coupled via the distribution system (DS) to form an extended service set (ESS). In the
infrastructure case every station has to associate itself with an AP in order to be allowed to commu-
nicate with other stations. When moving through the network the station may loose connectivity to
its current AP and move into reach of another AP. In this case the station has to reassociate with the
new AP. When station A wants to transmit a data frame to station B, it addresses the frame towards
its AP X. The AP X checks whether B is in its BSS. If so, X sends the frame to B. If not, it transmits
the frame via the DS to the AP Y in whichs BSS B resides. Finally, Y forwards the frame to B. The
DS itself is not part of the standard. If A’s frame is not destined to a station in the ESS, the AP
forwards it to a portal (e.g., an Ethernet bridge).
3.2.2 PHY Layer
The PHY layer of IEEE 802.11 is subdivided into two sublayers. The upper physical layer convergence
procedure (PLCP) sublayer offers the interface to the MAC layer, the lower physical medium dependent
37
(PMD) sublayer actually transmits and receives data frames.
In the original specification [120] three physical layers are defined: an infrared PHY, a DSSS PHY in
the 2.4 GHz ISM band and a FHSS PHY in the 2.4 GHz ISM band. The additional PHY layers are
an OFDM PHY in the 5 GHz band [122] and an 11 MBit/s extension of the DSSS PHY [121]. We
discuss only the DSSS PHY in some detail, the others are briefly summarized.
PHY Interface
The standard defines an abstract interface the PHY has to offer to an IEEE 802.11 MAC instance.
The interface offers primitives to start and stop transmission of frames, to pass data, to obtain carrier
sense information, and to indicate frame reception.
The PHY-DATA.request and PHY-DATA.indication service primitives are used for transferring single data
bytes from the MAC to the PHY and vice versa. The PHY-DATA.request can only be used when
the PHY is in transmit mode. The PHY answers this request with a PHY-DATA.confirm primitive.
Received data bytes are transferred with the PHY-DATA.indication from the PHY to the MAC.
With the PHY-TXSTART.request primitive the MAC asks the PHY to start transmission of a frame.
The result of this operation is signalled with the PHY-TXSTART.confirm primitive. To complete the
frame transmission after the last data byte, the MAC entity issues a PHY-TXEND.request primitive,
which is acknowledged by the PHY with a PHY-TXEND.confirm primitive.
When receiving packets, the PHY indicates this to the MAC with two primitives: the PHY-RXSTART.indication
is issued after the PHY has acquired bit synchronisation and received a valid start frame delimiter.
The PHY-RXSTOP.indication is issued after finishing frame reception.
The MAC layer needs carrier sense information for performing its CSMA/CA protocol (see Section
3.2.3). The PHY-CCARESET.request and PHY-CCARESET.confirm primitives allow the MAC to control
the carrier sense (or clear channel assessment (CCA)) logic of the PHY. The PHY-CCA.indication is
isssued by the PHY every time the channel changes its state from idle to busy or vice versa. This
primitive indicates the channel state to the MAC.
DSSS PHY
The DSSS PHY offers two bitrates: 1 MBit/s with differential binary phase shift keying (DBPSK)
modulation or 2 MBit/s with differential quaternary phase shift keying (DQPSK) [155]. A channel
bandwidth of 22 MHz is used, the center frequencies of the DSSS channel are placed in 5 MHz steps
in the 83.5 MHz wide 2.4 GHz ISM band (for Northern America 11 center frequencies are defined, for
Europe 13). Hence, within the same geographical area three overlapping 802.11 WLANs with DSSS
can be operated without mutual interference. The maximum allowed transmit powers in different
geographical regions are shown in Table 3.1.
The 11 MBit/s DSSS extension [121] is an addendum to the DSSS PHY. Additional modulation
schemes (complementary code keying (CCK)) are specified for 5.5 MBit/s and 11 MBit/s data rates.
Furthermore, an additional mechanism (rate shift mechanism) is defined, which allows to set the
default rate to binary phase shift keying (BPSK) or quaternary phase shift keying (QPSK), in order
to cooperate with legacy WLANs or to adapt to channel conditions.
38
North America 1000 mW
Europe 100 mW
Japan 10 mW/MHz
Table 3.1: Maximum allowed transmit powers in different regions
The DSSS PHY uses an 11 bit Barker code for direct sequence spreading. Hence, one data bit is
mapped to 11 chips. Each chip is transmitted either as a BPSK or as a QPSK waveform [155]. Each
chip of the barker sequence takes the values 1 or -1. It has the interesting property that the inner
product Sof the Barker code with a shifted version of it (with lag k) takes the value S= 11 for
k= 0 or |S| 1 for k6= 0. This simplifies the design of correlation receivers and maintenance of bit
synchronisation.
Sync (128 bits) SFD (16 bits) Signal (8 bits) Service (8 bits) Length (16 bits) CRC (16 bits) MPDU
PLCP HeaderPLCP Preamble
Figure 3.3: DSSS PHY PPDU Format
The PLCP part of the DSSS PHY forms its own protocol data units (denoted as PPDU) by adding
some fields to a MAC frame. The PPDU format is shown in Figure 3.3, it is subdivided into the
PLCP preamble, the PLCP header and the data part. The PLCP preamble consists of 128 one bits
(sync sequence), followed by a constant value (start frame delimiter, SFD). The sync sequence and
SFD allow the receiver to synchronize on the sender’s clock (bit synchronization) and to determine
the start of the frame. The signal field indicates the modulation type used in the data portion of the
PPDU, while the length field indicates the length of the data portion in microseconds. The service
field is not used yet. The CRC field contains a 16 bit cyclic redundancy check checksum which is
computed from the three previous values. If the checksum is wrong or the signal field carries an
unknown value, the MAC instance is signalled and can decide on aborting PPDU reception. It is
important to note that while the data part can use different modulation types, the PLCP preamble
and PLCP header fields are always transmitted with BPSK modulation. When the data part uses
another modulation type, both transmitter and receiver must switch the modulation type within a
PHY packet (more precisely, after the CRC field). With the PLCP preamble and the PLCP header a
PPDU has a minimum duration of 128 + 64 = 192µs.
The logical structure of the PMD part of the DSSS PHY is shown in Figure 3.4 (as taken from [123,
chap. 6]). At the transmitter, first a scrambling step is applied to a PPDU. This is done to randomize
the data, specifically to eliminate long runs of zeros or ones (like in the PLCP preamble). The modulo-
2 adder actually performs the DSSS algorithm. The transmit mask filter restricts the spectrum of the
spreader output to 22 MHz, while the QPSK/DPSK modulator produces actual waveforms. On the
receiver side basically the transmitter operations are inverted, with the additional duty of acquiring
the transmitters clock from the PLCP preamble. The receive process is based on a correlator for the
Barker sequence.
39
DBPSK
DQPSK
Modulator
Mask
Filter
Transmit
(Spread)
Modulo-2
Adder
Scrambler
PPDU
11-Bit-Barker Word
(De-Spread)
Correlator
Descrambler DBPSK
DQPSK
Demodulator
Timing
Clock
Recovery
11-Bit-Barker Word
Data Clock
PPDU
Figure 3.4: DSSS PHY Schematic
Other PHYs
The Infrared (IR) PHY uses infrared light as transmission medium. This restricts it to line-of-sight
applications. In addition, infrared waves cannot propagate through walls. Data rates of 1 MBit/s and
2 MBit/s are supported, using pulse position modulation.
The FHSS PHY uses the 2.4 GHz ISM band. The user can specify more data rates than with the IR
or DSSS PHY: from 1 MBit/s to 4.5 MBit/s in increments of 0.5 MBit/s, the standard prescribes the
support of 1 MBit/s and optionally 2 MBit/s. The ISM band is subdivided into 79 different subbands
of 1 MHz width (except in France, Spain, and Japan, where fewer subbands are used). The hopping
process is slow (2.5 Hz), three hopping sequences are defined. The FHSS PHY has not gained as much
market acceptance as the DSSS PHY.
The OFDM PHY uses three different bands in the 5 GHz range (U-NII bands): from 5.15 GHz to
5.25 GHz, from 5.25 GHz to 5.35 GHz and from 5.725 GHz to 5.825 GHz. Using different modulation
types (BPSK, QPSK, 16QAM, 64QAM) and different code rates it offers bitrates of 6, 9, 12, 18 or 24
MBit/s, optionally 36, 48 or 54 MBit/s. The overall bandwidth is subdivided into 52 subbands, 48 of
which are for data transmission, and 4 for carrier pilots. Additionally the PHY uses bit interleaving
and convolutional error correcting codes to compensate narrowband noises on one or few subbands.
3.2.3 MAC Layer
This section provides a brief description of the IEEE 802.11 MAC protocol. We restrict the discussion
to the simple data transmission procedures, leaving out management functionality, power saving issues,
40
multirate support, privacy/encryption or details like the fragmentation/reassembly scheme.
MAC Services
Basically, the IEEE 802.11 MAC provides a connectionless best-effort service to its user (typically an
LLC instance). However, for increasing transmission reliability, a bounded number of retransmissions
is performed for all unicast frames.
When the MAC user wants the MAC to transmit a MAC MAC service data unit (MSDU), it passes
aMA-UNITDATA.request primitive to the MAC layer. This primitive carries several parameters: the
source and destination MAC address, the data block (up to 2304 bytes), and a priority value, indi-
cating whether the MSDU should be transmitted in the contention or contention free period of the
point coordination function (PCF), see below. Transmission in the contention free periods reduces
MSDU losses by eliminating collisions. The last parameter describes, whether the MSDU should be
transmitted strictly in order, i.e., the MSDU may not be postponed or accelerated relative to other
MSDUs.3The reception of this primitive causes the MAC instance to generate a well-formed frame
and to try to deliver this to the destination address.
For every MA-UNITDATA.request service primitive the MAC entity generates a MA-UNITDATA-STATUS.indication
primitive, which tells about the success of the corresponding request primitive. As parameters the
source and destination address, the priority value, the transmission status, and the provided priority
and service class are passed. The transmission status can take several values, e.g, indicating successful
transmission, unsuccessful transmission due to exceeding retry limits or MSDU lifetimes, notifications
on illegal parameters in the request primitive and so forth. The priority value indicates the priority
(contention or contention free) actually used for the frame. This, in general, need not be the same as
the value wished by the user in the request primitive. Another example for this is the service class
parameter.
Finally, the MA-UNITDATA.indication service primitive is passed from the MAC instance to the LLC
upon successful frame reception. As parameters, it carries the source and destination address, the
priority and service class, and the data block. There is no notification upon erroneously received
frames.
The MAC services do not allow the user to express quality of service (QoS) requirements. It is not
possible to assign a separate lifetime or a separate number of retransmissions for every packet, not even
for the two priority classes. Instead, the corresponding protocol parameters are global for all packets.
It is also not possible to specify a packets transmit power or its bit rate via the MAC interface.4
Even the priority field of the MA-UNITDATA.request primitive is only a “proposal”, and the MAC is
not strictly required to stick to it.
Basic Frame Transmission Procedures
Consider the case that station A has a unicast frame for station B. After A has obtained channel
access (see next section), it transmits the frame. Station B is, upon successful reception, required
3The case of MSDU reordering applies to multicast and broadcast MSDUs when power saving mechanisms are used.
It is not further discussed in this thesis.
4Clearly, a MAC implementation can try to adapt these to environmental conditions, but typically the user has no
mechanism to influence this.
41
to send an immediate acknowledgement frame to A after a very short time (no more than SIFS, see
below). The basic access mechanism ensures that between a frame and its corresponding ack no other
transmission can take place. Therefore, the frame/ack sequence is considered to be atomic. If A does
not receive an ack, it performs a retransmission, however, the maximum number of retransmissions is
bounded by a configurable parameter.5The retransmissions have to contend for the channel by just
the same way as initial frames. Furthermore, for every frame the MAC instance maintains a lifetime
variable. If a frame is not successfully delivered within this lifetime, it is discarded.
IEEE 802.11 defines the RTS/CTS mechanism for attacking the hidden terminal problem. Consider
again the case that A has a unicast frame for B. After A has obtained channel access, it sends a short
RTS frame to B, indicating the time duration needed for the following CTS frame, the data frame
and the acknowledgement. If B receives the RTS frame without errors and is ready to answer, it does
so after a very short time (no more than SIFS) with a CTS frame, which contains the time duration
needed for the remaining transmission cycle (data frame plus ack). If A successfully receives the CTS
frame, it starts transmission of the data frame after a SIFS-bounded time. Finally, if B receives the
frame correctly, it sends an ack after a SIFS-bounded time. By using only very short gap times in
the whole frame exchange sequence, it is, by the basic access mechanism, atomic. Another station
C hearing the CTS from A waits, whether A starts to transmit a data frame. If so, C defers any
transmission, until A has finished the frame exchange (including ack). If not, C can transmit frames.
Any station D hearing the CTS from B waits with its transmission, until the frame exchange finished.
To implement this, C and D write the length information in the RTS and CTS frames into their NAV
fields. This field is part of the virtual carrier sense mechanism, described below.
The RTS/CTS mechanism is inefficient for small sized frames. Therefore, for small frames (w.r.t. a
configurable frame size) the RTS/CTS exchange can be switched off. The RTS/CTS mechanism is
not used for broadcast / multicast frames.
Basic Access Mechanism (DCF)
The basic access mechanism is called distributed coordination function (DCF). On top of it the centrally
controlled point coordination function (PCF) is defined, described in the next section. The distributed
coordination function (DCF) is a CSMA/CA mechanism, implemented by every station.
As a prerequisite, the DCF uses a combined carrier sense mechanism. This mechanism combines
carrier sense information delivered by the PHY with a virtual carrier sense mechanism: Every station
is required to maintain a NAV variable (network allocation vector), which conceptually is a timer.
Most frame types (e.g., the RTS and CTS frames, the beacon frames of the point coordination function
(PCF), or data frames) carry in their MAC header a duration field, indicating the time needed for
the actual frames and the remaining frames of the frame exchange sequence. A station successfully
receiving such a frame copies the duration field into their NAV value, if it is greater than the current
NAV value. As long as the NAV variable contains a nonzero value, it is periodically decremented.
The medium is considered busy, if either the PHY indicates a busy condition or if the NAV variable
has a nonzero value.
The DCF employs different time durations and inter frame spaces for prioritization of different frame
types:
5In fact, there is an additional distinction between short and long frames (w.r.t. a configurable bound), for each
type different retry counters are maintained.
42
The slot time is the length of one contention slot (20 µs for the DSSS PHY)
The short inter frame space (SIFS) is the maximum time stations are allowed to wait before
sending acks, CTS frames or during the contention free period in the PCF.6In the DSSS the
SIFS value is 10 µs.
The priority inter frame space (PIFS, used in the PCF) and the distributed inter frame space
(DIFS) are defined as follows:
PIFS = SIFS + slot time
DIFS = SIFS + 2 ·slot time
Finally, the extended inter frame space (EIFS) is much larger.
The access procedure works as follows: when a station A wants to initiate transmission of a new
or retransmitted frame, it senses the medium using the combined carrier sense mechanism. If the
medium is sensed idle and remains idle for DIFS time7, A starts to transmit its frame. If the medium
is busy, A defers until the medium becomes idle for at least DIFS (or EIFS, if the last frame was
erroneous8). After DIFS, the contention period starts. Station A generates a random backoff value
for an additional deferral and puts this into the backoff timer. However, this is done only, if the backoff
timer has a zero value. The random value is taken uniformly distributed from [0, CW], where CW is
the current contention window value. The CW variable is maintained by a modified truncated binary
exponential backoff algorithm, taking the number of retransmissions of this frame into account. The
minimum contention window is given by CWmin = 32, its maximum by CWmax = 1024. The CW
variable is reset to CWmin after successful frame transmission, or if a frame is discarded after the
maximum number of retries or exceeded lifetime.
In the contention period, station A performs carrier sensing after every slot. If the medium is sensed
idle, the backoff timer is decremented by a slot time. If the medium is busy, the timer is not decre-
mented and the backoff procedure is suspended. Station A has to wait again, until the medium is
idle for at least DIFS, and then resumes the backoff procedure, however, using the old timer and not
computing a new backoff value. Transmission is started, when the backoff timer reaches zero.
Contention Free Service / Point Coordination Function (PCF)
The PCF provides a contention-free frame transfer to stations, which can help to implement nearly
isochronous services. The PCF adds a special station (point coordinator, PC) and some frame types
to the basic protocol. While the presence of the PC, and thus the contention free service, is optional,
every station must understand the frame types. The PC is always co-located with an AP. Hence, the
PCF is available only in infrastructure mode.
The PCF defines a superframe structure, with variable- but maximum-length superframes. A super-
frame consists of a superframe header, followed by a contention free period (CFP) of variable length,
6In fact, it is also used for the second and subsequent data frames of a fragment burst (segmentation / reassembly
protocol).
7As SIFS ¡ DIFS, by this behavior acknowledgement and CTS frames have priority.
8This way, other stations can ack a frame, which is considered erroneous by A. The view on the channel quality may
be vastly different within a set of stations.
43
followed by a contention period (CP) of variable length. During the CP all stations, including the
AP/PC operate in DCF mode. The superframe header is formed by a beacon packet. The beacon
is a small packet sent by the PC to all stations. A part of the beacon packet is the maximum time
duration of the CFP. Every station receiving the beacon packet puts this value into its NAV variable.
The PC ends the CFP with a special frame (CFP-End), which allows all stations to reset their NAV
and work in the DCF mode. The CP must be long enough for at least one frame exchange, e.g., for
transmitting management frames with association requests, allowing stations to be put on the poll
list.
The PC has a poll list of station addresses. Members of this list are polled during the CFP and can
transmit their data contention free. Vice versa, the PC/AP can transmit frames during the CFP to
its stations (downward frames). For efficiency reasons, it is possible to piggyback acknowledgements
for frames sent by stations and poll-commands to the downward frames. When polled by the PC,
each station A can transmit only one frame (if A’s queue is empty, it answers with an ack frame).
It does so after maximum SIFS time, ignoring combined carrier sense information. This frame is not
necessarily targeted to the AP/PC, as is normally required in infrastructure mode. The station’s
frame can piggyback an ack for a previous downward frame addressed to A. A station is only allowed
to send a data frame, if the frame transmission including ack can be finished before the maximum
CPF length. If there is insufficient time, a station may answer with only an ack to the PC, however,
indicating that it has a non-empty queue.
The polling scheme is not fully specified. Every station which wants to be polled, has to signal
this to the PC/AP. When a station associates with its AP, it sets the CF-Polling-Request bit in the
association request. The PC then includes this station into its poll list. The poll list membership ends
upon disassociation or when the station reassociates itself without setting the CF-Polling-Request
bit. The standard prescribes only the following rules for poll list handling: a) during every CFP at
least one station must be polled; b) the poll list is traversed in the same order as the stations have
associated themselves; c) if there is spare time during a CFP the PC may use it arbitrarily; d) the
PC may decide to poll stations not in the poll list.
The PC maintains a local periodic timer. When this timer expires, the PC senses the medium. If it is
idle for at least PIFS, it starts the CFP by sending the beacon frame. If the medium is sensed busy,
the PC defers until the medium is idle for at least PIFS, and then sends the beacon frame. Hence, by
using the PIFS inter frame space (which is smaller than DIFS), the PCF has priority over the DCF
mode. Since the medium need not be idle when the timer expires, the CFP start times show some
jitter, preventing strictly isochronous services. This jitter is denoted as superframe stretching. The
PC transmits beacon packets regularly during the CFP. Each beacon indicates the maximum time to
the end of the CFP.
During the CFP the RTS/CTS mechanism is not used. If a frame transmitted during CFP is not
acked, retransmissions may be performed after the next poll or during the following CP. The PC may
retransmit an unacked downward frame after a PIFS period. In general, the PC uses only SIFS inter
frame spaces during the CFP, except on getting no ack on a downward frame or no response to a poll.
44
Chapter 4
PROFIBUS and Wireless
PROFIBUS
In this chapter we introduce the relevant aspects of the PROFIBUS fieldbus systems (Section 4.1).
We give a brief description of its architecture, the different physical layers, the link-layer services, the
MAC- and link-layer protocol, and the most important properties with respect to realtime behav-
ior. The link-layer services (and their interface) are the common denominator of the existing wired
PROFIBUS and the wireless PROFIBUS targeted in this thesis, since one of the most important
design goals of wireless PROFIBUS is to implement this interface to allow easy porting of applica-
tion layer instances. The discussion is restricted to the transmission of user data and leaves out any
management functionality.
In the next step we depict a general framework for wireless industrial communication systems (Section
4.2). It turns out that there are several means for integrating wired and wireless stations into a single
fieldbus LAN, and for a certain class of fieldbuses (including PROFIBUS) a coupling on the MAC-
and link-layer is advocated. The approach is to use the existing MAC- and link-layer protocol only
for the wired stations and to use a specifically tailored protocol on the wireless side.
This framework is then specialized to the case of a wireless PROFIBUS (Section 4.3). This includes
a description of the system under study. Of special importance, however, is the definition of the
realtime performance measures, capturing the realtime- and reliability requirements, which protocols
for wireless PROFIBUS should fulfill as good as possible.
This chapter also includes a discussion about the possibility to find a mapping between the PROFIBUS
link-layer services and the IEEE 802.11 MAC services (and protocol). It turns out that both do not
match well.
Finally, in Section 4.5 we give an overview on the related work on wireless fieldbus systems, wireless
PROFIBUS and realtime transmission over IEEE 802.11.
45
4.1 The PROFIBUS Fieldbus System
The PROFIBUS is a german national standard since 1991 ([33], [34], some corrections in [135] and
[133], the english versions being [55], [134]), which was also adopted as a european standard in 1996
[177]. A short tecnical description can be found in [136]. The PROFIBUS comes in two variants.
The variant this thesis focuses on is a fieldbus system (compare Chapter 2), targeted to the coupling
of intelligent field devices. A simplified variant belongs to the class of sensor/actuator buses. The
PROFIBUS is designed to deliver real-time services in harsh, industrial environments. It has gained
widespread usage, the “PROFIBUS International” user organization (www.profibus.org) states that
there are more than 2 million PROFIBUS devices in more than 200000 installations in factory and
process automation, making PROFIBUS the most popular fieldbus in Europe.
In this section the most relevant characteristics of the PROFIBUS are presented. Other descriptions
of the PROFIBUS can be found in [11, 116, 30]. A performance assessment can be found in [89]. The
real-time properties of the PROFIBUS are investigated in [169], [168, chap. 3, chap. 5].
It should be noted that the standards document and the corrections are written in plain text, not
using any formal description techniques like, e.g, SDL or LOTOS [49], [24], [175].
4.1.1 Architecture
The PROFIBUS targets a wide range of applications in manufacturing and control. In order to avoid
“one size fits all” solutions it offers different profiles, i.e. different sets of protocols and application layer
services. The communication profiles denote different sets of protocols, while the application profiles
distinguish between different sets of application layer services, and the physical profiles distinguish
between different transmission technologies.
The PROFIBUS Decentralized Periphery (PROFIBUS-DP) communications profile (defined in 1993
[35] ) is designed for coupling several simple sensors or actuators (or slaves) to a single controller,
using a cyclic polling scheme with only a single master station. The master station is the only station
controlling the medium. In this profile only the layers one (physical layer) and two (MAC and data
link layer) of the OSI reference model are covered. Application layer entities use the link layer interface
to obtain services from the link layer.
The PROFIBUS-FMS (Fieldbus Message Specification) communications profile is targeted at dis-
tributed control applications, with many intelligent nodes. The role of being a master station is
changed over time between a set of stations. The PROFIBUS-FMS covers the layers one, two and
seven of the OSI reference model (an overview of the communication-related part of the PROFIBUS-
FMS protocol stack is shown in Figure 4.1, see [11, chap. 1.7.3]). The PHY covers physical transmis-
sion of single bits. The Fieldbus Data Link (FDL) layer provides MAC functionality and semi-reliable
transmission of data (link layer functionality). The application layer is subdivided into the application
layer services or Fieldbus Message Specification (FMS), and the lower layer interface, the latter pro-
viding e.g., connection management. The application layer services are similar to the MMS services
(Section 2.3). The part of the PROFIBUS-FMS protocol stack dealing with data exchange is shown
in Figure 4.1 (see [11, chap. 1.7.3]).
For both communication profiles the PROFIBUS is a LAN technology, where a number of stations
share a single physical medium. The lack of routing functionality and proper addressing schemes
46
Physical Layer (PHY)
Fieldbus Data Link (FDL)
Lower Layer Interface (LLI)
(FMS)
Fieldbus Message Specification
FDL-Interface
1
2
7a
7b
Figure 4.1: PROFIBUS protocol stack
prevent the coupling of different PROFIBUS LANs to form larger networks. A coupling of different
PROFIBUS LANs has to be provided at the application layer (gateways).
Because of its generality the focus is exclusively on the PROFIBUS-FMS (or PROFIBUS for short),
thus explicitly covering the case of multiple master stations.
4.1.2 Physical Layer
The PROFIBUS is defined for different physical layers: an RS-485 version, a fiber-optic version and
a special version (IEC 1185-2) for use in explosible environments [136].
The RS-485 version uses serial transmission over a shielded twisted pair (STP) cable. Two cable types
are standardized, however, one of them is not used anymore. The data is encoded with non-return to
zero (NRZ) coding [62]. The transmission is byte-oriented, to support proper bit-synchronization the
MAC layer adds start- and stopbits to a data byte, enforcing a minimum rate of signal level changes.
The basic unit of the physical topology is called segment and has a bus structure, i.e. all stations
attached to a segment see the same signals. The maximum number of stations on a single segment
47
Bitrate (kBit/s) 9.6 19.2 93.75 187.5 500 1500 12000
max. Distance (m) 1200 1200 1200 1000 400 200 100
Table 4.1: Relation between bitrate and distance for the RS-485 PHY and cable type A
is 32. Segments can be coupled using repeaters, a maximum of three repeaters is allowed between
any pair of stations. The PROFIBUS addressing scheme restricts the number of stations in a single
PROFIBUS LAN with multiple segments to 127. The set of available bit rates vs. the maximum
distance between any pair of stations is shown in Table 4.1.
The IEC 1185-2 version uses a fixed bitrate of 31.25 kBit/s and Manchester coding [62] over STP
cables. This version allows small devices to get their power supply from the cables. Furthermore, its
electrical properties are such that hardware failures do not create sparks. Hence, the cable and field
devices can be used in explosible environments. The cabling allows for bus and tree topologies, again
the basic unit is a single segment. Segments can be coupled with repeaters. Special segment coupler
devices allow the coupling of IEC 1185-2 segments to RS 485 segments, while link devices subsume a
single IEC 1185-2 segment into a single RS 485 station.
The fiber optic version is targeted for use in harsh environments, i.e., where strong electromagnetical
interferers are present. Different types of fiber can be used (e.g., monomode, multimode), the physical
topology is a ring or a passive star coupler is used. Fiber optic segments and RS 485 segments can be
coupled with special devices.
4.1.3 Link Layer Services
The link layer offers four service types to the upper layers: three asynchronous service types and
a cyclic one (or polling service). All services except the cyclic service allow the user to distinguish
between two priorities: high priority (i.e., important) data and low priority (less important) data.
The high priority class is devoted to the exchange of time sensitive and safety critical sporadic data
(e.g., alarms), the low priority class is used for everything else. The cyclic services belong to the low
priority class, however, the bandwidth assignment rules are different than for ordinary low priority
data.
While not explicitly specified in the standard, for the asynchronous service types it is reasonable to
assume that within each priority class the service requests are processed in first in first out (FIFO)
order, regardless of the service type. In other words, conceptually there are two request queues, one
for low priority requests and the other for high priority requests. Every request primitive, regardless
of its service type, is sorted into one of these queues according to its priority. For the cyclic service
type a kind of polling-table is maintained.
For the discussion about services the terminology of the OSI reference model is used (see [165, chap.
1]): services are accessed via a service access point (SAP), and typically four service primitives are
involved in the communication between service provider and service user: request, indication, response
and confirmation. As a convention, for a service primitive belonging to a service A we will write
A.primitive, e.g., FDL DATA ACK.request. For all services there is a distinction between the roles of the
initiator station and the responder station. Service handling is initiated when an application layer
entity (called FDL user) at the initiator station issues a request primitive. Furthermore, a protocol
48
Service SDA SDN SRD CSRD
Immediate Ack yes no yes yes
Connection Oriented no no no local connection setup
Data in Ack no - yes yes
Occurence sporadic sporadic sporadic cyclic
Table 4.2: PROFIBUS FDL-Services
data unit (PDU) is the means of communications between peer entities.
A key attribute of several service primitives are address informations. The address information consists
of an address byte and an optional service access point (SAP) [33, chap. 4.8.2]. PROFIBUS supports
three types of addresses: a unicast address has an address byte with a value between 0 and 126
(therefore, the number of stations in a PROFIBUS LAN is restricted), and optionally a SAP can be
used. For multicast addresses and broadcast addresses the address byte has a value of 127 and the
group address / multicast address is selected via the SAP value. In fact, the address scheme can be
more sophisticated (address extensions, segment addresses [33, chap. 4.8.2]), but this is not relevant
to this work.
For every request primitive exactly one confirmation primitive is created, which contains information
about the success of the request. By the FIFO assumption discussed above, within one priority class
the confirm primitives occur in the same order as the corresponding request primitives. The standard
prescribes that once packet transmission for a request has started, it cannot be interrupted by other
requests. Stated differently: the PROFIBUS does not allow for preemption in handling requests. This
property is called atomicity property.
The PROFIBUS utilizes no respond primitives. All services are designed in a way that a responder’s
station’s FDL instance can generate answer frames immediately, without interacting with higher layer
instances.
The PROFIBUS services are explained in the following sections, a short summary of the services can
be found in table 4.2.
SDA: Send Data with Acknowledge
The send data with acknowledge (SDA) service is basically a semi-reliable acknowledged datagram
service. A sketch of the interactions is shown in Figure 4.2.
The FDL user at the initiator station starts this service. He prepares a data block of up to 246
bytes length, chooses a priority, provides destination address byte and destination service access
point (DSAP) for the target station and its own SAP as source service access point (SSAP). All this
information is passed with the FDL DATA ACK.request service primitive to the FDL instance via the
local SAP. The destination address is required to be a unicast address.
Some time later the local FDL instance attempts to send this data block within a single frame1to
the responder station. The responder has to acknowledge receipt of the frame using an immediate
MAC-layer acknowledgement. The acknowledgement carries no data. If the initiator receives no ack
1called telegram in the PROFIBUS standard, we use both terms interchangeably.
49
ResponderInitiator
FDL_DATA_ACK.request (data, ...)
FDL_DATA_ACK.confirm (status, ...)
FDL_DATA_ACK.indication (data, ...)
L_PDU (data, ...)
L_PDU ()
Figure 4.2: Interactions of SDA service
within a pre-specified time (called slot time,TSL ), it repeats the frame. The overall number of trials
is upper-bounded. When the result of the transmission trial is known, i.e. an acknowledgement is
received or the maximum number of trials is exhausted, the local FDL user is informed about this
result with the FDL DATA ACK.confirm primitive. Additionally, this primitive contains the address
information and prioritiy level passed with the corresponding FDL DATA ACK.request primitive.
When the FDL instance at the initiator has started processing a FDL DATA ACK.request primitive X,
it is required to get the result as fast as possible, i.e., it is not allowed to process other requests Y or
to pass the token (see Section 4.1.4) while processing X.
At the responder station, at the first reception of a data frame a FDL DATA ACK.indication primitive
is generated, the data is passed as a parameter. If the responder identifies a frame as retransmitted
(by the alternating bit protocol, see Section 4.1.4), it is acknowledged and silently discarded. The
FDL layer ensures that for every FDL DATA ACK.request primitive at most one FDL DATA ACK.indication
primitive and exactly one FDL DATA ACK.confirmation primitive is generated.
SDN: Send Data with No Acknowledge
The send data with no acknowledge (SDN) service is basically an unacknowledged datagram service,
applicable to unicast- and multicast-/broadcast-addresses. A sketch of the interactions is shown in
Figure 4.3.
The FDL user at the initiator passes a FDL DATA.request primitive to the FDL instance. This primitive
carries the same parameters as the FDL DATA ACK.request primitive described in the previous section.
The destination address may be a unicast, a multicast or broadcast address. At a later time the MAC
instance transmits one data frame with the data and then produces a FDL DATA.confirm primitive for
the local FDL user. Every other station which is addressed and which successfully receives the frame,
creates an FDL DATA.indication primitive and delivers the data to its FDL user, hence, not confirming
any proper reception. No station is allowed to send an acknowledgement.
50
ResponderInitiator
L_PDU (data, ...)
FDL_DATA.request (data, ...)
FDL_DATA.confirm (status, ...)
FDL_DATA.indication (data, ...)
Figure 4.3: Interactions of SDN service
SRD: Send and Reply With Data
The send and receive data (SRD) service is basically the same as the SDA service, however, the
acknowledgement sent by the responder can carry data. A sketch of the interactions is shown in
Figure 4.4.
ResponderInitiator
FDL_DATA_REPLY.request (opt data, ...)
L_PDU (opt data, ...)
FDL_REPLY_UPDATE.confirm
FDL_REPLY_UPDATE.request (data’)
FDL_DATA_REPLY.indication (opt data, ...)
L_PDU (data’)
FDL_DATA_REPLY.confirm (status, data’,...)
Figure 4.4: Interactions of SRD service
The FDL user at the responder station can place a piece of data (up to 246 bytes) along with an
associated SAP and a priority into an internal buffer of the FDL instance. This is done using the
FDL REPLY UPDATE.request primitive. After filling the buffer the FDL instance generates the corre-
sponding FDL REPLY UPDATE.confirm primitive, indicating the status of the operation. An additional
parameter (reuse information) of the request primitive indicates, whether the piece of data is used
for answering a single request or multiple requests. The FDL user at the responder station can write
51
data into this buffer at arbitrary times.
The remaining service proceeds in much the same way as the SDA service (using the FDL DATA REPLY.request,
FDL DATA REPLY.confirm, and FDL DATA REPLY.indication primitives). However, if the responder gen-
erates its acknowledgement, it checks whether there exists a buffer for the requested SAP, and, if so,
writes the contents of this buffer into the immediate ack frame. This data is passed to the FDL user
at the initiator station with the FDL DATA REPLY.confirm primitive. If the buffer contents should be
transmitted only for a single request, the buffer is deallocated. Necessary frame repetitions, e.g., due
to a corrupted ack frame, are handled by buffering the ack frame as long as necessary (Section 4.1.4).
If the acknowledgement frame is erroneous, the initiator station retransmits its request frame. The
responder then retransmits the last frame without reconstructing it. If the FDL user at the responder
station uses the FDL REPLY UPDATE.request primitive between the first and second ack frame, it has
no effect on the retransmission.
CSRD: Cyclic Send and Reply With Data
The cyclic send and receive data (CSRD) service is the most complicated service of PROFIBUS, and
not discussed in full detail. The realtime performance measures defined in Section 4.3.1 focus entirely
on high priority messages, while the CSRD service is a priori of low priority.
At the beginning of this service the local FDL user gives a list with unicast station addresses to
the local FDL instance (poll list). The local FDL instance maintains for every station in this list a
marker and an optional buffer. The local FDL user can put a block of data (up to 246 bytes) into
this buffer, along with SAP, priority and reuse information, i.e., the buffer contents can be sent only
once or multiple times. On the responder station the FDL instance can also allocate buffers, which
are handled just as for the SRD service.
The stations in the poll list are polled in a round robin fashion. To every station in the poll list a request
frame is sent, which may carry some data, if the associated buffer exists and is non-empty. If the buffer
contents should be transmitted only once, the buffer is deallocated. The responder station is required
to send an immediate acknowledgement, which also can carry the contents of a corresponding buffer.
For every request the initiator station performs a bounded number of retransmissions. If all attempts
fail, the responder station is marked as “dead” by modifying the marker variable accordingly. At some
later time, the dead station is pinged again by the local FDL instance, however, no retransmissions are
performed. If the dead station responds, it is marked as “alive” and the local FDL instance proceeds
with its normal operation.
4.1.4 MAC- and Data Link Protocol
On the MAC layer the PROFIBUS combines two principles: master/slave communication for data
exchange and token passing for managing the right to initiate packet transmissions. Furthermore,
two types of stations are distinguished: active stations can participate in the token passing process,
while passive stations cannot. An active station which currently owns the token, is called a master
station, all other stations have the role of slave stations. Only the master station is allowed to
initiate a data transfer, a slave station may only transmit data, if it has to send an immediate
acknowledgement to a frame directed to itself. Hence, the FDL DATA ACK.request, FDL DATA.request
and FDL DATA REPLY.request service primitives can only be issued on active stations.
52
Token Passing and Ring Maintenance
The PROFIBUS token passing protocol is similar to the IEEE 802.4 Token Bus protocol [71] and
uses a broadcast medium. A logical ring is formed by ascending station addresses. The address space
is small, a station address is in the range of 0 to 126. Every station (denoted as TS: This Station)
knows by the ring maintenance mechanism explained below the address of its logical successor (NS:
Next Station) and its logical predecessor (PS: Previous Station). If TS receives a valid token frame
with TS as destination address, it checks whether it was sent by its PS. If so, the token is accepted,
otherwise the frame is discarded. In the latter case, if the same token frame is received again as the
very next frame, the token is accepted and the token sender is registered as new PS and the list of
active stations (LAS) is updated, see below. In any case, after accepting the token TS determines its
token holding time THT (according to a simplified variant of the timed token protocol with target
token rotation time TT T RT ) and is allowed to send some data during the THT. If there is no data
anymore or THT expires, TS is required to pass the token to NS by sending a token frame. This must
be done even if TS is the only ring member (NS = TS = PS), and TS must accept the token in the
same way as if PS 6= TS. After sending a token frame, TS listens on the medium for some activity.
This can be the reception of a valid frame header (indicating that NS has accepted the token) or
reception of some erroneous transmission. However, TS listens on the medium only for the slot time
TSL which is typically chosen very tight, e.g., in the range of 100 µsec to 400 µsec.2If this time passes
without any medium activity the token frame is repeated (clearly, active stations are required to react
fast enough on token frames, otherwise collisions occur). If there is again no activity, and a third
trial is also unsuccessful, NS is assumed to be dead and TS determines the next station in the ring
(i.e. the successor of NS), makes this the new NS and tries to pass the token to it, following the same
rules. The new station can be determined from the LAS, which is updated by the ring maintenance
mechanism, as explained below. If TS finds no other station, it sends a token frame to itself.
A special protocol rule is the following: TS must read back from the medium bit by bit all token
frames it transmits (hearback), in order to detect a defective transceiver and to resolve collisions (see
below). If TS encounters a difference the first time, it waits for some response (which indeed may
occur due to undetected errors in the token frame, see below). If there is no activity on the medium
it repeats the token frame. If TS again encounters a difference, it discards the token immediately and
removes itself from the ring, behaving as newly switched on and “forgetting” all knowledge previously
obtained. The rationale for this is the assumption that the transceiver is is faulty and its results are
not trustworthy.
The ring maintenance mechanism works by two different means. First, if a station is newly switched
on, it is required to listen passively on the medium, until it has received two successive identical token
cycles and thus has a valid view on the whole logical ring (referred to as listen token state). During
this time it is not allowed to send or answer to data frames or to accept the token. Every station
address found in a token frame belonging to this two cycles is included into the LAS. After building
a valid view the station can enter the ring if another station passes the token to it. The second rule
requires every station to inspect every correctly received token frame and to include the source and
destination address into the LAS. An important rule here is the following: if TS feels itself as already
included in the logical ring and reads a token frame, where TS is “skipped” (i.e. the address of TS lies
truly within the address range spanned by sender and receiver of the token frame) it removes itself
from the ring and behaves as newly switched on.
2These numbers are taken from a database with configuration data found at www.profibus.org.
53
In order that another station can pass the token to a station newly switched on, every station a
maintains a gap list (GAPL), containing all possible station addresses between aand its NS b. A
station ais required to periodically poll all addresses in its GAPL by sending a Request-FDL-Status
frame to a single address cand waiting one slot time TSL for an answer, which indicates c’s current
status (ready / not ready for the ring). A station which tries to detect two identical token cycles will
respond with a “not ready” status. Within every token cycle apolls at most one station address in
its GAPL. If a station in the GAPL responds as “ready”, awill change its NS, shorten its GAPL,
update its LAS, and then send a token frame to the new station. The period for scanning the GAPL
is created by a special timer (gap timer), which is set as an integral multiple (gap factor, the standard
requires values between 1 and 100) of the target token rotation time TT T RT .
For leaving the ring it suffices to just stop all transmissions. In this case PS will detect the station
loss when unsuccessfully trying to pass the token to TS.
A special mechanism is used for the very first ring initialization or to handle token loss due to system
crash of the current token owner: every station listens permanently on the medium. Every time the
medium goes idle, TS starts a special timer, the timeout timer, with the timeout value TT O. The timer
is resetted each time the medium goes busy. If the timer expires (no transmission on the medium
for some time), TS “claims the token”, i.e. it starts with behaving as the current token owner and
performs some frame transmission: it sends data frames or passes the token to its current NS. If TS
was not in the listen token state when the timeout timer expires, there is no change in its internal
state, specifically in its LAS, NS and PS. In the other case, since the station has not yet a valid view
on the ring, it assumes the ring to be empty and itself being the only member of LAS.
The timeout value depends linearly on the station’s address n:
TT O(n) = (6 + 2 ·n)·TSL
This can lead to collisions, and the hearback feature is necessary to resolve them. One situation where
collisions can occur is the following: consider that in an empty ring two stations are newly switched
on at different times, such that their timeout timers expire simultaneously. When both stations start
transmitting token frames, the resulting collision induces hearback errors. Both stations retire from
the ring and stop transmissions, while simultaneously starting their timeout timers. Because of the
different station addresses the timers expire at different times, and now a valid ring can be built up
without further collisions.
Bandwidth Allocation
After an active station receives the token, it computes its token holding time (THT ) by subtracting the
real token rotation time TRR (measured time between two token arrivals) from the configured target
token rotation time TT T RT :
TT H =TT T RT TRR
The real token rotation time is measured continuously as the time between successive token arrivals.
A timer (token holding timer) is started with the computed TT H .
After receiving the token a master station is allowed to handle one high priority frame including neces-
sary retransmissions, regardless of the value of THT (the handling of a frame including retransmissions
is denoted as cycle). Hence, this feature can only be exploited in the SDA, SDN and SRD services,
since the CSRD service is performed with low priority.
54
If further high priority messages are available and the THT is not expired, the station continues with
high priority cycles. Except from the first cycle, it is required that at the start of a new cycle of low
or high priority, the token holding timer must not be expired. If the timer expires meanwhile, the
station is allowed to finish the cycle, including all retransmissions. Then the station must pass the
token to its NS.
The station must first perform all available high priority cycles, until the timer expires or the high
priority queue empties.3In the latter case the station the station may proceed with low priority or
cyclic frames, if the THT is not expired. After having started, the low priority service is nonpreemptive,
i.e. newly arriving high priority requests are handled upon the next token arrival. It runs until there
are no frames to transmit or until the token holding timer expires.
The low priority service starts with traversing the poll list of the CSRD service. If all stations in
this list are polled and the token holding timer is not expired, the station proceeds with handling
asynchronous low priority cycles (SDA, SDN and SRD services) until timer expiration. If the poll list
can be traversed within one token cycle but there is no remaining time for low priority data, the low
priority queue is handled upon the next token arrival where after processing high priority requests
there is still time for low priority cycles. After this the poll list is traversed again.
If traversing the poll list takes more than the THT would permit, the remaining list is handled in the
subsequent token cycles, without interruption by asynchronous low priority frames. The latter are
handled if the poll list is fully traversed.
Some special frames for ring maintenance (Request-FDL-Status) are treated as asynchronous low
priority frames.
The protocol described so far is a variant of the well-known timed token protocol which is also used
in the FDDI and IEEE 802.4 standards. It is known to be capable of transmitting multimedia data
[117], [3].
Data Link Protocol
The PROFIBUS data link layer provides a semi-reliable service for the SDA and SRD services, with a
bounded number of retransmissions, given by the global max retry parameter. The necessary feedback
is provided by immediate MAC layer acknowledgements, which may for the SRD service also carry
some data.4By the protocol all trials are performed subsequently, no other frame transmission or
token passing is allowed in the meantime. If the result of a frame transmission is known (successful
reception of an ack or max retry trials without receiving an ack) the upper layers are notified with a
confirmation primitive. In order to allow the receiver to distinguish between new and retransmitted
frames, a variant of the well-known alternating bit protocol is used [10].
The FDL instance of the initiator keeps for every possible target station a state variable (distinguishing
between “dead” and “alive”, [33, chap. 4.1]) and a frame count bit (FCB) [33, chap. 4.8.3]. The state
variable is changed according to the following rule: if the target station is “alive” and does not answer
on any trial belonging to a cycle (i.e, it does not ack the request frame and all of its max retry
transmissions), it is marked as dead. If a cycle is targeted to a dead station, no retransmissions are
3The standards document [33] does not state explicitly whether the service is exhaustive (all requests are processed)
or gated. In this work exhaustive service is assumed.
4The acknowledgement frame in the CSRD service may also carry data, but uses a different retransmission scheme.
55
performed within the CSRD service. If a station marked as dead answers again, it is marked as alive.
The FDL instance of the responder keeps for every possible station address a FCB. Furthermore, it
maintains a global buffer for the last acknowledgement frame (acknowledgement buffer).
The FCB allows the responder station to distinguish between new frames and retransmitted frames.
It is transmitted as part of the frame header (the frame formats are described in the next section),
along with a frame count valid bit (FCV), which indicates, whether the FCB is valid. When the
initiator sends a frame to a responder the first time, or if the responder wakes up after it was dead,
both stations have to synchronize on a FCB value. For doing this the initiator sends frames with FCV
= 0 and FCB = 1 to the responder, however, such frames are not retransmitted. After the responder
has answered, the initiator sends all further frames with FCV = 1 and inverts the FCB after every
finished cycle.
If the responder receives a frame with FCV = 1 and FCB = x, it checks, whether this FCB was
different from that stored in its table. If so, the cycle is considered as successfully finished, the FCB
is stored in the table, a new acknowledgement frame is generated, stored in the global buffer and
transmitted to the initiator, and an indication primitive is generated for the responder’s FDL user.
If the responder receives a frame from a station with address bafter receiving a frame with FCV =
1 from station a, it considers the last cycle of aas finished. If the responder receives two successive
frames from the same station, both with FCV = 1 and the same FCB, it concludes that the last frame
is a retransmission. In this case only the contents of the acknowledgement buffer is retransmitted, no
further action is taken.
Frame Formats
The PROFIBUS supports five different types of frames, shown in figure 4.5. With the exception of
the one byte long short acknowledgement (SC) frame, all other frame types start with a start delim-
iter (each frame type with a different one), and carry at least a destination address (DA) byte and a
source address (SA) byte. The frame control (FC) byte has different meanings when transmitted from
initiator to responder and vice versa. When sent by the initiator, the FC byte carries control informa-
tion: FCB and FCV, a service type tag (SDA, SDN, SRD, management frame), or the answer code
in case of acknowledgement frames (positive or negative acknowledgement, error codes for indicating
malformed request frames, lack of memory etc.) [33, chap. 4.8.3]. The frame check sequence (FCS)
byte is a simple checksum, computed for the frame with no data, the frame with fixed length data
and the frame with variable length data. The FCS byte is the sum modulo 256 of all bytes located
between the DA-field (inclusive) and FCS-field (exclusive) of the frame. The short acknowledgement
and the token frame have no checksum at all.
The RS 485 version of PROFIBUS uses serial transmission with NRZ coding (see Section 4.1.2). Every
byte of data is transmitted with eleven bits: one startbit, one stopbit, eight data bits and a parity bit
(see Figure 4.5).
4.1.5 Important Properties of the PROFIBUS
We shortly summarize the important features of the PROFIBUS with respect to realtime and reliability
behavior. In the absence of transmission errors the protocol can give the following guarantees:
56
SD DA SA
Token Frame
S b0 b1 b2 b3 b4 b5 b6 b7 P S
SD DA SA FC FCS ED
Telegram with No Data
SD DA SA FC FCS EDDATA_UNIT
Telegram with Fixed Size Data
FCS = Frame Check Sequence
LEr = Frame Length (Repeated)
SD = Start Delimiter (different for every frametype)
SA = Source Address
SD = Destination Address
FC = Frame Control
ED = End Delimiter
LE = Frame Length
S = Start-/Stopbit
P = Parity Bit bx = data bit
LE LEr SD DA SA FC DATA_UNIT FCS EDSD
Telegram with variable Size Data
Short Acknowledgement
SC
Figure 4.5: PROFIBUS frame formats
For high priority messages for every station the handling of at least one request within a bounded
time can be guaranteed. An upper bound to this time is given by the target token rotation time
TT R plus Ntimes the maximum duration of a high priority message exchange (including the
bounded number of retransmissions), where Nis the number of active stations in the ring [33,
p.26]. This is due to the rule that every station may handle at least one high priority frame
exchange per token arrival. Stated differently: for every high priority request primitive the time
until the occurence of the next high priority confirmation primitive is upper bounded, regardless
whether the primitives belong together.
The time needed for ring initialization after newly switching on the first stations is upper
bounded. The reason is that for the expiration of the timeout timer two cases occur: it ex-
pires only in a single station, or it expires in two or more stations simultaneously. In the former
case the single station starts to transmit, and all other stations reset their timeout timers. The
first station builds a valid ring with only itself being member. Upon transmitting the first token
frame with source and destination address being its own address, all other stations get the same
view on the ring and wait patiently for being included. In the latter case the collision can be
resolved as described in Section 4.1.4.
The timed-token protocol gives long-term fair distribution of bandwidth between stations [76,
chap. 23]. The bandwidth splitting between high priority and low priority data is performed in
57
a purely local manner in every station.
The FDL offers a semireliable service, for every request exactly one confirmation primitive is
generated, indicating the success of the request. Furthermore for every request at most one
indication is generated. Within one priority class the confirms have the same order as the
requests, the indications are a subset with preserved order.
It should be noted that most of the bounds, e.g., the TT R value and the ring initialization time,
depend on the number of stations in the ring. However, since this number is fixed upper bounded by
the limited address range, an absolute upper bound can be given.
4.2 Wireless Industrial Communication Systems
The creation of a wireless Industrial Communication System (ICS) and specifically of wireless fieldbus
systems is nowadays “up in the air”, due to the ever increasing acceptance of WLAN technology and
their attractive features, like mobility and reduced cabling need. Of specific interest is the idea of a
wireless fieldbus extension, i.e., to create a possibility to integrate wireless stations in already existing,
wired fieldbus systems to form a single LAN.
For wireless fieldbus extensions we propose a classification, which reflects the architectural character-
istics of fieldbus systems just covering layers 1, 2 and 7 of the OSI reference model (see Sections 2.2
and 2.3):
wireless repeater scenario: all stations are attached to a cable and have wired transceiver, and
just a piece of cable is replaced by a wireless link. In this scenario no station has to be aware
of the wireless link. From now on we use the term wireless station to denote a station with a
wireless transceiver.
Wireless bridging scenario: Integration solely on the physical layer. Wired stations are attached
to a cable, wireless stations have a radio frontend, and a bridge-like device translates the different
framing rules used on wired and wireless media into each other. The MAC- and link-layer
protocol and all application layer protocols are the same for both wired and wireless stations.
Integrated scenario: Integration at the MAC- and data link-layer, with two different MAC- and
link-layer protocol stacks on both sides, but with fixed link-layer interface.
Wireless gateway scenario: Integration at the application layer.
Some mixture of these scenarios.
In this thesis we focus on the integrated scenario, since, for a wireless PROFIBUS it appears to be an
attractive choice:
Both the wireless repeater and the wireless bridging scenario require the transport of PROFIBUS
MAC frames over a wireless medium. In Chapter 5 it is shown that, however, the error behavior
of a wireless link can seriously harm the realtime capabilities of PROFIBUS.
In the wireless gateway scenario a tight coupling of the timing behavior between wired and
wireless stations is hard to achieve.
58
CAS = Cabled Active Station
CPS = Cabled Passive Station
WAS = Wireless Active Station
WPS = Wireless Passive Station
BS-IWU = Base Station / Inter-Working Unit
Logical Ring
CAS CPS CPS
CPS CAS CPS CAS
BS-IWU
WAS
WAS
WPS
WPS
WPS
virtual ring extension
Cable
Figure 4.6: Integrated PROFIBUS LAN
4.2.1 Integrated Scenario: General Considerations
There is a certain class of fieldbus systems employing a decentralized MAC protocol with explicit
token passing, used to form a logical ring. Besides the PROFIBUS also the IEEE 802.4 Token Bus
and the multi-master mode of BITBUS belong to this class (see Section 2.3). The basic situation for
this class of protocols is sketched in Figure 4.6 for the specific case of PROFIBUS. The distinction
between active and passive stations refers to the ability to take part in the token passing process:
active stations can participate, passive stations not.
In the integrated scenario it is assumed that on the wired part the given protocol is used, while on
the wireless part a specifically tailored MAC- and link-layer protocol is used. The need for protocols
specifically tailored to the wireless medium can be easily justified by the properties of the wireless
medium, as discussed in Section 3.1 and Chapter 6. While the MAC- and link-layer protocol on the
wireless side will be different from that of the wired side, the link-layer interface should be kept fixed.
This requirement allows easy porting of application layer software.
The approach of using the unchanged protocol on the wired side and using another protocol on the
wireless side has important consequences, specifically for the role that the coupling element (denoted
as base station / interworking-unit (BS-IWU)) has to play. In general, it has at least the following
tasks:
Give the wired stations a consistent view on the logical ring.
Give the wireless stations a consistent view on the logical ring.
Synchronize both MAC protocols.
Translate possibly different link-layer protocols.
59
Forward frames from the wired to the wireless side and vice versa.
Keep track about the wireless stations belonging to “its” LAN.
Optionally performing proxy operations for certain types of wireless terminals.5
4.3 Integrated Scenario for Wireless PROFIBUS
This thesis proposes to use a polling-based protocol on the wireless side, for the sake of realtime-
performance, as defined in Section 4.3.1. This class of protocols assumes a central controller, granting
access to the wireless medium. Since this centralized protocol has to be synchronized with the dis-
tributed PROFIBUS token passing MAC protocol, and since furthermore the BS-IWU is the only
instance capable of knowing the state of both sides (compare Figure 4.6), it is natural to put the
central controller functionality into the BS-IWU.
With this approach, the BS-IWU is a fairly complex device performing several functions:
Frame forwarding between the wireless and wired side. In PROFIBUS data frames require an
immediate acknowledgement, for which the time bounds are typically sharp (see Section 4.1.4).
This suggests using cut-through forwarding instead of store-and-forward.
MAC-Integration (mimikry functions), and optionally link-layer integration (translation of the
PROFIBUS alternating bit protocol to the wireless link-layer protocol).
Central scheduler for the wireless stations.
Signalling functions: mobility support, authentication, station registration, address allocation.
The mimikry functions are due to the concept of keeping explicit token passing and logical ring
maintenance away from the wireless part. Therefore, the BS-IWU is required to act on the wired
segment on behalf of the wireless stations. This includes: generating token frames on behalf of wireless
stations, sending and answering the frames necessary for including new stations into the logical ring,
reaction on certain frames asking for static information about wireless stations (e.g., vendor name,
product id), and to prevent the wired medium becoming idle when the token is logically in the wireless
part (timeout timer).
A more detailed discussion of a possible architecture for a wireles PROFIBUS based on the integrated
scenario can be found in [191].
4.3.1 System under Study and Realtime Performance Measures
In this section and Section 4.2.1 a framework for wireless fieldbuses in general, and more specific, for
a certain class of fieldbuses employing token passing protocols was depicted. Within this framework
now the system of interest and the performance measures of interest (realtime performance measures)
are described, along with their associated load models. This description, although originating from
the properties of the PROFIBUS, is more abstract and fits both the PROFIBUS protocol described
in Section 4.1 and the polling-based protocols described in Chapter 7.
5Consider for example a small, energy-constrained wireless sensor observing a slowly varying physical process. The
BS-IWU can poll this sensor from time to time and answer other stations requests on behalf of it.
60
System Description
Consider a scenario with Nwireless terminals (WT). All these terminals are assumed to be active
stations in the sense that they are willing to acquire transmission rights. In the case of PROFIBUS
this means that they want to participate in the token passing process. All data transmissions are
between the WT’s. For this work a WT consists mainly of the MAC- and link-layer, which is attached
to the physical medium on the one hand and to the upper layers on the other hand. The interface
offered to the upper layers is denoted as link-layer interface. At each WT a number of traffic sources
generating requests and consuming confirmations can be attached to its link-layer interface.
Although the PROFIBUS offers more link-layer services, the service of interest is the acknowledged
datagram service, specifically the PROFIBUS SDA service (see Section 4.1.3). This choice is conve-
nient, since:
For the unacknowledged datagram (SDN) service there are no guarantees anyway.
In the (C)SRD services the acknowledgement may carry data. However, this adds no new quality
to the realtime behavior of acknowledged services, and is left out for simplicity.
A traffic source generates a request and hands this over to the MAC- and link-layer protocol via the
link-layer interface. The MAC- and link-layer protocol tries to successfully transmit this request,
however, the number of trials is bounded (max retry parameter). In any case, when the fate of the
request is known, the MAC- and link-layer instance generates a confirm primitive and passes this to
the traffic source via the link-layer interface. There is no segmentation and reassembly scheme applied
to the requests.
There are two types of traffic: low priority and high priority traffic: the high priority traffic is meant
for transmission of safety critical data (like alarms), all other data types (e.g., periodic process data,
file transfers) belong to the class of low priority traffic. In this thesis only the behavior of the high
priority traffic is of interest, the low priority traffic serves only as background load.
A fully meshed topology is assumed, i.e. all stations can hear each other. Furthermore, the distance
between the wireless terminals is assumed to be small (max. 30-50 m, according to machine plant
applications), hence the propagation delay is small and can be neglected.
Load Models
Definition. For a single priority a load value of xpercent has the meaning that in the case of no
errors and without packets of other priority present in the system the load offered via the link-layer
interface is such that the time spent for transmitting data frames of the given priority and including
overhead is xpercent of the theoretical link bandwidth.
Two different load models are defined, which are closely related to the realtime performance measures.
In both models and for both priorities, the data requests generated by the traffic sources have a size
of 40 bytes user data. The load is varied by varying the interarrival times of the sources accordingly.
There can be several sources attached to a single station.
61
Smooth Model The first model is denoted as the smooth model. It assumes some low priority
background load of x {10,50}percent, which is splitted half into periodic arrivals and asynchronous
arrivals (Poisson sources).
For the high priority load Poisson arrivals with an overall load value of 10% are assumed. However,
the sources generate high priority requests for a station only when there is no pending high priority
request. Stated differently, for the high priority requests a single station is modeled as an M/G/1/1
queueing system (with its service times involving vacations due to the polling algorithm or the token
passing) [90].
This model allows investigating the confirmation delay for high priority requests defined below without
taking queueing delays at the originating stations into account.
Batch Model The batch model is an approximation to the alarm storm phenomenon discussed in
Section 2.1.
This phenomenon is approximated by a batch arrival, happening simultaneously at all stations. This
definition is related to the associated measure of interest, namely the consecutive confirm delay, which
measures the distribution of the times between successive confirmation primitives at a given station.
This model is unrealistic by not using different times for batch arrivals at the stations. However, in the
“more realistic” case the load conditions, as seen from a single station, vary over time as the batches
at other stations appear. Hence, the obtained consecutive confirm delay values cannot be assumed to
be from the same distribution.
So the model is as follows: given a constant low priority background load of 50% and no high priority
load before time t0= 10 s, at time t0= 10 s at each station a batch of high priority requests and
infinite size arrives.
As a combined scenario, for the high priority arrivals a two state Markov Modulated Poisson Process
(MMPP) can be used, which is aimed to approximate best the “real world” behavior. However, with
an instationary arrival process the delay measures can not be obtained in a meaningful fashion.
Realtime Performance Measures
The realtime performance measures are a set of measures targeted to capture the reliability regarding
time and reliability. Due to the sometimes harsh and variable error conditions on certain types of
wireless links, it is hopeless to give tight and deterministic guarantees on successful delivery of certain
messages within a bounded time (see the results reported in Chapter 6, where it is shown that there
are periods where for several seconds no packet is successfully received). Therefore, it is appropriate
to express the requirements stochastically.
The main measures are delay-oriented: the confirmation delay DC(i) and the consecutive confirmation
delay DCC(i), taken for a fixed wireless station i. Both are defined with respect to certain traffic
scenarios: the DC(i) measure is evaluated within the smooth scenario, the DCC (i) measure within
the batch scenario. Furthermore, both measures are taken only for high priority requests.
For a fixed station ithe DC(i) measure denotes the time between the arrival of a high priority request
at the link-layer interface of station iand the time instant when the corresponding confirmation
primitive is generated, i.e., the transmission outcome is known. As from the smooth scenario, high
62
priority requests arrive always to a system with an empty high priority queue, hence, this measure
doesn’t take any additional queueing delays into account.
The DCC(i) values for a fixed station idenotes the time between the generation of high priority
confirmation primitives in the batch scenario.
Clearly, DC(i) and DCC (i) are taken as random variables, and it is assumed that (due to the traffic
scenarios chosen) all the values belong to the same distribution.
Now the main optimization target for this work can be stated conveniently. Denote for given station
iby FDC(i)(x) = Pr[DC(i)x] the distribution function of the confirmation delay values. The 99%
percentile x99(i) of this distribution is given by:
x99(i) = inf{xR:FDC(i)(x)0.99}
We want to minimize the quantity g
DC(which we denote as overall confirmation delay):
g
DC:= max{x99(i) : i {1,...,N}}
i.e. we want to minimize the maximum of the 99% DC(i) percentiles over all stations. The g
DC
measure can be defined the same way for other percentiles, e.g. a 99.9% percentile.6
A similar quantity is of interest for the DCC(i) measures. This aggregate measure is denoted as ]
DCC.
These measures now can be used to ask for a realtime capacity. Let us assume that from the application
some bound on maximum transmission times of high priority messages is given, say Dmax. As an
example, consider a pressure sensor in a tank, which is configured to have a threshold value. If the
pressure exceeds this threshold, an alarm is generated and sent to a controller station. The latter is
then requested to do something against it as fast as possible. Then, for this given bound, the realtime
capacity is defined as the maximum number Nof wireless stations, such that g
DCDmax holds. From
the definition of g
DCit is clear that this value depends on N, although this is not explicitly shown in
the notation.
In this definition the realtime capacity is computed for the DC(i) measures. A similar definition can
be given with respect to the DCC (i) values, but this is not considered furthermore within this work.
As a side measure, it is also worthwhile to investigate the remaining bandwidth for low priority traffic
BLin the smooth scenario with 50% low priority load. Clearly, if two polling schemes show the same
performance with respect to g
DC, the scheme which offers more bandwidth to low priority traffic is
preferrable. The BLmeasure is always given as a fraction of the overall available bandwidth, and
incorporates data and overhead of low priority frames.
Some additional remarks are in order:
It is assumed that the maximum number of retransmissions (max retry parameter) is set to a
high value 20, in order to increase reliability. Hence, the negative confirmation rate, where
the MAC-/data link-layer has to report about finally unacknowledged requests, should be zero.
If the max retry parameter has a lower value (say, 3 to 5), then the negative confirmation rate
would be of much interest.
It is exactly the movement from a deterministic delay measure to 99% percentiles which accounts
for the variability and error behavior of the wireless link, while simultaneously expressing hard
delay requirements.
6The question, whether the results change for different percentile values is not covered in this thesis.
63
4.4 Wireless PROFIBUS over IEEE 802.11 MAC
For designing a wireless PROFIBUS considerable development effort could be saved, if existing tech-
nologies are used. This applies not only to wireless PHY’s, but also reusing existing MAC protocols
seems attractive.
More specifically, the approach discussed in this section is to take an existing MAC protocol providing
a MAC interface, and to construct a mapping between the PROFIBUS link layer interface and the
MAC interface. This mapping should implement the link-layer interface’s syntax and semantics as
closely as possible.
A natural candidate is the IEEE 802.11 WLAN standard with either DCF or PCF. Especially the
PCF seems interesting with its time bounded services. However, this approach makes sense only if no
changes or only minor changes to the existing protocols are required, in order to benefit from existing
implementations. The feasibility of this approach is discussed.
4.4.1 DCF-based approaches
We consider first the DCF-based case. If no additional media access rules are implemented on top
of the DCF (e.g., token passing on top of IEEE 802.11 MAC, as suggested in [130]), the following
problems arise:
By the decentralized protocol, it cannot be guaranteed that in the integrated scenario (see
Section 4.3) the wireless medium is idle, when the token is logically in the wired segment and
the current token-owner sends a request frame or the token to a wireless station.
According to the SDL specification given in the IEEE 802.11 standard [120] the MAC entity
maintains internal queues for its MAC-PDUs, and the MAC interface offers no way for upper
layers to inspect or modify these queues. Only the current queue length can be inferred from
the number of outstanding MA-UNITDATA-STATUS.indication primitives. Consider the case of an
arriving high-priority request when the queues already contain a number of low-priority requests.
There is no way to put the high-priority request in front of the queue, instead it is blocked by
the low-priority requests.
The CSMA-based protocol is subjected to the hidden terminal and exposed terminal scenarios.
In general, a stochastic MAC protocol cannot give any guarantees on delays and losses.
Hence, additional distributed media access rules should be implemented. The focus here is on dis-
tributed rules, since otherwise we need a central controller, which leads to PCF-based scenarios,
discussed below. An obvious candidate rule is explicit token passing [130], which, in the integrated
scenario, could be integrated with the PROFIBUS token passing protocol (otherwise two different
token passing protocols need to be maintained in parallel). However, in Chapter 5 it is shown that ex-
plicit token-passing over error-prone wireless links has serious problems and should not be considered
for a wireless PROFIBUS.
Other additional media access rules (e.g., virtual token passing as in P-Net, see Section 2.3) must be
made compatible with the properties of the PROFIBUS token passing.
64
Even if one of these approaches could be implemented successfully, it is questionable whether the
result will be more efficient or less complex than a specifically tailored wireless PROFIBUS MAC
protocol. The 802.11 DCF is of considerable complexity, and additional rules increase the complexity
of the resulting MAC protocol.
There are other DCF-based approaches, which, however, extend the DCF such that existing equipment
could not be used. For example, the authors of [156] propose a DCF extension for adding station
priorities. Their approach uses a special jam signal which is not part of the standard and thus requires
special capabilities of the wireless transceiver, which makes this approach unattractive.
4.4.2 PCF-based approaches
In this section we make the assumption that the 802.11-PCF AP serves also as the boundary between
the wired and wireless parts, where the PROFIBUS frames are translated between the different media.
A critical situation for PCF-based approaches is when the token is logically in the wired segment
(wired token situation). To avoid collisions when a wired master sends a frame to a wireless slave, the
wireless medium should be kept idle.
Consider first the case, that the AP works in conformance to the IEEE 802.11 standard, i.e., it
implements the concept of superframes subdivided into CFP and minimum length CPs. If the wired
token situation happens within the CFP, the AP can keep the wireless medium idle by continuosly
transmitting short dummy frames to a non-existing address.7These frames, however, need a PLCP
preamble and header of at least 192 µs duration, not counting the MAC-PDU data. When a request
frame arrives during the PLCP preamble, it may be possible to replace the dummy frame by the
request frame, if the request frame arrives early enough, say, up to the τ-th bit of the preamble
(τ < 128). The value of τis determined by the MAC processing latency. If the request frame arrives
just after the τ-th bit, the remaining bits of the dummy frame induce additional forwarding delay.
If the wired token situation happens within the CP, it is hardly possible to keep the wireless medium
idle without additional medium access rules or without the AP occupying the wireless medium. Fur-
thermore, due to the variable PROFIBUS token cycle times it is hard to guarantee that always the
wired token situation and CFP occur jointly.
The problems with the CP can be attacked by getting rid of the superframe structure and by letting the
AP controlling the medium all the time. However, this is close to the polling-based approach adopted
in this thesis, which does not need all the complexity introduced with the 802.11 MAC protocol, but
runs directly on top of an 802.11 DSSS PHY.
4.4.3 SRD service handling
It is difficult to implement the (C)SRD service with the DCF or PCF of IEEE 802.11. The basic
problem is that the immediate acknowledgement frames cannot carry any data, as is requested by the
SRD service.
First we consider the DCF case.8If the token is logically in the wired segment and a wired master
7Another approach is to equip the AP with modified wireless transceivers, allowing to send jam signals of arbitrary
length. However, this cannot be done with off-the-shelf components and is not considered furthermore.
8We assume the RTS/CTS protocol being disabled, since PROFIBUS frames are typically small.
65
sends a SRD frame to a wireless station, the following approaches are possible:
The SRD frame could be directed to a unicast 802.11 MAC address corresponding to the des-
tination PROFIBUS MAC address. The immediate ack sent by the wireless station can be
suppressed by the AP, i.e., it does not appear on the wired part. Immediately after receiving
the request frame the wireless slave prepares a response frame and hands it over to its MAC
entity (which should have no other frames in its internal queues). After some delay (at least
DIFS), the answer frame is transmitted and forwarded by the AP to the wired master. This
approach has the problem of substantial delay before the first bit of the answer frame appears on
the wired medium: SIFS plus ack frame duration plus the time needed for preparing the answer
plus DIFS plus PLCP preamble and header. If the wireless station’s internal MAC queues are
not empty, this approach does not work.
The SRD request frame could be directed to the 802.11 broadcast address, as well as the answer
data. This way, two immediate acks are eliminated. This approach imposes additional burdens
to all wireless stations, since, by increasing the number of broadcast frames, it increases the
processing load. Again, it works only if there are no other frames in the wireless station’s
internal MAC queues.
When using the PCF, two cases can be distinguished: if the SRD frame exchange is performed during
the CP, the same considerations as for the DCF case apply. When performed during the CFP, the
following cases occur:
When the SRD request frame is sent by a wired station, the AP can piggyback a poll-request
to the forwarded frame. If the wireless slave has given the answer data already to the MAC
instance (with the MA-UNITDATA.request service primitive), it can answer with the stored answer
data. Unfortunately a piece of data given to the MAC cannot be replaced by some updated data,
instead both pieces of data are transmitted. Hence, with this mechanism the PROFIBUS SRD
service cannot be fully implemented, since this service has the concept of an FDL buffer, which
can be modified several times and only the last version is put into an ack frame. If the answer
data is not readily available, the wireless slave must be polled some time later, introducing delay.
When the SRD request is sent by a wireless station to another, unicast frames have to be
used and costly immediate acks cannot be suppressed. Furthermore, the AP has to poll the
destination station immediately after the ack, in order to allow for transmission of answer data.
In summary, the IEEE 802.11 MAC is not well suited to implement the SRD services.
4.4.4 Final Remarks
Using a full 802.11 protocol stack with its MAC and management functionalities, brings significant
complexity to wireless devices. Especially for small and cheap sensors, which only deliver some
measurement values from time to time, 802.11 seems to be oversized in terms of cost.
The approach to implement a mapping of the link layer interface to the 802.11 MAC interface intro-
duces several problems and unneeded complexity. The IEEE 802.11 MAC protocol is not designed
for the services needed and constraints given in a wireless PROFIBUS LAN. Many functions are not
66
needed for wireless PROFIBUS (e.g., always sending immediate acks), others are missing. Additional
functionality must be added to achieve reasonable behavior. In this case, however, approach looses
its appeal of using off-the-shelf components.
A specifically tailored MAC protocol is the solution adopted for this thesis.
4.5 Related Work
4.5.1 Overview of other Work in Wireless Fieldbus Systems
There exists much literature on “wireless realtime MAC” related topics, however, in almost all cases
the focus of interest is on time-sensitive and somewhat loss-tolerant data types like voice and video.
For IEEE 802.11 the related literature is reviewed in Section 4.5.2. Much literature exists in the
context of wireless ATM systems (some references are [6, 93, 25]) or for integrating voice and data. In
contrast, the topic of hard real-time communications or fieldbus systems over wireless media is only
sparsely covered.
Wireless PROFIBUS
The Funbus project [52], [81] was an industry-driven project with the goal of finding a cheap and
reliable technology for wireless and transparent coupling of field devices. Three different fieldbus
technologies (PROFIBUS-DP, INTERBUS-S [36], [37], [38], and CAN [50]) and several wireless tech-
nologies (e.g., GSM, DECT, 802.11, TETRA) were investigated, but finally the participants have
chosen 802.11 DSSS related technologies working in the 2.4 GHz ISM band.
The project focused on the wireless repeater scenario. Some alternative approaches for this have been
investigated: a) the forwarding could be done over the 802.11 DCF MAC with either unicast addressing
(thus every wireless station or bridge has to send an ack) or broadcast addressing; or b) forwarding
is done by encapsulating the PROFIBUS frames in 802.11 PHY frames without using the MAC
and sending them in broadcast mode. For performance reasons, the latter approach was adopted.
The forwarding is done by bridge-like devices in a cut-through manner. Some bridge prototypes
were built, based on the Silver Data Stream radio modem [157]. This setup was evaluated with
laboratory measurements and field trials. The minimum forwarding delay introduced by the wireless
link and bridges was measured to be 200µs. A simple laboratory setup was evaluated, consisting of
one PROFIBUS DP master linked to two PROFIBUS DP slaves. The connection between master
and slaves was either fully wired or contained a wireless link. The other parameters were: bus speed
on the wired part of 500 kBit/s, line of sight (LOS) connection for the wireless link, no antenna
diversity and 1 MBit/s BPSK modulation on the wireless link. For the wireless case a throughput
reduction of 40% was observed in terms of handled requests/s [52, p.72].9In the field trial the same
setup was placed in a gypsum warehouse: the DP master has a fixed position at the warehouses walls,
while the DP slaves were put on a moving gypsum conveyor system (LOS connection). On the cabled
segment a transmission speed of 500 kBit/s was used, on the wireless segment 1 MBit/s. The conveyor
system was actively moving during the 9h measurement. The data throughput was stable most of the
9The numbers given in the report (13800 requests/s in the wired case, 8500 requests/s in the wireless case) are too
high for a 500 kBit/s wired bit rate, but we presume that the relation is correct, i.e., in the wireless case the efficiency
is reduced to 60% of the wired case.
67
time, but frequently outliers occur. For an hour no communication was possible, the authors propose
multipath effects as a possible explanation.
Meanwhile, some companies offer products for the wireless repeater scenario, using infrared communi-
cations. These products allow to couple wireless stations into a wired PROFIBUS or to link different
wired PROFIBUS segments (two exemplarily references are [84], [66], both employing infrared waves).
Other Wireless Fieldbus Systems
Within the Funbus project [52] also the INTERBUS-S and CAN were investigated. For the INTERBUS-
S a repeater/bridge solution based on the 802.11 DSSS PHY was used. This was feasible, since the
INTERBUS MAC protocol operates on point-to-point links with no further access mechanism. For
the CAN protocol an application layer gateway was constructed, since the approach of emulating the
CAN MAC protocol on the wireless link was found overly difficult. For performance reasons a specific
MAC protocol was implemented on top of an IEEE 802.11 PHY.
The R-FIELDBUS project (www.rfieldbus.de) within the European Union Information Society Tech-
nologies (IST) program evaluates the use of different radio technologies (UMTS, IEEE 802.11, HIPER-
LAN, DECT, Bluetooth) for the fieldbus systems specified in the EN50170 european standard (in-
cluding PROFIBUS, P-NET and WorldFIP) with focus on multimedia support [141]. At the time of
writing no technical details were available.
A group at EPFL Lausanne has worked on transparent integration of wireless stations into FIP [109].
The factory instrumentation protocol (FIP) fieldbus is a european fieldbus standard, which emerged
from a french standard [178]. It uses a polling table to implement a realtime database (a more detailed
description can be found in reference [30]). A main element of the approach in [109] is a wireless-to-
wired gateway, which serves as central base station for the wireless part. The MAC protocol is based
on a time division multiple access (TDMA) scheme. The base station is responsible for caching all
process variables produced by mobile stations and to transmit these on the wired part, if requested.
Furthermore the gateway caches all process variables produced by wired stations and consumed by
wireless stations, and broadcasts these on the wireless link. The case of asynchronous directed message
transmission is not discussed. In [110] it is investigated, how the MAP/MMS application layer protocol
[80] can be enhanced with mobility. In the proposed system the IEEE 802.11 MAC protocol with the
(stochastic) DCF is used, time critical transmissions are not considered. In [108] the same question
was investigated with DECT as underlying technology. Again, time critical transmissions were not
considered.
The european community project OLCHFA (June 92 until September 94) was targeted at enhancing
FIP with wireless stations at the 2.4 GHz ISM band using a DSSS physical layer [74]. However,
the available publications put emphasis on the management of configuration data and on distributed
algorithms for clock synchronization. Within the project a communications controller was developed,
which can switch between using a wired and a wireless medium. The MAC and data link protocol of
FIP was not modified.
For the IEC FieldBus [69] (which uses a centralized, polling-based access protocol for supporting
periodic data and a token passing protocol for asynchronous data) in [22] an architecture was proposed,
which allows coupling of several fieldbus segments using a wireless backbone based on IEEE 802.11
with PCF.
68
A group at the university of Sussex has worked on the topic of wireless CAN. Since the CAN MAC
protocol is not implementable on a wireless medium, two different approaches were developed: the
WMAC approach uses backoff times directly proportional to the message priority before starting to
transmit [96], in the second approach the priority value is mapped onto the channel using an on-
off-keying scheme: a station transmits a short burst, if the current priority bit is a logical one, or it
switches to receive mode if it is a zero. If the station receives something in the receive mode, it resumes
from contention. The priority bits are handled from the most significant bit to the least significant
bit, all stations have to be synchronized on bit boundaries [95]. The papers cited do not take channel
errors and retransmissions into account. Furthermore, this approach requires fast switching between
transmit and receive mode for the radio modem.
4.5.2 Real-Time Data Transmission with IEEE 802.11
In this section existing work on real-time transmission with IEEE 802.11 is summarized. In almost all
publications the notion of “real-time transmission” is not to be understood in the industrial “hard real-
time” sense, where packet losses should not happen. Instead, it is used in the context of multimedia
transmission, e.g., for speech, audio and video data. While these data types also have stringent timing
requirements, they can tolerate some losses. For voice transmissions the delay is desired to be 250
msec, but a certain packet loss rate (typically 1%) seems tolerable, depending on the codec and the
influence of error concealment techniques [65, chap. 7].
The literature in this field can be broadly subdivided into approaches for using and enhancing the DCF
and those using the PCF, see the next two sections. A comparison of several schemes for transmitting
speech data over an IEEE 802.11 WLAN is presented in [131].
Several studies, however, deal with the performance of 802.11 without explicitly referring to real-time
data, e.g., [185]. Especially the backoff algorithm has been the focus of several studies, e.g., [15] and
[186]. In reference [20] capacity limits for IEEE 802.11 DCF were derived using analytical estimates for
the protocol overhead. They showed that for certain scenarios the theoretical capacity is not reached
and proposed a modified backoff algorithm. This algorithm takes the current load into account.
In most of these papers transmission errors were not taken into account.
DCF-based Approaches
In [31] an enhancement of the 802.11 DCF for prioritizing frames is proposed, targeted to enhance the
real-time capabilities for voice and video. This approach distinguishes two types of data: time-bounded
data and asynchronous data. Asynchronous data may be delayed arbitrarily, but should definitely
reach the destination. On the other hand, time-bounded data looses its meaning after deadline
expiration. With the help of new inter frame spaces and shorter backoff intervals a prioritization
of stations is introduced. When prioritizing video frames over data frames using this scheme, an
improved transmission delay and reduced loss probabilities for video frames can be observed.
Recently, QoS is becoming an issue also in the IP world. One of the most prominent approaches is the
Differentiated Services approach [86]. In this approach packets are classified into different flows and
each flow is treated by different rules in the routers and edge nodes. One prerequisite is the ability
of the underlying networks to provide different service levels. Especially in those technologies, where
bandwidth is a scarce resource, this is often achieved with introducing packet priorities and proper
69
packet scheduling approaches. In [12] two approaches are combined for introducing two service classes:
time-bounded and asynchronous data. The first approach uses two different CWmin values for time-
bounded data (lower value, see Section 3.2.3) and asynchronous data (higher value). It is shown with
simulations that time-bounded data gets faster channel access (in the mean) and that throughput of
time-bounded data is largely insensitive to the load of asynchronous data. By additionally selecting
a lower CWmax value for time-bounded data packets, these are discarded earlier in high congestion
conditions. The second approach employs measurement based admission control. The current network
load is estimated by measuring the delays, a virtual MAC instance would experience under current
network conditions.
A second approach for modifying the backoff procedure to achieve service differentiation is presented
in [1]. However, instead of modifying the CWmin and CWmax bounds, the growth exponent of the
CW variable is set to different values than two. Each flow is assigned its own exponent. It is shown
that this approach can provide throughput differentiation for different CBR flows, if the growth factor
ratios are not too large. However, for TCP flows no significant service differentiation can be observed.
The authors of [156] propose a DCF extension for adding station priorities. Their approach uses
a special jam signal which is not part of the standard and thus requires special capabilities of the
wireless transceiver.
PCF-based Approaches
In [181] a scenario is investigated, where the PCF is used for voice transmission, while data is trans-
mitted with the DCF. Hence, superframe stretching occurs. For the voice sources a simple ON/OFF
model [19] is used (modeling a voice coder with silence detection). The model is based on a two state
markov chain, i.e., the state holding times are exponentially distributed (mean time in talk state:
1 s, mean time in silent state: 1.35 s). During the talk state speech frames are generated with 8
kBit/s. Additionally, unsent voice frames are discarded, if their lifetime exceeds a threshold of 25 ms.
Data frames have exponentially distributed interarrival times and frame sizes. The load in the CP
part of the superframe, as given by subtracting the nominal length of the CFP from the superframe
period, is set to 98%. There are 15 stations, all stations send data frames, while a variable number
of stations generates additional voice frames. The voice stations are polled during the CFP in round
robin fashion. Their main result is that there is a large overhead necessary for voice transmission, and
that superframe stretching further limits the voice capacity. For example, with a superframe period
of 20 ms and the CFP length set to support 8 voice connections, only 50% of the bandwidth is avail-
able for DCF transmission. Exploring the potential statistical multiplexing gain for ON/OFF sources
increases the number of conversations up to 15 with 3.5% drop rate (life time exceeds threshold). In
this case the maximum throughput for data frames is limited to 200 kBit/s (channel rate: 1 MBit/s).
Channel errors were not taken into account.
The work described in [92] investigates a similar scenario, i.e., a mixture of voice and data sources.
It provides a comparison of DCF and PCF mode for speech transmission. In the DCF case a local
scheduler is used for providing priority to speech frames, in the PCF case the CFP is used for speech
frames. Similar to reference [181] a substantial overhead (up to 50%) is found, but nonetheless the
PCF allows to support more voice streams than the DCF variant. A major source of overhead are
empty polls during the CFP, i.e., stations have no speech packets available when polled by the BS.
When the PC has perfect knowledge about all stations queue status, a significant increase (600 kBit/s)
70
of DCF capacity as compared to the case with imperfect knowledge can be achieved (11 MBit/s, 64
kBit/s speech rate). However, errors are not taken into account.
The authors of [180] also consider voice transmission using the PCF and propose suitable settings for
different protocol parameters. The maximum number of admissible CBR voice calls depends on the
superframe length, e.g., for a 90 ms period with 11 MBit/s DSSS up to 26 voice calls can be carried,
with 2 MBit/s FHSS up to 11 calls can be carried, both in case of no transmission errors. They
propose to use these numbers to perform admission control for voice calls.
Some further studies about voice and video transmission with the IEEE 802.11 PCF can be found
in [27] and [159]. A protocol extension, which addresses a group of stations with a single poll-frame,
thereby reducing overhead, is described in [53].
71
Chapter 5
Behavior of the PROFIBUS
Protocol under Link Errors
This chapter provides an answer to the question, why it is not a good solution to run the PROFIBUS
MAC and link layer protocol directly on top of a wireless PHY. The main problem is the need for
explicit token passing and the consequences that token losses can have for the stability of the logical
ring and the active stations opportunities to transmit their data.
The investigation proceeds in two steps: in the first step we show that the PROFIBUS protocol
is not designed for coping with higher error rates or packet losses (Section 5.1). To do this, the
protocol is investigated in its “natural environment” with an underlying RS 485 PHY and certain
error assumptions.
In the second step the PROFIBUS protocol is investigated with the characteristics of an 802.11 PHY
(Section 5.2). The main differences are the error assumptions and the fact that the hearback feature
is not available.
Finally, in Section 5.3 the related literature is reviewed, and in Section 5.4 the results of this chapter
are summarized.
The material presented in Section 5.1 is also published in [188], [190], [189], and [198].
5.1 PROFIBUS over Error Prone Links
The PROFIBUS is designed to deliver real-time services in harsh, industrial environments, as is the
IEEE 802.4 token bus protocol. In both protocols a logical ring is built on top of a broadcast medium,
using special control (token) frames for ring maintenance, however, the maintenance mechanisms differ:
IEEE 802.4 uses a contention-based mechanism for including active stations (stations for short) into
the ring, while PROFIBUS uses explicit polling. In both protocols only members of the logical ring are
allowed to transmit data. Thus, one important goal of the PROFIBUS protocol is that all stations,
who wants to be, are member of the ring and remain so. The degree to which this is achieved is
referred to as ring stability, and can be captured with different metrics. Since the ring membership
is maintained by exchanging special control frames, the ring stability can be affected by loss of these
72
frames. Since data transmission is restricted to ring members it is clear that ring stability strongly
impacts the achievable QoS and system reliability.
In this section we study the ring stability of the PROFIBUS protocol operated over the RS 485
PHY in the presence of transmission errors and under different error models. It is shown that the
protocol has serious stability problems under higher error rates and that ring stability is sensitive to
the “burstiness” of errors. We propose two improvements of the protocol and its parameters, which
require no modifications in frame formats and are interoperable with the unchanged protocol rules.
These improvements yield a significant increase in ring stability, but give still not satisfactorily results,
as is shown in Section 5.2 for the case of a wireless link and also in Chapter 7 when compared with
polling-based protocols.
We introduce two definitions: a station loss event (or simply station loss) denotes the single point in
time where an active station detects its loss from the ring and discards all of its knowledge previously
obtained by its ring maintenance mechanism, especially the LAS. After a station loss a station behaves
as newly switched on.
Astation outage time denotes the time duration needed for a lost station to become a ring member
again (by expiration of its timeout timer or by being reincluded by another station).
5.1.1 Major Causes for Ring Instability
By analysis of the protocol specification and of simulator traces (see Section 5.1.4), we have identified
three different ways how a station can get lost.
The first way is due to the fact that the token frame has no checksum. It is only protected with a
parity bit, startbit and stopbit for every single byte (every byte is transmitted serially with 11 bits,
see Section 4.1.4). Thus there is some probability that a token frame can be corrupted such that no
station except the sender (by the hearback feature) will recognize an error.1Consider now the case of
two stations with addresses aand brespectively, where a < b holds to ease presentation. If asends a
token frame to bwhere the destination address is corrupted and equal to cwith a < b < c,bconsiders
itself being skipped and immediately removes itself from the ring, behaving as newly switched on. If
aretransmits the token, bhas not yet built a valid LAS and does not accept the token. After another
token frame aconsiders bas lost from the ring, since again bis not allowed to answer. We refer to
this as error skipping.
The other scenarios are due to the presence of the hearback feature: when station aexperiences
hearback errors in two successive trials to send a token frame it gets lost from the ring (i.e. forgets
its LAS). When the token frames are detected as faulty by all other stations, then the medium is idle
until the timeout timer of the station with the lowest address expires. Within this scenario two cases
can be distinguished: ahas the lowest station address of all current ring members or not (assume
that ahas negligible initialization delay). If ahas the lowest address, it is the timeout timer of a
that expires. Since there was no transmission during the idle time and ahas forgotten its LAS, anow
1This probability can be lower bounded by the probability PR, that exactly two bit errors occur within the same
byte, which cannot be detected by the parity scheme. The token frame is 3 ·11 = 33 bits long. Assuming that bit
errors are independent (at least over the length of a token frame) and occur with fixed probability p,PRis then given
by PR=168
1056 ·b(2; 33, p) where b(k;n, p) = n
kpk(1 p)nkis the distribution function of the binomial distribution.
We have used the fact that from 1056 ways to distribute two errors over 33 bits only 168 of these lead to undetectable
errors, all others are detected. With p= 0.001 we have PR0.00008.
73
thinks it is alone in the ring and sends a token frame to itself. Then all other stations remove from
the ring, feeling themselves skipped. We refer to this scenario as ring jacking. If ahas not the lowest
address, the remaining ring keeps alive and ais reincluded later. This is denoted as hearback removal.
To summarize, the mechanisms for loosing stations are as follows:
Station agets lost due to error skipping.
Station aexperiences a hearback removal.
Station agets lost because another station bwith the lowest address performs ring jacking.
5.1.2 Ring Stability Metrics
The metrics for ring stability can be roughly divided into two classes: the global stability metrics are
focused on the whole logical ring, while the local stability metrics look at a single station.
Let Kbe the number of stations and {N(t)}tRa set of integer-valued random variables, denoting the
number of stations that are members of the ring at time t(more precisely: which consider themselves
being member). We have 0 N(t)K(tR), and N(t) changes only at discrete points in time, by
the operation of the protocol. All stations want to be member of the ring all the time. We introduce
the following global metrics for ring stability:
Consider at time t0we have N(t0) = Kand lim0,>0N(t0)< K, i.e. the ring has just been
completed at t0. Furthermore let t1= inf{t > t0:N(t)< K}and C=t1t0. The random
variable Cdenotes the time duration that the ring is complete, before the next time it looses a
station. We are interested in its mean value ¯
Cand distribution function C(s) = Pr[Cs]. The
“dual” of C, i.e. the time needed to re-enter the state of a full ring after the full ring breaks, is
not covered here.
Mean number of stations in the ring during interval [0, t]:
¯
N(t) = 1
tZt
0
N(s)ds,
additionally we are interested in the limiting mean value ¯
N= limt→∞ ¯
N(t), which is assumed
to exist and approximated by evaluating ¯
N(t) for some large t.
Fraction of time where not all stations are member of the ring during time interval [0, t]:
¯
M(t) = 1
tZt
0
1[0,K1](N(s))ds
where 1A(x) is the indicator function for the set A, i.e. 1A(x) = 1 if xA, and 1A(x) = 0
otherwise. Additionally we are interested in the limiting fraction ¯
M= limt→∞ ¯
M(t).
Some important local metrics for a single station iare the following: the distribution of times between
station loss events, the duration of station outages and the overall fraction of time that iis not member
of the ring. Some simulation results for these metrics can be found in [189].
74
Listen-Token Ready Use-Token/
Pass-Token
Active-Idle
Figure 5.1: A single stations life cycle
5.1.3 Analytical PROFIBUS Ring Membership Model
In the following we describe an analytical model for the behavior of a set of PROFIBUS stations
w.r.t. ring membership. This model allows to determine ¯
Mand ¯
N. This model depends on some
model parameters, for which in turn approximations are derived using only the protocol description
and some static system parameters.
The purpose of this model is twofold: first it is interesting in itself. Second, by successfully comparing
the qualitative and quantitative behavior of the analytical models outputs with those of the simulation
model described in Section 5.1.4, the confidence in proper working of the simulation model can be
increased.
The Model
The approach is to derive a discrete time markov chain (DTMC) from a simplified station life cycle,
employing four states (the PROFIBUS protocol state machine [33, chap. 4.1] has eleven states). The
initial DTMC description has a four-dimensional state description. However, in order to solve for a
steady-state probability vector, the four-dimensional state space is flattened into a one-dimensional
state space.
The stations life cycle is shown in figure 5.1 as a set of states, which the station visits in some
order within its life. The state Listen-Token corresponds to the state, where a station listens to the
medium, waiting for two successive identical token cycles and not being able to enter the ring (with
the exception of timeout timer expiration). In state Ready a station has successfully received two
successive identical token cycles and thus has a valid LAS, however, it waits for being included in
the ring. In state Use-Token/Pass-Token the station is member of the ring and it currently owns the
token. It either performs some data transmission, pings a station in its gap list or tries to pass the
token to its successor. Finally, when a station is in state Active-Idle, it is a ring member, but does not
currently own the token. All possible transitions can be explained by normal protocol operation, only
the transition from Active-Idle to Listen-Token needs the “hearback removal” condition explained in
Section 5.1.1. In order to obtain a Markovian model, the following assumptions hold:
Time is slotted, slots have fixed length TSlot (different from the protocols slot time TSL). This
assumption is reasonable, if there is no load in the system and only token frames and request
frames for ring inclusion are exchanged. One slot corresponds to one token frame.
The system is assumed to be time-homogeneous, and already running for a long time, hence the
75
system is in the steady-state.
All feasible state transitions occur for every single station with fixed probability pXY , where X
and Ydenote shortly the source and target state (e.g., pLU being the probability for transition
from Listen-Token to Use-Token), exceptions see below. These fixed probabilities depend on
the current number of ring members, and on environmental conditions, e.g., the error rate. The
stations are independent of each other.
The protocol allows for two different transitions from state Ready to state Use-Token: in the
first there is no token owner and a station in state Ready experiences a timeout, in the second
a station in Ready state is explicitly included by the current token owner. For the first type
of transitions we assume that every station performs an independent Bernoulli experiment in
every slot with fixed probability pRU . Upon success one station enters the Use-Token state. For
the second transition type only a single station can be included, this happens with the state-
dependent probability pI(i, j), where idenotes the number of current ring members (given by
A+U) and jdenotes the number of stations in state Ready.
The probability pLR for transition from the state Listen-Token to the state Ready depends on
i, where iis the number of current ring members.
Bit errors occur independently with fixed rate p.
Station addresses are uniformly distributed in [0,126]
If token frames are transmitted correctly, either a transition from Listen-Token to Ready or a
transition from Ready to Use-Token is possible (only one of them at a time); if the token is
erroneous, there can be a hearback error, causing a station to leave the ring, and additionally it
can happen that other stations are skipped (these events do not exclude each other).
For experiencing a hearback error, we assume that a single erroneous token suffices, however this
occurs with a probability corresponding to two successive erroneous token frames. Furthermore
it is assumed that the event of an active station being “skipped” by token frames with unfor-
tunate error patterns is tied to the event of experiencing a hearback error. This assumption is
reasonable, since this event is relatively rare.
A DTMC with a four-dimensional state space S(K) is constructed:
S(K) = {(L, R, U, A)N4
0|(L+R+U+A=K)(U1)}
where for s= (L, R, U, A) S(K) we define
L=L(n) = # Stns in Listen-Token at time slot n
R=R(n) = # Stns in Ready at time slot n
U=U(n) = # Stns in Use-Token at time slot n
A=A(n) = # Stns in Active-Idle at time slot n
for every slot number nN. However, the slot number nis dropped in the notation. It is easy to see
that |S(K)|=K2+ 2K+ 1 holds.
76
A state xn= (L, R, U, A) S(K) denotes the state of the system at the n-th time slot. In the
following we enumerate for every state the possible transitions and give their respective probabilities.
When not explicitly mentioned, the probability for staying within the same state is implicitly defined
by (1 P(all other trans. prob.)). A specific system usually starts with x0= (K, 0,0,0).
L=K, R = 0, U = 0, A = 0:
Exactly one station experiences a timeout:
Pr[xn+1 = (K1,0,1,0)|xn= (K, 0,0,0)] = b(1; K, pLU )
where b(k;n, p) = n
kpk(1 p)nkis the distribution function of the binomial distribution.
L=K1, R = 0, U = 1, A = 0:
The token owner experiences a hearback error:
Pr[xn+1 = (K, 0,0,0)|xn= (K1,0,1,0)] = pUL
A number of stations enters the Ready state:
Pr[xn+1 = (K1k, k, 1,0)|xn= (K1,0,1,0)] = (1 pUL)·b(k;K1, pLR(1))
where k {0,...,K1}holds.
L=i, R =j, U = 1, A = 0 with i+j+ 1 = Kand j1:
The token owner experiences a hearback error:
Pr[xn+1 = (i+ 1, j, 0,0)|xn= (i, j, 1,0)] = pUL
A number of stations enters the ready state (i1):
Pr[xn+1 = (ik, j +k, 1,0)|xn= (i, j, 1,0)] = (1 pUL)·(1 pI(1, j)) ·b(k;i, pLR(1))
where k {0,...,i}holds.
A single ready station enters the ring and becomes token owner:
Pr[xn+1 = (i, j 1,1,1)|xn= (i, j, 1,0)] = (1 pUL)·pI(1, j)
L=i, R =j, U = 0, A = 0 with i+j=Kand j1:
A single station in ready state experiences a timeout (no transitions from Listen-Token to
Ready can happen):
Pr[xn+1 = (i, j 1,1,0)|xn= (i, j, 0,0)] = b(1; j, pRU )
A single station in Listen-Token experiences a timeout (i1):
Pr[xn+1 = (i1, j, 1,0)|xn= (i, j, 0,0)] = b(1; i, pLU )
L=i, R =j, U = 1, A =kwith i+j+k+ 1 = Kand k1:
77
The token owner experiences a hearback error and by that way a number of stations in
state active idle feel themselves skipped:
Pr[xn+1 = (i+1+ν, j, 0, k ν)|xn= (i, j, 1, k)] = pUL ·b(ν;k, pAL)
where ν {0,...,k}holds.
A number of stations enters the ready state (i1):
Pr[xn+1 = (iν, j +ν, 1, k)|xn= (i, j, 1, k)]
=(1 pUL)·(1 pI(k+ 1, j)) ·b(ν;i, pLR(k+ 1)) : j > 0
(1 pUL)·b(ν;i, pLR(k+ 1)) : j= 0
where ν {0,...,i}holds.
A single ready station enters the ring (j1):
Pr[xn+1 = (i, j 1,1, k + 1)|xn= (i, j, 1, k)] = (1 pUL)·pI(k+ 1, j)
L=i, R =j, U = 0, A =kwith i+j+k=Kand k1:
A station in Listen-Token experiences a timeout and by that way kicks all stations in
Active-Idle out of the ring (i1):
Pr[xn+1 = (i1 + k, j, 1,0)|xn= (i, j, 0, k)] = b(1; i, pLU)
A station in Ready experiences a timeout (j1):
Pr[xn+1 = (i, j 1,1, k)|xn= (i, j, 0, k)] = b(1; j, pRU )
A station in Active-Idle experiences a timeout:
Pr[xn+1 = (i, j, 1, k 1)|xn= (i, j, 0, k)] = b(1; k, pAU )
This model has a four-dimensional state space. However, common steady-state solution techniques
require Markov chains to be one-dimensional. Hence, a bijective mapping hfrom the state space S(K)
to a one-dimensional state space of exactly the size |S(K)|=K2+2K+1 is needed. This mapping can
be explicitly constructed by simply enumerating the whole space S(K) and assign every state a distinct
natural number. If Pdenotes the time-homogeneous state transition matrix for the one-dimensional
markov chain, then we can determine the steady-state probability vector π= (π1, . . . , π|S(K)|)Tas
usual by solving the following system of linear equations:
πT=πT·P
|S(K)|
X
i=1
πi= 1
where ATdenotes the transposed matrix of matrix A. After determining πwe can translate this
back to a four-dimensional steady-state vector πS(K)= (π(K,0,0,0),...,π(0,0,1,K1)) using the inverse
mapping h1. The steady-state vector exists for all investigated parameter values.
78
Evaluation of Stability Measures
After obtaining the steady-state vector πS(K)the ¯
Mand ¯
Nmeasures can be evaluated as follows:
¯
M= 1 X
(L,R,U,A)∈S(K),U+A=K
π(L,R,U,A)
¯
N=X
(L,R,U,A)∈S(K)
π(L,R,U,A)·(U+A)
Estimating Model Parameters
The next issue to resolve is the estimation of the model parameters pI(i, j), pLU ,pRU ,pAL,pAU ,pUL
and pLR(i) from some selected system parameters, namely:
the (constant) bit error rate p(0,1),
transmission rate b(bits/sec)
gap factor g,
target token rotation time TT T RT
number of stations K
protocol slot time TSL
under the assumption of no load in the system.
The probability pUL corresponds to the case, where a single station experiences two successive hearback
errors in its token frames. Since a token frame consists of three bytes, each one transmitted with 11
bits, the probability for a single token frame of being correct is given by
pT okenCorrect = (1 p)33,
hence, the probability of two successive token frames being wrong under independent errors is given
by
pUL = (1 pT okenCorrect)2= (1 (1 p)33)2
Correspondingly, for the Request-FDL-Status frame (six bytes) and its corresponding answer frame
(six bytes) the probability that both frames are successfully transmitted is given by
pReqF DLSucc = ((1 p)66)2.
The model slot time TSlot is for simplicity fixed to 50 bit times (approximately the center value
between the 33 bits long token frame and the 66 bits long Request-FDL-Status frame):
TSlot =50
b
Within a slot a token frame or a Request-FDL-Status frame can be transmitted. The protocol slot
time TSL is assumed to be an integral multiple of the model slot time TSlot:
TSL =m·TSlot
79
The probability pI(i, j) is defined to be the probability that the current token owner includes a new
station into the ring by the gap update mechanism (where idenotes the number of current ring
members and jthe number of stations in state Ready). Instead of modeling the gap timer, we have
chosen to let every current token owner perform an independent Bernoulli trial. On success, a random
station address is selected and a second random experiment is performed to check whether there is
a station on the selected address and the Request-FDL-Status frame exchange is performed without
error. The probability pGap that the current token owner performs a gap update trial is defined as:
pGap =q1
q2
:= mean gap list length
number of slots for g·TT T RT
where q1=2·(127i)
iand q2=g·TT T RT
TSlot . The factor 2 in q1accounts for the need of two time slots for
exchange of the Request-FDL-Status frame and its answer frame. Hence
pGap =2·(127 i)·TSlot
i·g·TT T RT
The probability pAnsF DL that a valid answer frame is received given that the current token owner has
decided to perform a gap update trial to station kcan be expressed as:
pAnsF DL = Pr[frames correct and khits stn in gap list |Request-FDL-Status frame to k]
= Pr[frames correct |Request-FDL-Status frame to k]
·Pr[khits stn in gap list |Request-FDL-Status frame to k]
=pReqF DLSucc ·pRequestHits
For istations in the ring and jstations in state Ready, there are in the mean
pRequestHits =j
127i
i
ready stations in the token owners gap list. Hence,
pAnsF DL =i·j
127 i·pReqF DLSucc
Finally, the probability pI(i, j) that the current token owner includes a station is given by
pI(i, j) = pGap ·pAnsF DL
=2·TSlot ·j
g·TT T RT ·pReqF DLSucc
Next we determine an estimation jointly for pLU ,pAU and pRU , since they all occur in a scenario,
where a station in a given state experiences a timeout and claims the token. A coarse estimation can
be developed as follows: if we assume station addresses to be uniformly distributed over [0,126] the
mean station address is given by e63. The timeout time for a station with address nis given by
TT O =TSL ·(6 + 2 ·n) = m·TSlot ·(6 + 2 ·n)
so the mean timeout time is then 132 ·TSL (with n=e). We assume that each station in every slot
performs an independent Bernoulli experiment with a fixed probability pRU =pLU =pAU . Hence, the
number of slots necessary for a single station experiencing a timeout is a geometric random variable
80
with success probability pLU (or pAU , pRU respectively) and mean value 1pLU
pLU . For this mean value
the relation
TSlot ·1pLU
pLU
= 132 ·m·TSlot
holds, and we conclude
pLU =pAU =pRU =1
1 + 132 ·m
For determining the probability pAL we note that error skipping can happen only if bit errors in the
token frame occur somewhere in the address bytes and are not detected by the parity bit mechanism
or by failures in the start- and stopbits (the token frame is not equipped with a checksum). If an
error occurs in a start- or stopbit or in the start delimiter, this will be always detected. We restrict
ourselves to the case where only two bit errors occur within a token frame, higher numbers of bit
errors are neglected. The probability for exactly two bit errors is given by b(2; 33, p). The number of
placements of two errors within the address fields of a token frame, such that they are not detected is
given by 16 ·7, since for the first erroneous bits there are 16 possibilities, while for the second we are
restricted to the remaining seven bits of the same byte. Since the overall number of placing two bit
errors over 33 bits is given by 33!
(332)! we finally have:
pAL =b(2; 33, p)·16 ·7
33!
(332)!
=7
66 ·b(2; 33, p)
For the last unknown probability, pLR(i) (where iis the number of stations currently in the ring) we
remember that two successive identical token cycles without transmission errors have to be detected.
We can think of each token frame as a single Bernouilli experiment with success probability ˜p:=
pT okenCorrect = (1 p)33. A station enters the state Ready, if it encounters a run of 2 ·isubsequent
successes. For the probability fnthat at the n-th Bernouilli experiment we observe rsubsequent
successes (with r= 2 ·i) for the first time, it is known that its moment generating function is given
by [47, Sec. XIII,7]
F(s) =
X
ν=0
fνsν=˜pr·sr
1˜qs(1 + ˜ps +...+ (˜ps)r)
where ˜q= 1 ˜pand the mean value given by
µ(˜p, r) = F0(1) = 1˜pr
˜q˜pr
However, we do not use the random variable given by the fn’s as a model for the number of slots
necessary for moving from Listen-Token to Ready, but instead we use a geometric random variable
and set its mean value such that it equals µ(˜p, 2·i). Hence, the relation
1
pLR(i)=µ(˜p, 2·i)
should hold. From this we conclude
pLR(i) = 1
µ(˜p, 2·i)
81
5.1.4 Simulation Results
We present simulation results for the global stability metrics defined in Section 5.1.2.
The simulations were performed with a detailed simulation model written in C++ and using the CSIM
simulation library [103]. The model includes parts of the PROFIBUS link layer (SDA-, SDN, and SRD
services), the PROFIBUS MAC protocol and a shared medium with the characteristics of the RS 485
PHY. In the shared medium all attached stations including the transmitter see the same signals and
bits (a quite unrealistic assumption for wireless channels), hence the transmitter can perform proper
hearback. All timing properties pertaining to the behavior of the medium (e.g., bit times, required
idle times), and additionally a station’s delay in processing received frames and generating answers
are considered in the model.
Shared Medium
Packetized
PHY-Interface
MAC-/FDL
Protocol Engine
FDL-Interface
Source Source
Packetized
PHY-Interface
MAC-/FDL
Protocol Engine
FDL-Interface
Source Source
...............................................
Figure 5.2: Logical structure of PROFIBUS simulation model
The structure of the simulation model is shown in Figure 5.2. A set of stations is attached to the
shared medium. Each station consists of a variable number of traffic sources, attached via the FDL /
link layer interface to the FDL-/MAC-protocol engine. The packetized PHY interface is an abstraction
of the shared medium, delivering and accepting full packets instead of single bits. Furthermore some
channel-related low level protocol functions are implemented here (e.g., timeout timer handling).
Simulator Validation
The simulator is validated by code inspection, successful comparison of generated frame sequences
with expected frame sequences, and by comparison of the generated stability measures ¯
Nand ¯
M
with those generated by the analytical model described in Section 5.1.3. Both models were developed
independently.
The fixed parameters common for both models are shown in Table 5.1. Both models assume inde-
pendent channel errors with bit error rate (BER) p, varying from 104to 103. Furthermore, there
is no load in the system, therefore only token frames and Request-FDL-Status frames occur on the
82
Parameter Value
# of stations K= 10
gap factor g= 6
target token rotation time TT T RT = 20 msec
bit rate b= 500 kBits/s
protocol slot time TSL = 200 µs
Table 5.1: Fixed parameters common for the analytical model and the simulation model
Parameter Value
# of stations K= 10
gap factor g= 6
target token rotation time TT T RT = 20 msec
bit rate b= 500 kBits/s
protocol slot time TSL = 400 µs
station delay 100 µs
Table 5.2: Fixed parameters for ring stability simulations
medium.
The simulations used for validation are run for 3600 seconds. N(t) is sampled every 100 µsec and
the mean values ¯
N(3600) and ¯
M(3600) are computed from these values. The confidence intervals for
¯
N(3600) and ¯
M(3600) are very tight and thus not shown (see footnote 2).
In Figure 5.3 the values ¯
N(3600) from the simulation and ¯
Nas from the analytical model are compared.
Both curves have the same shape and their values differ by no more than 5%, however, towards higher
BERs the simulation model gives better results (higher ¯
Nvalue) than the analytical model. Given
that the analytical model uses many simplifications (e.g., in the protocol state machine), the result is
quite satisfactorily.
In figure 5.4 the values ¯
M(3600) from the simulations and ¯
Mfrom the analytical model are displayed.
While both curves have the same shape, the analytical model predicts that the ring is much more
often incomplete than the simulation model does. From the view of the simulation model, for high
BERs a 50% increase in the ¯
Mvalue is obtained with the analytical model. This is likely due to
the coarse handling of time in the analytical model.
In summary, the good match of the ¯
Ncurves and the fact that the ¯
Mcurves have the same shape,
increases confidence in the results delivered by the simulator.
Results
In the first set of simulations there are K= 10 stations without any external load, thus only token
frames and Request-FDL-Status frames occur. We have chosen this setting for clearly highlighting
the ring stability problems, simulations with load are discussed in Section 5.1.5.
Every station always wants to be a member of the ring and there are no failures except transmission
83
8
8.5
9
9.5
10
10.5
11
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Simulation
Analytical
PSfrag replacements
BER
¯
N(3600),¯
N
Figure 5.3: ¯
N(3600) and ¯
Nvs. BER (independent errors)
0
0.2
0.4
0.6
0.8
1
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Simulation
Analytical
PSfrag replacements
BER
¯
M(3600),¯
M
Figure 5.4: ¯
M(3600) and ¯
Mvs. BER (independent errors)
84
errors. All simulations were run for 3600 simulated seconds, the fixed parameters are shown in Table
5.2. Two different error models were used: independent errors with fixed BER and the two-state
Gilbert/Elliot model [184] (wireless channel models are discussed in Chapter 6). In the Gilbert/Elliot
model (or Gilbert model for short) the channel is always in one of two states: Good or Bad. Within
each state, bit errors are assumed to be independent with a fixed rate. The channel state is modulated
according to a two-state continuous time markov chain. For parametrization of the Gilbert model four
values suffice: BER in good state eg, BER in bad state eb(egeb), mean duration of good state λ
in seconds, and mean duration of bad state µin seconds. With pg=λ
λ+µand pb=µ
λ+µbeing the
steady-state probabilities for being in state good or bad, respectively, the mean BER mis given by
m=pg·eg+pb·eb.(5.1)
The Gilbert model is very popular for modeling wireless channels due to its simplicity and its ability
to capture bursty error behavior with short term correlation. The (mean) BERs for both error models
are in the range 104...103(these values are justified from measurements discussed in Chapter 6).
In Figures 5.5 and 5.6 ¯
M(3600) and ¯
N(3600) are displayed, respectively. In both figures we have used
the independent error model with varying BER. Furthermore, in Figure 5.7 the distribution functions
C(s) (the random variable Cindicates, how long a full ring is stable, see Section 5.1.2) for different
BERs is shown. The nearly vertical line on the left side comes from the time resolution used (5 ms)
and the fact that all distributions have a share between 5% and 21% of their mass within the first 5
ms. The confidence intervals for ¯
N(3600) are very tight and thus not shown2. In Figure 5.5, a nearly
linear relationship between the BER and the fraction of time where the ring is incomplete can be
observed. For the highest BER this fraction is approximately 1/3. Even more frustrating is the result
that for the lowest investigated BER of 104a full ring is stable for less than 15 seconds in more than
40% of all cases, although if ¯
N(3600) and ¯
M(3600) look good (compare Figure 5.7). This is a serious
problem for real-time applications over error prone links, since for reincluding of a lost station some
time is needed.
In order to show that the protocol is not only sensible to the overall BER but also to the characteristics
of the error process (specifically: its “burstiness”), we have performed simulations with the Gilbert
model. We have chosen to keep m= 0.001, eg= 0.0000820 and λ= 0.061736 fixed and to vary µ
using values of 5, 10, 20, 30, 40, 50 and 60 ms3, then determining ebfrom equation 5.1. The burstiness
index (BI) is defined to be λ/µ. The question, whether the ring stability metrics are invariant of the
scale of λand µis not further investigated. In Figure 5.8 we present ¯
N(3600) vs BI. Apparently for
more bursty errors (larger BI) this metric decreases. This is to be expected, since for constant mthe
value ebincreases when BI increases, it is more likely that a station experiences a hearback error.
As a visual impression that frequently the number of ring members reduces from five or more to one
within a very short time, the evolution of N(t) for the first 100 seconds is displayed in Figure 5.9
(Gilbert errors, m= 0.001, µ= 20 ms). A careful analysis of the corresponding simulator traces shows
that often multiple stations are lost simultaneously, and that these breakdowns are indeed caused by
2The maximum relative error of the ¯
N(·) value for all simulations is with 98 percent confidence not larger than
one percent of the absolute value. Most relative errors are smaller than 0.1 percent. For actually calculating these
values within the simulation, N(t) was approximated by a sampled version Nk=N(k·T) with T= 100µs fixed and
kN. Accordingly, we calculate ¯
N(t) with kt= max{kN:k·T < t}as the sample mean: ¯
N(t) = 1
ktPkt
i=0 Niand
the variance ¯
N2(t) as the sample variance. Furthermore, in the simulator transient removal techniques were used for
achieving steady-state results.
3The values for λand egare calculated directly from [184], while the values chosen for µhave the same order of
magnitude as those from [184].
85
0
0.1
0.2
0.3
0.4
0.5
0.6
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
0
0.1
0.2
0.3
0.4
0.5
0.6
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
PSfrag replacements
BERBER
¯
M(3600)
¯
M(3600)
Figure 5.5: ¯
M(3600) vs. BER (independent errors)
Parameter Value
# of active stations 4
# of passive stations 1
target token rotation time TT T RT = 20 msec
bit rate b= 500 kBits/s
protocol slot time TSL = 400 µs
station delay 100 µs
Table 5.3: Fixed parameters for ring re-inclusion simulations
the ring jacking scenario. Furthermore, the frequent transitions from ten members to nine members
are caused by hearback removals. The error skipping scenario is rare: for the worst error parameter
setting (Gilbert errors, m= 0.001, µ= 5 ms, eb0.012) a token frame with undetectable errors is
observed once every minute in the mean. Therefore this scenario is not considered furthermore.
A station lost from the ring must be re-included by another station, using the ring maintenance
mechanisms. This may take some time. For an assessment of this time another set of simulations
was performed [189], using the parameters shown in Table 5.3. The scenario consists of four active
and one passive station. The active stations addresses are 22, 39, 65 and 69, taken from uniform
distribution. The active stations offer a load of 20% if all stations are in the ring. The channel
generates independent errors with a fixed BER of 103. We have investigated station outage times.
From the point of view of a single station a station outage time is the time duration between the
instant the station gets lost from the ring and its later reinclusion. In Figure 5.10 the mean station
outage time (MSOT) is shown vs. the gap factor. The gap factor is expected to have significant
influence on the ring (re-)inclusion times, and indeed this is confirmed by the figure. In Figure 5.11
the cumulated station outage time (CSOT) is shown, i.e., the fraction of time that a station is not in
the ring. The following points are interesting:
86
7
7.5
8
8.5
9
9.5
10
10.5
11
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
7
7.5
8
8.5
9
9.5
10
10.5
11
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
PSfrag replacements
BERBER
¯
N(3600)
¯
N(3600)
Figure 5.6: ¯
N(3600) vs. BER (independent errors)
For all stations except station 22 (lowest station address) the MSOT increases almost linearly
with the gap factor. Even more, the slope is greater for higher station adresses. This can be
explained by the ring jacking scenario described in Section 5.1.1. After a station is lost, it will
take some time to get re-included.
The results on the cumulated station outage times are dramatic: for gap factors of around 30
all active stations except station 22 are ring members for only 50% of the time. This gets worse
for higher gap factors. Even for small gap factors these stations are for approximately 10 % of
the time not member of the ring. This shows clearly that the used deterministic algorithm for
station inclusion breaks down under a high bit error rate.
The MSOT and CSOT values show the same behavior for a Gilbert channel. Furthermore, varying the
TT T RT value gives equivalent results. The MSOT values increase for increasing load, since there is less
time available for Request-FDL-Status frames. However, the CSOT value decreases for increased load,
since with more data frames the number of vulnerable token frames per fixed unit of time decreases,
and thus fewer stations get lost [189].
87
0
0.2
0.4
0.6
0.8
1
1.2
0 2 4 6 8 10 12 14 16
BER = 0.001
0
0.2
0.4
0.6
0.8
1
1.2
0 2 4 6 8 10 12 14 16
BER = 0.001
BER = 0.0008
0
0.2
0.4
0.6
0.8
1
1.2
0 2 4 6 8 10 12 14 16
BER = 0.001
BER = 0.0008
BER = 0.0006
0
0.2
0.4
0.6
0.8
1
1.2
0 2 4 6 8 10 12 14 16
BER = 0.001
BER = 0.0008
BER = 0.0006
BER = 0.0004
0
0.2
0.4
0.6
0.8
1
1.2
0 2 4 6 8 10 12 14 16
BER = 0.001
BER = 0.0008
BER = 0.0006
BER = 0.0004
BER = 0.0002
0
0.2
0.4
0.6
0.8
1
1.2
0 2 4 6 8 10 12 14 16
BER = 0.001
BER = 0.0008
BER = 0.0006
BER = 0.0004
BER = 0.0002
BER = 0.0001
0
0.2
0.4
0.6
0.8
1
1.2
0 2 4 6 8 10 12 14 16
BER = 0.001
BER = 0.0008
BER = 0.0006
BER = 0.0004
BER = 0.0002
BER = 0.0001
PSfrag replacements
Time (sec)Time (sec)Time (sec)Time (sec)Time (sec)Time (sec)Time (sec)
PDFPDFPDFPDFPDFPDFPDF
Figure 5.7: Distribution function C(s) for different BER’s
7
7.5
8
8.5
9
9.5
10
10.5
11
0 2 4 6 8 10 12 14
7
7.5
8
8.5
9
9.5
10
10.5
11
0 2 4 6 8 10 12 14
PSfrag replacements
BIBI
¯
N(3600)
¯
N(3600)
Figure 5.8: ¯
N(3600) vs. BI for m = 0.001 (Gilbert errors)
88
0
2
4
6
8
10
0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1e+06
0
2
4
6
8
10
0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1e+06
PSfrag replacements
time in 104sectime in 104sec
N(t)N(t)
Figure 5.9: N(t) vs. time (Gilbert errors)
0
0.2
0.4
0.6
0.8
1
1.2
1.4
5 10 15 20 25 30 35 40 45 50
MSOT (sec)
Gap Factor
Station 65
0
0.2
0.4
0.6
0.8
1
1.2
1.4
5 10 15 20 25 30 35 40 45 50
MSOT (sec)
Gap Factor
Station 65
Station 22
0
0.2
0.4
0.6
0.8
1
1.2
1.4
5 10 15 20 25 30 35 40 45 50
MSOT (sec)
Gap Factor
Station 65
Station 22
Station 39
0
0.2
0.4
0.6
0.8
1
1.2
1.4
5 10 15 20 25 30 35 40 45 50
MSOT (sec)
Gap Factor
Station 65
Station 22
Station 39
Station 69
0
0.2
0.4
0.6
0.8
1
1.2
1.4
5 10 15 20 25 30 35 40 45 50
MSOT (sec)
Gap Factor
Station 65
Station 22
Station 39
Station 69
Figure 5.10: MSOT vs. gap factor (independent errors)
89
0
0.2
0.4
0.6
0.8
1
5 10 15 20 25 30 35 40 45 50
Cumulated SOT (fraction)
Gap Factor
Station 65
0
0.2
0.4
0.6
0.8
1
5 10 15 20 25 30 35 40 45 50
Cumulated SOT (fraction)
Gap Factor
Station 65
Station 22
0
0.2
0.4
0.6
0.8
1
5 10 15 20 25 30 35 40 45 50
Cumulated SOT (fraction)
Gap Factor
Station 65
Station 22
Station 39
0
0.2
0.4
0.6
0.8
1
5 10 15 20 25 30 35 40 45 50
Cumulated SOT (fraction)
Gap Factor
Station 65
Station 22
Station 39
Station 69
0
0.2
0.4
0.6
0.8
1
5 10 15 20 25 30 35 40 45 50
Cumulated SOT (fraction)
Gap Factor
Station 65
Station 22
Station 39
Station 69
Figure 5.11: CSOT vs. gap factor (independent errors)
90
5.1.5 Improvements
In this Section we propose a new method for setting timeout timers and an additional protocol feature.
The new timer setting tries to prevent the breakdowns of the ring by letting expire the timeout timer
for current ring members before that of stations in state Listen-Token. The additional protocol feature
aims at reincluding lost stations as fast as possible. Since both of them require no modification of
frame formats or protocol operation, they are interoperable with the unchanged protocol. Thus,
stations with the modified and the unchanged protocol stack can potentially be operated in the same
PROFIBUS LAN. However, the ability to dynamically influence the timeout timer setting is needed,
which may require an upgrade of today’s ASIC-based protocol implementations. Both methods are
targeted to combatting the ring jacking and hearback removal scenarios, avoiding the error skipping
scenario requires a better protection of the token frame and thus a change in frame formats.
The effect of the proposed methods is investigated with simulations, using the same scenarios and
stability metrics as in Section 5.1.4, and with additional simulations taking the effects of system load
and different numbers of active stations into account.
Timeout Calculation
From our simulations and from analysis we have observed that the ring jacking scenario (described in
Section 5.1.1), where the station with the lowest address can destroy the whole ring, occurs frequently.
The calculation of the timeout value is for station nas follows (see Section 4.1.4):
TT O(n) = (6 + 2 ·n)·TSL
where TSL is the protocol slot time. The basic problem of this scenario is that the timeout timer may
expire for a station which is in the listen token state and has no valid LAS. If the timer of a station
in the ring (not in the listen token state) expires, the ring keeps alive. Thus we propose to make the
timeout calculation state-dependent:
TT O(n) = (6 + 2 ·n)·TSL : state 6= listen token
(254 + 6 + 2 ·n)·TSL : state = listen token
in order to make sure that the timeout timer expires first for stations in the ring and as a result to
avoid ring jacking. The effects of this improvement are shown later in this section.
Fast Reinclusion of Lost Stations
When a station is lost from the ring, it takes some time before it is reincluded. First, the station is
required to observe the same sequence of token frames twice, second, it will not be reincluded before
it is pinged by its predecessor using the Request-FDL-Status frame. We propose to add the following
feature to the protocol: after station ahas lost its successor b(i.e. there is no reaction of bto three
consecutive token frames), awaits for two token cycles and then pings bwith the Request-FDL-Status
frame as soon as there is token holding time available. This is the earliest moment where bcan be
reincluded, due to b’s need for observing two identical token cycles. This procedure should be carried
out independently of the normal ring-inclusion algorithm. Thus it can happen that aincludes another
station cduring the two token cycles it waits for reincluding b. In this case bshould only be reincluded
if its address lies in the range between aand c, otherwise cwill remove itself from the ring, being
91
skipped by the first token frame asends to b. However, when the ring jacking scenario occurs more
frequently, this protocol extension should be used in conjunction with the new timeout calculation
method, since otherwise fast reinclusion will not happen.
Performance Evaluation
Three different versions of the protocol are compared: the normal protocol without any improvements,
the protocol with the new timeout calculation method and the protocol with both improvements. The
simulation setup is the same as in Section 5.1.4. The results for ¯
M(3600) are shown in Figure 5.12,
the results for ¯
N(3600) are shown in Figure 5.13, both for independent errors and varying BER.
These figures show that the new timeout computation significantly improves stability, the protocol
with both improvements performs best. In Figure 5.14 the sample coefficient of variation for Nis
shown. The improvements reduce the variability of N. In Figure 5.15 the three protocol versions for
the case of Gilbert errors and varying BI for fixed mean BER m= 0.001 are compared. The stability
gain of the improvements as compared to the normal protocol is larger for more bursty errors than
for the “smooth” independent errors. As a visual impression in Figure 5.20 the evolution of N(t) for
the same system as for Figure 5.9 (ten masters, no load, Gilbert errors with m= 0.001 and µ= 20
ms) is shown, however, with both protocol improvements enabled. It can be seen that most of the
breakdowns visible in Figure 5.9 are removed.
Additionally, the ring jacking scenario also influences the local stability metrics mentioned in Section
5.1.2. One example is the fraction of time that station iis not in the ring. For the station with the
lowest address this fraction is small and nearly independent of the gap factor or the TT T RT value,
while for all other stations this metric depends almost linearly on the gap factor, and furthermore
increases with increasing station address (see ref. [189] for examples).
In order to show that ring stability problems occur also when there is load in the system (and thus
a smaller number of vulnerable token frames per fixed unit of time), two more scenarios were investi-
gated. In the first scenario there are four active stations, two passive stations, and four traffic sources,
each attached to a different active station. The traffic sources generate requests, the attached station
puts them in a queue of infinite size. Two traffic sources generate requests with a fixed interarrival
time of 10 ms. The corresponding requests lead to frames of 25 bytes size (carrying 16 bytes of user
data), which are acknowledged by the passive station with frames of the same size. The other sources
generate sporadic requests with exponentially distributed interarrival times (10 ms mean value), des-
tined for the second passive station and with data sizes uniformly distributed between 8 and 30 bytes
(leading to frame sizes between 17 and 39 bytes), however, the acknowledgement carries no data.
Thus, there is a mixture of synchronous and asynchronous traffic.
In the second scenario there are ten active stations and ten traffic sources. The first five sources
are periodic (with 25 ms period), the other sources are sporadic (with 25 ms mean value). Thus in
both scenarios a minimum bandwidth of 35% of the medium bandwidth is devoted to exchange
of data frames including the acknowledgements, but not including retransmissions. The need for
retransmissions at error rates of 103saturates the system, higher loads lead to growing request
queues. This is true especially for independent errors, for Gilbert errors the queues can be emptied
during good channel periods. The simulations run for 10000 simulated seconds, the other parameters
(gap factor, TT T RT , bit rate, slot time TSL) are fixed. The ¯
N(10000) results for the scenario with
ten stations are shown in Figure 5.16 (independent errors) and Figure 5.17 (Gilbert errors). It can
92
be seen that for all three protocol versions and both error models this value is better than in the
corresponding simulations without any load. However, for high bit error rates the stability problems
and their dependence on the type of channel errors are still visible. The proposed improvements again
yield a significant gain.
The ¯
M(10000) values for both station numbers are shown in Figures 5.18 (independent errors) and
5.19 (Gilbert errors) for the normal protocol and the protocol with both improvements. Again, in the
presence of load this metric is better (lower) than for the corresponding simulations without load (not
shown here for Gilbert errors), and the improved protocol version yields the best results. Interestingly,
in both figures the numbers are smaller for fewer stations. While for four stations and ten stations the
times for breaking a full ring are comparable (four stations: mean value ¯
C1.12 sec, stddev 1.38;
ten stations: mean ¯
C1.17 sec, stddev 1.27) with ten stations it takes much longer to complete
the ring. Likely the difference stems from the time needed to complete the ring after multiple stations
have been lost at once, as in the ring jacking scenario. If only a single station gets lost, it is reasonable
to expect that reinclusion is slightly faster in the ten station case, since the gap lists are typically
shorter than with fewer stations. Furthermore, for a newly reincluded station there might be some
delay between its reinclusion and the time it starts to poll its gap list, since in the simulation the gap
timer is independent from the stations state of ring membership. As a result, if more stations need
to be reincluded, a higher delay for ring completion can be expected.
All these findings together confirm the belief that ring instability is an issue for higher bit error rates,
and furthermore that two important sources for instability are the ring jacking and hearback removal
scenario, while the error skipping scenario seems to play a much smaller role. The ring jacking and
hearback removal scenarios can be combatted with the two proposed improvements. Since for lower
bit error rates station losses occur rarely and the improvements are not invoked, they impose no
additional cost in terms of bandwidth or delay.
However, even with these improvements the PROFIBUS protocol is not a good choice. As we show in
Chapter 7, the achievable realtime performance for the PROFIBUS is for many scenarios much worse
than for the investigated polling-based protocols. And in Section 5.2 we show the reason for this:
when faced to a wireless-type link with packet losses (see Chapter 6), the stability problems are still
significant.
93
0
0.1
0.2
0.3
0.4
0.5
0.6
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
0
0.1
0.2
0.3
0.4
0.5
0.6
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
New Timeout
0
0.1
0.2
0.3
0.4
0.5
0.6
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
New Timeout
Both
0
0.1
0.2
0.3
0.4
0.5
0.6
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
New Timeout
Both
PSfrag replacements
BERBERBERBER
¯
M(3600)
¯
M(3600)
¯
M(3600)
¯
M(3600)
Figure 5.12: ¯
M(3600) vs. BER (independent errors)
7
7.5
8
8.5
9
9.5
10
10.5
11
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
7
7.5
8
8.5
9
9.5
10
10.5
11
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
New Timeout
7
7.5
8
8.5
9
9.5
10
10.5
11
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
New Timeout
Both
7
7.5
8
8.5
9
9.5
10
10.5
11
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
New Timeout
Both
PSfrag replacements
BERBERBERBER
¯
N(3600)
¯
N(3600)
¯
N(3600)
¯
N(3600)
Figure 5.13: ¯
N(3600) vs. BER (independent Errors)
94
0
0.05
0.1
0.15
0.2
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
0
0.05
0.1
0.15
0.2
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
Timeout
0
0.05
0.1
0.15
0.2
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
Timeout
Both
0
0.05
0.1
0.15
0.2
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
Timeout
Both
PSfrag replacements
BERBERBERBER
CoVCoVCoVCoV
Figure 5.14: Sample coefficients of variation for Nvs. BER (independent errors)
7
7.5
8
8.5
9
9.5
10
10.5
11
0 2 4 6 8 10 12 14
Normal Protocol
7
7.5
8
8.5
9
9.5
10
10.5
11
0 2 4 6 8 10 12 14
Normal Protocol
New Timeout
7
7.5
8
8.5
9
9.5
10
10.5
11
0 2 4 6 8 10 12 14
Normal Protocol
New Timeout
Both
7
7.5
8
8.5
9
9.5
10
10.5
11
0 2 4 6 8 10 12 14
Normal Protocol
New Timeout
Both
PSfrag replacements
BIBIBIBI
¯
N(3600)
¯
N(3600)
¯
N(3600)
¯
N(3600)
Figure 5.15: ¯
N(3600) vs. BI for m = 0.001 (Gilbert errors)
95
7
7.5
8
8.5
9
9.5
10
10.5
11
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
7
7.5
8
8.5
9
9.5
10
10.5
11
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
New Timeout
7
7.5
8
8.5
9
9.5
10
10.5
11
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
New Timeout
Both
7
7.5
8
8.5
9
9.5
10
10.5
11
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
Normal Protocol
New Timeout
Both
PSfrag replacements
BERBERBERBER
¯
N(10000)
¯
N(10000)
¯
N(10000)
¯
N(10000)
Figure 5.16: ¯
N(10000) vs. BER (independent errors) with 10 masters and 36% load
7
7.5
8
8.5
9
9.5
10
10.5
11
0 2 4 6 8 10 12 14
Normal Protocol
7
7.5
8
8.5
9
9.5
10
10.5
11
0 2 4 6 8 10 12 14
Normal Protocol
New Timeout
7
7.5
8
8.5
9
9.5
10
10.5
11
0 2 4 6 8 10 12 14
Normal Protocol
New Timeout
Both
7
7.5
8
8.5
9
9.5
10
10.5
11
0 2 4 6 8 10 12 14
Normal Protocol
New Timeout
Both
PSfrag replacements
BIBIBIBI
¯
N(10000)
¯
N(10000)
¯
N(10000)
¯
N(10000)
Figure 5.17: ¯
N(10000) vs. BI for m = 0.001 (Gilbert errors) with 10 masters and 36% load
96
0
0.1
0.2
0.3
0.4
0.5
0.6
0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.001
4 stations, Normal Protocol
10 stations, Normal Protocol
10 stations, Both
4 stations, Both
PSfrag replacements
BER
¯
M(10000)
Figure 5.18: ¯
M(10000) vs. BER (independent errors) with 4 and 10 masters and 36% load
0
0.1
0.2
0.3
0.4
0.5
0.6
0 2 4 6 8 10 12 14
4 stations, Normal Protocol
10 stations, Normal Protocol
10 stations Both
4 stations, Both
PSfrag replacements
BI
¯
M(10000)
Figure 5.19: ¯
M(10000) vs. BI m= 0.001 (Gilbert errors) with 4 and 10 masters and 36% load
97
0
2
4
6
8
10
0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1e+06
0
2
4
6
8
10
0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1e+06
PSfrag replacements
time in 104sectime in 104sec
N(t)N(t)
Figure 5.20: N(t) vs. time (Gilbert errors, both protocol improvements)
98
Parameter Value
# of stations K= 10
gap factor g= 6
target token rotation time TT T RT = 20 msec
bit rate b= 1 MBit/s
protocol slot time TSL = 400 µs
station delay 100 µs
Table 5.4: Fixed parameters for ring stability simulations over wireless channel
5.2 PROFIBUS over Wireless Links
When the PROFIBUS protocol runs on top of an IEEE 802.11 DSSS PHY, some properties of data
transmission change. First, since it is impossible to send and receive simultaneously on the same
channel, the hearback feature is not available. Hence, it is reasonable to expect that the ring jacking
scenario occurs less often. Second, every frame is preceded by the 192 µs long PLCP preamble and
PLCP header, which affects at least some protocol parameters like the slot time TSL. And third,
based on the results reported in Chapter 6, it is reasonable to extend the channel error model with
packet losses.
To investigate the PROFIBUS ring stability under wireless conditions, some changes in the protocol,
the channel model and simulation parameters were necessary. With respect to the protocol and
framing rules the following assumptions were made:
PROFIBUS frames are embedded as they are into the data part of an 802.11 DSSS PHY PPDUs
(see Section 3.2.2). Hence, no 802.11 MAC fields were present and implicit broadcasting is used.
Every byte is transmitted with eight bits instead of eleven.
The PLCP preamble and header are assumed to have a length of 192 µs. In addition, to every
frame a 16 bit CRC checksum is appended, which for simplicity is assumed to detect all bit
errors.
The hearback feature is not available, hence, collisions could not be detected.
The protocol slot time TSL is at least 400 µs.
The timeout timer can rely on a “true” carrier sensing facility, which indicates a carrier if
the received signal strength exceeds some threshold and does not require having achieved bit
synchronization.
The channel model was changed to incorporate not only bit errors but also losses of whole packets,
which can occur due to failure of preamble acquisition (Chapter 6). However, all stations see the same
signals on the medium.
Two simple sets of simulations were performed, differing in their respective BER: the first set uses
independent errors with a BER of 103, in the second set there occur no bit errors, but only packet
losses. In both sets the packet loss rate (PLR) is varied from 0.0 to 0.1 in steps of 0.01, assuming
99
independent packet losses. These PLRs are well in the range observed by measurements (see Section
6.5.2).
The scenario consists of K= 10 stations with no data load, as in Section 5.1.4 for the PROFIBUS
over RS-485 case. The fixed simulation parameters are summarized in Table 5.4. The simulations are
run for 3600 simulated seconds.
0
2
4
6
8
10
0 0.02 0.04 0.06 0.08 0.1
normal protocol
new timeout
fast reinclusion
both improvements
PSfrag replacements
PLR
¯
N(3600)
Figure 5.21: ¯
N(3600) vs. PLR (independent packet losses) and no bit errors
For the case without bit errors in Figure 5.21 the ¯
N(3600) values are presented for varying PLR,
while in Figure 5.22 the ¯
M(3600) values are displayed. Respectively, for the case with bit errors the
corresponding Figures are Figure 5.23 for the ¯
N(3600) values and 5.24 for the ¯
M(3600) values. The
following points are important:
The ring stability is sensitive to packet losses. Even for 6% packet losses without bit errors
50% of the time the ring is not full, while enabling both improvements reduce this fraction
to 30%. However, this is unacceptable for time critical communications. If in addition bit
errors occur, with both improvements the ring is not complete for 56% of the time, while the
unchanged protocol completely increases this rate to 78%.
In all figures the curves for the normal protocol and the protocol with the new timeout com-
putation method are very close, the same holds for the curves for both improvements and the
fast reinclusion feature. Hence, only the fast reinclusion feature gives some gain in ring stability,
while the new timeout method gains nothing. For the RS-485 simulations with hearback the
opposite behavior could be observed. This allows to draw the conclusion that the ring jacking
scenario occurs only rarely.
To summarize, the original PROFIBUS protocol behaves inacceptably for moderate packet loss rates
of 5% to 10%, as observed in measurements. However, with a wireless link the ring jacking scenario
is not the dominant source of ring instability. Instead, as seen by inspection of simulation logfiles, in
most cases stations get lost because the token frames do not reach them due to packet losses.
100
0
0.2
0.4
0.6
0.8
1
1.2
0 0.02 0.04 0.06 0.08 0.1
normal protocol
new timeout
fast reinclusion
both improvements
PSfrag replacements
PLR
¯
M(3600)
Figure 5.22: ¯
M(3600) vs. PLR (independent packet losses) and no bit errors
A worthwhile topic for further research would be to investigate other packet loss patterns, e.g., bursty
patterns or to use directly some traces from measurements. It is also interesting to see what happens
when the “single-channel” assumption is dropped and between every pair of stations a separate channel
is used. Some results for realtime performance in this case are presented in Section 7.3.2.
5.3 Related Work
The behavior of PROFIBUS in the presence of transmission errors or its ring membership behavior /
ring stability is to our best knowledge not covered in the literature. Most analyses of the PROFIBUS
real-time capabilities ([169], [168, chap. 3, chap. 5]) allow for sporadic transmission errors by taking
retransmissions into account, however, the influence of transient times where a station is involuntarily
not a ring member is not considered.
For the IEEE 802.4 Token Bus it is investigated in [82] using analytical techniques and measurements,
how bursty errors affect the token passing process, and how this in turn affects the mean token passing
time and, more important, the mean token rotation time. For the PROFIBUS some results on local
stability metrics are available in [189].
5.4 Conclusions
The PROFIBUS protocol was not designed with error prone links in mind. This is manifest in some
design decisions (e.g., to run member (re-)inclusion at low priority, to use only a weak checksum
algorithm, and to not protect the token frame at all) and in the poor ring stability delivered by
the protocol both over RS-485 and wireless-type channels. The need for explicit token passing and
permanent ring maintenance makes the protocol vulnerable to frame losses or (undetected) bit errors.
101
0
2
4
6
8
10
0 0.02 0.04 0.06 0.08 0.1
normal protocol
new timeout
fast reinclusion
both improvements
PSfrag replacements
PLR
¯
N(3600)
Figure 5.23: ¯
N(3600) vs. PLR (independent packet losses) and BER of 103
In the case of the RS-485 link with hearback, it is especially the ring jacking scenario, which influences
ring stability, while in the wireless case it is the loss of token frames. In general, maintaining a
distributed state for ring membership is vulnerable in the presence of link errors.
The bad thing about ring instability is that it may take some time to re-include lost members. During
these outage times the stations are not allowed to transmit data, no matter how critical they are.
Another disadvantage of explicit token passing not discussed so far is the fact that it explicitly requires
a fully meshed topology, i.e., every station must be able to hear all other stations. Modifying the token
passing process such that partially meshed topologies are possible and can be integrated with a wired
PROFIBUS segment is at least challenging.
The proposed improvements can help a lot on wired-type links without packet losses, but as is shown in
Section 5.2, the protocol is vulnerable against packet losses on wireless-type links. The improvements
do not avoid them, but they just help in re-including a lost station faster. As we will show in Chapter
7, indeed for wireless-type media the stability problems are serious, hence, the realtime performance of
the PROFIBUS protocol is inferior in most cases as compared to the polling-based protocols discussed
there.
To summarize, the existing PROFIBUS protocol is not a good candidate for being used on top of a
wireless PHY.
102
0
0.2
0.4
0.6
0.8
1
1.2
0 0.02 0.04 0.06 0.08 0.1
normal protocol
new timeout
fast reinclusion
both improvements
PSfrag replacements
PLR
¯
M(3600)
Figure 5.24: ¯
M(3600) vs. PLR (independent packet losses) and BER of 103
103
Chapter 6
Error Behavior of Wireless
Channels and its Modeling
In order to design MAC and link-layer protocols with good real-time performance, it is vital to have
some understanding of the error patterns exhibited by the PHY or wireless link (this notion is used
as an abstraction of the ensemble of transmitter, receiver, spread spectrum modem, the channel, the
scrambler, high and intermediate frequency circuitry, and more). This knowledge is important for
several reasons:
The same protocol can show different behavior and performance for different error characteristics.
For example, the results presented in references [201] and [200] show that bursty (Markovian)
bit errors are beneficial for the performance of TCP, as compared to the case of independent
errors with the same mean bit error rate. For the PROFIBUS independent errors result in better
delay performance and stability of the logical token passing ring than bursty errors [189], as is
demonstrated in Chapter 5.
Advance knowledge of the error characteristics can help the protocol designer to select appro-
priate protocol mechanisms, e.g., to choose suitable forward error correction (FEC) schemes
or to find good rules on when to perform retransmissions. As a simple example, time-variable
characteristics call for adaptive mechanisms.
The application layer software and applications are affected, since, in contrast to cable-based
communications, they cannot assume channel outage conditions to be a rare exception. Instead,
safety critical applications over wireless links need to be designed taking longer channel outages
explicitly into account. As a prerequisite, the MAC protocol has to detect channel outages and
to signal these conditions to the applications.
It is often convenient for protocol designers to evaluate protocols with simulations before developing a
prototype implementation and performing complex measurements. A key part of such simulations are
link error models, which, in principle, determine for a transmitted packet which of the receiver stations
see bit errors; sometimes the exact position of errors within a packet is of interest. Often, simulation-
based performance evaluations are done with stochastic link error models, as a convenient alternative
to handling large and clumsy measurement traces. In these models a simple stochastic process, which
104
often can be described in terms of a few parameters, is used to generate bit error and packet loss
patterns. Most of the stochastic models are not designed to reflect or model physical phenomena, but
simply to reproduce the statistics of given or conjectured error patterns with some accuracy. Hence,
their computational complexity is typically low as compared to, e.g., realistic channel models based
on ray tracing. This simplicity is beneficial for packet level simulations, since often the model has to
be applied to hundreds of thousands of packets and simulation time is becoming an issue. However,
there is a tradeoff between a model’s complexity (and its number of parameters) on the one hand,
and its qualities in matching some given statistics or in giving good performance predictions on the
other hand.
This chapter provides the necessary input for the design of polling schemes for wireless PROFIBUS,
and for the stochastic channel models needed for simulation. The foundation is laid by a measurement
study of wireless link error characteristics in an industrial environment. The focus of this study is
not in “explaining” the results in terms of “physical” phenomena (such as noise sources, propagation
characteristics), but on the statistics of the packet loss and bit error patterns delivered by the wireless
link (via its interface provided by the baseband processor, see Section 6.2.1) to the MAC and link-layer
protocol. The results of these measurements are then used in different ways:
The results allow to draw some basic conclusions regarding the design of MAC and link-layer
protocols for a wireless PROFIBUS aiming at achieving good realtime performance.
It turns out that some of the popular wireless error models, e.g., the independent model and the
Gilbert/Elliot model, are not adequate. The statistics of errors generated by the models differ
significantly from those of the traces. Furthermore, for an example communication system, the
predictions of selected performance parameters as generated by the simple stochastic models
deviate significantly from those where a trace is used (see Section 6.7.3). This is taken as an
incentive to design an alternative type of error models, the “bipartite” model.
The measurement data provide “real-world” parameters for several stochastic models.
A structure for an overall channel model can be derived, clearly separating the issues of packet
losses and bit errors.
The measurements were done in an industrial environment to be relevant specifically for design and
simulation of MAC protocols for wireless fieldbus systems. In order to achieve long-term results and
to assess the influence of different parameters, we have chosen to restrict to a single scenario (short
distance non line-of-sight scenario in a factory building).
For the measurements an IEEE 802.11-compliant radio modem with DSSS modulation was used. It
was possible to obtain a chipset without any upper layer (MAC) functionality (Harris/Intersil PRISM
I chipset as MACless version [2] [73]). When this study started, this chipset was very popular and
used in commercial wireless local area network (WLAN) products.
The measurement setup is constructed such that there is no bias introduced by upper layer protocols
or operating systems. There is no MAC protocol nor any higher layer protocol, just a small engine
for generating well-known packets. Hence, it is possible to have fine grained control over timing and
content of the generated packets. But more important, by using a MAC entity would have introduced
undesired interaction with MAC mechanisms, e.g., packet discarding in case of wrong checksums or
illegal MAC header fields.
105
Clearly, the study has its limitations. It cannot be extrapolated simply to scenarios other than the
chosen one, nor to other radio modems. One fundamental reason for this difficulty are the unique
properties of wireless links in the 2.4 GHz range, as described in Section 6.1. An example is the phe-
nomenon of multipath fading. In general, the wave propagation environment (number of propagation
paths, their respective loss) and its time-varying nature (moving people, moving machines) play a
dominant role in constituting channel characteristics. It is far from being obvious or straightforward
how wave propagation characteristics or presence of noise / interference translate into error behavior.
However, we assume that with the chosen scenario some common characteristics of industrial environ-
ments are captured: presence of strong electrical motors, many (sometimes moving) metal surfaces,
moving people, and machines switching on and off. Furthermore we assume that, although the quan-
titative results like mean bit error rates are likely not valid for other environments, the qualitative
results (time-varying behavior; presence, burstiness behavior and order of magnitude of packet losses;
high variability of error burst lengths) will carry over to similar environments and are important for
designing MAC protocols. This assumption is confirmed by the fact that certain qualititative results
(regarding packet losses and time variability) were also obtained in a similar study in an industrial
environment [52], using a radio modem of a different manufacturer (see Section 6.6).
This chapter is structured as follows. First, in Section 6.1 a brief overview of the physical phenomena
occuring in wireless transmission and leading to transmission errors is given. The next four sections
are devoted to the link error measurements: Section 6.2 explains the measurement setup, Section 6.3
describes the approach for evaluating the measurements. After this, in Section 6.4 we describe the
industrial facility and the environment where the measurements were taken. Finally, in Section 6.5
an overview of the most important measurement results is given.
After reviewing the literature on other bit- and packet-level wireless measurement studies in Sec-
tion 6.6, stochastic modeling is discussed in Section 6.7. Following a brief overview of some popular
stochastic models (Section 6.7.1), an alternative (“bipartite”) model is introduced (Section 6.7.2),
which is intended to overcome some deficiencies of the popular models. The “performance” of differ-
ent stochastic models in matching the measurements statistics and in giving predictions of selected
performance parameters of an example system is investigated in Section 6.7.3.
The final Section 6.8 summarizes several conclusions drawn from the measurements. Beneath dis-
cussing the measurement results themselves (Section 6.8.1), the issue of stochastic modeling is summa-
rized in Sections 6.8.2 and 6.8.3. In the following Section 6.8.4 some implications of the measurement
results for the design of MAC and link-layer protocols are reflected.
Some parts of the work presented here can be found in reference [192].
6.1 Sources of Errors
In this section some basic physical phenomena of wireless transmission and how these lead to distorted
information reception are briefly reviewed. More thorough presentations can be found in [137], [28],
[124], [182], [77], [23], [145]. The main sources of errors are:
Path loss and attenuation on obstacles, leading to slow fading or shadow fading.
Reflexion, diffraction, refraction and scattering, causing transmission on multiple paths, resulting
in fast fading (or multipath fading) and intersymbol interference.
106
Adjacent channel or cochannel interference.
Thermal or man-made noise.
Imperfections of transmitter and receiver.
While thermal noise is present in almost every communication channel, fast fading and slow fading
are specific for wireless transmission. Man-made noise in industrial environments can have several
sources, e.g., remote controls, motors, or microwave ovens.
It must be noted that many of the physical aspects and the resulting bit error behavior depend on
the frequency, the modulation scheme used, and the current environment (e.g., distance, interferers,
number of different paths and their respective loss).
6.1.1 Path Loss
For isotropic antennas the path loss can be modeled approximately as (see Equation 2.8 in [182])
PR=PT·gT·gR·λ
4π2
·1
dγ
where gRand gTare the antenna gains of receiver and transmitter, PRand PTare the power levels at
receiver and transmitter, λis the wavelength, dis the distance between transmitter and receiver, and
γvaries between 2 (free space wave propagation) and 5 (strong attenuation, e.g., due to obstacles).
Some typical values for γare quoted from [145] in Table 6.1. While the details of this equation vary
with the propagation environment and the antenna technology, the qualitative behavior remains the
same: the path loss is at best quadratic in the distance between transmitter and receiver.1The path
loss can be shown to be a source of bit corruption even over short distances like in wireless LANs (see
Section 6.6). Furthermore, since wireless receivers typically require the signal strength to be above
some threshold value, cell bounds can be established this way [39]. Unfortunately, most of the path
loss models available (e.g., the Okumura/Hata model [63]) are targeted for larger distances of 1 - 10
kilometres.
The notion of slow fading refers to significant changes in the mean value of the received signal strength,
as they occur due to significant changes in distance between transmitter and receiver, or by moving
through tunnels or beyond large obstacles. Slow fading phenomena usually occur on longer timescales,
they often coincide with human activity (e.g., mobility). For short durations in the range of a few
seconds the channel can be assumed to have constant path loss. According to [126] in certain situations
the value of the receiver power level PRfluctuates according to a lognormal distribution about its mean
value.
6.1.2 Multipath Fading
Signals transmitted in the 2.4 GHz frequency band are subject to reflection, diffraction, refraction,
and scattering. An immediate result is that a signal may travel on multiple different paths from
transmitter to receiver (see Figure 6.1). Since these paths usually have different lengths, multiple
1For indoor scenarios sometimes path loss exponents γ < 2 were observed, see Table 6.1. Likely these are due to
constructive interference generated by the presence of multiple paths with only small delay difference.
107
Environment γ
Free-Space 2
Urban area cellular radio 2.7 - 4
Shadowed urban cellular radio 5-6
In-building Line of Sight 1.6-1.8
Obstructed In-building 4-6
Obstructed in factories 2-3
Table 6.1: Path loss exponents for different environments
Tx
Rx
Figure 6.1: Multipath fading
108
copies of the same signal with different phase angles overlap at the receiver (delay spread). This has
two consequences:
The resulting signal can be amplified or attenuated (constructive or destructive interference),
depending on the relative phase shift (signal strength variation). This fast fading may lead to
loss of received power of up to 40 dB.
The delay spread leads to intersymbol interference, since signals belonging to different informa-
tion symbols may arrive at the same time.
If the stations move relative to each other, the number of paths and their phase shifts vary in time,
thus giving a fast fluctuating signal strength at the receiver, however, with nearly constant mean value
on short timescales. The mean value may vary on longer timescales due to changes in distance or
moving beyond obstacles, both leading to slow fading.
If the delay spread is small relative to the duration of a channel symbol, the channel is called non-
frequency selective or flat, otherwise it is called frequency-selective.
In general it is a hard task to predict the number of paths and their relative strength, since accurate
information about the environment would be needed, including all objects, their material, trajectories
of moving people, and so forth. So usually one resorts to stochastic models, where the number of
paths, their phase shifts and relative strengths at the receiver are modeled as random variables and
the resulting signal strength/signal phase pair at the receiver as a complex-valued random process
{r(t)}tR. This random process is likely to show some correlation, since for continuous waveforms the
sum signal at the receiver is continuous, at least for time intervals where the number of paths does
not change. For the design of coding and modulation schemes, the following properties of {r(t)}tR
are of interest: the distributions of the phase Ψ(t) and amplitude A(t) corresponding to r(t), its level
crossing rate, and the fade duration. These are important since most wireless receivers require A(t) to
be above some threshold value Amin in order to be able to successfully detect and decode a signal. For
example, the fade duration can be helpful in designing interleavers. The level crossing rate is defined
as the rate with which the stochastic process {|r(t)|}tRtakes some fixed value r0with negative slope
(hence, entering a deep fade). The fade duration is defined as the duration that the process is below
some fixed value r0. A more thorough discussion about the computation of these values from the
signal strength process can be found in [147], [148] and [78]. According to [138], there are no general
expressions for the probability distribution of the fade durations available, not even for the popular
cases of Rayleigh fading and Rice fading, which are summarized below:
Rayleigh fading: in this case a large number of paths of nearly equal signal strength between
transmitter and receiver is assumed. Using the central limit theorem [47, chap. 10], the resulting
signal strength at the receiver can be shown to be a Rayleigh process and the resulting amplitude
is Rayleigh distributed. This model is often assumed for macrocellular environments without a
LOS path [145].
Rice fading: as opposed to the Rayleigh fading case, a dominant signal path is present, e.g.,
caused by a LOS path. Here the resulting signal strength is a Rice process and the amplitude
is Rice-distributed. The probability of having a LOS component increases with smaller cell
diameter [28, p. 34]. For this reason Rice fading is the more appropriate model for LAN
environments.
109
Some stochastic bit error models are built from the behavior of the fading process, e.g., the Wang-
/Moayeri model [184], which uses a Rayleigh fading assumption (see below).
6.2 Measurement Setup
In this section we give a brief overview on the measurement equipment. A more detailed discussion
of the setup can be found in [196] and [197].
6.2.1 IEEE 802.11 / PRISM I PHY
The IEEE 802.11 DSSS PHY is described in Section 3.2 in some detail. For building the measurement
setup we have used a MACless radio modem (based on the Harris/Intersil PRISM I chipset [2]), which
is compliant with the IEEE 802.11 DSSS PHY. It offers the following modulation types/bitrates: 1
MBit/s with differential binary phase shift keying (DBPSK), 2 MBit/s with differential quaternary
phase shift keying (DQPSK), 5.5 MBit/s with binary m-ary bi-orthogonal keying (BMBOK), 5.5
MBit/s with complementary code keying (CCK), 11 MBit/s with quaternary m-ary bi-orthogonal
keying (QMBOK) and 11 MBit/s with CCK. The CCK modes are compliant to IEEE 802.11, the
BMBOK and QMBOK modes are only present for compatibility reasons and were not used. It is
possible to attach two antennas to the modem and to use receiver diversity (i.e. the receiver selects the
antenna with the maximum signal level). The transmitter power was fixed at 18 dBm, corresponding to
63 mWatt. The radio modem basically consists of high frequency circuitry and a baseband processor.
The latter accepts and delivers a serial bit stream from upper layers, optionally performs scrambling
(employing a shift register with feedback), performs DSSS processing, and generates / receives PHY
packets [73]. The characteristics of the serial bit stream is the focus of interest.
The PHY packet format of the PRISM chipset is exactly the same as prescribed by the 802.11 standard
for the DSSS PHY (Figure 3.3 and Section 3.2). If the PLCP header checksum is wrong or the signal
field carries an unknown value, the whole packet is discarded by the baseband processor.
6.2.2 Measurement Setup
We used two dedicated stations, a transmitter station and a receiver station, which do not change
their roles during a measurement. The setup is sketched in Figure 6.2. The basic idea is that the
transmitter station sends a well-known packet stream over the wireless link, which is captured and
stored by the receiver station into a logfile. For generation and reception of the packets we have used
a microcontroller board carrying the radio modem and a separate processor (Motorola PowerQUICC
[111] with Tundra PCI Interface [171] and a 50 MHz PowerPC 603e processor). The coupling to
the (Windows NT-based) host is achieved with a segment of 64 kByte shared memory, denoted as
host interface. We call this board a wireless network interface card (NIC). The wireless NIC contains
a specific measurement application and neither MAC functionality nor any higher layer protocols.
This way we have fine grained control over the packet generation and reception process and no bias is
introduced by upper layer protocols. Specifically, using a MAC entity would have introduced undesired
interaction with MAC mechanisms, e.g., packet discarding in case of wrong checksums or illegal MAC
header fields, and might also have an unwanted influence on the packet’s sending time. Especially
110
Wireless NIC (Power QUICC)
Tx
TxNICCtrl
TxCtrl
Host-PC (Windows NT)
Wireless NIC (Power QUICC)
Rx
RxNICCtrl
RxCtrl
Host-PC (Windows NT)
Well-known Not so
Well-known
(Ethernet)
Synchronization Signals
Packet Stream
Packet Stream
Trace Files
?
Shared Memory
Hostinterface / Shared Memory
Hostinterface /
Figure 6.2: Measurement setup
the 802.11 MAC with its carrier sensing mechanism tries to avoid interferences, while for measuring
purposes it can be sometimes of interest to see their influence.
We briefly discuss the different software modules of our measurement setup, see Figure 6.2.
The Tx module is located on the wireless NIC of the transmitter station. It accepts config-
uration commands from the TxNICCtrl module discussed below (allowing to set the variable
parameters), and generates a well-known packet stream, see Section 6.2.3.
The Rx module is also located on the wireless NIC. Its main task is to capture packets from the
wireless link, to add metainformation (e.g., timestamps, packet size, reception status, and signal
strength) and to pass them to the host via the host interface (which puts them in a logfile). The
resulting stream of received packets is called a trace.
The TxNICCtrl module and RxNICCtrl module are wrappers which offer a command line inter-
face to the Tx module and Rx module.
The TxCtrl module is a script which synchronizes itself with the RxCtrl software for controlling
the measurements (using a TCP connection over the Ethernet).
The RxCtrl module is actually controlling a whole measurement. It loops over all desired values of
variable parameters; for each combination of parameters a packet stream is started (by triggering
the TxCtrl module) and the trace is logged onto the harddisk.
The evaluation of the traces is done off-line, employing several Perl scripts.
Our setup enables variation of several parameters, which are related both to the properties of the
radio modem and packet-stream generation. The important modem-related parameters are shown in
Table 6.2 and the set of packet-stream-related parameters is shown in Table 6.3.
111
Parameter Description
ScramblingEnabled Determines whether scrambling is used
DiversityEnabled Determines whether receiver antenna diversity is used
PreambleLength Number of bits for PHY preamble
ModulationCode Distinguishes modulation used for data portion: 1 MBit/s BPSK, 2
MBit/s QPSK, 5.5 MBit/s CCK, 5.5 MBit/s BMBOK, 11 MBit/s CCK,
11 MBit/s QMBOK
Table 6.2: Adjustable radio parameters
Parameter Description
NumPackets Number of Packets
GapTime Time gap between two packets
NumChunks Number of chunks per packet, Packet length = NumChunks times 288
bits
Table 6.3: Adjustable packet stream parameters
Our setup was tested in laboratory measurements and in other measurement campaigns [59] in con-
trolled environments and works fine. In several traces, either bit errors, packet losses, or other packet-
related phenomena occured at all, or it was possible to relate the observed phenomena to environmental
conditions.
6.2.3 Format of the Generated Packet-Stream
The transmitter station generates a packet stream. What the receiver captures after its activation is
called a trace. If no errors occur, the trace is the same as the packet stream. The format of the packet
stream was chosen such that
the number of 0’s and 1’s are equal
long runs of 0’s or 1’s are avoided
it suffices to have a fraction of the packet (denoted as a chunk) correctly received in order to
determine which packet it originally was.
Especially the last property enables bit-by-bit comparison of a received packet with the transmitted
packet.
The generated packet stream consists of a prescribed number of packets (according to the NumPackets
parameter), which are transmitted at equidistant start times, all packets having the same values for
all parameters, including packet size. The data part of a packet consists of an integral number of
chunks. For generating a chunk, every bit of a 32 bit sequence number is mapped to eight bits (with
07→ 11000011 and 1 7→ 00111100), making up 256 bits. Additionally, header (0xffff) and trailer
(0x0000) are generated, thus a chunk has an overall size of 288 bits. The sequence numbers are
incremented from chunk to chunk. For example, with NumChunks = 3 chunks per packet, the first
112
packet of a packet stream carries sequence numbers 0, 1, and 2, the second packet 3, 4, and 5 and so
forth.
It is immediately clear that this format has the same number of zero’s and one’s and that long runs of
either value are avoided. A received sequence of 288 bits length is considered as a correctly received
chunk, if header and trailer match their specified values and if all bytes in between are either 0xC3 =
11000011 or 0x3C = 00111100.
6.3 Measurement Evaluation Methodology
Much of the evaluation of the measurements uses the notion of indicator sequences or the more specific
binary indicator sequences. First the according definitions are given, then the use of such sequences
in measurement evaluation is described.
6.3.1 Indicator Sequences
In general, an indicator sequence is a finite sequence of natural numbers, the numbers in a binary
indicator sequence are restricted to the values zero and one. As a convention, in binary indicator
sequences we associate with a 1 an error event (e.g., an erroneous bit or a lost packet) and with a 0
the correct event. A binary indicator sequence can be viewed as finite subset of a sample path of a
random process {Bn}nN, where each Biis a Bernoulli random variable.
Binary indicator sequences are subdivided into error bursts and error-free bursts according to a burst
order k0. We define an error-free burst of order k0to be a maximum-length contiguous all-zero
subsequence with a length of at least k0+ 1. In contrast, an error burst of order k0is a subsequence
of at least one bit length and with ones at its fringes. Furthermore, within an error burst at most
k01 consecutive zeros are allowed.
By this definition a binary indicator sequence i1i2. . . imof mvalues length is segmented into p
alternating error bursts and error-free bursts (these definitions are similar to those used in [91]). The
length of the j-th error-free burst is denoted as Xj, the length of the j-th error burst is denoted as Yj,
and Zjis the actual number of ones occuring in the j-th error burst. We can form the burst length
sequence2:
X1, Y1, Z1X2, Y2, Z2. . . Xp, Yp, Zp
Let us denote the sequence X1X2...Xpas the error-free burst length sequence,Y1Y2. . . Ypas the error
burst length sequence and Z1
Y1
Z2
Y2...Zp
Ypas the error density sequence. As an example, take the binary
indicator sequence 00100101000110001100000. With burst orders of k0= 1 and k0= 2 we get the
burst length sequences
k0= 1 : 2,1,1 2,1,1 1,1,1 3,2,2 3,2,2 5,0,0;
k0= 2 : 2,1,1 2,3,2 3,2,2 3,2,2 5,0,0.
It is important to note that with only recording the number Zjof errors within error burst Yjwe
loose information about the exact error positions.
2We will write Xj= 0 or Yj= 0 to denote the absence of a burst at the fringes of a binary indicator sequence.
Furthermore, the notation does not explicitly indicate indicate the dependence on k0.
113
Using the notion of burst length sequences, some simple statistics can be computed, e.g., the mean
error rate ¯eor the mean error burst length ¯
Y
¯e=Pp
j=1 Zj
Pp
j=1(Xj+Yj),¯
Y=1
p
p
X
j=1
Yj.
Accordingly, some other simple first order statistics [variance, coefficient of variation (CoV)] can also
be computed for the burst length sequence.
Taking a binary indicator sequence i1i2...imas a sequence of identically distributed Bernoulli random
variables, the conditional probability Pr[in+k= 1|in= 1] for 1 kmnis of some interest. It is
approximated as follows (frequency-based approach):
Pr[in+k= 1|in= 1] #cases with in= 1 and in+k= 1
#cases with in= 1
=Pmk
j=1 ij·ij+k
Pp
j=1 Zj
.
This conditional probability is related to the correlation function of the binary indicator sequence,
since, with the assumption of equally distributed ik(with mean ¯eand variance σ2= ¯e(1¯e)) we have:
Corr[in, in+k] = Cov[in, in+k]
σ2σ2
=E[inin+k]E[in]E[in+k]
¯e(1 ¯e)
=E[inin+k]¯e2
¯e(1 ¯e)
=Pr[in+k= 1|in= 1]¯e¯e2
¯e(1 ¯e)
Pr[in+k= 1|in= 1]
where the approximation holds for small ¯evalues. Here we have used that
E[inin+k] = X
x,y∈{0,1}
xy Pr[in=x, in+k=y]
= Pr[in= 1, in+k= 1]
= Pr[in+k= 1|in= 1] ·Pr[in= 1]
= Pr[in+k= 1|in= 1] ·¯e
A more in-depth treatment of binary indicator sequences can be found in [85].
6.3.2 Trace Evaluation
Given a single trace, the focus of interest is on the packet losses and on the bit error behaviour of the
packets actually received. Both are expressed as binary indicator sequences. In a preprocessing step
other packet impairments (e.g., ghost packets, truncated packets, bit-shifted packets) are identified
and the corresponding packets are marked as lost packets (more details and a justification are given
in Section 6.5.1).
114
The packet loss indicator sequence (PLIS) of a single trace is constructed by marking lost packets
with a 1 and received packets with a 0. This sequence only displays lost packets while ignoring bit
errors: received packets of correct length but with bit errors are marked with a 0.
The bit error indicator sequence (BEIS) of a single trace is constructed by XORing every received
(i.e., possibly erroneous) packet riwith its corresponding transmitted (error-free) packet ti(i
{1,...,NumPackets}):
νi=riXOR ti
The results ν1. . . νNumPackets then are simply concatenated in the order of increasing packet numbers.
In the BEIS any information about packet boundaries, lost packets, or packet gap times is completely
ignored.
The BEIS can be seen as the available input of a MAC protocol or a coding scheme.
While the PLIS is typically analyzed with k0= 1, for the BEIS several values of k0were used to get
more insight into the burst structure.
6.4 Measurement Parameters and Environment
To avoid confusion, we will use the following definitions: a measurement campaign consists of one or
more measurements.
The set of tunable parameters is given in Tables 6.2 and 6.3. For a single measurement, a subset of
these parameters is kept fixed, while the remaining parameters are variable. Furthermore, a suitable
range of values for the variable parameters must be chosen. Two measurements are distinguished by
their choice of variable parameters and parameter ranges. Within a measurement, for each parameter
setting a packet stream is generated. Hence, within a packet stream all packets have the same
parameters and are transmitted at equidistant times.
We have conducted two measurement campaigns in an industrial environment, namely at the Pro-
duktionstechnisches Zentrum (PTZ) in Berlin, Germany. The PTZ is a research facility for machinery
engineering, supported by industry and academia. The first campaign was performed on June 26,
2000 and its main purpose was to evaluate our measurement setup and to find out which phenomena
are important [196]. The second campaign took place from Aug. 28 to Aug. 30, 2000 [197]. The focus
here is on the second campaign.
6.4.1 Environment
The PTZ owns a large factory building which contains several machines of different types and with
people walking around all the time. The ground plan of the building has the shape of a circle. At the
fringe of the circle is a path which can be used by small vehicles, while the inner circle contains the
machinery. During both campaigns we have chosen the same positions for placing our setup within
the building. The choice came from asking the PTZ people where they would place both stations (for
a discussion of whether the chosen position was a “best case” or “worst case” position see Section
6.8.1). In Figure 6.3, we show the relative position of our measurement equipment in the factory
building, while in Figure 6.4 we show the close neighbourhood, especially the machines that are in
close proximity. We have investigated a NLOS scenario, with a closet in between the transmitter and
115
Machinery Area
Our Setup
Portal Crane
Figure 6.3: Position of our setup within the building
receiver station, and the die sinking electrical discharge machine (EDM) working area very close to the
direct line. Both stations are 7-8 meters apart and stationary during the measurement campaigns.
The receiver station was in close proximity (1 m) to a cabinet containing the power supply for a
huge 5 axis milling machine, which, however, was not operating during the second campaign. The
die sinking EDM was active most of the time, except when changing the workpiece. A second EDM
machine was located behind the first one (see Figure 6.4). It was used by PTZ staff almost all the
time. At the ceiling, in a height of 8 meters, was a portal crane, capable of moving around 20 tons.
Its motors are placed at the fringe’s end of the portal crane. The crane was used during the first two
days of the second campaign.
Instead of investigating different scenarios with a restricted set of measurements, we have chosen
to focus on the single scenario described above. This concentration allowed us to get a more in-
depth insight into different aspects of wireless transmission (e.g., long-term behavior). The reason
for choosing a NLOS scenario is that the measurement results should help in the design of MAC
protocols and coding schemes for industrial WLANs, where the response to bad channel conditions is
of particular interest for assessing protocol’s reliability.
6.4.2 Parameters
The second measurement campaign was designed to assess: a) the packet loss and packet impairment
behavior on short and long timescales, b) the long-term bit error rate behavior, c) the dependency of
the bit error behavior on packet sizes and modulation types, and d) the dependency of packet losses
and impairments and bit error behavior on the scrambling mode. Furthermore, no interferers in the
same frequency band (e.g., IEEE 802.11 LANs) were present.
We have performed three different measurements within the second campaign: the longterm1 mea-
surement is a long-term measurement performed with a single modulation type and packet size, only
116
Milling Machine
Power Supply Cabinet
Closet
Die Sinking
Electrical Discharge Machine
Working Area
Rx
Tx
5 axes milling machine
A really HUGE
Fringe Path
Electrical Discharge Machine
Figure 6.4: Setup of PTZ measurement
varying the scrambling mode (addressing a), b), and d)). The longterm2 measurement is the same
as the longterm1 measurement, however, another pair of PRISM radio modems was used. In the
factorial measurement we have varied the scrambling mode, modulation type and packet sizes (thus
addressing c)), and for each combination of parameters the short-term bit error behavior was in-
vestigated. The main purpose of the longterm2 measurement was to confirm that the observed
phenomena are not due to the particular pair of radio modems used. Indeed, our results confirm this
belief and allow us to restrict the discussion to the results obtained with the first pair of radio modems
(the longterm1 measurement and factorial measurement) [194].
We have chosen for the longterm1 and longterm2 measurements to keep all parameters fixed,
except the scrambling mode (on, off) and the pair of radio modems used (see Table 6.4). In both
measurements we have taken 90 traces for every scrambling mode. With 90 traces, 2 hours and 10
minutes are covered. Within a measurement the traces are numbered consecutively, thus increasing
trace numbers corresponds to increasing time. Within the longterm1 measurement the first 90 traces
are taken without scrambling, the other 90 traces with scrambling. For the factorial measurement
we have chosen a full factorial design [75, chap. 16], varying the modulation type, packet size, and
the scrambling mode as summarized in Table 6.6, while keeping the other parameters fixed (Table
6.5). For every combination of parameters two traces were taken. Depending on modulation type and
packet size the trace duration is between 30 seconds and 1000 seconds. Traces 1 to 56 are taken
without scrambling, the traces 57 to 112 with scrambling. Within each of the two groups we have
varied the modulation scheme from low bitrates to high bitrates and for each modulation scheme we
have varied the packet sizes from small packets to large packets.
117
Parameter Value
PreambleLength 128 bits
DiversityEnabled True
Frequency/Channel 12
NumPackets 20000
NumChunks 14 (504 bytes)
GapTime 1000 µs
ModulationCode 2 MBit/s QPSK
Table 6.4: Fixed parameters for longterm1 and longterm2 measurements
Parameter Value
PreambleLength 128 bits
DiversityEnabled True
Frequency 12
NumPackets 20000
GapTime 1000 µs
Table 6.5: Fixed parameters for factorial measurement
Parameter Value
ScramblingEnabled True, False
ModulationCode 1 MBit/s BPSK, 2 MBit/s QPSK, 5.5 MBit/s CCK, 11 MBit/s CCK
NumChunks 3, 9, 14, 28, 56, 112, 167 (corresponding to 108, 324, 504, 1008, 2016,
4032, 6012 bytes)
Table 6.6: Variable parameters for factorial measurements
118
6.5 Measurement Results
In this section we discuss the most important measurement results. A more in-depth presentation can
be found in [192], and [197], which in turn relies on data presented in [194], [195], and [193].
6.5.1 Packet-Related Phenomena
There are three major types of transmission errors (compare the PHY PPDU format described in
Section 3.2): Failure to acquire bit synchronization or to properly detect the start frame delimiter, an
error in the header fields (e.g., wrong value in signal field or CRC error), and bit errors in the packet’s
data part.
Failing to acquire bit synchronization leads to packet losses, discussed in Section 6.5.2. An error in
the header fields leads to other packet-related phenomena (ghost packets,missized packets), which are
discussed in more detail in [197, chap. 2 and 5].
It is possible to distinguish between getting no bit synchronization and the other packet-related phe-
nomena, since the baseband processor generates an interrupt when it has acquired bit synchronization
and detected the SFD field. For lost packets this interrupt is missing. Therefore, we can conclude
that packet losses are due to not acquiring bit synchronization.
An interesting phenomenon are packets with bit shifts. In this case, the baseband processor delivers
the correct packet length and an appropriate number of bits. However, somewhere in the packet’s data
(typically at the beginning of a packets data part, but this is not a general rule) some random bits are
inserted into or deleted from the bit stream, and a corresponding number of bits is truncated or added
to the packets end. As a result, the following bit sequence is a left- or right-shifted version of the
original sequence. We have no validated explanation for this phenomenon, but after several personal
discussions with independent experts in communication systems the most educated guess is that it is
due to the bit synchronization algorithm used in the receiver. The “bit shift patterns” change when
the scrambling mode or the environment is varied (see [197], [59]), hence, bit-shifts are influenced by
the environment. Furthermore, in other measurements no bit-shifted packets were observed over long
time-spans [59]. Both circumstances do not fit together with the assumption of a (permanently) faulty
setup, hence, we assume our setup being ok.
For every trace of the longterm1 measurement the fate of every of its packets was determined, and
the “fate sequences” of all traces were concatenated in order of increasing trace numbers. The possible
fates are: packet ok (this does not imply the absence of bit errors), packet lost, bit-shifted, missized,
ghost packet. Then we have investigated bursts of subsequent packets having the same fate. In Table
6.7 some simple statistics of these bursts are summarized. There are several important points. First,
the rates of bit-shifted, missized (truncated and oversized) or ghost packets are negligible. Second,
the phenomena of truncated, oversized, and ghost packets tend to occur in single packet bursts,
as indicated by their small variations and means. The occurence of bit-shifted packets is slightly
more bursty. The overall packet loss rate (PLR) is 6.3%, which is non-negligible and needs to be
considered in accurate channel models.
Ghost packets, bit-shifted packets and missized packets are treated as lost packets. This is reasonable,
since their rates are low and they have the tendency to occur paired with lost packets [197, chap. 5.1].
119
% packets BL: mean BL: CoV BL: max
packet ok 93.5144 51.1254 20.0712 101158
packet loss 6.2789 3.6173 17.2987 14936
bit shifted 0.0324 1.5116 0.8980 20
truncated 0.0383 1.0430 0.1946 2
oversized 0.0197 1.0456 0.2064 3
ghost packet 0.1160 1.0278 0.1880 4
Table 6.7: First order statistics of compound packet fate sequence of longterm1 measurement (BL:
burst length)
0
0.2
0.4
0.6
0.8
1
0 20 40 60 80 100 120 140 160 180
Loss Rate
Trace Number
Figure 6.5: Packet loss rate vs. trace number for longterm1 measurement
6.5.2 Packet Losses
In this section we discuss the packet loss behavior found in the longterm1 measurement, as manifested
in the corresponding packet loss indicator sequence (PLIS), see Section 6.3.2. Instead of “error bursts”
for the PLIS we will use the term “packet loss bursts”.
For the receiver station a lost packet is indistinguishable from the case that no packet was sent at
all. Therefore, for detecting lost packets the timestamps of the packets in the logfile were used (these
are generated by the receiver station upon finishing packet reception), together with the property
of equidistant packet start times. The difference of subsequent timestamps is compared with the
InterpacketTime (given as the sum of the GapTime, the time needed for the fixed length header (see
Figure 3.3), and the known duration of the data part). From this comparison the number of lost
packets is easily computed.
The packet loss rates are time-varying. To show this, in Figure 6.5 the PLR of individual traces
vs. the trace number for the longterm1 measurement is presented; it should be noted that this
figure spans more than four hours. The PLRs are sometimes very high (more than 80%) and strongly
120
0
0.5
1
1.5
2
0 20 40 60 80 100 120 140 160 180
Position
Trace Number
Figure 6.6: Position of portal crane (0=close proximity, 1=short distance, 2=longer distance) for
longterm1-measurement
varying. A possible explanation offers Figure 6.6, where the “portal crane function” for the longterm1
measurement is shown. This function displays the distance of the portal crane to our setup (0 =
directly above the setup, 1 = no more than five meters away, 2 = more than five meters away). It can
be seen that, except for a peak at traces one and two, the PLRs have the highest rates and the highest
degree of fluctuation when the portal crane is close to the setup. During the factorial measurement
the portal crane was not active and the PLR was always below 10%.
Next we consider the “burstiness” of packet losses. To get summary information, we have formed the
compound packet loss indicator sequence (COMP-PLIS) by concatenating the PLIS of all longterm1
traces in order of increasing trace number. The COMP-PLIS is a binary indicator sequence. It is
analyzed with burst order k0= 1 (only consecutive packet losses belong to the same packet loss burst).
Fraction of Received Packets 93.5144%
Fraction of Lost Packets 6.4855%
Received Packets: Mean BL 51.1254
Lost Packets: Mean BL 3.5457
Received Packets: CoV BL 20.0712
Lost Packets: CoV BL 17.1956
Received Packets: Max. BL 101158
Lost Packets: Max. BL 14936
Pr[Packet n+ 1 lost|Packet nlost] 0.7179
Pr[Packet n+ 1 received|Packet nreceived] 0.9804
Table 6.8: First order statistics of burst lengths (BL) of the longterm1 measurement (COMP-PLIS,
k0= 1)
121
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
10 20 30 40 50 60 70 80 90 100
Packet Loss Bursts
Packet Loss Free Bursts
PSfrag replacements
Burst Length in # of Packets
Pr[Burst Length x]
Figure 6.7: Cumulative distribution functions of packet loss burst lengths and packet loss-free burst
lengths (for COMP-PLIS)
factorial longterm1 longterm2
w scrambling 9885 34755 26392
w/o scrambling 20411 191456 45159
Table 6.9: Number of lost packets
The main statistics of the packet-loss-burst lengths and packet-loss-free burst lengths are summarized
in Table 6.8, and their respective distribution functions are shown in Figure 6.7. The overall packet
loss rate (PLR) is 6.4% and thus non-negligible. The packet-loss bursts are typically short (95%
of all bursts last ten packets or less), but their lengths are highly variable, and very long bursts can
be observed (long tailed distribution). The packet-loss-free burst length distribution is even more
variable, has a higher mean value and a longer tail, which fortunately leads to long periods of no
packet losses. Another view of the burstiness of packet losses is given by the conditional probability
that packet n+kis lost given that packet nis lost, shown in Figure 6.8: it decays monotonically
from 0.71 for k= 1 to 0.44 for k= 2000. Hence, by the approximation given in Section 6.3.1,
packet losses are strongly correlated over several hundreds of packets. In the uncorrelated case the
conditional probability would be equal to the packet loss rate of 6.3% for all k.
Let X1...Xpand Y1...Ypbe the corresponding packet-loss-free burst lengths and packet-loss burst
lengths. In order to find out whether the burst length sequences X1X2...Xpand Y1Y2. . . Ypshow
correlation, their autocovariance functions RX(k) and RY(k) were computed, using a standard ap-
proximation formula [18]:
RX(k)rk=ck
c0
(k {1,...,p1})
122
0
0.2
0.4
0.6
0.8
1
0 200 400 600 800 1000 1200 1400 1600 1800 2000
PSfrag replacements
k
Pr[Packet n+klost |Packet nlost]
Figure 6.8: Conditional probability that packet n+kis lost given that packet nis lost (for COMP-
PLIS)
where
ck=1
p
pk
X
t=1
(Xt¯
X)(Xt+k¯
X)
The rule of thumb is that if for some k1 the absolute value of rkexceeds 0.2 then there is more
than weak correlation and thus the Xicannot be independent. In Figure 6.10, the autocovariance
function for the packet loss-free burst length sequence X1...Xpis shown, in Figure 6.9, the same is
displayed for Y1...Yp. The conclusion is that the packet loss burst lengths are uncorrelated (and thus
can be modeled as independent), while for the packet loss-free burst lengths there is more than weak
correlation on the first five lags, then correlation gets weak.
A surprising observation is documented in Table 6.9, which shows the overall number of lost packets
for the different measurements with and without scrambling. Packet losses occur significantly more
often if scrambling is disabled. Furthermore, not shown here, with scrambling the packet loss bursts
are typically shorter than without scrambling. We have not found any clear dependency of packet
losses on the modulation scheme used.
123
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20 25 30 35 40 45 50
ACVF
Lag
Figure 6.9: Autocovariance function of packet loss burst lengths for the compound loss sequence and
k0= 1 for longterm1 measurement (for COMP-PLIS)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20 25 30 35 40 45 50
ACVF
Lag
Figure 6.10: Autocovariance function of packet loss-free burst lengths for the compound loss sequence
and k0= 1 for longterm1 measurement (for COMP-PLIS)
124
0
10
20
30
40
50
60
70
0 500 1000 1500 2000
Frequency
Bit Position of Error
Figure 6.11: Positions of bit errors, factorial trace 83 (QPSK modulation, with scrambling, 6012
bytes packet size)
6.5.3 Positions of Bit Errors
Bit errors do not occur in all positions of a packet’s data part with equal probability. This is ex-
emplarily shown in Figure 6.11 for a QPSK trace (factorial measurement) and in Figure 6.12 for
a 5.5 MBit/s CCK trace, where for the first 2000 bit positions within a packet the number of bit
errors occuring at this position during a trace is displayed. These figures are representative of the
patterns occuring for the respective modulation types (provided that one looks at those traces where
the number of errors is sufficiently high; clearly, for traces with only a few errors the figures appear
more irregular).
There is a peak at the beginning of a packet’s data part. For QPSK and BPSK traces without
scrambling it is frequently found between bit 200 and 250, for traces with scrambling often a peak
at positions 80 to 120 is present.
The figures for the BPSK and QPSK traces show some periodicity. From inspection, for BPSK traces
the basic period is 64 bits, for QPSK it is 128 bits. This periodicity is visible with and without
scrambling, however, as a visual impression, with scrambling the effect seems to be more pronounced.
In [197] some figures are presented which, for selected QPSK traces, show the conditional probability
that bit n+kis wrong given that bit nis wrong (calculated over the respective bit error indicator
sequence (BEIS), see Section 6.3). These figures indicate that indeed often bit errors have a distance
of 128 bits.
As for the bit-shifted packets, we have no validated explanation for both phenomena, but again we
find it highly probable that they are due to artifacts of the wireless receiver’s bit synchronization
algorithm.
125
0
50
100
150
200
250
300
350
400
0 500 1000 1500 2000
Frequency
Bit Position of Error
Figure 6.12: Positions of bit errors, factorial trace 37 (5.5 MBit/s CCK modulation, no scrambling,
2016 bytes packet size)
Modulation MBER w/o scrambl. MBER w/ scrambl.
BPSK 2.5571e-05 0.0003
QPSK 7.4428e-05 0.0001
CCK (5.5 MBit/s) 0.0018 0.0399
CCK (11 MBit/s) 0.0544 0.0589
Table 6.10: Mean Bit Error Rates for different modulation types (factorial measurement)
6.5.4 Mean Bit Error Rates
The 11 MBit/s CCK and 5.5 MBit/s CCK traces are excluded from further discussion, since they are
extremely error prone. For example, in trace 52 (11 MBit/s CCK, 2016 bytes packet size, without
scrambling) 19729 out of 20000 packets do not contain a single well-formed chunk.3Many of the
5.5 MBit/s CCK traces are also extremely error prone (specifically with scrambling) and thus are
excluded.
The mean bit error rate (MBER) per trace is time-varying over several orders of magnitude, even for
the same modulation type and packet size. To illustrate this, we show in Figure 6.14 the mean bit
error rate (MBER)s for the longterm1 measurement; it should be noted that this figure spans more
than four hours (see Section 6.4.2). The MBERs reach higher values more often for the traces with
scrambling4(traces 91 to 180); our data supports this. For the factorial measurement the MBER
3For many packets it was not possible to compute the error rate, since it was not possible to determine the corre-
sponding expected packet. For those packets where the expected packet could be guessed, bit error rates of 25% to 30%
are easily reached.
4This is true for both the factorial and longterm1 measurements, taken with the same modem set. For the
longterm2 measurement both scrambling modes show approximately the same mean bit error rate. A possible expla-
nation is as follows: the scrambler XORs the received bit stream with the (already internally XORed) contents of a shift
126
1e-06
1e-05
0.0001
0.001
0.01
0 10 20 30 40 50 60 70 80 90
Mean BER
Trace Number
Figure 6.13: Mean bit error rate vs. trace number for remaining traces (logarithmic scale)
1e-06
1e-05
0.0001
0.001
0.01
0 20 40 60 80 100 120 140 160 180
Mean BER
Trace Number
Figure 6.14: Mean bit error rate vs. trace number for longterm1 measurement (logarithmic scale)
127
vs. trace# graph (shown in Figure 6.13 for the BPSK and QPSK traces) has the same characteristic
of spanning several orders of magnitude.
From looking at the MBERs, only a few clear patterns emerge: MBERs seem to be higher with
scrambling enabled, and furthermore, the MBERs increase with transmission speed, as is shown in
Table 6.10. The BPSK modulation shows the best error rates, followed by the QPSK scheme. Other
patterns, e.g., dependency on packet sizes, were not clearly visible; they are likely overshadowed by
the inherently time-varying nature of the link.
6.5.5 Burst Length Statistics
As described in Section 6.3.2, for every trace its bit error indicator sequence (BEIS) i1i2. . . imwas
formed. Hence, for some given burst order k0there is a number of perror-free bursts and error bursts,
their respective lengths are given by X1X2...Xpand Y1Y2...Yp.
In Figure 6.15 the mean error burst length of a trace vs. k0is shown for some BPSK traces, while in
Figure 6.16 the same is shown for selected QPSK traces. The respective curves are typical for BPSK
and QPSK traces. In general, clearly the mean error burst length increases when increasing k0from k0
0
to k00
0> k0
0. The same is true for the mean error-free burst lengths, since the error-free bursts of length
lwith l > k00
0survive as they are, while the error-free bursts of length lwith k0
0< l < k00
0disappear
and are not considered in mean burst length calculation. From Figure 6.16, for QPSK traces one can
observe a trend to “step functions” with the steps having a distance of 128. After inspection of
several traces it shows that this behavior is due to the periodicity of bit errors described in Section
6.5.3. A similar behavior, however, with a period of 64, can be observed for BPSK traces (for other
traces not shown in Figure 6.15 the “step function” character is more pronounced).
It is interesting to look along the time axis: for selected values of k0in Figure 6.17 the mean error burst
length vs. the trace number for the longterm1 measurement is displayed. It can be seen that, even
for all parameters fixed, the mean error burst lengths vary substantially over time (the mean error-free
burst length for fixed k0fluctuates over several orders of magnitude, not shown here). Furthermore,
as already seen for the mean bit error rates in Figure 6.14, the mean error burst lengths are higher for
scrambling enabled. Thus, the bit error characteristics vary not only on short timescales (from burst
to burst, range of milliseconds), but also on larger timescales (trace order, range of minutes).
In Figures 6.18 and 6.19 we show for selected BPSK and QPSK traces of the factorial measurement
and several values of k0the coefficients of variation (CoV) of the error burst length and error-free burst
length distribution of the respective BEIS. The variability of the error-free burst length distributions
is much higher than that of the error burst length distributions (the latter are typically between 1
and 3). This is due to very long periods of no errors within the respective BEIS. For the error-free
burst length distributions, there is a tendency of increased variability for increased packet sizes. This
is likely due to the tendency of bit errors to cluster at the beginning of packets (see Section 6.5.3): for
large packets the packet beginnings have a larger distance in the BEIS, hence, likely longer error-free
bursts occur, which increases the variance. Furthermore, the CoV of the error-free burst lengths is
larger for BPSK as compared to QPSK. A likely reason is the typically lower MBER of BPSK traces,
which leads to longer error-free bursts, the latter increasing the variance. All observations are also
true for the other BPSK and QPSK traces.
register of eight bits depth. The shift register changes its content with every bit depending on the incoming bit stream.
Hence, an error in the bit stream propagates into the shift register and may influence the following eight bits.
128
0
100
200
300
400
500
600
700
0 50 100 150 200 250 300 350 400 450
small packets, no scrambling (trace 2)
small packets, w. scrambling (trace 58)
large packets, no scrambling (trace 13)
large packets, w. scrambling (trace 69)
PSfrag replacements
k0
Mean Error Burst Length
Figure 6.15: Mean error burst length vs. k0for selected BPSK traces BEIS (factorial measurement)
0
50
100
150
200
250
300
350
400
450
500
0 50 100 150 200 250 300 350 400 450
small packets, no scrambling (trace 18)
small packets, w. scrambling (trace 72)
large packets, no scrambling (trace 27)
large packets, w. scrambling (trace 84)
PSfrag replacements
k0
Mean Error Burst Length
Figure 6.16: Mean error burst length vs. k0for selected QPSK traces BEIS (factorial measurement)
129
0
50
100
150
200
250
0 20 40 60 80 100 120 140 160 180
burst order 1
burst order 8
burst order 15
burst order 50
burst order 100
burst order 150
PSfrag replacements
Trace Number
Mean Error Burst Length
Figure 6.17: Mean error burst length vs. trace number for selected k0(for BEIS of longterm1
measurement)
0
20
40
60
80
100
120
0 50 100 150 200 250 300 350 400 450
small packets, no scrambling, error bursts (trace 2)
small packets, w. scrambling, error bursts (trace 58)
large packets, no scrambling, error bursts (trace 13)
large packets, w. scrambling, error burstsi (trace 69)
small packets, no scrambling, error free bursts (trace 2)
small packets, w. scrambling, error free bursts (trace 58)
large packets, no scrambling, error free bursts (trace 13)
large packets, w. scrambling, error free bursts (trace 69)
PSfrag replacements
k0
CoV
Figure 6.18: Coefficients of variation (CoV) of error burst lengths and error-free burst lengths vs. k0
for selected BPSK traces BEIS (factorial measurement)
130
0
10
20
30
40
50
60
70
0 50 100 150 200 250 300 350 400 450
small packets, no scrambling, error bursts (trace 18)
small packets, w. scrambling, error bursts (trace 72)
large packets, no scrambling, error bursts (trace 27)
large packets, w. scrambling, error bursts (trace 84)
small packets, no scrambling, error free bursts (trace 18)
small packets, w. scrambling, error free bursts (trace 72)
large packets, no scrambling, error free bursts (trace 27)
large packets, w. scrambling, error free bursts (trace 84)
PSfrag replacements
k0
CoV
Figure 6.19: Coefficients of variation (CoV) of error burst lengths and error-free burst lengths vs. k0
for selected QPSK traces BEIS (factorial measurement)
Modulation len=1 len=2 len >2
BPSK/wo 95782 657 250
BPSK/w 987863 615 379
QPSK/wo 191740 65611 7979
QPSK/w 264891 123225 13833
Table 6.11: Burst lengths of error burst with density one for QPSK and BPSK traces
6.5.6 Error Densities and Error Clustering
In this section we investigate briefly the error densities. As a result, we achieve statements about how
many errors occur within a small number of bits. This information is interesting for the design of
FEC codes, since it provides the design goals for FEC codes.
First we consider the BPSK traces. For every trace and several burst orders k0we have calculated
its error density sequence Z1
Y1...Zp
Ypfrom the trace’s BEIS. Following this, we have merged the error
density sequences of all the BPSK traces without scrambling into a single set of values which, after
proper renaming, are denoted as Z1
Y1...Zp0
Yp0. In Figure 6.20 the cumulative distribution function
D(x) = Pr[Zi
Yi< x] of this single set is shown for different burst orders. The curve for burst order
k0= 60 is representative for all burst orders k0<64.5Accordingly, the curve for burst order k0= 70
matches very closely the curves for 64 < k0<128, and the curve for k0= 130 is representative for
the curves with k0>128. It can be seen that for the short burst order k0= 60 most of the mass is on
density one, in almost all cases corresponding to single bit errors (see Table 6.11), while the remaining
densities have comparably low frequencies. This means that in almost all cases a single bit error is
surrounded by two error-free bursts of at least 61 bits length. The results shown for the larger burst
orders k0do not contradict this finding, since the mass shifted from density 1 to densities of below
5For the BPSK traces k0= 64 is a critical value because of the 64 bit periodicity described in Section 6.5.3.
Correspondingly, for QPSK k0= 128 is a critical value.
131
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
Burst Order 60
Burst Order 70
Burst Order 130
PSfrag replacements
Density
Pr[Density < x]
Figure 6.20: Cumulative distribution function for the error densities of all BPSK traces without
scrambling
5% corresponds to those bursts where by the 64 bit periodicity phenomenon a single burst consists of
two (or very few) erroneous bits with distance of 64 bit.
In Figure 6.20 the cumulative distribution function D(x) for the QPSK traces without scrambling and
burst orders k0= 120 (below the critical value k0= 128), k0= 130 and k0= 260 is shown. Again,
the curves are typical for the curves in their respective classes. Of special interest are those bursts
with densities between 10% and 80%, which make up 20% of all bursts.6An amount of 90% of
these bursts have a length of 16 bits or fewer, with the dominant burst lengths being 4, 14 and 16
(the other lengths are insignificant). From the bursts of 14 or 16 bits length, 94% have only two bit
errors (at both ends of the burst).
6The mass shift from density one to densities below 5% when switching from k0= 120 to k0= 130 can be explained
by the 128 bit periodicity typical for QPSK.
132
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
Burst Order 100
Burst Order 120
Burst Order 130
Burst Order 260
PSfrag replacements
Density
Pr[Density < x]
Figure 6.21: Cumulative distribution function for the error densities of all QPSK traces without
scrambling
133
6.6 Review of other Measurement Studies
In this section for the sake of reference and completeness some other measurement studies are summa-
rized, with focus on packet level or bit level measurements. Lower level (wave propagation) measure-
ments of indoor scenarios are discussed in references [17] (measurements on channel impulse response),
and [7] (overview of propagation measurements and models). In the following, we restrict the discus-
sion to indoor measurements.
Within the FUNBUS project some measurements with an IEEE 802.11-compliant DSSS PHY were
carried out [52, chap. 9 and 10]. Namely, the Silver Data Stream radio modem [157] was used. Their
measurement setup (developed at the ifak, Institute for Automation and Communication, Magde-
burg/Germany) has similarities to ours: MACless radio modem, dedicated transmitter and receiver
stations, packet stream with equidistant start times and well-known packet contents (3, 64 and 252
bytes long packets). All measurements were performed without diversity, BPSK modulation, and
scrambling enabled; the transmitter power was not given. Their setup was able to distinguish be-
tween lost packets, truncated packets (data part too short), erroneous packets (of correct length but
with bit errors) and correct packets. In an outdoor line of sight (LOS) scenario their setup showed no
transmission errors for distances up to 800 meters, hence it can be assumed to work properly. Four
scenarios were investigated: a) an undistorted free-space scenario (outdoor LOS); b) a residential area
with a building, a sports field, parking cars and trees; c) two indoor scenarios: within a flat and in an
industrial environment.
In the outdoor LOS scenario without interferers, as long as the weather was fine, no errors were
found for distances up to 800 meters, when there is fine weather. However, on a foggy day for
distances higher than 530 meters significant fractions of packet losses (more than 90% at 730
meters distance) and packet truncations (up to 20%) can be observed.
In the residential area scenario up to 100% packet losses were observed in a non line of sight
(NLOS) setting with 30 meters distance. For a LOS setting with 100 meters distance, the
transmission was error-free. However, when a metallic fence was put between receiver and
transmitter, 97% of the packets were lost.
In the flat scenario, when both stations were moved within the flat, virtually no errors occured,
regardless whether there was a LOS or NLOS connection. However, when moving one station
to another floor, very high packet loss rates (up to 100%) and packet truncation rates (up to
30%) were observed, and the results were varying.
The industrial environment scenario was located in a fabric hall of the University of Magdeburg.
There was no activity at the time of the measurements. The results indicate that in a LOS
scenario for varying distance between 5% and 100% of all frames were error-free, however,
there was no relationship to the distance. The missing packets are mainly due to packet losses
(sometimes up to 60%) and truncations; the respective rates are varying. The transmission
quality in NLOS scenarios was rated as “unusable”.
Some of their results confirm ours, e.g., the occurence and order of magnitude of packet losses and
their time-varying nature. These are of prime importance for the design of wireless MAC protocols
for fieldbuses.
134
In a recent paper of Eckhardt and Steenkiste [41] adaptive error correction techniques were applied
to WLAN traces, recorded in measurements using a DSSS WaveLAN (902-928 MHz frequency band,
2 MBit/s QPSK modulation, receiver antenna diversity). They generated a specific UDP/IP packet
stream, the underlying WaveLAN uses a CSMA/CA variant without retransmission on the MAC
level. This stream is captured and stored by a special receiver station, even if the frame checksum
generated by the WaveLAN MAC was wrong. Their main focus was on investigating the effect of active
interference sources and attenuation on the occurence of bit errors / bit corruption, packet truncations
and packet losses. The authors attribute packet truncations to loss of receiver synchronization, and
lost packets to receiving a too heavily distorted physical layer preamble. The main findings are:
Bit errors were insensitive to the bit value.
At short distances with no interferers the PLR was zero and the packet error rate (PER) (rate
of packets with at least one bit error) of 3.4·104was negligible.
With co-channel interferers (cordless telefone in the same frequency band) the PLR went up to
31%, and truncation rates of up to 23% were observed, depending on the distance and mobility
of the interferers. Interestingly the bit corruption rates varied strongly with the scenarios, but
seem not to be coupled to the packet loss and packet truncation rates.
Almost all packets with corrupted bits had fewer than 5% of their bits corrupted. Bit errors
did not have a trend to cluster in specific bit positions within a packet. Errors tend to occur in
bursts, which were most often restricted to one or two bytes length (burst order k0= 7).
The packet truncation rate depends on the packet length: shorter packets were truncated more
rarely, also the relative percentage of lost bits was higher for longer packets.
The PLR and BER were insensitive to the packet size.
The same authors had published another set of results on WaveLAN measurements before [40], using
the same measurement setup. They had investigated signal quality parameters in an in-room LOS
scenario, a scenario with passive obstacles, and a third scenario with active interferers. In the in-room
LOS scenario:
At fixed distance with good signal level there occured virtually no bit errors but a PLR of
0.04%. Furthermore there was only a single packet truncation for more than a million packets
transmitted.
For varying distance they found that the signal level decreases with increasing distance. The
given plot suggested at least a quadratic loss of signal level.
When the signal level goes below some threshold the PLR drastically increases.
The investigations with passive obstacles showed that a single wall only affects the signal level, but not
the signal quality. As a general trend they observed that decreasing signal level tends to increase the
number of packets with corrupted bits, while decreasing signal quality tends to increase the number
of truncated packets. Moving human bodies between sender and receiver increased all types of errors.
The sensitivity against active radiation sources depended on the nature of the source. If the frequency
used has enough offset to the 902-928 MHz frequency band, the receiver filters work well and the
135
interferer causes no harm. This was verified with an amateur radio transmitter in the 144 MHz band
and with a microwave oven working in the 2 GHz band. Regarding interference in the same frequency
band, one can distinguish between narrowband interference (produced with a 900 MHz cordless phone)
and broadband interference (produced with a 900 MHz spread spectrum cordless phone). While the
WaveLAN was robust against narrowband interference, the broadband interference led to 50% PLR
and truncation of every arrived packet. The susceptibility to broadband interferers was confirmed,
when the cordless phone was replaced by a competing WaveLAN station.
The work described in reference [118] is focused on tracing and modeling of wireless channel errors on
a packet level, incorporating a full UDP/IP protocol stack over WaveLAN (902-928 MHz frequency
band, DSSS, QPSK, 2 MBit/s). For each trace a transmitter sends a sequence of fixed-size UDP
packets with a fixed generation rate to a receiver. All other interference and packet sources are
suppressed. When only the packet arrival rate is varied, the PER rate does not change. When
varying the packet size, the PER doubles with every 300 byte increase of packet size (starting with
100 bytes), reaching 103for 1400 bytes. When only varying the distance, the PER doubles every
17 feet, up to 0.08 at 130 feet. They defined a binary indicator sequence by assigning a one for an
erroneous packet and a zero for an error-free packet. The mean error burst length was in most cases
between two and three, while the mean error-free length seems to decay almost linearly with increasing
distance. The authors calculated suitable parameters for a simple two-state semi-Markov model for
generating binary indicator sequences from their measurements. They found that a two-state Markov
model (Gilbert/Elliot model, see Section 6.7.1) with its need for geometric burst-length distributions
would not fit their data, since they observed a much larger variability than possible for a geometric
distribution.
One of the earliest WLAN packet-level studies is [39]. Again, a 902-928 MHz WaveLAN with 2 MBit/s
QPSK, DSSS, and receiver antenna diversity was used. They have performed their measurements in
a 56 metres long hallway, with a dedicated sender and receiver station. The receiver is placed near a
wall.7The authors have focused on varying the distance. For increasing distance the PER increases,
at 50 m it is below 2%. However, when increasing the distance to 56 m, the PER increases to 50%.
This behavior is observed for several packet sizes. Furthermore, for the same distance with increasing
packet sizes the PER increases. In their evaluation, if two erroneous bits occured in neighbouring
bytes, they belong to the same error burst. When evaluating their bit error indicator sequence (BEIS),
they found that errors tend to be non-consecutive, typically only the minimum number of bits for
constituting an error burst is erroneous (only one erroneous bit per byte), and most packets show
only a single burst. Furthermore, some error burst lengths are strongly preferred at all distances and
packet sizes, e.g., 13 or 14 bits long. This is similar to our results with 14 or 16 bits long error bursts
for QPSK. When looking at the burst length distribution functions, for longer runs they observed
a (decaying) sawtooth pattern with maxima at multiples of 8. Hence, the authors also found some
position dependency in the bit error behavior. The mean bit error rates are found to be “roughly
constant” over all packet sizes and distances. An explanation for this could be that multipath fading
instead of noise is the dominant source of errors. The effects of multipath fading do not correspond
in a simple way to the distance.
7They reference a preliminary study in the same environment, measuring channel impulse response. One interesting
result is that 75% of delay spreads were below 50 ns, which is small compared to 1 µs channel symbol duration. Thus
the channel shows no intersymbol interference and the paths other than LOS appear as random noise.
136
6.7 Stochastic Modeling of Bit- and Packet-Errors
Often, simulation-based performance evaluations are done with stochastic link error models. An
alternative approach would be to use traces, but these are only rarely available and their handling is
often perceived as cumbersome. In stochastic models, a simple stochastic process, which often can be
described in terms of a few parameters, is employed to generate bit error patterns.8Some models are
frequently used in performance studies of MAC or link-layer protocols, e.g., the Gilbert/Elliot model.
A stochastic model is most useful if:
it needs only a small number of parameters (say, not more than a few dozens), which can be
communicated conveniently to other people and be used in their simulations.
it can be parameterized from “real data”, obtained from measurements.
it produces error patterns that approximate reality in a to be defined sense.
From the measurements, one can distinguish between packet losses and bit errors. This distinction
makes sense for the designer of MAC protocols and coding schemes, since packet losses cannot be
combatted by any MAC scheme by influencing the packet’s data part, as they are caused by not
getting bit synchronization, which would happen in the packets PHY header. After separating both
issues, it is possible to describe both packet losses and bit errors by means of (binary) indicator
sequences. Hence, for stochastic channel modeling we are interested in methods for generating binary
indicator sequences, matching some statistics of a given binary indicator sequence.
There are several requirements for such a process: a) it should have low computational and memory
complexity; b) it can be easily implemented on a computer; c) its parameters can be computed from
the measurement results; d) the models output should match certain channel statistics with sufficient
accuracy; and e) preferrably, it is conceptually simple.
In the following Section 6.7.1 a brief overview of the most popular stochastic models for generating
bit errors (or binary indicator sequences) is given. Where appropriate, we explain how these models
can be parameterized from the measurements. Then, in Section 6.7.2 an alternative model type
for generating binary indicator sequences is introduced, called “bipartite” models. It is targeted to
remedie some of the other model’s limitations. It gives much better results in matching the first and
second order statistics of a trace. In Section 6.8.3, we propose an overall structure of a stochastic
channel model which uses explicit submodels for the phenomena of packet losses and bit errors and
combines these submodels in a unified framework. Finally, in Section 6.7.3 a simple example system
is used to demonstrate the benefits and shortcomings of the different stochastic models, as compared
to directly using a trace. The results confirm that the bipartite model is much more accurate than
other models, at the expense of only a small increase in model complexity.
6.7.1 Overview of Common Stochastic Models
We briefly present some of the most commonly used stochastic processes used for generating binary
indicator sequences (in most cases interpreted as bit errors). The majority of these models use time-
8Other types of models, e.g., those directly targeting wave propagation aspects [137], [79], are beyond the scope of
this thesis.
137
homogeneous Markov chains (discrete or continuous). Many of these models are also discussed in
[16].
Let us assume that a binary indicator sequence i1i2...imis given, and the associated burst length
sequence is X1, Y1, Z1...Xp, YpZp. Let the mean error rate be ¯e, the mean error-free burst length be
¯
Xand the mean error burst length be ¯
Y(see Section 6.3.1).
The first and most simple model is the independent model, where one fixed bit error probability
pb(0,1) is given, and, conceptually, for every bit a Bernoulli experiment is carried out, such that
every experiment has the same parameter pband is independent of all other experiments. In order to
match at least the mean value between model and given indicator sequence, clearly pb= ¯emust be
chosen.
This model is quite simple to implement, however, does not capture the “bursty” nature of channel
errors observed in low level measurements or predicted from propagation models. A quite common
approach is to introduce some additional “channel state”.
Two-State Models
A very popular model is the two-state Gilbert model [57] or Gilbert/Elliot model [42]. It assumes a
“good” channel state and a “bad” channel state. Within every state, bit errors occur according to
the independent model with rates egand eb, respectively (egeb). Conceptually, the next channel
state is determined after every bit according to a discrete two-state time-homogeneous Markov chain
with transition matrix
P=pgg 1pgg
1pbb pbb
(with pxy being the probability that the next state is y given that the current state is x). From
the Markov or memoryless property, the state holding times have a geometric distribution and are
independent of each other. The steady-state probability of being in the good state is given by
pg=1pbb
2(pgg +pbb),
and the steady-state probability of being in the bad state is given by
pg=1pgg
2(pgg +pbb).
The mean bit error rate can then be calculated as ¯e=pgeg+pbeb.
It is easy to see that the Gilbert/Elliot model has short-term correlation properties for bit errors (i.e.,
Pr[in+k= 1|in= 1] 6=Pr[in= 1] for proper model parameters and small values of k), however, the
burst length sequences are uncorrelated. The Gilbert model uses eg= 0 and eb= 0.5 while in the
Gilbert/Elliot model these values can be chosen arbitrarily. To determine the matrix Pit is sufficient
to know the mean state holding times 1
1pgg for the good state and 1
1pbb for the bad state. In our
setting it is natural to associate the “good” state with the error-free bursts and the “bad” state with
the error burst. Thus, we choose eg= 0, eb=Pp
i=1 Zi
Pp
i=1 Yi,pgg = 1 1
¯
Xand pbb = 1 1
¯
Y). It is then
easy to see that the mean bit error rate ¯eand the mean bit error rate generated by the model are the
same. Clearly, the same holds for the mean error burst length and mean error-free burst length.
138
If either the error-free burst length or error burst length distribution is not geometric, it is appropriate
to drop the Markov assumption and to use other distributions, which better match the first and second
moments of the error-free burst length and error burst length distributions. Candidate distributions
are, e.g., the binomial or Poisson distributions, or quantized versions of continuous distributions, e.g.,
the lognormal distribution [196]. This class of models is denoted as semi-Markovian models. However,
it is important to note that this type of models also has short-term correlation properties for the
bit errors, but, since all burst lengths are independent, the model allows no correlation for the burst
length sequences.
N-State Models
A popular model using an N-state Markov chain is described in [184]. In this model, the Markov
chain is derived from modeling the instantaneous signal-to-noise ratio at the receiver (R-SNR) under
Rayleigh fading assumptions. The range of possible R-SNR values is partitioned into intervals, each
interval is associated with a state of the Markov chain and a bit error rate value. Since the R-
SNR can be assumed to be time-continuous, transitions are only allowed between neighboring states,
and thus the transition matrix has tridiagonal structure. The possibility to generate this model
from a few simple physical parameters (e.g., mean R-SNR value and doppler frequency) makes it
attractive. However, since these values are not available from our measurements, this model cannot
be parameterized.
In [51], a Markov model with Nstates is described, which are subdivided into two state classes, namely
error-free states (class A) and error states (class B). This class of models is called Fritchman models.
If the system’s current state is in class B, the transmitted symbols are erroneous with probability 1.
In general, the possible state transitions are not restricted. An application of Fritchman models to
measurements of a 142 MHz land mobile channel can be found in [160]. A similar model is described in
[102], however, this uses two transition matrices Pand Q, where Pis used every time the preceding
channel symbol was in error, while Qis used otherwise. Here it is allowed to have bit error rates
different from 0 and 1.
Kim and Li [87, 88] propose to use a Markov modulated process (MMP) for approximating the first and
second order statistics of packet error rate measurements. They employ an N-state time-continuous
Markov chain, for which the generator matrix Qis of circulant type (i.e., each row is the previous
row, shifted by one element), and within each state ithe channel has a packet service rate of γi
(channels currently subject to errors have a lower service rate) [139]. As input data they use the
service rate process {Rc(t)}tR, obtained from measurements [140]. They use the fact that the power
spectral density function R(t) of the MMP generated by Qand (γ0,...,γN1) can be explictly repre-
sented by the eigenvalues of Q. These values are chosen such that R(t) matches the measured power
spectral density function of {Rc(t)}tRas closely as possible. Then Qcan be constructed from the
chosen eigenvalues. Although this approach has attractive features, it is not easily adapted to our
methodology and notions.
Another class of models are the Hidden Markov Models (e.g., [54, 172, 174], an in-depth treatment
can be found in [173]). The methodology proposed in [54], however, uses only one state for the error-
free bursts, and thus the error-free burst lengths are a priori independent. Furthermore, the Hidden
Markov Models lack a direct intuition between the channel behavior and the underlying Markov chain.
We will not discuss these models further.
139
6.7.2 Bipartite Model for Generating Indicator Sequences
The models described so far have some problems:
The Gilbert/Elliot model allows only geometrically distributed state holding times. When ap-
plying this model to our traces’ bit error indicator sequence (BEIS) such that the mean burst
lengths match, the resulting geometric distributions for both error burst lengths and error-free
burst lengths will have coefficients of variation close to 1. However, especially for the error-free
burst lengths coefficients of variations of 20 up to 100 are typical (see Figures 6.18 and 6.19).
Thus, the “true” distributions are much more variable than the geometric distribution.
The Gilbert/Elliot model and other two-state models can only express (very) short-term correla-
tion for the bit error/packet error process and no correlation in the burst length processes, since
the burst lengths are a priori independent. However, in several traces, correlation is present
[197].
Most other models are rarely used and their parameterization is quite complex.
Some models (e.g., [184]) need physical parameters (e.g., receiver-SNR) not accessible to the
measurement setup.
It would be nice to have stochastic models of moderate complexity but capable of expressing variability
of the burst lengths and longer-term correlation.
We introduce a special class of Markov models, called “bipartite models”. This name stems from the
fact that the corresponding Markov chain forms a bipartite graph. This model offers advantages over
the models discussed so far: a) the underlying distribution functions for the error burst lengths and
the error-free burst lengths can be approximated with the desired degree of accuracy (at the cost of
increasing the memory needed for the model), and b) depending on the number of states, the model
can express longer-term correlation than the two-state models discussed so far. Furthermore, the
model is conceptually related to the notion of binary indicator sequences and burst length sequences,
its parameterization from the traces is straightforward, and it is intuitively appealing.
The bipartite model is similar to Fritchman models [51], with some distinguishing features:
the shape of the transition matrix is explicitly restricted to periodic ones;
bit errors do not necessarily occur with probability 1 in bad states; and
the burst length distributions can be arbitrarily chosen.
Model Description
The approach is to employ a number n1of “bad” states and n2of “good” states and to allow state
transitions only from good states to bad states and vice versa (forming a bipartite graph). An example
model with two good states and two bad states is shown in Figure 6.22. When states are numbered
s1,...,sn1, sn1+1,...sn1+n2, the transition matrix has the form:
P=0 Q1
Q20
140
G1
G2
B1
B2
Figure 6.22: A sample bipartite Markov chain
where Q1is an n1×n2stochastic matrix9describing the state transitions from the bad states to the
good states, while Q2is an n2×n1stochastic matrix for the transitions from the good states to the
bad states. A state sicorresponds to a set Iiof possible (error or error-free) burst lengths that can
be generated in this state. Typically, the set Iiis an interval of natural numbers.
The operation of this model is as follows: every state siis assigned a discrete random variable pi
with probability distribution pi(k) = Pr[pi=k] (with kNand pi(k) = 0 for k /Ii) and associated
distribution function Fi(x) = Pr[pix]. This random variable takes values on a finite interval of the
natural numbers. When the system enters a specific good state sν, a random number is drawn from
the distribution pν. This random number is then interpreted as the number of bits for which no errors
occur. When the system enters a specific bad state sµ, again a random number is generated according
to pµ, determining the error burst length in bits. For an error burst we make the assumption that at
least at both ends an error occurs, in the remaining burst the bit errors occur independently with a
fixed rate ri.
In order to build a model from the traces one needs to choose the numbers of states n1and n2, the
matrices Q1and Q2, the probability distributions piand the bit error rates in the bad states. Assume
that for the error-free burst lengths X1...Xpthe distribution function is FX(·) and for the error burst
lengths Y1. . . Ypthe distribution function is FY(·). A simple approach can be summarized as follows:
Select a number of good states and bad states.
Partition the range of possible error-free burst lengths into subintervals [ai, bi) such that for
every subinterval we have FX(bi)FX(ai)1
n2. Do the same for the error burst lengths. To
every error-free subinterval we assign one error-free state, and to every error subinterval one
error state. Hence, every state of the markov chain is associated with an interval.
Construct the transition matrix Pby simply counting in a traces burst length sequence for every
state ithe number of times it is left towards every possible target state jand divide this by the
total number of times the system has left state i.
Assign to every state ia random variable pigenerating the burst lengths of the corresponding
interval. The choice of piis somewhat arbitrary, but the best results are achieved if it matches
at least the mean value of the burst lengths lying within the interval.
9In a stochastic matrix all elements are nonnegative and the rows sum up to 1. An analogous definition holds for a
stochastic vector.
141
For the error states we assume that errors occur independently with a fixed rate. For every error
state i, let Γi {1,...,p}denote the subset of all error bursts which belong to state iand use
ri=PkΓiZk
PkΓiYk
.
The procedure for constructing the transition matrix Pis described in some more detail in [196]. The
model allows to choose arbitrary distributions for the subintervals. The accuracy of the model depends
to a large extent on how good the distributions FX(·) and FY(·) are approximated. Especially the
error-free bursts with their long but sparsely covered tail must be handled with care.
Main Model Characteristics
From the description given in the previous section it is easy to see that Pgenerates a periodic Markov
chain with period 2, and thus is not ergodic. This means that there is no steady-state probability
vector πsatisfying
πT=πT·P
n
X
i=1
πi= 1,
(where n=n1+n2is the total number of states). However, this class of models satisfies a weaker form
of steady state condition. It can be shown, that in the long run under certain assumptions for each
state sithe fraction of the number of visits in state siw.r.t. the total number kof state transitions
so far converges with probability one to a fixed value ai. Furthermore, for states 1 to nthese values
sum up to one. We give a proof for this in Appendix A.1.
Furthermore, it is shown in Appendix A.2 that the bipartite models allows for arbitrary precision in
approximating the distributions of error burst lengths and error-free burst lengths.
The bipartite model has basically the same autocorrelation properties as other Markovian models.
Correlation vanishes asymptotically, while present in the short run. However, due to the higher number
of states, it can express longer-term correlation than the simple two-state models (see Appendix A.3).
6.7.3 Comparison of Different Models
In this section we show that the different stochastic models achieve different accuracy in predicting
selected performance parameters of an example system. The simple two-state models give quite good
results for an aggregate metric, but they are not able to predict a certain form of correlation over
longer timescales. The bipartite model gives much better results for this, at the cost of a moderate
increase in model complexity (in terms of number of states).
We have chosen to build an example system of two stations communicating with each other. A
part of this system is the wireless link, which is modeled using both a real trace as well as several
stochastic models, which in turn are parameterized from this very trace. The accuracy of the models
is determined by comparing the results obtained with the models with those of using the real trace.
142
trace 24 (k0: 100) trace 24 (k0: 150)
MBER 0.000370 0.000370
mean EBL 6.873 114.807
CoV EBL 2.457 2.341
max. EBL 229 6529
mean EFBL 6353.049 11514.112
CoV EFBL 77.441 57.775
Table 6.12: Summary statisitis of trace 24 (EBL=error burst length, EFBL=errorfree burst length,
MBER=mean bit error rate)
Description of Example System
One transmitter and one receiver station are connected via a wireless link. The transmitter wishes
to transfer a file of 1 GB (gigabytes) to the receiver. The file is split into packets of 1000 bits,
the protocol overhead (headers) is neglected. There are no further stations present and no MAC
protocol or propagation delay is considered. The transmitter sends a data packet and waits for an
acknowledgement. If the ack does not arrive within two bit times after transmission finished, the
packet is repeated, otherwise the next packet is transmitted (Send and Wait Protocol). Data packets
can be subject to errors, acknowledgements are always error-free and of negligible size. The receiver
only acks a packet if it contains no errors. The number of retransmissions per packet is unbounded.
The transmission rate is 2 MBit/s QPSK.
Channel Models
For modeling the wireless link we have chosen to use trace 24 of the factorial measurement as the
basis (no scrambling, 2016 bytes packet size), its corresponding bit error indicator sequence (BEIS)
was generated with burst order k0= 150 (the basic statistics are shown in Table 6.12). The following
error models are used:
in the null model no errors occur at all
the independent error model with BER p= 0.000370
the Gilbert/Elliot model, parameterized such that the mean burst lengths match those given
in Table 6.12 (however, the Gilbert/Elliot model largely underestimates the CoV of the traces,
since it generates a CoV close to 1)
a semi-Markov model, using a quantized lognormal distribution, parameterized such that mean
value and variance of the generated burst lengths match the trace.
several bipartite models, differing in their respective number of good and bad states. The
parameterization of these models is done as outlined in Section 6.7.2. As distributions for the
burst lengths the Beta distribution [106] are used, such that mean value and variance of the
measured burst lengths are matched.
The trace itself.
143
Model Mean Time Variance Prediction Error
trace 5915.63 s 0 -
independent 8028.61 s 0.47 35.7%
Gilbert/Elliot 6100.30 s 0.65 3.1%
semi-Markov 5803.03 s 137.41 1.9%
null model 5540.00 s 0 -
bipartite (7,7) 5928.29 s 520.84 0.2%
bipartite (15,15) 5914 s 944.63 0.02%
bipartite (20,20) 5917.76 s 608.74 0.03%
Table 6.13: Transmission times for 1 GB data over channels with different error models (based on
factorial trace 24)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 20 40 60 80 100
Trace
Gilbert/Elliot
Semi-Markov
Bipartite (7,7)
Bipartite (12, 12)
Bipartite (20, 20)
Bipartite (15, 15)
PSfrag replacements
k
Pr[Packet n+kerroneous |Packet nerroneous]
Figure 6.23: Conditional probability that packet n+kis erroneous given that packet nis erroneous
In the simulations only bit errors are considered, the packet loss behavior of trace 24 is ignored.
Results
The performance measure of interest is the time necessary to transmit the whole file, i.e. from sending
the first packet until receiving the last acknowledgement. With the exception of the null model and
the trace, every simulation was performed 40 times with different seeds of the pseudo random number
generator.
The mean values reported in Table 6.13 show that the independent model gives an inacceptably bad
prediction of transmission time (35.7% prediction error). The Gilbert/Elliot model predicts the correct
result with only 3.1% (Gilbert/Elliot) error, the semi-Markov model improves this result to 1.9%. In
both types of models the variation in the results is small. Hence, the error burstiness induced by these
models plays a significant role, and it does not suffice to fit only the mean bit error rate. The bipartite
models increase prediction accuracy by an order of magnitude, however, the comparably small gain
here would not justify the increased model complexity (14, 30 or 40 states as compared to 2).
144
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Trace
Gilbert/Elliot
Semi-Markov
Bipartite (7,7)
Bipartite (12, 12)
Bipartite (20, 20)
Bipartite (15, 15)
PSfrag replacements
k
Pr[Packet n+kerroneous |Packet nerroneous]
Figure 6.24: Conditional probability that packet n+kis erroneous given that packet nis erroneous
The next example shows that for other purposes the simple two-state models are not accurate enough.
From the receiver’s logfile we have generated a special binary indicator sequence, the packet error
indicator sequence (PEIS), which is formed by assigning a zero to a correctly transmitted packet and
a one to an erroneous packet. From the PEIS we have computed the conditional probability that
packet n+kis erroneous, given that packet nis erroneous, using the equation given in Section 6.3.1.
The results are displayed in Figure 6.23, while in Figure 6.24 the same is shown for a longer timescale.
The following points are remarkable:
The original trace has a packet error rate (PER) of 6.3%, hence, it is low enough to take the
curve for the conditional probabilities as the autocorrelation function of the PEIS (see Section
6.3.1). From Figure 6.24 one can see that the correlation lasts for a large number of packets.
The simple two-state models (Gilbert/Elliot, semi-Markov) fail to match the long-lasting cor-
relation. Specifically the Gilbert/Elliot model drops off rapidly to its mean PER of 10%.
The semi-Markov model with its more variable burst length distributions drops off slower and
estimates the mean PER to 4.5%.
The bipartite models give a rather good approximation on the shorter timescale (up to 100
packets) and drop off much more slowly than the simple models. On very short timescales
(<10 packets) the shape of the bipartite (12,12), (15,15) and (20,20) curves follow those of the
original trace. On the other hand, the bipartite models predict even for a timescale of 2000
packets significant correlation, where the other models have lost their memory.
Interestingly, for the bipartite models one cannot say that the approximation is better for higher
number of states: the bipartite (12,12) model matches the trace best.
The ability of the bipartite models to match longer-term correlation much better than the other models
is confirmed with other traces.
The packet error correlation example is not only of theoretical interest, it has also a practical meaning.
Consider the case of a FEC code adding 25% overhead to a packet. The trace’s original PER of 6.3%
145
indicates that applying FEC to every packet would be wasteful. Instead, by the results indicated in
Figure 6.23 it makes sense to switch on FEC only in case of retransmissions and to keep it enabled for
a large number of packets. The effectiveness of this algorithm will change when switching from the
more accurate bipartite model to the two-state models, leading to wrong predictions of the algorithms
performance in the latter case.
It is remarkable that the best results are obtained with the bipartite (12,12) model, having 24 states.
This finding is somewhat unintuitive, since one would expect a better approximation with increased
number of states. However, there is a tradeoff between the number of states and the length of the
underlying trace. Increasing the number of states for a given trace decreases the accuracy of the
transition probabilites as calculated in Section 6.7.2, since with a high number of states every state is
visited fewer times as compared to a lower number of states.
To summarize, the bipartite model gives much better approximations of both the overall transmission
time and the packet error correlation than the other models, while having only moderate complexity.
6.8 Conclusions
6.8.1 Summary of Measurement Results
The most obvious, yet far-reaching result is the variability of the wireless link over several timescales,
even when looking over hours. This concerns, for example, packet loss rates and mean bit error rates.
This variability can be attributed to frequently changing environmental conditons: moving people,
portal crane activity, moving parts of machines, and so forth. Many industrial environments share this
property of a frequently changing environment, hence, the measurement study appears representative
in this respect. Stated differently: it cannot be said to represent “worst case” or “best case” condi-
tions, but “typical” conditions. This gives us confidence that, although one cannot directly transfer
numerical results from this environment to others, the qualitative results in fact can be transferred:
time-varying behavior over several timescales
occurence, burstiness properties, and orders of magnitude of packet losses
great variability of error-free burst length distributions for both packet losses and bit errors,
leading to long periods of good conditions. In fact, these long periods of good conditions make
a wireless link appear nonstationary.
sometimes very long lasting correlation in the packet error process of selected traces (see Section
6.7.3) which cannot be captured by models capable of expressing only short-term correlation.
the tendency of packet losses and bit errors (QPSK) to occur in bursts.
The time variability and the phenomenon of packet losses is also confirmed in another study [52].
We think that the restriction to a single scenario is not really a restriction: when changing to a
seemingly “better” position, likely this will not fix the variability and protect from “bad” periods of
time. Furthermore, one often has not the freedom to move to “better positions”.
146
Beyond several statistical results, of some importance is the finding that the 5.5 MBit/s and 11
MBit/s modulation schemes have serious performance problems. For the design of MAC and link-
layer protocols the phenomenon of (sometimes long-lasting) packet losses is of utmost importance.
More general, MAC and link-layer protocols and coding schemes should incorporate some adaptivity,
since the channel is time-varying, both in terms of mean bit error rates and packet loss rates.
Clearly, there is much room for further research. Some interesting questions are the following:
Can the packet loss rates be influenced by increasing preamble length or by varying the trans-
mission power?
Error behavior in other scenarios, e.g., in the presence of interferers, with one mobile station, in
line of sight scenarios.
Performance of several (adaptive) FEC schemes for traces with different mean bit error rates,
correlation behavior, packet loss rates.
6.8.2 Stochastic Modeling
The popular stochastic models used in the literature (independent model, Gilbert/Elliot model) fail
to match the statistics of the measurement data in several respects. The Gilbert/Elliot model:
does not match the high variability of error burst length and error-free burst length distributions,
due to its restriction to geometric distributions;
does not capture correlation in the burst length sequences; and
fails to predict long-term correlation in the packet error process of a simulated system (see
Section 6.7.3).
Nonetheless, the Gilbert/Elliot model and the semi-Markov model give surprisingly good predictions
for selected performance parameters of an example system. Hence, with some care, both models can
be useful for simulating errors on a wireless link.
In the literature several alternative error models were proposed, however, most of them have only
gained limited popularity. In fact, most performance analyses use the Gilbert/Elliot model. The
alternative models often are hard to parameterize, or need higher numbers of parameters. We believe
that with the bipartite model a good compromise between simplicity, quality in generating good
predictions of performance parameters, and parsimonious parameterization is found. In fact, for the
investigated traces frequently good results can be achieved with a moderate overall number of states
(between 20 and 40).
There are several interesting topics for further research:
Comparison of the bipartite model with other N-state Markovian models or Hidden Markov
Models of comparable complexity.
For the bipartite model the influence of the distribution type for the single states on the generated
packet error process should be investigated.
147
For the bipartite model, from the simulations done so far, selecting the overall number of states
between 20 and 40 seems to be a good engineering rule. However, it is interesting to investigate
the influence of the number of states in some more detail.
Finally, as a general remark, it does not suffice to take only bit errors into account, but the phenomenon
of packet losses should be modeled, too. Hence, an integrated error model as the simple one proposed
in Section 6.8.3 should be used.
6.8.3 Overall Channel Model
Single-Channel Case
As basic cornerstone for a stochastic model we use the observations that on the one hand, there are
the packet-related phenomena, specifically packet losses, and on the other hand there are bit errors
within the packets. For simplicity the following additional assumptions hold:
There are only packet losses; all other packet impairments are treated as packet losses.
Packet losses and bit errors are statistically independent of each other.
The bit error models depend on the modulation type (QPSK, BPSK) but not on the packet size.
This contradicts the findings in Section 6.5.3, but leads to simpler models.
We propose to compose the channel model out of three different submodels: the packet loss submodel
generates a binary indicator sequence which determines for every packet handed to the channel model
whether it is lost or not. If the packet is not lost, then one of the two remaining bit error submodels
is applied to the packet’s data part (which can include a MAC protocols header and trailer). If the
packet should be transmitted with BPSK, then the BPSK submodel is applied, otherwise the QPSK
submodel is applied. In principle, for each submodel one of the models described in Section 6.7.2
or cited in Section 6.7.1 can be instantiated. In order to keep things simple, we assume that the
BPSK and QPSK submodels are independent of each other. If this assumption is not made, and both
submodels keep some channel state variable, the problem of how to decide which channel state should
be entered when switching between the two submodels arises.
Multiple-Channel Case
We consider the case of multiple stations.
The basic assumption is that between every pair of stations a separate and independent channel
w.r.t. its error behaviour exists. The independence assumption can be justified by the following
heuristic argument: when multipath fading is the dominant source of errors and by the “nonlinear
dependency” of the error behaviour from the propagation environment, it is reasonable to assume that
for every pair of stations the propagation environment is “different enough” from those of every other
pair. Furthermore, for the case of MAC protocols implementing a strict-TDMA scheme there is no
interference from other stations, as opposed to CDMA-based protocols. For a fixed pair of stations
the channel in both directions is assumed to be symmetric.
148
It is clear that the independence assumption has limitations. For example, consider the case where
two stations are placed in close vicinity to each other and to an interferer, say, a microwave oven.
Clearly, the activity of the oven disturbs both receivers in parallel and makes all the channels from
other stations to these two correlated. However, this is not considered furthermore in this thesis.
The channels between different pairs of stations can follow different error models.
6.8.4 Consequences for Design of MAC and Link-Layer Protocols
The measurement results allow to draw some simple conclusions and to make some suggestions for
the design of MAC and link-layer protocols aiming at reliability. A general observation is that packet
loss rates and mean bit error rates are time-variable, even for the same modulation type bit error
rates vary over several orders of magnitude. This calls for inclusion of adaptivity into the protocol
implementations.
The occurence of (sometimes long-lasting) packet losses is a major challenge. Packet losses are due to
failure of acquiring bit synchronization. This happens already in the header, thus no MAC protocol
can protect itself against packet losses by influencing the contents of the data part of a packet.
Instead it is necessary to incorporate other mechanisms, e.g., variation of transmit power level, using
retransmission schemes, enabling scrambling, or better shielding the radio equipment. Furthermore,
for invoking these mechanisms a feedback from the receiver is needed, i.e., the MAC protocol has
to incorporate an immediate acknowledgement mechanism. Instead of using whole packets with full-
length preambles for immediate acks, it may suffice to rely on the presence or absence of short noise
bursts.
The burstiness of bit errors and packet losses suggest to use postponing schemes. Consider a scenario
where a single base station (BS) serves a number of spatially distributed wireless terminals (WT),
and the traffic is mainly from BS to WT and vice versa. Furthermore, we assume that channel access
is time-multiplexed between stations, not code- or frequency-multiplexed. In this case, and with
multipath fading being a significant source of errors, the BS has for every WT a different propagation
environment (number of paths and their respective losses). Hence, the BS has a separate channel to
every WT, which can be viewed as independent from the others. If the errors on all channels have a
bursty nature, this calls for introducing link-state dependent scheduling approaches, as proposed in
[14]. In this type of schemes the BS may decide to postpone retransmissions (triggered by a packet
loss or packet error) and to serve another WT in the meantime. When assuming a strong FEC code
and considering only packet losses, our results indicate that, if the retransmission is postponed for 5
to 10 packet times, with 95% probability it will be successful (compare Figure 6.7).
The occurence of longer outage conditions due to packet losses should be recognized by a MAC
protocol and signalled to upper layers to allow them to react properly (e.g., enabling emergency
stop procedures). Unlike to wireline or fibre optic communications, applications cannot assume the
underlying network to be reliable, but should take changing network conditions explicitly into account.
Therefore, some service primitives for signalling network conditions to upper layers should be added
to the interface of the MAC or link-layer protocol.
The presence of long-term correlation in the bit error process and packet error process (see Section
6.7.3 for an example) can be considered in different ways. If it is likely that many of the packets
following an erroneous packet will also be erroneous, it can be worthwhile to protect them, e.g., by
149
switching back to BPSK (increasing energy per bit), by using FEC, or by increasing transmit power
for a while. Another alternative are postponing schemes as sketched above for packet losses in a one
BS / many WT scenario.
150
Chapter 7
Polling Protocols for Wireless
PROFIBUS
For MAC-protocols targeted to hard real-time requirements token-based approaches like the PROFIBUS
protocol are popular, but they are not appropriate for error-prone wireless links (see Chapter 5).
Therefore, it is worthwhile to look whether other approaches do better. In this chapter polling-based
protocols are proposed and their realtime performance as defined in Section 4.3.1 is evaluated and
compared to that of the PROFIBUS protocol.
As a very general description [163, 161], a polling system consists of a central station (called base station
(BS)) and a number of stations (called wireless terminals (WT)), with each station conceptually having
a queue for requests or packets. Conceptually, the BS carries out two different tasks: first it queries
the queue’s states from the WT’s, and second, it assigns bandwidth to the WT’s according to the
query results and some polling policy. Typically, for a single WT it is assumed that a query is less
costly than to serve a packet, otherwise the query overhead could not be justified as compared to a
pure TDMA system. The conceptual decoupling between querying a WT’s state and serving the WT
is useful, since this allows both functions to be carried out at distinct times. In some extreme cases,
however, one of these phases can be skipped. Consider the round-robin scheme as an example: the
BS does not really care about the WT’s queue states, but sends just a poll request, allowing a WT to
send a certain amount of data.
For hard-realtime systems we demand that every WT has a collision-free possibility to be queried
by the BS, and that always for a nonempty queue (of high priority requests) the time a WT has to
wait for the next bandwidth assignment is upper-bounded. These additional requirements make the
polling-protocols considered here different from most other polling-based protocols, and also different
from demand-assignment MAC protocols [60].
Polling schemes are attractive for the following reasons:
In PROFIBUS a station gets lost from the ring if (amongst other possibilities) it misses three
consecutive token frames, and it needs to be re-included. In polling-based systems a station is
not lost from the LAN, if it misses a poll frame, and no re-inclusion is necessary. It suffices to
wait for the next poll. In theory, after a one-time registration a WT can be LAN member for
151
indefinitely long time. In practice, it makes sense to implement some maybe very loose time-
bounds, within which some life-signs of a WT should be received. If this time expires, some
resources allocated to the WT within the BS can be freed.
There is a contention-free possibility for data transmission and querying a WT’s state.
Bandwidth assignment by a central station (BS) can be more efficient than decentralized ap-
proaches, since potentially more knowledge about pending requests is available at the BS.
If no transmission errors occur, they allow for deterministic system behaviour and thus for
guaranteeing time bounds.
The BS always knows which WT is transmitting. This knowledge can be explored with adaptive
antenna array technologies [152].
Polling schemes are chosen for their potential to bound the maximum medium access times under
any load conditions, not for the sake of mean delay or throughput, where other schemes may be more
appropriate. Clearly, polling schemes have disadvantages:
They introduce a single point of failure, namely the BS. Hence, appropriate redundancy schemes
are needed. This kind of problems is addressed in detail in [154, 129].
Polling-schemes do not scale easily to a large number of stations. However, for the small number
of stations as in PROFIBUS LANs this concern is not relevant.
The overhead needed for querying or polling the WT’s.
Polling WT’s with no packets increases delay for nonempty WT’s.
If membership changes often due to mobility, then registration and deregistration overhead can
become significant. In addition, a registration delay is introduced.
In designing polling-based MAC-protocols there are several degrees of freedom:
polling-sequence and decisions on when to perform retransmissions;
stations involved in data transmission and the polling process;
variation of transmit power within the legal constraints;
framing and coding schemes (FEC, interleaving, packet duplication);
variation of modulation scheme;
and as a meta-method: adaptivity, based on feedback about the channel behavior [26].
This thesis focuses on the first two “control knobs” and their potential to combat channel error
conditions. They are tightly coupled to the concept of polling. The other knobs have already shown
their capabilities elsewhere, and it is an issue for further research to investigate their behavior when
combined with polling protocols.
In the design of polling schemes the error behavior of wireless links, as presented and modeled in
Chapter 6 should be taken into account. This is true specifically for the phenomenon of (bursty)
152
packet losses, which could a priori not be eliminated by adding redundancy to a packet’s MAC-header
or data part, as they happen before in the PHY header. But clearly, also bit errors and in general the
time-varying nature of the wireless link should be taken into account.
In this chapter we present a specific k-limited round-robin protocol (Section 7.2.1). We use a simulation
approach, to compare the realtime performance (see Section 4.3.1) of round-robin and the PROFIBUS
protocol for different load models and error models (Section 7.3.2). It is shown that the round-robin
protocols have much better realtime performance for error models exhibiting some burstiness. One
reason for this is that polling avoids a permanently active ring maintenance mechanism. Instead a
mechanism is used, where terminals have to register once at the base station, and the registration can
be maintained with a soft-state approach with timeouts in the range of minutes. Stated differently: in
the PROFIBUS protocol a station can get lost from the ring in every token cycle. With a registration
scheme the “LAN membership loss opportunities” can be much reduced, or even eliminated in the
case of a fixed topology and one-time registration. Hence, a major source of problems is removed.
In addition, three different modifications for further improvement of realtime performance are proposed
(Sections 7.2.2, 7.2.4 and 7.2.3):
The functional repolling approach, influencing the polling sequence and decisions on when to
perform retransmissions. With this scheme the burstiness of bit errors on the wireless channel
is addressed by trying to find a good time for doing retransmissions.
Two relaying approaches, simple poll relaying and simple data relaying, modifying the set of
stations involved in a data transmission and the polling process. These approaches are targeted
to combat longer-lasting packet losses on a specific channel between two WT’s by letting other
stations help in packet transmission. Hence, spatially different wireless channels come into
play. The relaying approaches are nearly “orthogonal” to the basic round-robin and functional
repolling protocols.
All the three approaches are to the author’s best knowledge not treated in the literatur so far.
Each of these three approaches is compared with the basic round-robin protocol for its realtime
performance, and it is shown that often significant gains can be achieved (Section 7.3.3).
The round-robin protocol and the modifications are combined with a proposal for a modified PROFIBUS
link layer protocol, better adapted to the behavior of wireless links (Section 7.1). Some issues of the
changes in semantics and the interoperability with the PROFIBUS protocol are discussed in Section
7.4.
The findings of this chapter can serve as a starting point for the design and specification of a full
MAC- and link layer protocol stack for an integrated wireless PROFIBUS system.
7.1 System under Study
In this section we give a brief description of the system under study, including its basic topology,
MAC-protocol framework, automatic repeat request (ARQ) protocols, and the channel models used.
The system under study is designed to highlight the issues related to the realtime performance of
different polling schemes under error conditions.
153
For the polling protocols only the framework is described here, the actual polling schemes are presented
in Section 7.2. Some of the proposed protocols change the semantics of the PROFIBUS services. This
is discussed in Section 7.4.
7.1.1 Description of the System
Topology, PHY, Cell membership
We consider a wireless-only scenario with one base station (BS) and a number Nof wireless terminals
(WT), together constituting a cell. This topology is fixed, the stations are not mobile. This scenario
is simpler than the integrated scenario targeted for a wireless PROFIBUS (see Section 4.3), but it
allows to concentrate on realtime performance issues over wireless links.
A fully meshed topology is assumed, i.e., all stations can hear each other. The maximum distance
between any two stations is small (30-50 m, according to the typical size of a production cell). There-
fore, the propagation delay is small and can be neglected. In larger cells we would have to take guard
times into account. The question whether this assumption can be relaxed is discussed in Section 7.6.
Furthermore, given the small cell size, it is reasonable to assume that there are no near-far effects.
Hence, parallel transmissions have the result that no station receives a valid and decodeable signal.
All stations are equipped with an IEEE 802.11 DSSS compliant PHY. They are capable of performing
fast carrier sensing: for detecting the channels state it suffices to receive a signal of sufficient strength
for a short time (10-20 µs), and it is not necessary to have acquired bit synchronization before. Query-
ing carrier sense information from the radio modem takes negligible time. Hence, if any station within
a cell transmits, all other stations know this almost immediately.1To avoid wrong interpretations of
signals from neighboring cells, careful frequency planning is needed in a fabric hall.
All WT’s are considered to be active stations in the sense that they require the right to initiate
transmissions from time to time. Data transmissions are only amongst the WT’s. In the simplified
scenario adopted here, the BS is looked at only with respect to its role in medium access control, in the
integrated scenario it serves also as a source or sink of data. The WT’s have distinct station addresses,
each of one byte length. Conceptually, a WT consists of a PHY, a MAC-entity, and a link-layer
entity. The link-layer entity offers a link-layer interface to upper layers (the same as PROFIBUS).
Furthermore, to each WT a number of traffic sources generating requests can be attached. These
sources are on top of the link-layer interface.
The BS maintains a list of the WT’s which are member of its cell and which have to be polled. To
become a member of this list, a WT has to register itself once at the BS. This is done using special
registration frames on a separate (physical or logical) channel.2There are two ways for a WT to get
removed from the list: it removes itself by unregistering, or the BS deletes it after it has not received
any signal from this mobile for a long time. However, since the topology is assumed to be fixed, it is
reasonable to assume that one-time registration suffices and that this has been already done at system
startup.
As a simplification, the necessary overhead for the registration mechanism is not considered. In fact,
it can be argued that this overhead can be made low. To back up this claim, we propose a scheme: let
1In fact, the Harris/Intersil PRISM chipset [73] used throughout Chapter 6 allows to couple the CCA (clear channel
assessment) signal directly to the received signal strength. Hence, it is reasonable to assume this feature.
2A possible approach are interspersed random access slots, which the WT’s access with a slotted ALOHA protocol.
154
the BS issue every TRASlot seconds a random access slot solely for registration purposes. If the address
space is small (as in the PROFIBUS), the BS can resolve collisions in this slot by invoking a binary
search algorithm over the address space, using special random access slots with a restricted address
range. If the overall arrival rate is low (say, one arrival per minute), the collision rate will also be
low (depending on TRASlot), and the binary search algorithm is invoked rarely. If TRASlot takes values
of 50-100 ms, then the short random access slots will cost only a small fraction of bandwidth. With
these settings a low degree of mobility can be accommodated.
Frame Structure and Framing
There are two types of data traffic: low priority and high priority, having the same meaning as for
the PROFIBUS: the high priority traffic is used for asynchronous transmission of safety critical data
(like alarms), all other data types (e.g., periodic process data, file transfers) belong to the class of low
priority traffic. However, even if the high priority traffic is asynchronous, it can occur in batches (or
alarm storms), which may arrive at several stations in parallel. We are only interested in the timing
behaviour of the high prioritiy traffic, since this is typically used for safety-critical data. The low
priority traffic serves as background traffic.
There are two classes of frames: (several) control frames and a single data frame type. All frame
types are transmitted using 802.11 DSSS PHY PPDU’s (see Section 3.2.2). A PPDU consists of a
PHY header and a PHY data part. The PHY header consists of a PLCP preamble of length 128 bits
and a PLCP header length of 64 bits, all transmitted in BPSK mode. Hence, the PHY header is of
192 bits or 192 µs length. The PHY data part is transmitted with QPSK modulation. It consists
of a MAC header and optionally a MAC data part, carrying the user data. The MAC header has
fixed structure. It is assumed to be of eight bytes length, which is sufficient to accommodate several,
frame-type dependent fields (e.g., source and destination address, a tag field describing the frametype,
several flags [indicating priority, alternating bit protocol information or the method of data encoding],
queue lengths of low and high priority request queues3, a length field [indicating the number of data
bytes in a data frame]). Furthermore, we assume that this MAC header contains a separate header
checksum. The checksum is assumed to be perfect, i.e., bit errors in the MAC header are always
detected. In fact, with proper choice of the checksum algorithm the probability that bit errors remain
undetected can be made very low. Hence, it is reasonable to assume that the influence of this case is
negligible.
Control frames carry only the MAC header, the data frame can carry up to 255 bytes of data.
Hence, it allows to embed a PROFIBUS variable length telegram of full length. No segmentation and
reassembly scheme is assumed.
We assume that all stations detect the end of a frame within very short time <10µs after its last
bit. This corresponds to the SIFS of 802.11 with DSSS (see Section 3.2.3), where within this time
the transmission of an 802.11 immediate ack must have been started. By this requirement also the
modem turnover time between receiving and transmitting modes is upper-bounded by 10 µs.
Every data frame is equipped with a checksum, covering only the data part and not including the
MAC header. After proper reception the receiver has to send an immediate acknowledgement, using
the ack frame (belonging to the class of control frames). For proper comparison with the PROFIBUS
SDA service (see Section 4.1.3) the ack frame carries no data. As for PROFIBUS, the ack frame
3These can be used by the BS for scheduling decisions.
155
can indicate both positive and negative acknowledgements (e.g. on the occasion of buffer shortage).
However, for simplicity for both the PROFIBUS protocol and the polling protocols we assume that
only positive acks are sent.
ARQ-Protocol and WT-Behavior
The PROFIBUS uses a variant of the alternating bit protocol (ABP) with a bounded number of re-
transmissions (protocol parameter max retry) as ARQ protocol (see Section 4.1.4). The ABP provides
protection against losses and duplicates at the receiver.
We introduce two modifications to this protocol: the PROFIBUS atomicity restriction (see Section
4.1.3) is removed and for every priority class a separate instance of the alternating bit protocol is
maintained. Hence, given a number of NWT’s, every WT maintains 2(N1) instances of the
protocol, namely, for every possible target and every priority. The rationale for this is to allow high
priority requests to preempt low priority requests currently in work. In the nonpreemptive PROFIBUS
scheme the latter may block high priority requests in case of a longer period of transmission errors /
retransmissions. The possible preemption violates the PROFIBUS atomicity property, for a discussion
of changes in semantics see Section 7.4.2. Since for every possible target address and every priority
a separate ABP instance is maintained, the preemption causes no confusion of the alternating bits
used. How this protocol can be integrated with the PROFIBUS ABP protocol is briefly discussed in
Section 7.4.
Furthermore, two different max retry parameters for low and high priority data are distinguished. For
the high priority requests we assume large values of the max retry parameters, e.g., max retry >20.
This is due to the high reliability requirement for high priority requests.
Two important rules for the behavior of a WT are:
For both priority classes separate queues are maintained. Upon every poll, a local scheduler
selects the queue to serve. Within this work we consider only a simple local priority scheduler,
where always the high priority queue is served, if it is nonempty. However, other types of
schedulers can be used, too.
The WT observes the perceived poll intervals. If it is not polled for a certain time, it gener-
ates appropriate indications or management signals for its upper layers, indicating the outage
condition.
Channel Models
For investigating the realtime performance of the polling-based protocols and the PROFIBUS protocol,
a set of channel models with different degrees of statistical sophistication is used.
The channel models take the phenomena found in the measurements reported in Chapter 6 into
account, namely bit errors and packet losses. All other phenomena (ghost packets, bit shifted packets
and so forth) are treated as packet losses, see Section 6.5.1. Bit errors and packet losses occur
independently from each other, see Section 6.8.3.
The basic assumption is that between every pair of station there is a separate and independent channel
w.r.t. its error behaviour, see Section 6.8.3.
156
We assume that the channels between different station pairs are stochastically independent. If they
are in addition stochastically identical (i.e., follow the same stochastic model), this is called the
homogeneous case, otherwise we have the inhomogeneous case. In the inhomogeneous case every pair
of stations can have its own channel model: given a set of kdifferent channel models (each including
packet losses and bit errors) and Nstations with N(N+1)
4(due to the assumed symmetry) channels
between them, each of the channels is randomly selected from the kchannel models (all channel models
are equiprobable).
For a single channel four different channel models are used: the independent model, the Gilbert-Elliot
model, a Semi-Markov model (using lognormal distributions for the state holding times) and a complex
model, described below. Except from the complex model, all models are parameterized from trace 24
of the factorial measurement, assuming a burst order k0= 150 for the bit error submodels and a
burst order k0= 1 for the packet loss submodels.
The independent, Gilbert-Elliot and Semi-Markov model are homogeneous channel models, the com-
plex model is an inhomogeneous model.
For the independent model, the bit errors are assumed to be independent with a fixed bit error
rate (BER) of 0.00037, while packet losses occur independently with a fixed packet loss rate (PLR) of
0.1099.
In the Gilbert-Elliot model both the bit errors and the packet losses are assumed to follow a two state
markov chain model, respectively, as discussed in Section 6.7.1. With respect to bit errors, the mean
error burst length is 114.8, the mean error free burst length is 11514.1, and the bit error rate during
an error burst is 0.0375 (independent bit errors). For the packet loss model the following values are
used: mean error burst length: 2.68, mean error free burst length: 21.74, and the packet loss rate
during a packet loss burst is 1.
In the Semi-Markov model both the bit error and packet loss model are assumed to follow a two-state
model, however, with the state holding times drawn from a (quantized) lognormal distribution. For
the bit error model the mean error burst length is 114.8, the variance of the error burst length is
72237 (giving a coefficient of variation (CoV) of 2.341), the bit error rate within an error burst is
0.0375. The mean error free burst length is 11514.1, and its variance is 442520151449 (giving a CoV
of 57.775). For the packet loss model the mean packet loss burst length is 2.68, its variance is 24.53
(giving a CoV of 1.848) and the packet loss rate during a burst is 1. The packet loss free bursts have
a mean burst length of 21.74, a variance of 119492.06 (giving a CoV of 15.9).
For the complex model we use different traces: trace 24 of the factorial measurement, trace 21 of
the factorial measurement, trace 1 of the longterm1 measurement, and trace 17 of the longterm1
measurement (QPSK, without scrambling). Some first order parameters of these traces are summa-
rized in table 7.1. Each trace was taken as a single channel model, composed of one packet loss and
one bit error submodel. These four traces together form the inhomogeneous complex model. For all
traces both the packet loss and bit error submodels were expressed as bipartite models. The bit error
submodels use 25 good and bad states, respectively, the packet loss submodels use 5 good and 5 bad
states. For the bit-errors the number of 25 states has in Section 6.7.3 shown to deliver good predictions
for selected performance measures. The packet loss submodel was chosen with fewer number of states
due to the much more limited range of burst lengths.
It should be noted that the complex model is not comparable to the others, since it produces a different
mean BER, due to inclusion of traces with almost no bit errors. The overall PLR of 0.13 is also
157
trace 24 (k0: 150) trace 21 (k0: 150) trace 1 (k0: 150) trace 17 (k0: 150)
MBER 0.000370 0.000189 0 0
mean EBL 114.8 157.4 - -
CoV EBL 2.341 1.967 - -
max. EBL 6529 4482 - -
mean EFBL 11514.1 17335.4 - -
CoV EFBL 57.775 19.439 - -
PLR 0.1099 0.1 0.3193 0.004
mean PLBL 2.68 1.88 4.0 1.053
var PLBL 24.52 4.02 24.67 0.05
CoV PLBL 1.845 1.066 1.239 0.212
max PLBL 55 30 73 2
mean PLFBL 21.73 16.92 8.55 265.6
var PLFBL 119492.06 10240.69 11795.49 3381778.32
CoV PLFBL 15.9 5.98 12.71 6.92
Table 7.1: Summary statistics of selected traces constituting the complex error model (EBL=error
burst length, EFBL=errorfree burst length, MBER=mean bit error rate, PLR=packet loss rate,
PLB=packet loss burst)
different.
7.2 Polling-based Protocols
In this section we present the polling-protocols used for further study. The first one is a k-limited
round-robin protocol with a simple local priority scheduler, described below. This protocol serves as
a baseline protocol, in the sense that both the original PROFIBUS protocol and also the proposed
improvements are compared to this protocol. The choice of k-limited round-robin is reasonable, since
it has the property of a strictly bounded time between two trials for polling a fixed WT.
Among the different means to combat bit errors on wireless links two popular ones are FEC [97]
and postponing schemes like the one in [14, 104]. In the postponing schemes a retransmission of an
unacknowledged data frame is not performed immediately, instead it is delayed until a “better” time.
The idea is that for bursty errors an immediate retransmission will fail with high probability, and
instead the bandwidth can be used to serve other stations. This kind of scheme can be implemented
by influencing the sequence in which the BS polls the WT’s. For investigating this approach we
propose the functional repolling framework. However, in the course of preliminary investigations we
have observed that this approach should be applied only to high priority data frames, since otherwise
repolling of low priority requests may block high priority requests.
The round-robin and functional repolling protocols are “orthogonally” combined with two approaches
called simple poll relaying and simple data relaying, which aim to overcome (bursty) packet losses on
a wireless channel.
The following protocol description makes use of certain frametypes. For reference, these are summa-
rized in Table 7.2. Some further preparing remarks are:
158
Frame type min. parameters protocol meaning
poll k,aall WT amay handle up to kdata slots
data a,b, data all WT asends data to WT b
ack a,ball WT back’s data frame of a
explicit-poll k,arrk+SPR, frk+SPR like poll, but amust answer
null rrk+SPR, frk+SPR generated by empty WT on
explicit-poll
execute-explicit-poll k,a,brrk+SPR, frk+SPR BS asks WT bto send explicit-poll
frame to awith parameter k
executed-ack status rrk+SPR, frk+SPR poll-relayer gives feedback to BS
invited-poll afrk, frk+x BS grants aone data slot (special han-
dling)
Table 7.2: Different frametypes of polling-based protocols, with minimum parameters, x is either SDR
or SPR
Time is maintained by the BS not as an absolute value. Instead the BS keeps a variable
tccounting the number of poll transactions since system startup. Hence, hence the time is
discretized. A poll transaction starts with a poll frame, invited-poll frame, explicit-poll
frame or execute-explicit-poll frame (all explained below) and ends with the frame before
the next of these poll frames.
In the following, we will use the term time unit for a time duration of 64 µs, according to 64
bits transmitted with BPSK. Hence, the PHY header of an 802.11 DSSS PPDU consists of three
time units.
The convenient term slot denotes the exchange of a data frame including the following ack
frame.
7.2.1 k-limited Round-Robin
The k-limited round-robin protocol belongs to the class of cyclic polling protocols (as opposed to
Markovian or table based protocols, see [64, chap. 9]). The WT’s are served in round-robin fashion,
and the BS schedules a maximum of kNcontiguous slots for a single WT before proceeding to the
next one. The kvalue for the k-limited round-robin protocol and the functional repolling framework
(next section) is called round-robin bound.
In detail, the protocol works as follows: the BS sends a poll frame to WT a, indicating the maximum
number kof contiguous slots allocated to a. Hence, ais allowed to send kdata frames including ack
frames. If WT ahas data to send, it starts immediately to do so by sending a data frame to WT
b. Furthermore, WT amaintains a slot counter, by initially setting it to kand decrementing it after
every transmitted data frame. When bproperly receives the data frame, it sends immediately (within
half a time unit) an ack frame. When areceives the ack frame, it checks, whether there is more data
to send and the slot counter is greater than zero. If so, asends the next data frame. If ahas no
further frames to send, it keeps quiet. After the data frame awaits one time unit and then senses
the medium. If bhas received the data frame properly, it transmits a ack frame within half a time
unit. Hence, awill sense a busy medium. WT anow awaits the transmission outcome and proceeds
159
either proceeds with the next frame (new frame or retransmission) or stops sending, if its slot counter
reaches zero. When bdoes not transmit an ack frame, adetermines this by sensing an idle medium.
In this case WT awaits for a second time unit and performs another carrier sense operation. If WT
asenses a carrier, it concludes that the BS has taken control over the channel. WT athen goes back
in its initial state, waiting for the next poll frame from the BS. If there is no carrier and the slot
counter is greater than zero, WT acontinues with sending data frames or gives up, if its slot counter
reaches zero.
WT aruns a simple local priority scheduler: when deciding which frame to send, aalways selects
the high priority queue for service if it is nonempty. This has the consequence, that the atomicity
property is violated for low priority frames. Furthermore, since khas not necessarily any relation
to the max retry parameter, the atomicity property can also be violated for high priority frames: if
k < max retry a single request can span multiple poll cycles.
The behaviour of the BS is as follows: after sending the poll frame it waits one time unit and then
senses the medium. If the WT has not started to make any transmissions, the next WT is served
(possible errors in fast carrier sensing are not modeled). Otherwise, the BS simply waits for the
medium being for at least three time units, which means that the WT has stopped any activity. Then
the next WT is served.
It is important to point out that with the protocol described so far the WT’s need no prior (precon-
figured) knowledge about k, since the BS transmits this value with every poll frame. Hence, it is no
problem to implement other bandwidth assignment policies within the BS than to just send the same
kvalue to every station.
The decision to keep aquiet if it has no data to send introduces a problem: the BS cannot distinguish
between cases where ahas no data and where ahas not received the poll frame correctly. The reason
to choose this option is efficiency, since requiring ato send an empty or null frame in case of empty
queues costs at least one PHY overhead and a MAC overhead, which takes 192+x µs in a setup using
the IEEE 802.11 DSSS PHY. An alternative would be to let the WT send a short burst or jamming
signal, which is significantly shorter than any frame (say, 20-40 µs), and thus can be detected by the
BS upon the burst length. This is discussed in Section 7.6.
The k-limited round-robin protocol was chosen as a baseline protocol for the following reasons:
It is simple and deterministic.
It provides fair bandwidth distribution, and prevents stations from starvation.
It has an upper bound on cycle times.
These properties follow immediately from the protocol description and are valid in case of no errors.
7.2.2 Functional Repolling Framework
In this section we present the functional repolling framework. It provides a means to express when to
perform retransmissions of high priority requests.
Definition. A repoll function is a function
f:N N0{−1}
160
with f(i) = 1for i > max retry,f(i)i1and fmonotonically increasing in the interval
[1,...,max retry]. The repoll function indicates the number of poll transactions f(i), after which the
i-th retransmission takes place. This number is expressed with respect to the time t0of the time where
the first trial of transmitting a request fails.
The very basic idea of functional repolling is to let the BS keep track of necessary retransmissions
and to schedule them appropriately. The word “appropriately” only makes sense with respect to the
error conditions on a wireless link: if it is known to be bursty and a transmission error occurs on a
channel, it is reasonable to postpone the retransmissions and to serve other packets/WT’s meanwhile.
For expressing “appropriate” time instants for retransmissions, so-called “repoll functions” are used.
Basically, the BS maintains a k-limited round-robin protocol. However, the round-robin protocol is
applied only to a subset of all stations, namely, the members of the poll list. These are served by
sending regular poll frames as in the k-limited round-robin case. The remaining WT’s are member of
another list, the repoll list, discussed below.
The BS tries to capture all frames generated by the WTs. Specifically, consider the case that the BS
has sent a poll frame to WT aand ahas sent a data frame to b. Let us assume that bperceives
an error. In this case bgenerates no ack frame, which both aand the BS can determine by sensing
the medium one time unit after a’s data frame. The BS performs a repoll eligibility test. If this
test is passed successfully, the BS removes afrom its poll list and enters it into its repoll list, while
simultaneously resetting the retry counter rc(a) associated to a. If this test is not successful, aremains
in the poll list.
The repoll eligibility test distinguishes between two cases: if the BS has successfully received a’s data
frame, the test is passed, if the frame is of high priority, for a low priority frame the test fails. If
the BS has not received the frame (or the frame’s MAC header) successfully, it is a matter of policy,
whether the frame is assumed to be of low priority (this is likely the typical case) or of high priority
(which is the conservative assumption). It is assumed that it is of high priority.
The BS uses the repoll function fas follows: in the round-robin mode, when bdoes not acknowledge
adata frame of aand the repoll eligibility test was passed, the BS sets rc(a) = 0 and evaluates
f(rc(a)) = f(0). For every trial, if f(rc(a)) gives a negative value, the retry counter rc(a) is reset
to zero, and ais deleted from the repoll list and re-inserted into the normal poll list. If it returns
a nonnegative value, the station is inserted into the repoll list. The repoll list is a list of stations
to repoll. Each list element consists of a station’s address and the time instant (expressed in poll
transactions) when the repoll should take place (denoted as repoll time). The list is sorted by this
wait values, and new elements are inserted accordingly.
The BS now operates as follows:
When the repoll-list is nonempty, but the repoll time of the first element is larger than the
current time, a station of the poll list is served. As a special case, if the poll list is empty and
the repoll time is greater than the current time, the BS sends an invited-poll frame to an
arbitrary WT, allowing it to perform data transfers. However, it makes sense to not select an
arbitrary WT, but one whose queues are nonempty. For determining this the BS uses the queue
length information it captures from all WT’s MAC headers. The targeted WT has the freedom
to perform another trial on the blocking high priority request, to choose another high priority
request targeted to a different station or to choose a low priority request. Selecting another
request from the high priority queue (to overcome the head of line blocking) is only possible
161
when the requests destination address is different from that of the blocking request (because
of the alternating bit) and if a possible change in the sequence of confirmation primitives as
compared to the sequence of request primitives can be tolerated. If the WT chooses to serve
a high priority frame, this has no effect on the repoll list. If the request was successful, it is
removed from the queue.
When the repoll list is nonempty and the repoll time of the first element is smaller or equal
than the current time, the first element is removed, and the BS sends a poll frame to the
corresponding station a. This frame announces a number of kslots to a. If the WT does not
start to transmit any frame, it is inserted into the poll list. In the other case, the BS performs a
repoll eligibility test on the outcome. If this test is passed, the retry counter rc(a) is incremented
and station ais re-inserted into the repoll list with a value according to f(rc(a)). If the test
fails, rc(a) is set to zero and ais inserted into the poll list.
When the repoll list is empty, the round-robin protocol is used.
By these rules, a station ais at every time only member of a single list, and in every list aoccurs only
once.
The need to keep the repoll list sorted, introduces an O(N) operation, hence, the protocol does not
scale well with increasing number of stations. However, in the specific case of PROFIBUS this is no
problem, since the number of stations is restricted to 127.
Some sample repoll functions are:
immediate repolling: f(i) = i1
exponential repolling: f(i) = 2i1 (the first repoll is immediately)
delayed exponential repolling: f(i) = 2i1 + n0.
(here for simplicity the case f(i) = 1(i > max retry) is not written out).
7.2.3 Relaying
The idea of relaying is targeted towards combatting packet losses. As we learned from our measure-
ments, there can be very long periods of sustained packet losses on the channel C(a, b) between WT
aand b. One possibility to overcome this problem is to let another station crelay the communication
between aand b, hoping that the channels C(a, c) and C(c, b) are currently in a good state.
We propose two different schemes, both with the BS as the main actor.
Simple Data-Relaying
Consider the case that WT asends a data frame to WT bover the channel C(a, b), and WT bdoes
not decode this frame correctly, hence, it does not transmit an ack frame. The BS and WT acan
determine this by performing carrier sensing one time unit after WT ahas finished its data frame.
However, if the channel C(a, z) from ato the BS zis not distorted, BS zhas successfully captured the
data frame. In this case (and if it was of high priority), the BS may decide to send the data frame
162
on the channel C(z, b) to bimmediately after it recognized the missing ack frame. If bsends an ack
frame, the BS tries to capture it and sends it again. Hence, amay receive two times the same ack
frame, once on the channel C(b, a) and once on the channel C(z, a). This requires the ability of ato
detect and ignore duplicate ack frames. Furthermore, adoes not count z’s data frame as a separate
retransmission trial.
By the rules for the k-limited round-robin protocol, WT asenses the medium two times after finishing
its data frame: after one time unit it can determine whether WT bhas answered, after two time units
it can determine, whether the BS has (re-)transmitted the data frame. If so, WT aretires from the
medium and awaits the next poll from the BS. The BS has different policies to choose from:
The BS continues with a’s successor.
The BS polls aagain with the remaining number of cycles available to a.
We have chosen the second alternative.
This approach is denoted as simple data relaying (SDR). It can be easily implemented as an orthogonal
feature to both the k-limited round-robin scheme and the functional repolling framework. Further-
more, it does not need any history information maintained by the BS. Likely this scheme delivers the
best results, if the involved channels are not positively correlated. The term “simple” comes from the
restriction to let only the BS play the role of a data relayer. If this restriction is removed, however,
some synchronization between possible relayers seems necessary. A possible approach can use local
timers at every WT, with the timeout set according to a WT’s address. If a WT successfully captures
a data packet, it starts the timer. If the timer expires and there is no activity on the medium, the
WT transmits the captured frame. A similar scheme with local timers is used for triggering retrans-
missions in the scalable reliable multicast approach reported in [48]. However, such schemes have the
disadvantage that they put additional complexity into the WT’s.
It is a matter of policy, whether SDR is applied to every every frame, or, in the case of the functional
repolling framework, only after a certain number of repolls. We assume that it is applied to every
frame.
Simple Poll-Relaying
Definition. A memory-loss function m(·)is a function
m:R+
0 R+
0
which is monotonically decaying and for which limt→∞ m(t) = 0 holds.
A critical issue not resolved by the SDR scheme is the case of a heavily distorted channel between the
BS zand a fixed WT a. This may lead to the situation that adoes not receive a poll frame for long
time. Since WT asignals empty queues by keeping quiet, the BS cannot tell whether a’s queues are
empty or the channel to ais distorted. Hence, when polling athe BS should insist on getting a short
answer from time to time, in order to be sure that a’s silence is not due to a corrupt channel C(z, a).
This is done using an explicit-poll frame, which requests ato send a data frame or a short null
frame to the BS.4
4Special case: asends a null frame, but the BS does not capture it, so it cannot tell whether it was a data frame.
Then the BS has to wait for a sufficiently long medium idle time.
163
In general, the BS tries to capture all frames, and for every station akeeps the timestamps of the last n
successfully received frames / MAC headers (transmitted over the channel C(a, z), which is assumed
to be the same as C(z, a)). If for a certain station even after sending explicit-poll frames this
timestamp is older than a certain threshold TQuiet, the BS assumes that its channel to ais currently
bad. The question is: how can abe polled despite the bad channel C(z, a)?
The approach is very simple: the BS selects a station u(the poll relayer), which acts temporarily on
behalf of the BS z. The BS zsends an execute-explicit-poll frame to u, which carries a’s address
as a parameter. If uproperly receives this frame, it sends an explicit-poll frame to a.5WT amay
then start to do a data transfer to an arbitrary station (including u), or it sends a null frame. In
both cases uwaits for afinishing its transfer and then sends an executed-ack frame back to BS z,
indicating the success that uhas perceived for a’s data transfer (uis required to check for ack frames
on a’s data frames). This feedback information allows the BS in the functional repolling framework
to execute the repoll scheme for station a. The poll relayer uis not required to perform the SDR
algorithm on behalf of the BS (however, this can be considered as an interesting extension). If the BS
gets no executed-ack frame from u(as can be determined by sensing no signal for a certain time),
it can choose to select another station vas poll relayer, however, within this work it is assumed that
the BS does not do so.
A central question is how the BS selects the poll relayer u. To make a good selection, it is useful
that the BS collects channel history information for each channel C(i, j). Specifically we propose
the following scheme. Be nNthe history depth, which indicates the amount of history information
stored for each channel. For every single channel C(i, j) (with i, j {1,...,N}) the BS stores ntrans-
mission outcomes x1
i,j ...xn
i,j (with xν
i,j {−1,1}for all ν {1,...,n}) along with the corresponding
timestamps t1
i,j ...tn
i,j . If xis a transmission outcome, the value x=1 denotes an unsuccessful
transmission, and the value x= 1 denotes a successful transmission. The channel quality Q(i, j, t) of
channel C(i, j) at time tis then estimated as follows:
Q(i, j, t) =
n
X
ν=1
xν
i,j ·g(ttν
i,j )
where g(·) is a memory-loss function. With g(·) one can express that older entries are not so important
than fresh ones.
The BS chooses the poll relayer usuch that
min{Q(z, u, t), Q(u, a, t)} min{Q(z, u, t), Q(u, a, t)}
for all u {1,...,N}holds, i.e. it looks for the “route” with the best minimum quality.
The history is updated by the following rules:
When the BS zsuccessfully captures a frame from ato bat time t, it performs a history update
procedure on C(a, z). During the history update procedure, the BS removes the oldest entry
from C(a, z) and, after renumbering, adds a new entry xn
a,z = 1 and tn
a,z =tto the channel
history.6The same is done for the channel C(z, a) by the channel symmetry assumption. When
5This immediately starting activity allows the BS to check whether uhas received the execute-explicit-poll frame.
If udoes not react, the BS seeks for another station v. However, the number of trials for selecting a poll relayer is
bounded.
6This can be easily implemented as an O(1) algorithm using a ring buffer.
164
the BS captures only the MAC header of a data frame without errors, the transmission outcome
xn
a,z is set to 1.
If the BS polls station a, and senses a signal from abut does not capture a’s frame, it performs
a history update procedure for channels C(a, z) and C(z, a) with transmission outcome 1.
If WT asends a data frame to WT b, which the BS can capture, and if this frame is followed by
some signals in the time where the BS expects the ack frame, the BS performs a history update
procedure for the channels C(a, b) and C(b, a), both with a successful transmission. If WT b
sends no ack frame, the BS performs a history update procedure for the channels C(a, b) and
C(b, a), both with unsuccessful transmission.
It is clear from this description that the observed channel state can be inaccurate, e.g., in the last
case it may happen that adoes not receive b’s ack frame successfully. However, this scheme has the
advantage that all burdens are put onto the BS while the WT’s have to do nothing. Specifically, there
is no need to transmit history information from the WT’s to the BS. Hence, the WT’s implementation
is comparably simple.
7.2.4 Adaptive Functional Repolling
The functional repolling framework allows to implement an arbitrary repoll policy by selecting the re-
poll function f. For example, if the channel characteristics are known to correspond to a Gilbert/Elliot
model, it makes sense to postpone the retransmission for a short while (say, mean burst error burst
length plus two times standard deviation), which can be easily expressed. However, wireless channels
have an inherently time-varying nature, both with respect to bit error rates and packet loss rates.
Hence, a fixed fmay not be a suitable choice for all situations.
In [46] “Meta-MAC” protocols are introduced. The basic idea is simple and elegant: a station contains
not only one MAC instance, but several of them, running in parallel. These can be entirely different
protocols or the same protocol, but with different parameters. However, only one protocol is really
active at a given time in the sense that its decisions are executed, the other instances decisions are
only recorded. The active protocol is chosen based on suitable history information about transmission
outcomes. For each given protocol it is evaluated, how “successful” the protocol would have been
given the outcomes in the history. Based on this ranking a new protocol is chosen.
A similar idea can be adapted for choosing a proper ffrom a set of given f’s. The “success” of each
function can be evaluated for the known channel history information. The hope is that the observed,
historical channel conditions are a good prediction at least for the “near” future, and thus the selected
fis doing good.
Specifically, the proposed scheme can be described as follows: be {f1,...,fm}a set of mrepoll
functions as defined in Section 7.2.2. To every channel C(i, j) with i, j {1, . . . , N}the BS maintains
a channel history as described in Section 7.2.3. Furthermore, to every channel a current repoll function
f
i,j {f1,...,fm}is associated.
The main building block is to evaluate the “success” a given repoll function fwould have reached on
a channel C(i, j) given its channel history x1
i,j ,...,xn
i.j and t1
i,j ,...,tn
i,j (for simplifying notations the
indices iand jare dropped). First of all, we motivate and specify the notion of “channel quality”:
consider the example n= 10 and x1= 1, x2= 1,...,x4= 1, x5=1, x6= 1, . . . , x10 = 1 and
165
tk=k(k {1,...,10}). Here, within ten trials only one failure occurs, which should not be interpreted
as a “bad channel period”, but as a “temporary slip”. Hence, it makes sense to “smooth” the channel
history using low-pass filtering. A simple means to do this is to replace the channel history with a
windowed moving-average version. Specifically, let Wbe the window size and c:= f(max retry) be
the maximum number of poll transactions over which fhas any influence. Then we use the following
moving average function considering the last cpoll transactions as taken from time t:
d(k, t) = 1
W
n
X
i=1
xi·1[tcW+k,tc+k](xi)
where 1A(x) is the indicator function for the set A, i.e. 1A(x) = 1 if xA, and 1A(x) = 0 otherwise.
The “quality” of fis then interpreted as f’s ability to schedule the repolls during the “good” channel
states:
Q(f(·), t) =
c
X
i=1
d(i, t)·h(ci)1N(f1(i))
where h(·) is an arbitrary memory-loss function as defined in Section 7.2.3. If we do not want to look
back cpoll transactions, but only c0< c, and see how fwould have behaved “from tc0on”, we use
Q(f(·), t) =
c
X
i=c0
d(i, t)·h(ci)1N(f1(ic0))
A basic parameter for this approach is the update frequency TAR, i.e., how often a new repoll function
is determined for a channel.
7.3 Realtime-Performance Results
In this section first a comparison of the baseline k-limited round-robin protocol with the classical
PROFIBUS protocol is presented (Section 7.3.2). The results indicate that indeed the approach to
replace the PROFIBUS protocol by something else is promising. By these results the k-limited round-
robin protocol is established as a good starting point for investigating further improvements, which is
done in Section 7.3.3.
7.3.1 Method of Investigation
Realtime Performance of PROFIBUS
For evaluating the realtime performance of the PROFIBUS protocol the same simulator is used as in
Chapter 5, however, with some adjustments:
The stations use the wireless-type framing formats as described in Section 5.2, where PROFIBUS
frames are embedded into 802.11 DSSS PHY PPDUs (see Section 3.2.2), hence, including the
PLCP preamble and PLCP header.
The channel error model was changed to incorporate the four models as described in Section
7.1.1, and the medium characteristics. In fact, both the PROFIBUS simulator and the Polling-
simulator described below use the same code and parameters for the channel models.
166
MAC
BS
Link-Layer (ARQ)
MAC
Link-Layer-Interface
Source
Request
n
1
Source
Request
..........
WT 1
Link-Layer (ARQ)
MAC
Link-Layer-Interface
Source
Request
n
1
Source
Request
..........
WT 2
C(WT1, WT2)
C(WT2, WT1)
C(WT1, BS)
C(BS, WT1)
C(WT2, BS)
C(BS, WT2)
Collision Domain
Figure 7.1: Logical structure of polling simulation model
The simulator uses the CSIM simulation library and provides a full implementation of the token-
passing and ring-maintenance mechanism, the link-layer protocol and the link-layer interface (see
Section 5.1.4). Aspects like management are not considered in the simulator.
Every PROFIBUS station has separate request sources, which issue SDA requests at the link-layer
interface and accept the corresponding confirmation primitives. The request sources can generate low
and high priority requests.
Realtime Performance of Polling Protocols
For investigating the polling-based protocols a simulation model was developed, written in C++ and
using the CSIM simulation library [103]. This model is distinct from the PROFIBUS simulator. The
simulation approach was preferred over probabilistic analysis, since even for the simplest cases the
stochastic analysis of polling systems gets very involved.
The logical structure of the simulator is shown in Figure 7.1 for a setup of one BS and two WT’s. The
simulator covers the polling-based protocols as described in Section 7.2. A protocol for registration
and deregistration is not considered, nor are management functionalities included.
Every WT can have several request sources for generating requests. A single source generates requests
167
with a configurable priority, request size distribution, interarrival time distribution, and target address
distribution. The link-layer instances in the WT’s take the requests, enqueue them according to their
priority, handle the ABP protocol and finally run the local scheduler every time the MAC entity
signals the availability of a slot. The MAC entity within the WT’s depends on the polling-protocol,
but typically accepts one of the four poll frame types from the BS and acts accordingly. The MAC
entity of the BS is responsible for polling the WT’s, keeping the history information, executing the
adaptive functional repoll protocol, etc.
All stations form a collision domain. This means that all stations can hear each other and that
two signals transmitted in parallel will destroy the received signals for all stations. However, with
respect to the error behavior, between every pair of stations two separate, yet symmetric channels are
maintained, according to the models described in Section 7.1.1.
Comparability of Simulation Models
In both simulation models the framing is the same, since both embed their MAC frames into 802.11
DSSS PHY packets. Hence, both models consider a PHY overhead of 192 µs.
The sources generate in both models requests with a data length of 40 bytes, belonging to the ac-
knowledged SDA service. The MAC header length in the polling simulator is eight bytes, the length
of header and trailer in the PROFIBUS simulator is nine bytes. The interarrival times are computed
such that in every model the load definition of the realtime performance measures (see Section 4.3.1)
is satisfied, i.e., both simulators generate the same (error-free) link utilization. The sources of the
polling models always generate requests, those of the PROFIBUS model generate requests only, if the
attached station is member of the ring. This restriction was necessary to keep the simulators memory
consumption stable, otherwise for many parameters the queues grow infinitely. Please note that this
behavior is in favor of the polling simulator.
In both simulators it is assumed that there is no protocol processing time, and there is only a very
small delay (<1 bit) between reception of a data frame and the corresponding acknowledgement.
Actually, this is a good approximation, when taking todays fast microcontrollers into account.
Both simulators use the same channel error models, namely, those described in 7.1.1.
In summary, although the protocols in both simulators are different, they operate in the same load
environment, error environment and PHY medium.
7.3.2 Comparison of Round-Robin with PROFIBUS
Both simulators were given the same load, namely the smooth loads with 10% and 50% background
load. The set of varied parameters is shown in Table 7.3. The propagation delay is zero, both
simulators use 2 MBit/s QPSK modulation. The PROFIBUS slot time TSL was set to 400 µs, the
station delay was chosen to be 100 µs (as in Chapter 5.2). The simulation time was always chosen to
be 3600 simulated seconds, in order to be comparable with the results in Chapter 5.
For the PROFIBUS the setting of the max retry parameter is interesting. For high values one can
expect a low negative confirmation rate (see Section 4.3.1), possibly at the cost of blocking effects
due to the non-preemptability of requests (including low priority requests). We have chosen to set
this parameter for the PROFIBUS to max retry = 20, in order to have the same level of reliability
168
Parameter Values
Number of stations 2, 4, 8, 12, 16, 20
Target token rotation time (TT T RT ) 0.005, 0.01, 0.015, 0.02, 0.04 s
gap factor 1, 2, 4, 6, 8
max retry parameter rrk 20
max retry parameter PROFIBUS 20
PROFIBUS improvements on
channel models independent, Gilbert-Elliot, Semi-Markov, complex
load models smooth (10% low priority load), smooth (50% low priority)
round-robin bound k1, 2, 4, 6
Bitrate 2 MBit/s QPSK
Table 7.3: Simulation parameters for comparison between PROFIBUS and k-limited round robin
Error model Max. number of negative confirms
independent 0
Gilbert-Elliot 8
Semi-Markov 620
Table 7.4: Maximum number of negative confirmations over all N’s for the different error models
(PROFIBUS protocol, TT T RT = 0.005s, gap factor = 1)
as for the polling protocols. In fact, we state without showing this in this thesis, that the realtime
performance results show almost no difference for the two parameter values. This can be explained as
follows. In Table 7.4 we show for the PROFIBUS protocol and the parameter setting with the highest
degree of ring membership (TT T RT = 0.005s and gap factor equals 1) the maximum number of negative
confirmations, for each error model taken over all different values of N. For the independent and the
Gilbert-Elliot model the rates are close to zero, for the Semi-Markov model the negative confirmations
made up less than one percent of all confirmations. Hence, it can be reasonably expected that the
case of having more than three retransmissions occurs rarely and does not influence the results much.
In Figures 7.2 and 7.8 we show for the Gilbert-Elliot error model and the low priority loads of 10%
and 50% respectively, the g
DCvalues for the different round-robin bounds kand the g
DCvalue for the
PROFIBUS. The latter is determined as follows: for a fixed station number Ntake the minimum
g
DCvalue for all possible gap factors and TT T RT values, i.e., the best achievable value for the given
parameter ranges (which are actually chosen towards fastest re-inclusion of stations, i.e., there is much
bandwidth consumption for re-including stations). For the Semi-Markov model the corresponding
curves are shown in Figures 7.3 (10% low priority load) and 7.9 (50% low priority load), for the
complex model the curves are shown in Figures 7.5 (10% low priority load) and 7.11 (50% low priority
load), while for the independent model the curves are shown in Figures 7.4 and 7.10.
Furthermore, in Figure 7.6 we show for the 10% low priority load case the fraction of time, that all
stations are member of the logical PROFIBUS ring. For a fixed N, fixed error model and fixed channel
model, this mean value is taken over all possible different gap factors and TT T RT values. The same is
shown for the 50% low priority load case in Figure 7.12.
169
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20
PB
RR-1
RR-2
RR-4
RR-6
PSfrag replacements
N
g
DC(seconds)
Figure 7.2: Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS protocol
vs. number of wireless terminals Nfor 10% low priority load, Gilbert-/Elliot error model and different
round-robin bounds k
The 10% low priority load case
For the Gilbert-Elliot error model the advantage of the round-robin protocols over the PROFIBUS
protocol is impressive, see Figure 7.2. A factor of eight in g
DCperformance is reached. The ring
stability data shown in Figure 7.6 provides an explanation: since the logical PROFIBUS ring is rather
unstable, high priority requests have occasionally rather long waiting times in PROFIBUS stations
currently not member of the ring. For the highest number of stations the PROFIBUS performance
becomes better. This is likely due to typically shorter gap lists, leading to faster re-inclusion.
For the Semi-Markov model also a clear advantage of the round-robin protocols over the PROFIBUS
protocols is visible, although there are some differences:
The PROFIBUS g
DCperformance is better than for the Gilbert-Elliot case (max. 650 ms for
20 stations as compared to 760 ms for the Gilbert-Elliot case). This corresponds with a better
ring-stability as indicated in Figure 7.6. This can be attributed to the larger variance of the
underlying distributions of the bit-error-free and packet-loss-free periods, leading occasionally
to long times of no errors. During these times the PROFIBUS protocol can achieve a full ring
and empty its queues.
The round-robin protocols show worse performance for the Semi-Markov model than for the
Gilbert-Elliot model. This can be explained by the fact that longer error bursts or packet loss
bursts occur more often in the Semi-Markov model, due to its higher level of variability in the
error burst / packet loss burst lengths distributions.
Although the differences are small, for both the Gilbert-Elliot and the Semi-Markov model the
4-limited and 6-limited round-robin protocol achieves better results than the 1-limited or the
2-limited round-robin. This relationship is inverse to that found in the 50% low priority load
170
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 2 4 6 8 10 12 14 16 18 20
PB
RR-1
RR-2
RR-4
RR-6
PSfrag replacements
N
g
DC(seconds)
Figure 7.3: Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS protocol
vs. number of wireless terminals Nfor 10% low priority load, Semi-Markov error model and different
round-robin bounds k
0
0.01
0.02
0.03
0.04
0.05
0.06
0 2 4 6 8 10 12 14 16 18 20
PB
RR-1
RR-2
RR-4
RR-6
PSfrag replacements
N
g
DC(seconds)
Figure 7.4: Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS protocol
vs. number of wireless terminals Nfor 10% low priority load, independent error model and different
round-robin bounds k
171
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 2 4 6 8 10 12 14 16 18 20
PB
RR-1
RR-2
RR-4
RR-6
PSfrag replacements
N
g
DC(seconds)
Figure 7.5: Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS protocol
vs. number of wireless terminals Nfor 10% low priority load, complex error model and different round-
robin bounds k
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20
Gilbert-Elliot
Semi-Markov
Independent
complex
PSfrag replacements
N
¯
M
Figure 7.6: Fraction of time, that all stations are member of the logical PROFIBUS ring over all
different gap factors and TT T RT values vs. station number Nfor different error models (10% low
priority load)
172
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 2 4 6 8 10 12 14 16 18 20
PB, Independent
PB, complex
rr-1, complex
rr-1, Independent
PSfrag replacements
N
BL
Figure 7.7: Remaining bandwidth for low priority data for the rr-1 protocol and the PROFIBUS
protocol (best value over all parameters) BLvs. number of wireless terminals Nfor 50% low priority
load and independent and complex error models
case (compare Figures 7.8 and 7.9). For the high load case the advantage of small kvalues over
the larger ones can be attributed to having WT νperforming low priority requests, while the
high priority requests at its successor WT ν+ 1 have to wait. For the low priority load case the
advantage of high kvalues over the smaller ones can be explained as follows. We observe that
the curves for k= 2 for both the Gilbert-Elliot and the Semi-Markov model are more close to
the curves for k= 4 and k= 6 than to k= 1. Hence, it is reasonable to assume that a WT can
make benefit of k > 1 due to having the ability to perform immediate retransmissions of high
priority frames, instead of waiting for the next poll cycle. This effect can only show up in the
low priority case, since otherwise high priority requests at WT ν+ 1 can be blocked by ν’s low
priority requests.
For the complex model again the round-robin protocol has a clear advantage of the PROFIBUS
protocol (Figure 7.5). The round-robin protocols show a worse performance than for the Gilbert-
Elliot model. This is likely due to the stations, which are attached to the bad link corresponding to
trace 1 of the longterm1 measurement. Interestingly, the PROFIBUS does not show worse results
than for the Gilbert-Elliot case.
The results for the independent model look different:
The overall g
DCperformance for both the round-robin protocols and the PROFIBUS protocol
is much better than for the other error models. For the PROFIBUS this is due to two reasons:
the first one is that ring stability is much better than for the other two error models (compare
Figure 7.6), hence, fewer requests are blocked in the queues of non-ringmember stations. The
second reason is that the PROFIBUS policy of performing immediate retransmissions is the most
appropriate for independent errors. In this case, postponing a retransmission gives not a better
probability for success (as can be expected for bursty channels after waiting long enough), but
173
only increases the delay. The latter argument is also true for the rr-2, rr-4 and rr-6 protocols.
But even the rr-1 protocol achieves a gain for the independent error model.
The PROFIBUS protocol achieves better results than the rrk protocols (the same is true for
the 50% low priority load case). Specifically, a careful inspection of the simulation results shows
that the best PROFIBUS version is that with the smallest TT T RT value of 5 msec. This points
towards an explanation of this phenomenon: by the PROFIBUS bandwidth assignment rules
(see Section 4.1.4) for a high number of stations and extremely tight TT T RT value it happens
often that a station computes a negative token holding time upon token arrival. In this case it
is allowed to perform at most one high priority request, but no low priority requests. If no high
priority request is available, the station is required to pass the token to its successor, instead of
handling a low priority request. In contrast to this, in the round-robin protocols a station always
handles at least one request, if one of its queues is nonempty. This introduces additional delay
for the following stations. To validate this explanation, we show in Figure 7.7 the remaining
bandwidth for low priority requests BLfor the independent error model, the PROFIBUS and
the rr-1 protocol and the 50% low priority load case. It can be seen that rr-1 constantly uses
30% of the bandwidth for transmission of low priority data, while the PROFIBUS achieves its
good g
DCperformance at the cost of decreasing BLperformance.
Again the rr-4 and rr-6 protocols achieve better performance than the rr-1 and rr-2 protocols.
Within the k-limited round-robin framework a similar bandwidth distribution scheme as for the
PROFIBUS can be approximated by a small modification of the poll frame: a special flag could
indicate to a WT that it should only transmit data, if its high priority queue is nonempty. The BS
can use this flag according to different policies. One possibility is to switch on this flag, when the
cycle time of the last polling cycle exceeded some prescribed threshold.
The 50% low priority load case
For the Gilbert-Elliot model we still have a clear advantage of all round-robin protocols over the
PROFIBUS protocol (see Figure 7.8), the latter having at least two times the g
DCvalue than the
worst round-robin protocol (rr-6). For k= 4 the ratio is approximately three, for k= 1 and k= 2
approximately four. It can be seen that the rr-1 and rr-2 protocols have better g
DCperformance than
rr-4, the rr-6 protocol is even worse. As explained in the preceding section, this can be attributed to
emptying low priority queues at WT ν, which increases the waiting times for high priority requests
at its successor WT ν+ 1 (we call this mechanism low priority successor blocking).
The same observations apply to the Semi-Markov model (see Figure 7.9), however, while the PROFIBUS
g
DCperformance is nearly constant, the g
DCperformance for the round-robin protocols decreases
slightly. This behavior was already observed for the 10% low priority load case.
For the complex model (shown in Figure 7.11) the rr-1 and rr-2 protocols are always better, rr-4 is
beaten by the PROFIBUS protocol only for N= 20, while rr-6 looses against PROFIBUS already for
N > 12. A likely explanation is that the polling-protocols suffer from blocking effects.
Again, when looking at the independent model (see Figure 7.10) all the g
DCvalues have an absolutely
lower level than for the other error models, and the PROFIBUS protocol shows the best performance.
The rr-6 protocols shows the worst performance of the round-robin protocols, followed by rr-4. In-
terestingly, rr-2 seems to be slightly better than rr-1. Hence, the increase from k= 1 to k= 2 is
174
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20
PB
RR-1
RR-2
RR-4
RR-6
PSfrag replacements
N
g
DC(seconds)
Figure 7.8: Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS protocol
vs. number of wireless terminals Nfor 50% low priority load, Gilbert-/Elliot error model and different
round-robin bounds k
beneficial. Probably it is due to a good balance between the possibility for immediate retransmissions
for high priority frames and low priority successor blocking. For k > 2 the low priority successor
blocking mechanism overshadows this small gain.
The PROFIBUS g
DCperformance for the independent error model is the same as for the 10% low
priority load case. This indicates that the PROFIBUS protocols realtime performance does not depend
much on the low priority load. Again, the price to be paid here is the starvation of low priority traffic
for higher station numbers.
Overall Observations
An interesting observation is true for both load scenarios: the round-robin protocols show for all error
models a linear dependence of g
DCperformance from the number of stations N, only the slope varies
between the different error models and kvalues. Once the slope for a certain protocol is known,
this property allows an easy assessment of the realtime capacity (as defined in Section 4.3.1) of the
round-robin protocols.
This observation is not surprising, since it reflects a simple property of the round-robin protocols
(which shows up nicely under homogeneous load conditions): each station gets the same share of
bandwidth and can make purely local decisions on its usage, without considering the behavior of
other stations. This can lead to blocking effects.
In contrast to this, for the PROFIBUS other stations have an short-term influence on the bandwidth
assignment within a single PROFIBUS station by the modified timed-token protocol. This fits together
with the nonlinearities observed in the figures.
From Figures 7.6 and 7.12 it could be observed that for a higher number of stations the ring-stability
175
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20
PB
RR-1
RR-2
RR-4
RR-6
PSfrag replacements
N
g
DC(seconds)
Figure 7.9: Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS protocol
vs. number of wireless terminals Nfor 50% low priority load, Semi-Markov error model and different
round-robin bounds k
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0 2 4 6 8 10 12 14 16 18 20
PB
RR-1
RR-2
RR-4
RR-6
PSfrag replacements
N
g
DC(seconds)
Figure 7.10: Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS protocol
vs. number of wireless terminals Nfor 50% low priority load, independent error model and different
round-robin bounds k
176
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20
PB
RR-1
RR-2
RR-4
RR-6
PSfrag replacements
N
g
DC(seconds)
Figure 7.11: Overall confirmation delays g
DCfor the rrk-protocols and the original PROFIBUS protocol
vs. number of wireless terminals Nfor 50% low priority load, complex error model and different round-
robin bounds k
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 2 4 6 8 10 12 14 16 18 20
Gilbert-Elliot
Semi-Markov
Independent
complex
PSfrag replacements
N
¯
M
Figure 7.12: Fraction of time, that all stations are member of the logical PROFIBUS ring over all
different gap factors and TT T RT values vs. station number Nfor different error models (50% low
priority load)
177
is much worse for the 50% low priority load case than for the 10% low priority load case. This can be
explained that under higher system load it takes longer to re-include a WT into the logical ring, since
the ring-maintenance frames are only used when there is spare bandwidth (see Section 4.1.4), which
happens not so often as in the low load case.
To summarize, the polling-based protocols have under all bursty error models (Gilbert-Elliot, Semi-
Markov, complex) almost always an impressive advantage over the PROFIBUS protocol. Only in the
case of independent errors (which is unlikely to occur in real environments) the PROFIBUS shows
advantages. However, the polling-based protocols can easily mimic the behavior of the PROFIBUS
to suppress low priority traffic in congested situations.
178
7.3.3 Comparison of the Polling-protocol Modifications with Round-Robin
In the previous section we have shown that the round-robin protocols deliver superior realtime per-
formance over the PROFIBUS protocol for the bursty error models. And even for the independent
error model the round-robin protocols deliver good realtime performance (as compared to the other
error models), while the better PROFIBUS performance comes at the price of suppressing low priority
traffic at higher number of stations.
These results establish round-robin protocols as a good starting point for the design of protocols with
improved realtime performance. In this section we investigate the effects of the mechanisms proposed
in Section 7.2, using all the error models (independent, Gilbert-Elliot, Semi-Markov, complex) defined
in Section 7.1.1.
The results reported in this section are all obtained with the polling-simulator described in Section
7.3.1. We compare the round-robin protocol first with each single mechanism, and finally with a
protocol with all three proposals (SDR, SPR, adaptive functional repoll) enabled.
In the following, the simple k-limited round robin protocols is denoted as rrk, rrk augmented with
the SPR protocol is denoted as rrk+spr, accordingly rrk+sdr for the SDR protocol. The adaptive
functional repoll protocol with a specific round-robin bound kis denoted as fr-k, the class of protocols
is denoted as frk.
The common parameters for all protocols are summarized in Table 7.5. The SPR protocol incorporates
a memory-loss function g(·), indicating the influence of older history information. This function was
chosen to be
g(x) = 1[0,1000](x) exp x
100
The history depth is set to 20. The timeout TQuiet after which the BS starts an explicit-poll cycle,
is set to three times TMaxCycle, where TMaxCycle denotes the maximum time a single cycle for the
k-limited round robin case can take for Nstations:
TMaxCycle =N·(controlpacketsize + k·(datapacketsize + ackpacketsize))
For the frk protocols several functions need to be provided: the memory-loss function h(·) and the
set of repoll functions. Furthermore, the update frequency TAR has to be chosen. For this work an
update is performed after every 100 poll frames (i.e., one of poll,explicit-poll,invited-poll,
execute-explicit-poll). The function h(·) is set to h(x) = ex
20 , where the number 20 is related to
the max retry parameter. The window size for the moving average channel quality estimation is set
to 40. As the set of possible repoll functions to choose from in the frk protocols we have considered
the functions summarized in Table 7.6.
Comparison of Round-Robin with Round-Robin+SPR
The method of comparison of rrk+spr with rrk is simple: for every investigated number of stations the
rrk+spr’s g
DC(rrk+spr) value is divided by the corresponding value for the the rrk protocol g
DC(rrk),
giving the ratio g
DC(rrk+spr)
g
DC(rrk).
First we look at the BLperformance measure, indicating for the 50% low priority load case the fraction
of bandwidth remaining for low priority traffic (see Section 4.3.1). For the Gilbert-Elliot model we first
179
Parameter Values
Number of stations 2, 4, 8, 12, 16, 20
round-robin bound k1, 2, 4, 6
Modifications SDR, SPR, fr-k, SDR+SPR+fr-k
channel models independent, Gilbert-Elliot, Semi-Markov, complex
load models smooth (10% low priority load), smooth (50% low priority)
max retry parameter (low, high) 20
Bit rate 2 MBit/s (QPSK modulation)
Table 7.5: Common simulation parameters for performance comparison of the different protocol mod-
ifications rrk+x vs. k-limited round robin
Name Expression
immediate repolling fir(i) = i1
bounded immediate repolling fbir(i) = 1[1,3](i)·(i1) + (1)1(3,)(i)
fast linear repoll fcr1(i) = 2 ·i1
slow linear repoll fcr2(i) = 4 ·i1
quadratic repoll fqr(i) = 1[1,5](i)·i2+ (1)1(5,)(i)
Table 7.6: Repoll function set for the frk protocols, for all functions additionally fx(i) = 1 if
i > max retry holds
observe from Figure 7.15 that the additional cost in terms of BLperformance of the SPR mechanism
in the rrk+spr protocols as compared to the basic rrk protocols is below 3.5%, which we assume to
be acceptable.7The bandwidth loss of rrk+spr is larger for smaller kvalues. This is clearly due to
the better ratio of data frames to poll frames achievable with larger kvalues. The same observation
is true for the Semi-Markov model, with the bandwidth loss approaching 4%. For the complex error
model the bandwidth loss is below 1%.
In general, for all the protocols rrk+spr, rrk+sdr, frk, frk+spr+sdr and all error models the bandwidth
overhead of the respective additional mechanism is below 4% as compared to rrk. Hence, these
mechanisms incur only small penalties in terms of BLperformance. Interestingly, for the rrk+sdr
protocol and the frk protocol actually slightly more bandwidth was available for the low priority
traffic, giving a small gain in BLperformance.
For the 50% low priority load case the ratio g
DC(rrk+spr)
g
DC(rrk)vs. number of stations Nis shown in Figures
7.14, 7.17, 7.19, and 7.21 for the Gilbert-Elliot, Semi-Markov, independent, and complex error models,
respectively. The following observations are interesting:
For k2 the rrk+spr protocol achieves a real gain for all error models. For the Gilbert-Elliot
and the Semi-Markov model this gain is significant (rrk+spr shows for k= 6 and N4 less
than half the g
DCvalue of rrk, for k= 2 still savings of 30% to 40% can be achieved, and even
k= 1 saves between 10% and 20%), for independent errors up to 20% savings for higher number
of stations could be reached, and for the complex error model for N8 between 15% and 30%
can be saved.
7This fraction likely depends on the TQuiet parameter. This is a subject of further research.
180
50%low priority load 10% low priority load
# of execute-explicit-poll frames 1902 29916
# of successful execute-explicit-poll frames 1453 23077
# of null answers to explicit-poll frames 1388 22771
Table 7.7: execute-explicit-poll frame statistics for rrk+spr, independent error model, N= 20,
k= 1
For all error models the rrk+spr protocols achieve a higher gain for higher values of k, with
k= 4 and k= 6 giving better gains than k= 2, which in turn beats k= 1. In all cases the
performance gain tends to increase in the number of stations.
For the 10%low priority load case (see Figures 7.13, 7.16, 7.18, and 7.20) the picture changes:
For k2, the Gilbert-Elliot and Semi-Markov model, the rrk+spr protocols make up to 30%
gains over the rrk protocols in g
DCperformance. For k= 2 we achieve the best gains, for k= 6
the smallest. For the independent and the complex error model rrk+spr has an up to 10% worse
g
DCperformance than rrk.
For k= 1 sometimes the rr-1+spr protocol produces g
DClosses of 10% to 25% (Gilbert-Elliot,
independent, complex) as compared to rr-1, while for the Semi-Markov model gains of up to
12% could be observed.
The different behavior of the 10% low priority load and the 50% low priority load case can be explained
by the operation of the round-robin protocols: as pointed out in Section 7.2.1, when a WT is polled
and has no packet to transmit, it keeps quiet. However, for the BS this is indistinguishable from
the case that the WT has not properly received the poll frame. If the overall load is low, a WT
will experience a higher probability for no newly arriving requests within a certain time, hence, the
probability of the SPR mechanism querying a WT with empty queues is higher. In these cases the
SPR mechanism is pure overhead without any gain. To illustrate this, we show in Table 7.7 for
N= 20, k= 1, the independent error model and the rrk+spr protocol some statistics about the
occurence of execute-explicit-poll frames. It can be seen, that for the 10% low priority load case
we have 15 times more of these frames than for the 50% low priority load case. In this table, a
execute-explicit-poll frame is counted as successful, if the target station actually responds to the
explicit-poll frame sent by the relayer station.
This points to a general tradeoff: if a single WT’s load is high, the SPR protocol is invoked only rarely
for this WT, hence, only small extra bandwidth is spent. If a single WT’s load is comparably low, the
SPR protocol is going to be invoked more often (depending on the TQuiet parameter). If there is one
subset of WT’s with high loads and another subset with low loads, the TQuiet parameter has a direct
influence on the remaining bandwidth available for the high-load WT’s and on the g
DCperformance
of the low-load WT’s.
181
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+spr)
g
DC(rrk)
Figure 7.13: Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, Gilbert-/Elliot error model
and different round robin bounds k
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+spr)
g
DC(rrk)
Figure 7.14: Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, Gilbert-/Elliot error model
and different round robin bounds k
182
0.965
0.97
0.975
0.98
0.985
0.99
0.995
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
BL(rrk+spr)
BL(rrk)
Figure 7.15: Ratio of the remaining bandwidth for low priority data for the rrk+spr protocol and the
rrk protocol BL(rrk+spr)
BL(rrk)vs. number of wireless terminals Nfor 50% low priority load, Gilbert-/Elliot
error model and different round robin bounds k
0.7
0.75
0.8
0.85
0.9
0.95
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+spr)
g
DC(rrk)
Figure 7.16: Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, Semi-Markov error model
and different round robin bounds k
183
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+spr)
g
DC(rrk)
Figure 7.17: Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, Semi-Markov error model
and different round robin bounds k
0.95
1
1.05
1.1
1.15
1.2
1.25
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+spr)
g
DC(rrk)
Figure 7.18: Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, independent error model
and different round robin bounds k
184
0.8
0.82
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+spr)
g
DC(rrk)
Figure 7.19: Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, independent error model
and different round robin bounds k
0.95
1
1.05
1.1
1.15
1.2
1.25
1.3
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+spr)
g
DC(rrk)
Figure 7.20: Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, complex error model and
different round robin bounds k
185
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+spr)
g
DC(rrk)
Figure 7.21: Ratio of the Overall confirmation delay for the rrk+spr protocol and the rrk protocol
g
DC(rrk+spr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, complex error model and
different round robin bounds k
186
Comparison of Round-Robin with Round-Robin+SDR
The approach for comparison of rrk and rrk+sdr is the same as in the previous section, namely, we
look at the ratios g
DC(rrk+sdr)
g
DC(rrk).
For the 10% low priority load case the ratio g
DC(rrk+sdr)
g
DC(rrk)vs. number of stations Nis shown in Figures
7.22, 7.24, 7.26, and 7.28 for the Gilbert-Elliot, Semi-Markov, independent, and complex error model,
respectively. The same is shown for the 50% low priority load case in Figures 7.23, 7.25, 7.27, and
7.29, respectively.
For all loads and all error models we observe that the rrk+sdr protocol makes the biggest gains for
k= 1. Furthermore, for k= 1 there are always real gains, sometimes up to 26% for the “artificial
models”, and up to 55% for the complex model (see Figure 7.28).8This can be explained by the fact
that the SDR approach performs a kind of immediate retransmission, whereas in the rr-1 protocol a
retransmission has to wait one poll cycle. For k= 2 it can also happen frequently that a retransmission
has to wait one poll cycle, and indeed for k= 2 rrk+sdr achieves almost all the time the second-best
results.
For k= 4 and k= 6 for both loads and all error models we observe comparably small gains (up
to 10%) or even slight losses (up to 6%). Furthermore, the curves show no clear dependence on the
number of stations. Even for the complex error model the gain for k= 4 reaches only 10%. For higher
kvalues a WT can most often handle a high priority request fully within one token cycle, in contrast
to the smaller kvalues. Hence, it occurs less often that the SDR scheme can save a full poll cycle.
The bias of the SDR scheme of having the largest gains for k= 1 and k= 2 makes its usage beneficial
for high load situations. To justify this, we again observe from the comparison with the PROFIBUS
protocol (see Section 7.3.2, Figures 7.8, 7.9, and 7.10) that the absolute values of the g
DCmeasure for
the round-robin protocols are better for small kvalues. And specifically in these cases rrk+sdr gives
gains.
As an overall impression, the rrk+sdr scheme seems to be not very sensitive on the load, but more
on the error model. For the same error model the curves for the different low priority loads tend to
be similar in shape and order of magnitude. The curves for different error models and fixed load look
much more different. Hence, the error model affects rrk+sdr performance more than the overall load
does.
8Here is an interesting problem for further research: one can guess that the gains of SDR are larger for scenarios
with heterogeneous channel error models than with homogeneous ones.
187
0.78
0.8
0.82
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+sdr)
g
DC(rrk)
Figure 7.22: Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, Gilbert-/Elliot error model
and different round robin bounds k
0.75
0.8
0.85
0.9
0.95
1
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+sdr)
g
DC(rrk)
Figure 7.23: Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, Gilbert-/Elliot error model
and different round robin bounds k
188
0.8
0.85
0.9
0.95
1
1.05
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+sdr)
g
DC(rrk)
Figure 7.24: Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, Semi-Markov error model
and different round robin bounds k
0.75
0.8
0.85
0.9
0.95
1
1.05
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+sdr)
g
DC(rrk)
Figure 7.25: Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, Semi-Markov error model
and different round robin bounds k
189
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+sdr)
g
DC(rrk)
Figure 7.26: Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, independent error model
and different round robin bounds k
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+sdr)
g
DC(rrk)
Figure 7.27: Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, independent error model
and different round robin bounds k
190
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+sdr)
g
DC(rrk)
Figure 7.28: Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, complex error model and
different round robin bounds k
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(rrk+sdr)
g
DC(rrk)
Figure 7.29: Ratio of the Overall confirmation delay for the rrk+sdr protocol and the rrk protocol
g
DC(rrk+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, complex error model and
different round robin bounds k
191
Comparison of Round-Robin with Functional Repolling
While the SPR and SDR relaying schemes modify the set of WT’s involved in data transmission and
the polling process, the functional repolling scheme (frk, see Sections 7.2.2 and 7.2.4) varies the polling
sequence and decisions on when to perform retransmissions.
For the 10% low priority load case the ratio g
DC(frk)
g
DC(rrk)vs. the number of stations Nis shown in Figures
7.30, 7.32, 7.34, and 7.36 for the Gilbert-Elliot, Semi-Markov, independent, and complex error models,
respectively. The same is shown for the 50% low priority load case in Figures 7.31, 7.33, 7.35, and
7.37, respectively.
From these figures it can be seen that the frk scheme produces worse g
DCperformance for k= 4 and
k= 6 for all numbers of stations, all error models and all load cases. In contrast, for k= 1 we always
make gains, and for k= 2 only in the complex error model we observe some gains (up to 20% for
certain values of N).
To explain the gains for k= 1 and k= 2, we observe from the set of repolling functions listed in Table
7.6 that all functions have the tendency to make the first retransmissions comparably fast (f(1) 3
for all the listed functions). For k= 1 and k= 2 this is, for Nlarge enough, typically faster than the
next time a WT is polled in the normal round-robin protocols.
For higher kvalues the g
DCperformance suffers from two design decisions:
to announce the full number of kslots to a WT acurrently repolled.
to not interrupt a station for repoll cycles: if the BS sends a poll frame with round-robin bound
kto WT aat time t0, and the next repoll cycle is due at t0+ 1, then the repoll has to wait
for a maximum of kdata transmissions of abefore it takes place. Stated differently: repolls are
subject to blocking, a poll cycle to a WT ais not preempted.
To reduce these blocking times is a topic for further research. One possible approach is that the BS
interrupts the poll cycle of a WT a, if it determines that ais going to perform a retransmission of a
low priority frame. To interrupt the cycle, the BS sends a poll frame to a’s successor. By the simple
local priority scheduler’s operation it is clear, that ahas no high priority requests in its queues at
this time. The BS can leave ain the normal poll list, while the functional repolling protocol is still
executed for the high priority requests.
192
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk)
g
DC(rrk)
Figure 7.30: Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 10% low priority load, Gilbert-/Elliot error model and different
round robin bounds k
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk)
g
DC(rrk)
Figure 7.31: Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 50% low priority load, Gilbert-/Elliot error model and different
round robin bounds k
193
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk)
g
DC(rrk)
Figure 7.32: Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 10% low priority load, Semi-Markov error model and different
round robin bounds k
0.8
0.9
1
1.1
1.2
1.3
1.4
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk)
g
DC(rrk)
Figure 7.33: Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 50% low priority load, Semi-Markov error model and different
round robin bounds k
194
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk)
g
DC(rrk)
Figure 7.34: Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 10% low priority load, independent error model and different
round robin bounds k
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk)
g
DC(rrk)
Figure 7.35: Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 50% low priority load, independent error model and different
round robin bounds k
195
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk)
g
DC(rrk)
Figure 7.36: Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 10% low priority load, complex error model and different round
robin bounds k
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk)
g
DC(rrk)
Figure 7.37: Ratio of the Overall confirmation delay for the frk protocol and the rrk protocol g
DC(frk)
g
DC(rrk)
vs. number of wireless terminals Nfor 50% low priority load, complex error model and different round
robin bounds k
196
Comparison of Round-Robin with Functional Repolling+SDR+SPR
In the previous sections we have evaluated the three mechanisms functional repolling, SPR and SDR
separately. It was shown, that they often achieve gains in g
DCperformance in certain situations, but
occasionally also worse performance.
It is also interesting to assess different combinations of the mechanisms and their mutual influence.
Out of the four possible combinations: rrk vs. SPR+SDR, frk+SPR, frk+SDR, and frk+SPR+SDR,
we have chosen to focus on the “full” frk+SPR+SDR approach. The main purpose is to check, how
these approaches interfere with each other.
For the 10% low priority load case the ratio g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of stations number Nis
shown in Figures 7.38, 7.40, 7.42, and 7.44 for the Gilbert-Elliot, Semi-Markov, independent, and
complex error models, respectively. The same is shown for the 50% low priority load case in Figures
7.39, 7.41, 7.43, and 7.45, respectively.
The overall result is that the combined approach frk+spr+sdr delivers superior performance over the
round-robin protocol for the Gilbert-Elliot and the Semi-Markov error model. For the complex and the
independent model we achieve almost always gains for k= 1 and k= 2, while k= 4 and k= 6 show
significantly worse performance for the independent error model (probably due to the frk protocols,
which show their worst performance for the independent error model).
A highlight is the saving for the 50% low priority load case and the Gilbert-Elliot and Semi-Markov
models specifically for k= 1 and k= 2 as compared to the rrk+spr protocol (see Figures 7.39 and
7.14 for the Gilbert-Elliot model, and Figures 7.41 and 7.17 for the Semi-Markov model).
The most impressive result would have been that the frk+spr+sdr approach is always better than
rrk+x for x being SPR, SDR or frk. This result could not be achieved due to the problems with the
frk approach for k= 4 and k= 6. When only looking at k= 1 and k= 2, we state without showing
the curves in this thesis that:
frk+spr+sdr is superior over rrk+spr for k= 1 and k= 2 for all error models, load cases and
all N. Typically the savings are better for k= 1.
frk+spr+sdr is superior over rrk+sdr for k= 1 and k= 2 for all error models and load cases
and all N, except for the complex and independent models at 10% low priority load and the
independent model at 50% load. However, the advantage of rrk+sdr for the independent model
is only given for N= 2, for all other Nthe frk+spr+sdr protocol performs better.
frk+spr+sdr is superior over frk for k= 1 and k= 2 for all error models, load cases and N,
except for the complex error model at 10% low priority load (here only for N= 2 frk is superior)
and the independent error model at 10%.
The savings of the frk+spr+sdr protocol over the other protocols can reach up to 60%. The worse
performance of the “full” protocol occurs mostly for N= 2, for higher number of WT’s Nthe combined
protocol is superior. Hence, for k= 1 and k= 2 the protocols influence each other in a constructive
way.
197
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk+spr+sdr)
g
DC(rrk)
Figure 7.38: Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk protocol
g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, Gilbert-/Elliot error
model and different round robin bounds k
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk+spr+sdr)
g
DC(rrk)
Figure 7.39: Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk protocol
g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, Gilbert-/Elliot error
model and different round robin bounds k
198
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk+spr+sdr)
g
DC(rrk)
Figure 7.40: Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk protocol
g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, Semi-Markov error
model and different round robin bounds k
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk+spr+sdr)
g
DC(rrk)
Figure 7.41: Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk protocol
g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, Semi-Markov error
model and different round robin bounds k
199
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk+spr+sdr)
g
DC(rrk)
Figure 7.42: Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk protocol
g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, independent error model
and different round robin bounds k
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk+spr+sdr)
g
DC(rrk)
Figure 7.43: Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk protocol
g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, independent error model
and different round robin bounds k
200
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk+spr+sdr)
g
DC(rrk)
Figure 7.44: Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk protocol
g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 10% low priority load, complex error model
and different round robin bounds k
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
0 2 4 6 8 10 12 14 16 18 20
k=1
k=2
k=4
k=6
PSfrag replacements
N
g
DC(frk+spr+sdr)
g
DC(rrk)
Figure 7.45: Ratio of the Overall confirmation delay for the frk+spr+sdr protocol and the rrk protocol
g
DC(frk+spr+sdr)
g
DC(rrk)vs. number of wireless terminals Nfor 50% low priority load, complex error model
and different round robin bounds k
201
Summary of Results
The difference in remaining bandwidth for low priority data BLof all three modifications as compared
to rrk is always below 4%, and the SDR mechanism sometimes even gives more remaining bandwidth
to the low priority data. In comparison to the significant gains (10%-60%) in g
DCperformance the
slight losses in BLperformance are well tolerable.
The single modifications often achieve gains in g
DCperformance as compared to rrk. Specifically:
The rrk+spr protocol achieves for 50% low priority loads the best gains (more than 50%) for
k2, for k= 1 the gains are smaller and sometimes performance is worse. For the 10% low
priority load case the gains are less impressive (<30%), and worse performance is observed more
often. The rrk+spr protocol seems to work better for high loads.
The rrk+sdr protocol is not load sensitive and achieves better gains for smaller k(up to 55%
for k= 1 and the complex model), and worse gains or sometimes even performance losses for
k= 4 and k= 6. For k= 2 typically 5% up to 15% gains are observed.
The frk algorithm pays out only for k= 1 and k= 2, for higher kvalues it produces worse
g
DCperformance due to blocking effects. For k= 1 the gains are due to the tendency to have
the first retransmission more early than the next polling instant, for k= 2 the best gains
occur in the complex error model, hence, in a situation where transmission problems occur not
“symmetrically” on all links, but only a subset of bad links needs to be corrected.
The combined frk+spr+sdr makes always gains over rrk for k= 1 and k= 2. For k= 4 and
k= 6 we often observe losses, due to the problems with frk. When comparing frk+spr+sdr
with each of rrk+spr, rrk+sdr, and frk, we find that we make almost always gains for k= 1 and
k= 2, hence, combining the protocols enhances g
DCperformance as compared to every single
modification.
These results justify to give the heuristic recommendation to use k= 2 as basic protocol parameter.
This value seems to be a good compromise, since k= 4 and k= 6 suffer from blocking problems
(specifically in frk) and for k= 1 a single request can often span over multiple token cycles, therefore
increasing delays. Furthermore k= 2 has less protocol overhead than k= 1 and shows often (close
to) the second-best g
DCperformance. The value k= 2 allows for one immediate retransmission. This
seems to be a good heuristic for bursty channels: if one trial and one retransmission do not suffice,
then it is good to proceed with other stations.
7.4 Polling-based MACs for wireless PROFIBUS
So far we have investigated the polling-based protocols with only loose reference to the PROFIBUS.
For example, we have proposed a modified variant of the alternating bit protocol for the wireless
stations, and we have not discussed the resulting integration and semantical issues so far. However,
these need to be considered when looking at the integrated scenario, where a wireless segment with a
polling-based protocol needs to be integrated on the MAC- and link-layer with the original PROFIBUS
protocol.
202
7.4.1 Integration Issues
As described in Section 7.1.1 the polling protocols employ a modified version of the PROFIBUS
alternating bit protocol. This modification lets every WT maintain two instances of the ABP per
target station, instead of one. Specifically, a WT maintains for every priority class a separate protocol
instance.
In the integrated scenario (see Section 4.3) a polling-based protocol (on the wireless side) and the
PROFIBUS protocol (on the wired side) are required to run in a single PROFIBUS LAN. Hence,
some kind of translation is needed, which is best performed in the BS-IWU. We assume that the base
station functionality of the polling-based protocols is a part of / colocated with the BS-IWU.
Let us first consider the case of a wired master polling a wireless slave. The wired master sets the
alternating bit in the frame control (FC) byte of a PROFIBUS data frame or acknowledgement frame
(see Section 4.1.4). This byte includes not only the ABP information, but indicates also the priority
of the frame. These informations are sufficient for the BS-IWU to construct a corresponding data
frame on the wireless side. This can be carried out during the frame forwarding process.
The case of a wireless master polling a wired slave (e.g., a wireless plant diagnose station) is more
complicated. A problem occurs, when a WT astarts with a low priority transmission to a wired slave
x, and gets interrupted by an arriving high priority request for the same target x(all other targets pose
no problem). By the rules of the polling protocol the WT interrupts the low priority transmission,
continues with the high priority transmission and resumes the low priority transmission with its old
alternating bit after finishing the high priority request. The BS-IWU is not only required to perform
the forwarding of frames between the wired and wireless segments, but it should also monitor the
frame exchange and perform bookkeeping of the alternating bits sent by ato x. If the BS receives
at time t0the first high priority frame from ato xafter receiving low priority frames from ato x, it
marks the corresponding PROFIBUS telegram as a new one by toggling the alternating bit. The low
priority frame immediately sent before t0is called L1. When WT afinishes the high priority transfer
(or a sequence of high priority transfers) and the BS receives a low priority frame from ato xat time
t1(the frame is called L2), there are two possibilities: if L2is distinct from L1, the BS-IWU just
marks the corresponding PROFIBUS telegram for L2as a new frame by toggling the alternating bit.
If L2is a retransmission of L1, there are three different cases:
The BS-IWU received an ack frame from xfor L1. In this case we can assume that x’s ack frame
got lost on the wireless link. The BS-IWU suppresses forwarding of L2and repeats the ack to
ainstead.
The BS-IWU receives not even a signal from xfor any repetition (original frame or retransmis-
sions) of L1sent out before t0. In this case the BS-IWU can be sure that xhas never properly
received the frame. Hence, it can safely mark this frame as a new one by toggling the alternating
bit.
Finally, in the case where the BS-IWU has received a signal from x, but no valid ack frame to
L1or its predecessors, it cannot know, whether xhas received the low priority frame properly
or not. It is a matter of policy what to do:
The BS-IWU can suppress L2and its retransmissions, in order to avoid duplicates at x.
This approach makes sense, if some losses of low priority frames can be tolerated.
203
The BS-IWU can mark L2as a new frame, introducing the danger of duplicates at x.
Some further issues which have to be resolved in the integrated scenario are the following:
Choice of polling activity: the BS polls WT’s when the PROFIBUS token is logically in the
wireless segment. It would be beneficial to use also those time slots, where only wired stations
are involved in a frame exchange. The BS-IWU can determine this by inspecting all frames on
the wired segment. If a data frame of sufficient length is sent from one wired station to another,
the BS-IWU can initiate some parallel activity on the wireless medium. However, it must be
ensured that no WT sends any frame to a wired station during this time. As an example, these
time slots can be used for registration frames sent by newly arriving WT’s to the BS.
During registration the BS collects some static information from the WT’s, e.g., their PROFIBUS
station type (active or passive, see Section 4.1.4), vendor, product id, and so forth. This enables
the BS to answer corresponding inquiry frames from wired stations on behalf of the WT’s, and
also to keep ring maintenance frames away from the wireless segment.
The BS-IWU has to perform mimikry functions, as described in Section 4.3.
7.4.2 Semantics
The polling-based protocols discussed in this chapter break with the PROFIBUS semantics in some
respects:
The atomicity property is violated in order to allow high priority requests to preempt low priority
requests, and due to the possibility to have k < max retry, meaning that it can take multiple
token cycles to serve one request. Without preemption a low priority request can block high
priority requests for long time, if the round-robin bound kis small (k= 1 or k= 2) and the
max retry parameter for low priority requests has moderate to large values (5 to 20). This makes
max retry setting in PROFIBUS a rather delicate issue, since for error-prone channels one has to
find a balance between blocking times (low values) and reliability requirements for high priority
requests (high values).
Different max retry parameters for low and high priority requests are introduced. This can affect
PROFIBUS management.
The invited-poll scheme as proposed in section 7.2.2 has the property, that it can change the
sequence of confirmations: the confirmations c1and c2to the high priority requests r1and r2
(with r1occuring before r2) can be exchanged such that c2occurs before c1.
7.5 Related Work
There exists much literature on polling systems in general, mostly concerned with their queueing
analysis. A lot of references can be found in [163], the topic is treated in detail in [161], a shorter
treatment can be found in [90] and [64]. Some selected references on more general polling systems
are [8], where the sequence of visiting stations is determined by an arbitrary table, and [5], where the
204
general problem of determining the state (e.g. availability of data packets) of a subset of stations over
a shared medium is investigated and different methods (polling, TDMA and group testing protocols)
are compared. In [29] polling systems imposing an upper bound on the time spent for a single queue
are investigated using an embedded Markov chain approach. In [149] an analysis of message delays in
polling systems can be found. Tree polling protocols are investigated in [170]. However, typically no
transmission errors are taken into account.
The topic of packet polling systems with transmission errors is treated rarely in the literature. In [163]
some results for infinite buffer polling systems with Bernoulli feedback are presented. Under Bernoulli
feedback for every queue the central station serves the head of the queue. A Bernoulli experiment
is performed and the customer leaves the system with some fixed probability or stays at the head
of the queue. For this scenario results for the mean message response time can be found in [162].
Another reference for polling systems with Bernoulli feedback is [164]. In [199] a polling system with
only downlink traffic (requested by the mobiles) is investigated under the Gilbert-Elliot error model.
The focus is on the efficiency of the protocol under high loads. Specifically, the downstream queue
size in the central station and the cycle times are investigated. Message delays are not considered. In
[67] a table-based polling protocol is investigated under the Gilbert-Elliot error model. The protocol
performs retransmissions, hence uses feedback. The polling-table accommodates synchronous and
asynchronous traffic. The mean response time and bandwidth utilization vs. load are investigated
using a simulation approach.
The use of polling protocols in wireless LANs is not a new idea, however, most wireless MAC protocols
use contention-based approaches, even those for supporting delay sensitive data, like e.g. wireless
ATM protocols. One of the most prominent protocols employing a polling scheme is the IEEE 802.11
standard with the point coordination function PCF, see Section 3.2. In reference [152] the authors
propose a scheme, where the capturing phenomenon is explicitly used: data packets are transmitted
in round-robin fashion with a high transmission power. But only those stations which are known to
have data are included in the round-robin cycle. In parallel, the BS polls all stations at low power, and
separates the parallel signals of poll answers and data frames. This scheme requires special hardware.
This scheme is shown to decrease the delay.
7.6 Discussion and Future Work
In this chapter we made the very first steps towards the definition of a wireless PROFIBUS system. We
have proposed and investigated several polling-based protocols for use in such a system. These proto-
cols can be implemented using off-the-shelf components for IEEE 802.11 DSSS PHY. We have shown
that even the simplest protocol, the k-limited round-robin protocol, has much better realtime/g
DCper-
formance than the PROFIBUS protocol, when the latter has problems with ring stability. And even
when PROFIBUS ring stability is not so critical (as for the independent error model), we have seen
that the advantages of the PROFIBUS protocols in g
DCperformance come at the cost of suppressing
low priority transmissions.
The results suggest that the polling-based protocols are a good choice for the wireless part of an
integrated PROFIBUS, even if some adaptation between wired and wireless protocols is necessary.
For the bursty error models the k-limited round-robin protocol outperforms the best PROFIBUS
version up to an order of magnitude.
205
Although the k-limited round-robin protocol is a much better choice than the PROFIBUS protocol
with respect to realtime performance, it has also become clear that there is room for improvements.
And indeed, the proposed protocol modifications (SDR, SPR, functional repolling) show better g
DC
performance than round-robin, at small costs in BLperformance (smaller than 4%). In some cases
savings of more than 50% in g
DCtimes as compared to round-robin are observed. Specifically, when
the improvements are applied to the case k= 2 the modifications show convincing results for most
of the load and error situations. When we look at the protocol with all modifications combined
(frk+spr+sdr), the gains can reach up to 70%.
The modifications adapt much better to the wireless environment than the PROFIBUS protocol, by
explicitly taking some of its characteristics into account. The relaying schemes make explicit use of
possibly different channel error states between different pairs of stations to circumvent bursty packet
losses and longer periods of bit errors. The results for the functional repolling approach (taking
advantage of bursty bit errors) are not in all cases convincing, but can give gains for k= 1 and k= 2.
However, the PROFIBUS has one desirable property, that round-robin and the modified protocols
don’t have: its g
DCperformance is nearly insensitive to the low priority load and also, in the range
investigated, to the number of stations N. In contrast, the round-robin protocols have shown to
be sensitive to the low priority load and also to the number of stations. This opens potential for
further research. As a starting point, we propose a small modification of the round-robin protocol
approximating this behavior, see below.
A tradeoff in the k-limited round-robin protocols is the choice of k: due to the WT’s simple local pri-
ority scheduler large kvalues lead to increased delays (and hence increased absolute g
DCperformance),
but to better BLperformance. For small kvalues we have the opposite behavior. When the priority
is on g
DCperformance, a topic of further research could be schemes for adaptively varying k: small
k’s for high load situations, and large k’s for low load situations. When looking at the round-robin
modifications, the value k= 2 seems to be a good heuristic value.
A basic design decision for all these protocols was to put almost all the complexity and computational
burdens (channel history maintenance, packet capturing, scheduling, execution of the polling protocols,
and furthermore all the other integration-related functions as e.g. the mimikry functions) into the
BS/BS-IWU. This has the advantage that the WT’s are kept free from these functions and can have
simple and cheap implementations. A disadvantage is the introduction of a single point of failure,
which requires proper redundancy schemes. A second disadvantage is the inherent inaccuracy, when
the channel history information is only collected by the BS.
There are many interesting and open issues for the proposed protocols, and also for the design of
future protocols:
Round-Robin:
Modification of poll frame by introducing a new bit called priority restriction bit. If this
bit is set, the polled WT is only allowed to perform data transmissions, if it has high
priority requests available, otherwise it has to keep quiet. In addition, a proper control
scheme has to be implemented within the BS. As an example, this bit can be set when the
previous token cycle took more than 60% time of the maximum token cycle duration.
Schemes for adapting kbased on load conditions.
Relaying schemes:
206
Find heuristics for proper choice of memory-loss functions.
SDR: find schemes for letting other stations than the BS help in retransmitting a data
frame, e.g., using a local, address-dependent timeout timer setting in each WT and the BS,
as described in Section 7.2.3.
SPR: evaluation of alternative schemes for poll relayer selection. For example: if the best
minimum-quality route is not much better than the direct way between the BS and the
target WT, then skip the SPR protocol.
SPR: find schemes to let WT’s collect channel history information and transmit it to the
BS in order to have more accurate channel estimates.
Functional repolling:
Development of a systematic approach for finding good repoll function sets.
Other types of quality functions.
Additional feature: in the scheme described so far a WT is interrupted by the BS upon
not getting a high priority frame acknowledged, but not upon bad luck with a low priority
frame. In an alternative scheme, the BS could interrupt a WT in either case. In case of
a low priority request the WT remains in the poll list, in other cases it is moved into the
repoll list. We anticipate reduced blocking times for successor stations.
Investigation of other schemes, e.g., involving group testing protocols [5, 21] to reduce polling
overhead.
What happens if the assumption about not having near-far effects is not true, i.e. if it can
happen that two stations send in parallel and the BS receives a decodeable signal?
How can the polling-schemes be used with only partially meshed topologies?
It is also worthwhile to have a closer look at the assumptions we made for the protocols and the
system model:
We have assumed that an IEEE 802.11 DSSS PHY is not capable of generating short noise
bursts. Here the term “short” refers to burst durations much less than a PLCP preamble of
128 µs length. Indeed, for the Harris/Intersil PRISM I chipset this is true. Let us assume that
future radio modems support the following: a) generation of short bursts of, and b) capabilities
for detecting these bursts without need to acquire bit synchronization. Let us furthermore
assume that we can design the MAC protocols having only positive acknowledgements, and that
all conditions leading to negative acknowledgements (e.g., lack of buffer space at the receiver)
are signalled by some other means to the transmitter. In this case we can make ack frames
much shorter by replacing them with a short noise bursts. However, with doing this we loose
an occasion for the ack sender to piggyback its request queue lengths onto a frame. Another
modification is to require a polled WT to generate an answer in any case. If the WT’s queues
are empty, it generates a short noise burst. In the proposed protocols we avoided to always
generate null frames for performance reasons, and have introduced the explicit-poll frame
to resolve the ambiguity between a WT ahaving no data and the BS having a bad channel to
WT a. When the WT is required to send short noise bursts upon empty queues, this ambiguity
is resolved, and the explicit-poll cycle could be dropped. If the BS does not receive the noise
burst for some subsequent polls, it can invoke the SPR protocol directly.
207
We have assumed the channels between different pairs of stations to be independent, and for a
fixed pair of stations to be symmetric (see Section 6.8.3). Although this assumptions seems to
be reasonable, it should be investigated by detailed measurements.
The assumption of having no propagation delay is not critical, since for small cell sizes (30-50
m) it is a good approximation. In practice, the propagation delay has to be considered by
introducing small guard times. These guard times do not affect the protocols proposed here, as
long as they are significantly lower than the time unit of 64 µs.
Clearly, besides polling-related control knobs there are many other means of achieving better trans-
mission reliability and realtime performance. Among these methods are the variation of transmit
power, modulation schemes, better preamble acquisition methods to avoid packet losses, and more.
An interesting area for further research are different framing and coding methods (FEC, interleaving
schemes). Some sample schemes are:
Usage of FEC: always FEC’ing the MAC header, and only FEC’ing a data frame’s data part
on occasion of retransmissions. However, this does not help for packet losses.
Adata frame’s data part can be split into several chunks, where each chunk is equipped with
a separate checksum. Correctly received chunks are stored in a buffer, and from retransmitted
frames only the missing chunks need to be successful. This way, the retransmission frame may
have bit errors, but when the missing chunks are correct, these do not hurt. By this method the
number of needed retransmissions can be reduced at the cost of a slightly increased overhead.
The proposed protocols form only a first step towards the definition of a wireless PROFIBUS capable
of supporting the integrated scenario. Some of the next steps are:
Formal specification and validation of the polling-based protocols, proof of the interoperability
between the PROFIBUS protocol and the polling-based protocols.
Investigation of schemes and protocols for mobility support and the associated registration
overhead and registration delay, redundancy / fault tolerance, management, security / authen-
tication, power saving, and hardware implementations.
Design of a prototypical BS-IWU implementation and field tests.
Although there is much work to do on the way to a wireless PROFIBUS, the proposed approach of
using a specifically tailored, polling-based MAC protocol on the wireless side and to integrate this on
the MAC and link-layer level with the existing protocol, shows up to be feasible and promising.
208
Chapter 8
Conclusions and Outlook
A wireless PROFIBUS would be an attractive choice for many applications. However, we are faced
to the problem of having hard realtime requirements on the one hand, and having an error-prone and
time-variable transmission medium on the other hand. These characteristics of wireless links call for
different approaches for design of protocols and transmission schemes than for other types of links.
This is especially true for wireless fieldbus systems in general, and more specifically for a wireless
PROFIBUS system.
One of the first steps towards a wireless PROFIBUS is the choice of a wireless transmission technology.
In this thesis we have taken the leading IEEE 802.11 WLAN technology with DSSS PHY as the basis.
It shows up, however, that for use in a wireless PROFIBUS we should only adopt the PHY of 802.11
and drop the MAC layer. The next question is which MAC and link-layer protocol to run on top
of the 802.11 DSSS PHY. Taking the error-prone nature of a wireless link in general and specifically
the measurement results for an 802.11 PHY into consideration, this thesis provides an answer to this
question: the PROFIBUS protocol, a quite obvious candidate, is not well suited for this environment
(even not with some modifications) and should be replaced by a specifically tailored protocol. To show
this, we have used the notion of realtime performance, which is a set of measures reflecting timing
constraints and reliability requirements for the case of error-prone wireless links. We have shown that
the PROFIBUS MAC protocols has serious problems with the stability of the logical ring. Since only
ring members are allowed to transmit data, these problems lead to poor realtime performance.
In this thesis we suggested polling-based protocols as a starting point for the design of specifically
tailored MAC and link-layer protocols. The simplest protocol, a k-limited round-robin protocol, shows
under bursty errors a much better realtime performance than the PROFIBUS protocol, the difference
reaches up to an order of magnitude.
Furthermore, we have proposed three modifications of round-robin, which can improve the realtime
performance significantly (more than 50%) under many circumstances. For the design of these mod-
ifications the properties of the wireless link were taken into account, specifically its burstiness and
spatial diversity. The search for further improvements and other types of protocols with good realtime
performance constitutes a field for further research.
When one wants to integrate wired and wireless stations into a single PROFIBUS LAN, the approach
of using a specifically tailored MAC protocol on the wireless side while maintaining the existing
209
protocol for the wired stations is a far-reaching design decision. It requires some MAC-/link-layer
translation between both sides. The design space is large and worth future exploration.
A way to avoid this design decision would be to find further improvements of the PROFIBUS protocol,
which help to better overcome the deficiencies of this protocol on wireless-type links. However, we
believe that the potential of this approach is limited, specifically when keeping the need for interop-
erability with wired stations (and the unchanged protocol) in mind.
Although the results achieved in this thesis are promising, they form only the first steps towards
a wireless PROFIBUS integrating wired and wireless stations in a single LAN. Many steps have to
follow. Some of the next steps are the investigation of other modulation schemes and PHY technologies
(OFDM is an interesting candidate) and the impact of mobility on the wireless PROFIBUS protocol
stack and also on the achievable realtime performance.
There is another aspect, which is not only important for wireless PROFIBUS, but also for another
interesting class of networks with realtime requirements, namely, the class of sensor networks (e.g.,
the PicoRadio system developed at the Berkeley Wireless Research Center [142]). The issue of power
saving and minimizing energy consumption is a central design goal for these systems. It adds another
dimension to the tradeoff between realtime requirements on the one hand and the error behavior of a
wireless link on the other hand.
In summary, this thesis has made the following contributions:
We have shown that the PROFIBUS MAC and link layer protocol has serious problems with
the stability of the logical ring over lossy / wireless links. We have proposed two protocol
improvements, which can significantly relax the problems on wired-type, error-prone links, but
on wireless-type links the ring stability is still unsatisfactorily, which in turn leads to degraded
realtime performance.
We suggested to use specifically tailored MAC- and link layer protocols on top of an IEEE
802.11 DSSS PHY to replace the PROFIBUS protocol. We advocated the class of polling-based
protocols as a candidate.
We have argued that constructing a mapping between PROFIBUS link layer services and the
IEEE 802.11 DCF/PCF MAC protocol is not a viable solution.
We have given a characterization and approaches to stochastic modeling of a wireless link using
real-word traces taken in an industrial environment. The results were used:
as design input for polling-based MAC protocols.
for parameterization of popular stochastic link error models.
A new class of stochastic models with increased modeling precision at moderate complexity costs
was proposed.
We have defined the notion of realtime performance, which takes timeliness and reliability over
wireless-type links into account. This notion provides an important optimization goal for wireless
MAC protocols targeted at hard realtime / industrial applications.
We have presented approaches for polling-based protocols and shown that:
210
A simple k-limited round-robin protocol shows substantially better realtime performance
than the PROFIBUS protocol under several error assumptions. The difference can reach
up to an order of magnitude.
The realtime performance of k-limited round-robin can be further improved. We have
proposed three modifications of round-robin. We have evaluated these modifications under
several error and load assumptions, and shown that they achieve under many circumstances
a significantly (>50%) better realtime performance than round-robin.
These results justify the suggestion to replace the PROFIBUS protocol by another protocol.
211
Appendix A
Main Characteristics of the
Bipartite Model
It is convenient to remind the necessary definitions for the bipartite model. Let n1be the number of
“bad” states, n2the number of good states, and n=n1+n2. For simplicity the states are numbered
from 1 to n. The transition matrix Phas the form
P=0 Q1
Q20
with Q1being a n1×n2stochastic matrix, and Q2being a n2×n1stochastic matrix.
To every state ithere is associated an interval Iiof the natural numbers, and a probability distribution
pi(k) = Pr[pi=k] (with kNand pi(k) = 0 for k /Ii) generating values in this interval. This
probability distribution has the distribution function Fi(x) = Pr[pix].
A.1 Asymptotic Behavior
From the description in Section 6.7.2 it is easy to see that Pgenerates a periodic Markov chain with
period 2, and thus has no steady-state. But the bipartite model satisfies a weaker form of steady-state
condition. In this section we show that in the long run under certain assumptions for each state ithe
fraction of the number of visits in state iw.r.t. the total number kof state transitions so far converges
with probability one to a fixed value ai.
For the following we need the observation that for kN0we have:
Pk =(Q1·Q2)k0
0(Q2·Q1)k
and
Pk+1 =0 Q1·(Q2·Q1)k
Q2·(Q1·Q2)k0
which can easily be proved by induction. We assume that the Markov chains generated by Q1·Q2and
Q2·Q1are ergodic (i.e., aperiodic and positive recurrent) and thus the limits A:= limk→∞(Q1·Q2)k
212
and B:= limk→∞(Q2·Q1)kexist (clearly, Aand Bare also stochastic matrices). The matrices A
and Bhave the specific feature that in every row all elements have the same value [115, chap. 8].
Using this, and the fact that Q1and Q2are stochastic matrices, it is easy to see that A0:= Q2·A
has the same number of rows as A, and again all elements of a single row have the same value, namely
the value that the corresponding row of Ahas. An analogous property holds for B0:= Q1·B.
Let n:= n1+n2,π0TRnwith π0= (π0
1,...,π0
n) a stochastic vector describing the initial state
distribution, let z0z1z2z3... be a sample path of the Markov chain (i.e. zj {s1, . . . , sn}), and
e1,...,endenote the unit vectors of Rn. For simplicity we assume that the states s1to snare given
by the natural numbers from 1 to n. Furthermore, denote 1S(x) the indicator function of the set S,
i.e. 1S(x) = 1 if xSand 0 otherwise. Define the vector akby
ak=ez0+
k
X
i=1
n
X
j=1
1{sj}(zi)ej=ez0+
n
X
j=1
ej
k
X
i=1
1{sj}(zi)
i.e. it counts in coordinate jof akhow often the system was in state jduring the first kstate
transitions of the given sample path. We define a0
k:= 1
kakand interpret the j-th coordinate of a0
k
as the fraction of the number of visits in state jw.r.t. the total number kof state transitions so far.
We write the j-th coordinate of a vector aas [a]j, and for a matrix Othe matrix element on the i-th
row and j-th column is written as [[O]]i,j . The long-term fraction of visits in state j(exemplarily we
choose j= 1) is then defined as:
a1:= lim
k→∞ [a0
k]1= lim
k→∞
1
k [ez0]1+
k
X
i=1
1{s1}(zi)!
provided the limit exists. It is convenient to rewrite this, in order to drop the influence of the start of
the sample path. Let νNfixed, for 1 <2ν < k we can write:
[a0
k]1=1
k [ez0]1+
2ν1
X
i=1
1{s1}(zi)!+1
k k+2ν
X
i=2ν
1{s1}(zi)!1
k k+2ν
X
i=k+1
1{s1}(zi)!(A.1)
For klarge enough we can drop the first and last term of Equation A.1, since in both cases the sum in
the braces is 2νand 2ν
k0 for k . Without loss of generality we assume kto be even, k= 2l
and l > 2ν. Then we can split the sum in the middle term of Equation A.1 into even and odd terms:
[a0
k]1=1
2l l
X
i=0
1{s1}(z2ν+2i) +
l
X
i=0
1{s1}(z2ν+2i+1)!(A.2)
The terms in the first sum are independent random variables, the same holds for the second sum. We
consider the first sum, denoted as Sl=Pl
i=0 1{s1}(z2ν+2i). For every random variable 1{s1}(z2ν+2i)
we have that
Pr[1{s1}(z2ν+2i) = 1] = π0·Pν+2i1=π0·(Q1·Q2)ν+i0
0(Q2·Q1)ν+i1
An analogous equation holds for the second sum. Since we have assumed the existence of Aand B, for
νlarge enough we can replace (Q1·Q2)ν+iby Aand (Q2·Q1)ν+iby B. Now each random variable
1{s1}(z2ν+2i) is an independent Bernoulli random variable with fixed probability
c:= Pr[1{s1}(z2ν+2i) = 1] = π0·A 0
0 B 1
213
Thus the first sum of equation A.2 is a sum of independent Bernoulli random variables. The strong
law of large numbers [47, chap. 8] asserts that for l the random variable Sl
lconverges to
E1{s1}(z2ν+2i)=cwith probability one, i.e., for almost all sample paths. The same calculations
can be done for the second sum of equation A.2. Putting this together, it is shown that
a1= lim
k→∞ [a0
k]1
a.s.
1
2 π0·A 0
0 B 1
+π0·0 B0
A001!
where a.s.
denotes almost sure convergence, i.e., convergence with probability one.
It remains to show that the values a1to anare nonnegative and sum up to one. The nonnegativity
follows immediately from the fact that all coefficients of A,B,A0,B0, and π0are nonnegative by
definition. The sum of a1to anis given as
n
X
i=1
ai=1
2
n
X
i=1 π0·A 0
0 B i
+π0·0 B0
A00i!=1
2 n
X
i=1 π0·A B0
A0Bi!
=1
2 n1
X
i=1 π0·A B0
A0Bi
+
n2
X
i=n1+1 π0·A B0
A0Bi!
=1
2 n1
X
i=1
[[A]]1,i +
n2
X
i=n1+1
[[B]]1,in1!=1
2(1 + 1) = 1
In the last step we have used that for each row of Aand Ball elements have the same value, and that
these values are the same as for A0and B0.
If we have the values a1up to anit is straightforward to compute e.g., the mean bit error rate,
the mean error burst length or the mean error-free burst lengths from the process generated by the
bipartite model. If Pi:= E[pi] is the mean state holding time of state i, then the long-term fraction
of time the system is in state iis given by aiPi, and the mean bit error rate is given by
¯m=Pn
i=1 riaiPi
Pn
i=1 aiPi
However, a drawback of the bipartite model is that these mean values cannot be represented in a
“simple” and “intuitive” manner from the values of P, as is the case for the Gilbert/Elliot model.
A.2 Distribution of Generated Burst Lengths
The following calculation shows that the distribution functions FX(·) and FY(·) of the error-free burst
length and error burst length distributions can be approximated with arbitrary precision by choosing
proper number of states and distribution functions for the single intervals. We show this only for the
error burst lengths, the calculations for the error-free burst lengths are identical.
The distribution function of the generated error burst lengths Y0in step kcan be calculated with the
214
law of total probability:
Pr[Y0y|zk {1,...,n1}] = Pr[Y0y({zk= 1}...{zk=n1})]
Pr[{zk= 1}...{zk=n1}]
=Pr[Y0y{zk= 1}] + ...+ Pr[Y0y{zk=n1}]
Pr[zk= 1] + ...+ Pr[zk=n1]
=Pr[Y0y|zk= 1] Pr[zk= 1] + ...+ Pr[Y0y|zk=n1] Pr[zk=n1]
Pr[zk= 1] + ...+ Pr[zk=n1]
=F1(y) Pr[zk= 1] + ...+Fn1(y) Pr[zk=n1]
Pr[zk= 1] + ...+ Pr[zk=n1]
Hence, the distribution of the generated error burst lengths Y0can be represented in a simple manner
as a linear combination of the approximating distributions Fi(·). By properly selecting the Fi(·) the
distribution FY(·) can be well approximated.
A.3 Correlation Properties
In order to show that the state process generated by Phas short-term or fast decaying correlation
properties, we assume that Pis diagonalizable, i.e. there exists two n×nmatrices Rand Dsuch that
R1exists and Dis a diagonal matrix with the eigenvalues λiof Pon the diagonale and P=R1·D·R
holds.1Since Pis a stochastic matrix, for all eigenvalues |λi| 1 holds [158, chapter 1.6]. Then the
autocorrelation function of the generated process can be represented as
R(k) = E[z0zk] =
n
X
i=1
n
X
j=1
ij Pr[z0=i, zk=j] =
n
X
i=1
n
X
j=1
ij Pr[z0=i] Pr[zk=j|z0=i]
=
n
X
i=1
n
X
j=1
ij [π0]iei·Pkj=
n
X
i=1
n
X
j=1
ij [π0]iei·R1·Dk·Rj
The influence of the eigenvalues λiwith |λi|<1 play a role for small kand vanish for k . The
eigenvalue λ1= 1 contributes the “mean” value of the generated state process.
Now we show that in the asymptotic case the generated process is becoming independent from its
start state z0, hence it looses correlation. To do this we have to show that for klarge enough
E[z0zk] = E[z0]E[zk] holds.
For calculating the long-term correlation we first observe that for the case of keven and large enough
we have
E[z0zk] =
n
X
i=1
n
X
j=1
ij [π0]iei·Pkj=
n
X
i=1
n
X
j=1
ij [π0]iei·A 0
0 B j
=
n1
X
i=1
n1
X
j=1
ij [π0]i[[A]]1,j +
n
X
i=n1+1
n
X
j=n1+1
ij [π0]i[[B]]1,jn1.
1The assumption that Pis diagonalizable is just for simplicity. Without this, the calculations are more involved. An
approach would be to switch into the complex domain and use the Jordan Normal form of Pinstead.
215
In the last equation we have used that for each row of Aand Ball elements have the same value. On
the other hand we have
E[z0]E[zk] = n
X
i=1
i[π0]i!·
n
X
j=1
jπ0·A 0
0 B j
= n
X
i=1
i[π0]i!·
n1
X
j=1
j[[A]]1,j
n1
X
l=1
[π0]l+
n
X
j=n1+1
j[[B]]1,jn1
n
X
l=n1+1
[π0]l
Using both equations, it is straightforward to verify that for all possible system start states π0=
eνwith ν {1,...,n}the relation E[z0zk] = E[z0]E[zk] holds and thus there is no correlation.
Furthermore, this calculation holds also true for kodd. Hence, there is no long term correlation.
216
Appendix B
Applicability of Simple FEC
Schemes to the Measurement
Traces
This appendix is an addendum to the evaluations of the measurement traces presented in Chapter 6.
We investigate here the feasibility of simple block FEC schemes, when applied to the measurement
results.
In block FEC schemes [97] a block of kuser bits is mapped onto ncode bits, with n > k. For
general block codes the Hamming Bound [97, chap. 3] applies, which states that up to terrors can
be corrected in a codeword of nbits length and kuser bits, only if the following relation holds:
2nk
t
X
i=0 n
i
The fact that a triple (n, k, t)N3satisfies this relation, does not imply that a code with this
properties really exists. The ratio k
nis denoted as the code rate. In the following, we restrict to the
case of n {8,...,32}. This restriction is somewhat arbitrary but can be justified by the observation
that in industrial communications frequently very small packets are used (e.g., short PROFIBUS
frames), i.e. kis often small.
With BPSK modulation in most cases a single bit error is surrounded by many correct bits (see Section
Mean PER Max. PER
BPSK w/o scrambling 0.9% 6%
BPSK w/ scrambling 6.4% 25.6%
QPSK w/o scrambling 7.7% 20.7%
QPSK w/ scrambling 3.2% 14.7%
Table B.1: Mean packet error rate (PER) and max. PER for QPSK and BPSK modulation and
different scrambling modes
217
6.5.6). Hence, it suffices to look at the case t= 1. By the Hamming bound, the best achievable code
rate k
nfor t= 1 is 84% ((n, k, t) = (31,26,1)). Stated differently, if every packet is transmitted with
FEC, the overhead is at least 16% for t= 1. In Table B.1 we show the mean and maximum packet
error rate (PER) (all packets with at least one bit error) for BPSK and QPSK with and without
scrambling.1For the investigated BPSK traces the PER is below 5.5%. Hence, applying FEC to all
packets is wasteful and should be restricted to retransmissions.
For QPSK there often occur 14 or 16 bits long bursts with two bit errors (see Section 6.5.6), hence,
we consider the case t= 2. The best code rate achievable for t= 2 and n {8,...,32}is 71%
((n, k, t) = (31,22,2)). Again, comparing with the PERs reported in Table B.1 it is easy to see that
applying FEC to all packets is wasteful.
To summarize, the findings about error densities and the fact that error-free burst lengths are some-
times very long, suggest that FEC should be enabled only for retransmissions. Furthermore, since
errors show longer term correlation, FEC should stay enabled for a while (suitable history information
should be used).
1One would expect increasing PERs for increasing packet sizes. This is only true for QPSK without scrambling.
218
Bibliography
[1] Imad Aad and Claude Castellucia. Introducing Service Differentiation into IEEE 802.11. In Proc.
Fifth IEEE Symposium on Computers and Communications (ISCC 2000), Antibes, France, July
2000.
[2] Richard L. Abrahams. 2.4GHz 11Mbps MACless DSSS Radio HWB1151 Users Guide -
AN9835.1. Intersil, 1999.
[3] G. Agrawal, B. Chen, W. Zhao, and S. Davari. Guaranteeing synchronous message deadlines with
the timed token medium access control protocol. IEEE Transactions on Computers, 43(2):327
339, March 1994.
[4] Lars Ahlin and Jens Zander. Principles of Wireless Communications. Studentlitteratur, Lund,
Sweden, 1998.
[5] Mostafa H. Ammar and George N. Rouskas. On the performance of protocols for collecting
responses over a multiple-access channel. IEEE Transactions on Communications, 43(2):412–
420, February 1995.
[6] Guiseppe Anastasi, Luciano Lenzini, Enzo Mingozzi, Andreas Hettich, and Andreas Kr¨amling.
Mac protocols for wideband wireless local access: Evolution towards wireless atm. IEEE Per-
sonal Communications, 5(5):53–64, October 1998.
[7] Jorgen Bach Andersen, Theodore S. Rappaport, and Susumu Yoshida. Propagation Measure-
ments and Models for Wireless Communications Channels. IEEE Communications Magazine,
33(1):42–49, January 1995.
[8] Joseph E. Baker and Izhak Rubin. Polling with a general-service order table. IEEE Transactions
on Communications, 35(3):283–288, March 1987.
[9] David F. Bantz and Frederic J. Bauchot. Wireless lan design alternatives. IEEE Network
Magazine, 8(3):43ff, 1994.
[10] K.A. Bartlett, R.A. Scantlebury, and P.T. Wilkinson. A note on reliable full-duplex transmission
over half duplex lines. Communications of the ACM, 12(5):260ff, 1969.
[11] Klaus Bender. PROFIBUS - Der Feldbus ur die Automation, volume 2. Carl Hanser Verlag,
M¨unchen, second edition, 1992.
[12] Michael Berry, Andrew T. Campbell, and Andras Veres. Distributed control algorithms for
service differentiation in wireless packet networks. In Proc. INFOCOM 2001, Anchorage, Alaska,
April 2001. IEEE.
219
[13] H.-P. Beuerle and G. Bach-Bezenar. Kommunikation in der Automatisierungstechnik. Siemens
Aktienges., Berlin; M¨unchen, 1991.
[14] Pravin Bhagwat, Partha Bhattacharya, Arvind Krishna, and Satish K. Tripathi. Using channel
state dependent packet scheduling to improve TCP throughput over wireless LANs. Wireless
Networks, 3(1):91–102, March 1997.
[15] Guiseppe Bianchi. Throughput Evaluation of the IEEE 802.11 Distributed Coordination Func-
tion. In Proc. Fifth International Workshop on Mobile Multimedia Communication (Mo-
MuC’98), pages 307–318, Berlin, Germany, 1998.
[16] Janos Bito. Digitale Mobilfunk-Kanalmodelle unter besonderer Ber¨ucksichtigung von adaptiven
digitalen Modellen. Dissertation, Technische Universit¨at Berlin, Department of Electrical Engi-
neering, December 1996.
[17] Kenneth L. Blackard, Theodore S. Rappaport, and Charles W. Bostian. Measurements and
models of radio frequency impulsive noise for indoor wireless communications. IEEE Journal
on Selected Areas in Communications, 11(7):991–1001, September 1993.
[18] George E. P. Box, Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis fore-
casting and control. Holden-Day, San Francisco, 3 edition, 1994.
[19] Paul T. Brady. A model for generating on-off speech patterns in two-way conversation. Bell
Systems Technical Journal, 48, September 1969.
[20] F. Cali, M. Conti, and E. Gregori. Ieee 802.11 wireless lan: Capacity analysis and protocol
enhancement. In Proc. INFOCOM 1998, San Francisco, April 1998. IEEE.
[21] J. I. Capetanakis. Tree Algorithm for Packet Broadcast Channels. IEEE Transactions on
Information Theory, 25(5):505–515, September 1979.
[22] S. Cavalieri and D. Panno. On the integration of fieldbus traffic within ieee 802.11 wireless lan.
In Proc. 1997 IEEE International Workshop on Factory Communication Systems (WFCS’97),
Barcelona (Spain), 1993.
[23] James K. Cavers. Mobile Channel Characteristics. Kluwer Academic Publishers, Boston, Dor-
drecht, 2000.
[24] CCITT. Recommendation Z.100: Specification and Description Language SDL. ITU General
Secretariat, 1988.
[25] Cheng-Shang Chang, Kwang-Cheng Chen, Ming-Young You, and Jin-Fu Chang. Guaranteed
quality-of-service wireless access to atm networks. IEEE Journal on Selected Areas in Commu-
nications, 15(1):106–118, January 1997.
[26] Charles Chien, Mani B. Srivastava, Rajeev Jain, Paul Lettieri, Vipin Aggarwal, and Robert
Sternowski. Adaptive Radio for Multimedia Wireless Links. IEEE Journal on Selected Areas in
Communications, 17(5):793–813, May 1999.
[27] Brian P. Crow, Indra Widjaja, Jeong Geun Kim, and Prescott T. Sakai. IEEE 802.11 wireless
local area networks. IEEE Communications Magazine, 35(9):116–126, September 1997.
220
[28] Klaus David and Thorsten Benkner. Digitale Mobilfunksysteme. Informationstechnik. B.G.
Teubner, Stuttgart, 1996.
[29] Edmundo de Souza e Silva, H. Richard Gail, and Richard R. Muntz. Polling systems with
server timeouts and their application to token passing networks. IEEE/ACM Transactions on
Networking, 3, October 1995.
[30] Jean-Dominique Decotignie and Patrick Pleineveaux. A survey on industrial communication
networks. Ann. Telecomm., 48(9):435ff, 1993.
[31] Dr-Jiunn Deng and Ruay-Shiung Chang. A priority scheme for ieee 802.11 dcf access method.
IEICE Transactions on Communications, E82-B(1):96–102, January 1999.
[32] H. Dietsch. Feldbus. Informatik Spektrum, 13:217ff, 1990.
[33] DIN - Deutsches Institut f¨ur Normung, Beuth Verlag Berlin. DIN 19245 Teil 1 - PROFIBUS:
¨
Ubertragungstechnik, Buszugriffs- und ¨
Ubertragungsprotokoll, Dienstschnittstelle zur Anwen-
dungsschicht, Management, April 1991.
[34] DIN - Deutsches Institut f¨ur Normung, Beuth Verlag Berlin. DIN 19245 Teil 2 - PROFIBUS:
Kommunikationsmodell, Dienste ur die Anwendung, Protokoll, Syntax, Codierung, Schnittstelle
zur Schicht 2, Management, April 1991.
[35] DIN - Deutsches Institut f¨ur Normung, Beuth Verlag Berlin. PROFIBUS-DP - Process Field
Bus Decentralised Periphery (DP) - Part 3, Draft Standard DIN 19245, April 1993.
[36] DIN - Deutsches Institut f¨ur Normung, Beuth Verlag Berlin. DIN 19258 Teil 1 - INTERBUS-
S, Sensor-/Aktornetzwerk ur industrielle Steuerungssysteme System-Architektur, May 1994.
Entwurf.
[37] DIN - Deutsches Institut f¨ur Normung, Beuth Verlag Berlin. DIN 19258 Teil 2 -
INTERBUS-S, Sensor-/Aktornetzwerk f¨ur industrielle Steuerungssysteme Physical Layer
(Bit¨ubertragungsschicht), May 1994. Entwurf.
[38] DIN - Deutsches Institut f¨ur Normung, Beuth Verlag Berlin. DIN 19258 Teil 3 - INTERBUS-
S, Sensor-/Aktornetzwerk ur industrielle Steuerungssysteme Data Link Layer (Sicherungss-
chicht), May 1994. Entwurf.
[39] D. Duchamp and N.F.Reynolds. Measured performance of wireless lan. In Proc. of 17th Conf.
on Local Computer Networks, Minneapolis, 1992.
[40] David Eckhard and Peter Steenkiste. Measurement and analysis of the error characteristics of
an in-building wireless network. In Proc. of ACM SIGCOMM’96 Conference,, pages 243–254,
Stanford University, California, August 1996.
[41] David A. Eckhardt and Peter Steenkiste. A trace-based evaluation of adaptive error correction
for a wireless local area network. MONET - Mobile Networks and Applications, 4:273–287, 1999.
[42] E. O. Elliot. Estimates of error rates for codes on burst-noise channels. Bell Systems Technical
Journal, 42:1977–1997, September 1963.
[43] ETSI. High Performance Radio Local Area Network (HIPERLAN) - Draft Standard. ETSI,
March 1996.
221
[44] ETSI. TR 101 683, HIPERLAN Type 2: System Overview. ETSI, February 2000.
[45] ETSI. TS 101 475, BRAN, HIPERLAN Type 2: Physical (PHY) Layer. ETSI, March 2000.
[46] Andras Farago, Andrew D. Myers, Violet R. Syrotiuk, and Gergely V. Zaruba. Meta-MAC
Protocols: Automatic Combination of MAC Protocols to Optimize Performance for Unknown
Conditions. IEEE Journal on Selected Areas in Communications, 18(9):1670–1681, September
2000.
[47] William Feller. An Introduction to Probability Theory and Its Applications - Volume I. John
Wiley, New York, third edition, 1968.
[48] Sally Floyd, Van Jacobson, Ching-Gung Liu, Steven McCanne, and Lixia Zhang. A Reliable
Multicast Framework for Light-Weight Sessions and Application Level Framing. IEEE/ACM
Transactions on Networking, 5(6):784–803, 1997.
[49] International Organization for Standardization. ISO Standard 8807 - Information processing
systems - Open Systems Interconnection - LOTOS - A formal description technique based on
the temporal ordering of observational behaviour. ISO - Internation Organization for Standard-
ization, February 1989.
[50] International Organization for Standardization. ISO Standard 11898 - Road Vehicle - Inter-
change of Digital Information - Controller Area Network (CAN) for High-Speed Communication.
ISO - Internation Organization for Standardization, 1993.
[51] B. D. Fritchman. A binary channel characterisation using partitioned markov chains. IEEE
Transactions on Information Theory, 13(2):221–227, April 1967.
[52] Projektkonsortium Funbus. Das verbundprojekt drahtlose feldbusse im produktionsumfeld (fun-
bus) abschlußbericht. INTERBUS Club Deutschland e.V., Postf. 1108, 32817 Blomberg,
Bestell-Nr: TNR 5121324, October 2000. http://www.softing.de/d/NEWS/Funbusbericht.pdf.
[53] Aura Ganz, Anan Phonphoem, and Zvi Ganz. Robust Superpoll Protocol for IEEE 802.11
Wireless LANs. In Proc. IEEE Military Communications Conference, Boston, Massachusetts,
October 1998.
[54] J. Garcia-Frias and P. M. Crespo. Hidden Markov Models for burst error characterization in
indoor radio channels. IEEE Transactions on Vehicular Technology, 46:1006–1020, November
1997.
[55] German Institute of Standardization (DIN). PROFIBUS Standard Part 1 and 2, 1991.
[56] Jerry D. Gibson, editor. The Communications Handbook. CRC Press / IEEE Press, Boca Raton,
Florida, 1996.
[57] E. N. Gilbert. Capacity of a burst-noise channel. Bell Systems Technical Journal, 39:1253–1265,
September 1960.
[58] Alois M. J. Goiser. Handbuch der Spread-Spectrum Technik. Springer Verlag, Wien, New York,
1998.
222
[59] James Gross, Michael Jaeger, and Andreas Willig. Measurements of a Wireless Link in dif-
ferent RF-isolated Environments. TKN Technical Report Series TKN-01-005, Telecommu-
nication Networks Group, Technical University Berlin, June 2001. http://www-tkn.ee.tu-
berlin.de/publications/tknrreports.html.
[60] Ajay Chandra V. Gummalla and John O. Limb. Wireless medium access control protocols. IEEE
Communications Surveys and Tutorials, 3(2), 2000. http://www.comsoc.org/pubs/surveys.
[61] Jaap C. Haartsen. The Bluetooth Radio System. IEEE Personal Communications, 7(1):28–36,
February 2000.
[62] Fred Halsall. Data Communications, Computer Networks and Open Systems. Addison-Wesley,
Reading, Massachusetts, 1996.
[63] M. Hata. Empirical formula for propagation loss in land mobile radio services. IEEE Transac-
tions on Vehicular Technology, 29(3):317–325, August 1980.
[64] Boudewijn R. Haverkort. Performance of Computer Communication Systems A Model Based
Approach. John Wiley and Sons, Chichester / New York, 1998.
[65] Olivier Hersent, David Gurle, and Jean-Pierre Petit. IP Telephony Packet-based multimedia
communications systems. Addison-Wesley, Harlow / England, London, 2000.
[66] Hirschmann Rheinmetall Elektronik, Neckartenzlingen. IZD Profi 01 Description and Oper-
ating Instructions Infrared Transmission System, March 1999.
[67] A. Hoffmann, R. J. Haines, and A. H. Aghvami. Performance analysis of a token based mac
protocol with asymmetric polling strategy (’topo’) for indoor radio local area networks under
channel outage conditions. In Proc. International Conference on Communications (ICC), pages
1306–1311, New Orleans, Louisiana, 1994. IEEE.
[68] Frank J. Furrer (Hrsg.). BITBUS - Grundlagen und Praxis. H¨uthig Buch Verlag, Heidelberg,
1994.
[69] IEC - International Electrotechnical Commission. IEC-1158-1, FieldBus Specification, Part 1,
FieldBus Standard for Use in Industrial Control: Functional Requirements.
[70] IEEE - Institute of Electrical and Electronics Engineers, IEEE Standards Department, 445 Hoes
Lane, P.O. Box 1331, Piscataway, NJ 08855-1331, USA. IEEE Standard 1118 - IEEE Standard
Microcontroller System Serial Control Bus, August 1991.
[71] IEEE/ISO. Information processing systems - Local Area Networks - part 4: Token-passing bus
access method and physical layer specifications. International Organization for Standardization,
August 1990.
[72] International Standards Organization. ISO 9506 Industrial Automation Systems Integration
and Communications - Manufacturing Message Specification, Part 1: Service Definition, Part
2: Protocol Specification.
[73] Intersil. HFA3860B Data Sheet, File Number 4594.1, 1999.
223
[74] Ivan Izikowitz and Michael Solvie. Industrial needs for time-critical wireless communication &
wireless data transmission and application layer support for time critical communication. In
Proc. Euro-Arch’93, M¨unchen, 1993. Springer Verlag, Berlin.
[75] Raj Jain. The Art of Computer Systems Performance Analysis Techniques for Experimental
Design, Measurement, Simulation, and Modeling. Wiley Professional Computing. John Wiley
and Sons, New York, Chichester, 1991.
[76] Raj Jain. FDDI Handbook: High-Speed Networking Using Fiber and Other Media. Addison-
Wesley, Reading, Massachusetts, 1994.
[77] W. C. Jakes. Microwave Mobile Communications. Wiley, New York, 1974.
[78] W. C. Jakes, editor. Microwave Mobile Communications. IEEE Press, New Jersey, 1993.
[79] Michel C. Jeruchim, Philip Balaban, and K. Sam Shanmugan. Simulation of Communication
Systems Modeling, Methodology and Techniques. Information Technology: Transmission, Pro-
cessing and Storage. Kluwer Academic/Plenum Publishers, New York, Boston, second edition,
2000.
[80] Vincent C. Jones. MAP / TOP Networking Achieving Computer Integrated Manufacturing.
McGraw-Hill, New York, 1988.
[81] L. Rauchhaupt org ahniche. Opportunities and problems of wireless fieldbus extensions. In
Proc. FeT’99: Feldbustechnik Fieldbus Technology, Magdeburg, 1999. Springer Verlag, Wien
/ New York.
[82] Hong ju Moon, Hong Seong Park, Sang Chul Ahn, and Wook Hyun Kwon. Performance Degrada-
tion of the IEEE 802.4 Token Bus Network in a Noisy Environment. Computer Communications,
21:547–557, 1998.
[83] Andreas Kanbach and Andreas orber. ISDN. Die Technik. Schnittstellen, Protokolle, Dienste,
Endsysteme. H¨uthig Buch Verlag, Heidelberg, third edition, 1999.
[84] Michael Kasper. Profibus goes wireless drahtlose ¨
Ubertragung mit infrarotstrahlen. Industrie
Service, page 30, July-August 1999.
[85] B. Kedem. Binary Time Series. Springer, New York, Basel, 1980.
[86] Kalevi Kilkki. Differentiated Services for the Internet. Macmillan Technical Publishing, Indi-
anapolis, 1999.
[87] Young Yong Kim and San qi Li. Modeling multipath fading channel dynamics for packet data
performance analysis. In Proc. IEEE INFOCOM 98. IEEE, 1998.
[88] Young Yong Kim and San qi Li. Capturing Important Statistics of a Fading/Shadowing Chan-
nel for Network Performance Analysis. IEEE Journal on Selected Areas in Communications,
17(5):888–901, May 1999.
[89] Ulrich Klehmet, Markus Ettl, and Peter otz. Leistungsbewertung der feldbus-protokolle
profibus und fip. atp - automatisierungstechnische praxis, 35(6):355ff, 1993.
224
[90] Leonard Kleinrock. Queueing Systems Volume 2: Computer Applications, volume 2. John
Wiley and Sons, New York, 1976.
[91] Almudena Konrad, Ben Y. Zhao, Anthony D. Joseph, and Reiner Ludwig. A Markov-Based
Channel Model Algorithm for Wireless Networks. In Proc. of Fourth ACM International Work-
shop on Modeling, Analysis and Simulation of Wireless and Mobile Systems (MSWiM 2001),
Rome, July 2001.
[92] Andreas opsel. A comparison between point and distributed coordination function of an
ieee802.11 wlan. Diploma thesis, Telecommunication Networks Group (TKN), Technical Uni-
versity Berlin, Berlin, July 2000. in German.
[93] Osama Kubbar and Hussein T. Mouftah. Multiple access control protocols for wireless atm:
Problems definition and design objectives. IEEE Communications Magazine, 35(11):93–99,
November 1997.
[94] J. F. Kurose, M. Schwartz, and Y. Yemini. Multiple-access protocols and time-constrained
communication. ACM Computing Surveys, 16:43–70, March 1984.
[95] A. Kutlu, H. Ekiz, M. D. Baba, and E. T. Powner. Implementation of “comb” based wireless
access method for control area network. In Proc. 11th Intl. Symp. on Computer and Information
Science, pages 565–573, Antalaya, Turkey, November 1996.
[96] A. Kutlu, H. Ekiz, and E. T. Powner. Performance analysis of MAC protocols for wireless
control area network. In Proc. Intl. Symp. on Parallel Architectures, Algorithms and Networks,
pages 494–499, Beijing, China, June 1996.
[97] Shu Lin and Daniel J. Costello. Error Control Coding Fundamentals and Applications.
Prentice-Hall, Englewood Cliffs, New Jersey, 1983.
[98] C.L. Liu and J. Layland. Scheduling algorithms for multiprogramming in a hard real-time
environment. Journal of the ACM, 20(1):46–61, 1973.
[99] Hang Liu, Hairuo Ma, Magda El Zarki, and Sanjay Gupta. Error control schemes for networks:
An overview. MONET Mobile Networks and Applications, 2(2):167–182, 1997.
[100] Jane W. S. Liu. Real-Time Systems. Prentice-Hall, Upper Saddle River, NJ, 2000.
[101] Madhav V. Marathe and Robin A. Smith. Performance of a map network adapter. IEEE
Network, 2(3):82ff, 1988.
[102] R. H. McCullough. The binary regenerative channel. Bell Systems Technical Journal, 47:1713–
1735, October 1968.
[103] Mesquite Software, Inc., T. Braker Lane, Austin, Texas. CSIM18 Simulation Engine Users
Guide, 1997.
[104] John J. Metzner. Message scheduling for efficient data communication under varying channel
conditions. IEEE Transactions on Communications, 32(1):48–55, January 1984.
[105] Jouni Mikkonen, James Aldis, Geert Awater, Andrew Lunn, and David Hutchison. The magic
wand functional overview. IEEE Journal on Selected Areas in Communications, 16(6):953–972,
August 1998.
225
[106] P. Heinz M¨uller, editor. Wahrscheinlichkeitsrechnung und Mathematische Statistik. Akademie
Verlag, Berlin, third edition, 1980.
[107] M. Molinari and M. Zekar. Drahtlose lokale Netze. DATACOM-Verlag, Bergheim, 1994.
[108] Philip Morel. Mobility in map networks using the dect wireless protocols. In Proc. 1995 IEEE
Workshop on Factory Communication Systems, WFCS’95, Leysin, Switzerland, 1995.
[109] Philip Morel and Alain Croisier. A wireless gateway for fieldbus. In Proc. Sixth International
Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC 95), 1995.
[110] Philip Morel and Jean-Dominique Decotignie. Integration of wireless mobile nodes in map/mms.
In Proc. 13th IFAC Workshop on Distributed Computer Control Systems DCCS95, 1995.
[111] Motorola Inc. MPC 860 PowerQUICC Users’s Manual, 1998.
[112] General Motors. Manufacturing Automation Protocol V3.0. General Motors, 1988.
[113] P. Narasimhan, S. K. Biswas, C. A. Johnston, R. J. Syracusa, and H. Kim. Design and per-
formance of radio access protocols in WATMNet, a prototype wireless atm network. In Proc.
ICUPC 97, 6th WINLAB Workshop, 1997.
[114] Kevin J. Negus, Adrian P. Stephens, and Jim Lansford. HomeRF: Wireless Networking for the
Connected Home. IEEE Personal Communications, 7(1):20–27, February 2000.
[115] Randolph Nelson. Probability, Stochastic Processes, and Queueing Theory The Mathematics
of Computer Performance Modeling. Springer Verlag, New York, 1995.
[116] P. Neumann, C. Diedrich, and J. ahniche. Der nationale feldbusstandard profibus - profile,
implementationen und tests. ZwF - Zeitschrift ur wirtschaftliche Fertigung, 87(7):365ff, July
1992.
[117] Joseph Kee-Yin Ng. MPEG transmission schemes for a timed token medium access control
network. ACM Computer Communication Review, 29(1):66 80, January 1999.
[118] Giao T. Nguyen, , Randy H. Katz, Brian Noble, , and Mahadev Satyanarayanan. A trace-based
approach for modeling wireless channel behavior. In Proceedings of the Winter Simulation
Conference, Coronado, CA, December 1996.
[119] The Editors of IEEE 802. IEEE 802.2, ISO/IEC 8802-2: Local Area Networks: Logical Link
Control, 1989.
[120] The Editors of IEEE 802.11. IEEE Standard for Wireless LAN Medium Access Control (MAC)
and Physical Layer (PHY) specifications, November 1997.
[121] The Editors of IEEE 802.11. IEEE Standard for Information Technology - Telecommunications
and information exchange between systems - Local and Metropolitan networks - Specific require-
ments - Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY)
specifications: Higher speed Physical Layer (PHY) extension in the 2.4 Ghz band, 1999.
[122] The Editors of IEEE 802.11. IEEE Standard for Telecommunications and Information Exchange
Between Systems - LAN/MAN Specific Requirements - Part 11: Wireless Medium Access Control
(MAC) and physical layer (PHY) specifications: High Speed Physical Layer in the 5 GHz band,
1999.
226
[123] Bob O’Hara and Al Petrick. IEEE 802.11 Handbook A Designer’s Companion. IEEE Press,
New York, 1999.
[124] K. Pahlavan and A.H. Levesque. Wireless Information Networks. J. Wiley and Sons, 1995.
[125] Kaveh Pahlavan, Ali Zahedi, and Prashant Krishnamurthy. Wideband local access: Wireless
lan and wireless atm. IEEE Communications Magazine, 35(11):34–40, November 1997.
[126] J. D. Parsons. The Mobile Radio Propagation Channel. Pentech Press, London, 1992.
[127] Juan R. Pimentel. Communication Networks for Manufacturing. Prentice-Hall International,
1990.
[128] Patrick Pleineveaux and Jean-Dominique Decotignie. Time critical communication networks:
Field buses. IEEE Network, 2(3):55ff, 1988.
[129] D. K. Pradhan, editor. Fault-tolerant Computer System Design. Prentice Hall, Upper Saddle
River, NJ, 1996.
[130] Prashant Pradhan and Tzi-Cker Chiueh. Real-Time Performance Guarantees over
Wired/Wireless LANs. In Proc. IEEE Real-Time Application and Technology Symposium. IEEE,
June 1998.
[131] Anand R. Prasad. Performance comparison of voice over ieee 802.11 schemes. In Proc. IEEE
Vehicular Technology Conference (VTC) ’99. IEEE, 1998.
[132] John G. Proakis. Channel equalization. In Jerry D. Gibson, editor, The Communications
Handbook, pages 339–363. CRC Press / IEEE Press, Boca Raton, Florida, 1996.
[133] PROFIBUS Nutzerorganisation e.V., PROFIBUS Nutzerorganisation e.V., Haid-und-Neu-Str.
7, Karlsruhe, Germany. PROFIBUS Entwurf Technische Richtlinie Implementierungshin-
weise zur DIN 19245 Teil 2, December 1993.
[134] PROFIBUS Nutzerorganisation e.V., PROFIBUS Nutzerorganisation e.V., Haid-und-Neu-Str.
7, Karlsruhe, Germany. Implementation Guide to DIN 19245 Part 1, August 1994.
[135] PROFIBUS Nutzerorganisation e.V., PROFIBUS Nutzerorganisation e.V., Haid-und-Neu-Str.
7, Karlsruhe, Germany. PROFIBUS Entwurf Technische Richtlinie Implementierungshin-
weise zur DIN 19245 Teil 1, August 1994.
[136] PROFIBUS Nutzerorganisation e.V., PROFIBUS Nutzerorganisation e.V., Haid-und-Neu-Str.
7, Karlsruhe, Germany. PROFIBUS Technical Description, September 1999.
[137] Matthias atzold. Mobilfunkkan¨ale Modellierung, Analyse und Simulation. Vieweg, Braun-
schweig, 1999.
[138] Matthias atzold and Frank Laue. Level crossing rate and average duration of fades of determin-
istic simulation models for rice fading channels. IEEE Transactions on Vehicular Technology,
48(4):1121–1129, July 1999.
[139] San qi Li and Chia-Lin Hwang. Queue response to input correlation functions: Continuous
spectral analysis. IEEE/ACM Transactions on Networking, 1:678–692, December 1993.
227
[140] San qi Li and Chia-Lin Hwang. On the convergence of traffic measurement and queueing analysis:
A statistical-match queueing (smaq) tool. IEEE/ACM Transactions on Networking, 5:95–110,
February 1997.
[141] R-Fieldbus Consortium. R-FIELDBUS High Performance Wireless Fieldbus in Industrial
Related Multi-Media Environment, 2000. Presentation Slides from www.rfieldbus.de.
[142] Jan M. Rabaey, M. Josie Ammer, Julio L. da Silva, Danny Patel, and Shad Roundy. PicoRadio
Supports Ad Hoc Ultra-Low Power Wireless Networking. IEEE Computer, 33(7), July 2000.
[143] Markus Radimirsch and Jamshid Khun-Jush. Application of hiperlan type 2 systems in private
environments. In Proc. International Conference on Telecommunications, Acapulco, Mexico,
2000.
[144] M. Rahnema. Overview of the gsm system and protocol architecture. IEEE Communications
Magazine, 31(4):92–100, April 1993.
[145] Theodore S. Rappaport, Rias Muhamed, and Varun Kapoor. Propagation models. In Jerry D.
Gibson, editor, The Communications Handbook, pages 1182–1196. CRC Press / IEEE Press,
Boca Raton, Florida, 1996.
[146] U. Rembold, B. O. Nnaji, and A. Storr. Computer Integrated Manufacturing and Engineering.
Addison-Wesley, 1993.
[147] S. O. Rice. Mathematical analysis of random noise. Bell Systems Technical Journal, 23:282–332,
July 1944.
[148] S. O. Rice. Mathematical analysis of random noise. Bell Systems Technical Journal, 24:46–156,
January 1945.
[149] Izhak Rubin and L. F. M. de Moraes. Message Delay Analysis for Polling and Token Multiple-
Access Schemes for Local Communication Networks. IEEE Journal on Selected Areas in Com-
munications, 1(5):935–947, 1983.
[150] Asuncion Santamaria and Francisco J. Lopez-Hernandez, editors. Wireless LAN Standards
and Applications. Mobile Communication Series. Artech House, Boston, London, 2001.
[151] Mischa Schwartz. Telecommunication Networks - Protocols, Modeling and Analysis. Addison-
Wesley, Reading, Massachusetts, 1988.
[152] Oran Sharon and Eitan Altman. An efficient polling mac for wireless lans. IEEE/ACM Trans-
actions on Networking, 9(4):439–451, August 2001.
[153] Kang G. Shin and Parameswaran Ramanathan. Real-Time Computing: A New Discipline of
Computer Science and Engineering. Proceedings of the IEEE, 82(1):6–24, January 1994.
[154] D. P. Siewiorek and R. S. Swarz. Reliable Computer Systems Design and Evaluation. Digital
Press, Burlington, MA, 2nd edition, 1992.
[155] Bernard Sklar. Digital Communications Fundamentals and Applications. Prentice Hall, En-
glewood Cliffs, New Jersey, 1988.
228
[156] Joao L. Sobrinho and A. S. Krishnakumar. Real-time traffic over the ieee 802.11 medium access
control layer. Bell Labs Technical Journal, 1(2):172–187, 1996.
[157] SRB Innovative Industrie Elektronik Gmbh. Silver Data Stream Transfer-Modul User Manual
for SDSTM-F24-V1.0.1, 1999.
[158] William J. Stewart. Introduction to the Numerical Solution of Markov Chains. Princeton Uni-
versity Press, Princeton, New Jersey, 1994.
[159] Takahiro Suzuki and Shuji Tasaka. Performance evaluation of video transmission with the
pcf of the ieee 802.11 standard mac protocol. IEICE Transactions on Communications, E83-
B(9):2068–2076, September 2000.
[160] F. Swarts and H.C. Ferreira. Markov characterization of digital fading mobile vhf channels.
IEEE Transactions on Vehicular Technology, 43(4):977–985, November 1994.
[161] Hideaki Takagi. Analysis of Polling Systems. MIT Press, Cambridge, Massachusetts, 1986.
[162] Hideaki Takagi. Analysis and applications of a multi-queue cyclic service system with feedback.
IEEE Transactions on Communications, 35(2):248–250, February 1987.
[163] Hideaki Takagi. Queueing analysis of polling models: an update. In Hideaki Takagi, editor,
Stochastic Analysis of Computer and Communication Systems, pages 267–318. Elsevier, Ams-
terdam, 1990.
[164] T. Takine, H. Takagi, and T. Hasegawa. Sojourn times in vacation and polling systems with
bernouilli feedback. Journal of Applied Probability, 28:422–432, June 1991.
[165] Andrew S. Tanenbaum. Computer-Netzwerke. Wolframs Fachverlag, Attenkirchen, second edi-
tion, 1992.
[166] Andrew S. Tanenbaum. Computernetzwerke. Prentice-Hall, Muenchen, third edition, 1997.
[167] Fouad A. Tobagi and Leonard Kleinrock. Packet switching in radio channels: Part ii the hidden
terminal problem in csma and busy-tone solutions. IEEE Transactions on Communications,
23(12):1417–1433, 1975.
[168] Eduardo Tovar. Supporting Real-Time Communications with Standard Factory-Floor Networks.
PhD dissertation, Dept. of Electrical Engineering, Univ. of Porto, Portugal, 1999.
[169] Eduardo Tovar and Francisco Vasques. Real-Time Fieldbus Communications Using Profibus
Networks. IEEE Transactions on Industrial Electronics, 46(6):1241–1251, December 1999.
[170] Don Towsley and J. K. Wolf. On adaptive tree polling algorithms. IEEE Transactions on
Communications, 32(12):1294–1298, 1984.
[171] Tundra Corporation. Reference Manual Tundra PCI Interconnect, 1998.
[172] W. Turin and R. van Nobelen. Hidden markov modeling of fading channels. In Proc. IEEE
Vehicular Technology Conference, pages 1234–1238, May 1998.
[173] William Turin. Digital Transmission Systems Performance Analysis and Modeling. McGraw-
Hill Telecommunications. McGraw-Hill, New York, 1998.
229
[174] William Turin and Robert van Nobelen. Hidden Markov Modeling of Flat Fading Channels.
IEEE Journal on Selected Areas in Communications, 16(9):1809–1817, December 1998.
[175] Kenneth J. Turner, editor. Using Formal Description Techniques - An Introduction to Estelle,
LOTOS and SDL. Prentice Hall, Chichester, New York, 1993.
[176] Union Technique de l’Electricit´e. General Purpose Field Communication System, EN 50170,
Volume 1: P-NET, 1996.
[177] Union Technique de l’Electricit´e. General Purpose Field Communication System, EN 50170,
Volume 2: PROFIBUS, 1996.
[178] Union Technique de l’Electricit´e. General Purpose Field Communication System, EN 50170,
Volume 3: WorldFIP, 1996.
[179] Richard van Nee and Ramjee Prasad. OFDM for Wireless Multimedia Communications. Artech
House Publisher, 2000.
[180] Malathi Veeraraghavan, Nabeel Cocker, and Tim Moors. Support of voice services in ieee 802.11
wireless lans. In Proc. INFOCOM 2001, Anchorage, Alaska, April 2001. IEEE.
[181] Matthijs A. Visser and Magda El Zarki. Voice and Data Transmission over an 802.11 Wireless
network. In Proc. IEEE Personal, Indoor and Mobile Radio Conference (PIMRC) 95, pages
648–652, Toronto, Canada, September 1995.
[182] Bernhard Walke. Mobilfunknetze und ihre Protokolle, Band 1. Informationstechnik. B.G. Teub-
ner, Stuttgart, 1998.
[183] Bernhard Walke. Mobilfunknetze und ihre Protokolle, Band 2. Informationstechnik. B.G. Teub-
ner, Stuttgart, 1998.
[184] H.S. Wang and N. Moayeri. Finite State Markov Channel - A Useful Model for Radio Com-
munication Channels. IEEE Transactions on Vehicular Technology, 44(1):163–171, February
1995.
[185] Jost Weinmiller, Morten Schl”ager, Andreas Festag, and Adam Wolisz. Performance study of
access control in wireless LANs IEEE 802.11 DFWMAC and ETSI RES 10 Hiperlan. MONET
- Mobile Networks and Applications, 2(1):55–67, 1997.
[186] Jost Weinmiller, Hagen Woesner, Jean-Pierre Ebert, and Adam Wolisz. Analyzing and im-
proving the 802.11-mac protocol for wireless lans. In Proc. MASCOT 95, San Jose, California,
February 1995.
[187] Stuart Williams. IrDA: Past, Present and Future. IEEE Personal Communications, 7(1),
February 2000.
[188] Andreas Willig. Analysis and Tuning of the PROFIBUS Token Passing Protocol for Use Over
Error-Prone Links. TKN Technical Report Series TKN-99-001, Telecommunication Networks
Group, Technical University Berlin, March 1999.
[189] Andreas Willig. Analysis of the PROFIBUS Token Passing Protocol over Error Prone Links.
In Proc. 25th Annual Conference of the IEEE Industrial Electronics Society (IECON’99), pages
1246 1252. IEEE, November 1999.
230
[190] Andreas Willig. Markov Modeling of PROFIBUS Ring Membership over Error Prone Links.
TKN Technical Report Series TKN-99-004, Telecommunication Networks Group, Technical Uni-
versity Berlin, May 1999. http://www-tkn.ee.tu-berlin.de/publications/tknrreports.html.
[191] Andreas Willig. Architectural Considerations for a wireless integrated PROFIBUS. TKN Tech-
nical Report Series TKN-01-006, Telecommunication Networks Group, Technical University
Berlin, October 2001. http://www-tkn.ee.tu-berlin.de/publications/tknrreports.html.
[192] Andreas Willig, Martin Kubisch, Christian Hoene, and Adam Wolisz. Measurements of a Wire-
less Link in an Industrial Environment using an IEEE 802.11-Compliant Physical Layer. IEEE
Transactions on Industrial Electronics, 2001. accepted for publication.
[193] Andreas Willig, Martin Kubisch, and Adam Wolisz. Bit Error Rate Measurements Second
Campaign, factorial measurement. TKN Technical Report Series TKN-00-011, Telecommuni-
cation Networks Group, Technical University Berlin, November 2000. http://www-tkn.ee.tu-
berlin.de/publications/tknrreports.html.
[194] Andreas Willig, Martin Kubisch, and Adam Wolisz. Bit Error Rate Measurements Second
Campaign, longterm1 measurement. TKN Technical Report Series TKN-00-009, Telecommu-
nication Networks Group, Technical University Berlin, November 2000. http://www-tkn.ee.tu-
berlin.de/publications/tknrreports.html.
[195] Andreas Willig, Martin Kubisch, and Adam Wolisz. Bit Error Rate Measurements Second
Campaign, longterm2 measurement. TKN Technical Report Series TKN-00-010, Telecommu-
nication Networks Group, Technical University Berlin, November 2000. http://www-tkn.ee.tu-
berlin.de/publications/tknrreports.html.
[196] Andreas Willig, Martin Kubisch, and Adam Wolisz. Results of Bit Error Rate Measurements
with an IEEE 802.11 compliant PHY. TKN Technical Report Series TKN-00-008, Telecommu-
nication Networks Group, Technical University Berlin, November 2000. http://www-tkn.ee.tu-
berlin.de/publications/tknrreports.html.
[197] Andreas Willig, Martin Kubisch, and Adam Wolisz. Measurements and Stochastic Modeling
of a Wireless Link in an Industrial Environment. TKN Technical Report Series TKN-01-001,
Telecommunication Networks Group, Technical University Berlin, March 2001. http://www-
tkn.ee.tu-berlin.de/publications/tknrreports.html.
[198] Andreas Willig and Adam Wolisz. Ring Stability of the PROFIBUS Token Passing Protocol
over Error Prone Links. IEEE Transactions on Industrial Electronics, 48(5):1025–1033, October
2001.
[199] Zhensheng Zhang and Anthony S. Acampora. Performance of a modified polling strategy for
broadband wireless lans in a harsh fading environment. Telecommunication Systems, 1:279–294,
1993.
[200] Michele Zorzi and Ramesh R. Rao. Performance of arq go-back-n protocol in markov channels
with unreliable feedback. Wireless Networks, 2:183–193, 1997.
[201] Michele Zorzi and Ramesh R. Rao. Perspectives on the impact of error statistics on protocols
for wireless networks. IEEE Personal Communications, 6(5), October 1999.
231