Millimeter Wave Wireless
Communication: Initial Acquisition,
Data Communication and Relay
Network Investigation
vorgelegt von
M.Eng.
Xiaoshen Song
ill
an der Fakultat IV -Elektrotechnik und Informatik
der Technischen Universitat Berlin
zur Er langung des akademischen Grades
Doktor der Ingenieurwissenschaften
-Dr.-Ing.-
genehmigte Dissertation
Promotionsa usschuss:
Vorsitzender: Prof. Rafael Schaefer
Gutachter: Prof. Giuseppe Caire
Gutachter: Prof. Robert W. Heath Jr.
Gutachter: Prof. Joerg Widmer
Tag der wissenschaftlichen Aussprache: 18. September 2020
Berlin 2020
Abstract
Wireless communication has become an important part of our daily lives. In the past
decades, the phenomenal increasing demand for mobile wireless data services has been
pushing both industry and academia to move to millimeter wave (mmWave) frequencies
(30-300 GHz) for the next generation (5G) mobile communication. The main motivation
for mmWave communication is the unprecedented massive bandwidth (multi-GHz) which
can offer multi-Gbps data rates for each mobile devices. However, mmWave signals
experience high path loss, directivity and blockages, which severely limits the network
performance.
To overcome the aforementioned challenges in mmWave communication, large antenna
arrays are used on the transceivers aiming for a large beamforming gain to compensate the
severe path loss. In addition, hybrid digital analog (HDA) architecture, with much smaller
number of radio frequency (RF) chains in comparison with the number of antennas, are
commonly implemented at the transceivers in order to reduce the hardware complexity
and power consumption. All these new features rise a big challenge for signaling and
networking for mmWave wireless systems.
The goal of this thesis is to clearly incorporate all the new features at mmWave
frequencies, on top of which to provide new state of the art schemes regarding different
communication phases. Specifically, this thesis contains four main contributions.
First, we propose an efficient beam alignment (BA) scheme for mmWave OFDM
(orthogonal frequency division multiplexing) systems. In this scheme, we use pseudo-
random multi-finger beam patterns in the downlink to explore the beam-domain channel,
and then construct an estimate of the channel second-order statistics. By using non-
negative least-squares (NNLS) technique, the resulting under-determined equations can
be efficiently solved. Accordingly, the proposed BA scheme is very robust to channel
time-dynamics and is strongly scalable to multi-user scenarios.
Second, we further explore single-carrier (SC) operation mode at mmWave frequencies
and propose a new BA scheme for mmWave SC systems. In this scheme, the BS
periodically probes the channel via a pre-specified pseudo-random beamforming codebook
and pseudo-random spreading codes. Each UE formulates the BA problem as the
estimation of a sparse non-negative second-order statistic channel vector, which can be
efficiently solved by using NNLS technique. In addition to the advantage of multi-user
iv
scalability, this proposed scheme is purely in time domain and is highly robust to fast
channel variations caused by the large Doppler spread between multipath components.
Third, we define two HDA antenna architectures which can be regarded as two
“extreme” cases, i.e., the fully-connected (FC) architecture and the one-stream-per-
subarray (OSPS) architecture. We propose a joint performance evaluation of the initial
BA, the consequent data communication as well as the hardware impairments, where we
consider, from a realistic point of view, only a limited channel state information (CSI)
that obtained from the BA phase. Also, a family of multi-user MIMO (MU-MIMO)
precoding schemes are investigated to well adapt to the hybrid architectures and the
beam information extracted from the BA phase. An interesting observation from this
work is that the two aforementioned architectures achieve similar sum spectral efficiency,
while the OSPS architecture is advantageous with respect to the FC case in terms
of hardware complexity and power efficiency, only at the cost of a slightly longer BA
time-to-acquisition due to its reduced beam angle resolution.
Fourth, we extend our work to relay networking to further increase the communication
range at mmWave frequencies. For a general mmWave half-duplex (HD) relay network
with arbitrary relay connections, we introduce the information theoretically optimal
schedule to firstly do a topology simplification procedure, on top of which we propose two
practical beam scheduling schemes, i.e., the deterministic edge coloring (EC) scheduler
and the adaptive backpressure (BP) scheduler. The former is more suitable for static
scenarios while the later is more favorable for time-varying scenarios. Both the proposed
schedulers can effectively stabilize the network within its capacity range, meanwhile
achieve much smaller queuing backlogs, much smaller backlog fluctuations, and much
lower packet end-to-end delays in comparison with the reference baseline scheme.
Zusammenfassung
Drahtlose Kommunikation ist zu einem wichtigen Bestandteil unseres täglichen Lebens
geworden. Die Nachfrage nach mobilen Datendiensten ist in den letzten Jahren massiv
gestiegen. Dies hat sowohl die Industrie als auch die Wissenschaft dazu veranlasst, für
die Mobilkommunikation der nächsten Generation (5G) auf Frequenzen im Bereich
der Millimeterwellen (mmWave, 30-300GHz) umzusteigen. Die Hauptmotivation für die
mmWave-Kommunikation ist die Verfügbarkeit einer enormen Bandbreite (mehrere GHz).
Diese ermöglicht Datenraten von mehreren Gbps für mehrere Mobilfunkendgeräte. Bei
mmWave-Signalen treten jedoch hohe Pfadverluste, sehr spezifische Richtcharakteristiken
und Blockierungen auf, was die Netzwerkleistung stark einschränkt.
Um die oben genannten Herausforderungen für die mmWave-Kommunikation zu
bewältigen, werden an den Transceivern große Antennenarrays verwendet. Diese ermög-
lichen einen großen Beamforminggewinn, um den hohen Pfadverlust zu kompensieren.
Darüber hinaus wird an den Transceivern üblicherweise eine hybrid digital analog
-Architektur (HDA-Architektur) mit einer im Vergleich zur Anzahl der Antennen
viel geringeren Anzahl von Basisbandsignalpfaden eingesetzt. Dadurch kann die
Hardwarekomplexität und der Stromverbrauch verringert werden. All diese neuen
Funktionen stellen die Entwickler der physikalische Schicht von mmWave-Funksystemen
und deren Netzwerken vor große Herausforderungen.
Ziel dieser Arbeit ist es, alle neuen Eigenschaften bei mmWave-Frequenzen klar
einzubeziehen und darüber hinaus neue hochmoderne Algorithmen für verschiedene
Phasen der Kommunikation bereitzustellen. Insbesondere enthält diese Arbeit vier
Hauptbeiträge.
Zunächst stellen wir einen effizienten Algorithmus für die initiale Ausrichtung von
Antennencharakteristiken (im Englischen beam alignment - BA) für mmWave orthogonal
frequency division multiplexing-Systeme (OFDM-Systeme) vor. In diesem Algorithmus
verwenden wir pseudozufällige Mehrfinger-Abstrahlcharakteristiken im Downlink, um
den Mobilfunkkanal zu untersuchen, und schätzen dann die Statistik zweiter Ordnung
des Kanals. Durch die Verwendung der Non-Negative Least Squares-Technik (NNLS-
Technik) können die resultierenden unterbestimmten Gleichungen effizient gelöst werden.
Der vorgeschlagene BA-Algorithmus ist sehr robust gegenüber der Zeitdynamik des
Kanals und skaliert sehr gut für Mehrbenutzerszenarien.
vi
Zweitens untersuchen wir den Einträger-Betriebsmodus (single carrier - SC) bei
mmWave-Frequenzen. Wir schlagen einen neuen BA-Algorithmus für mmWave-SC-
Systeme vor. In diesem Schema prüft die Basistation den Kanal periodisch über ein
vorbestimmtes Codebuch aus pseudozufälligen Mehrfinger-Abstrahlcharakteristiken und
Pseudozufalls-Spreizcodes. Jedes Endgerät formuliert das BA-Problem als Schätzung
eines dünnbesetzten nicht negativen statistischen Kanalvektors zweiter Ordnung. Dieser
kann unter Verwendung der NNLS-Technik effizient gelöst werden. Zusätzlich zu dem
Vorteil der Mehrbenutzerskalierbarkeit arbeitet der vorgeschlagene Algorithmus nur
im Zeitbereich und ist damit äußerst robust gegenüber schnellen Kanalschwankungen,
die durch eine große Doppler-Verschiebung zwischen Pfadkomponenten des Funkkanals
verursacht werden.
Drittens definieren wir zwei spezifische Strukturen der HDA-Architektur, die als
zwei Extremfälle angesehen werden können. Diese sind die vollständig verbundene
(fully connected - FC) Architektur und die Ein-Signalpfad-pro-Subarray-Architektur
(one stream per subarray - OSPS). Wir betrachten zusammenhängend die Leistung
des initialen BA, der daraus resultierenden Datenkommunikation sowie der Hardware-
Beeinträchtigungen. Dabei nutzen wir realitätsnah nur eine begrenzte Zustandsinforma-
tion des Funkkanals (channel state information – CSI). Diese Information kann direkt
aus der BA-Phase erhalten werden. Außerdem untersuchen wir verschiedene Multi-
User-MIMO-Codierungen (MU-MIMO precoding), welche an die vorgeschlagenen HDA-
Architekturen und die aus der BA-Phase extrahierten Kanalinformationen angepasst
sind. Eine interessante Beobachtung aus dieser Arbeit ist, dass die beiden oben genannten
Architekturen eine ähnliche spektrale Effizienz erzielen. Während die OSPS-Architektur
im Vergleich zum FC-Fall in Bezug auf Hardwarekomplexität und Energieeffizienz
vorteilhaft ist, zeigt die FC-Architektur eine etwas bessere Leistung im BA. Dies liegt
an der geringeren Winkelauflösung der OSPS-Struktur.
Viertens erweitern wir unsere Arbeit auf Relay-Netzwerke, um die Reichweite bei
mmWave-Frequenzen weiter zu vergrößern. Für ein allgemeines Relay-Netzwerk im
mmWave-Bereich im Halbduplexbetrieb mit beliebigen Relayverbindungen führen wir das
informationstheoretisch optimale Schema (im Englischen schedule) ein. Dadurch können
wir zunächst ein Verfahren zur Vereinfachung der Topologie durchführen. Anschließend
schlagen wir dann zwei praktische Schemata zur Strahlsteuerung vor. Zum einen das
deterministic edge coloring (EC) Schema und zum anderen das adaptive backpressure
(BP) Schema. Ersteres eignet sich besser für statische Szenarien, während letzteres für
zeitlich variierende Szenarien günstiger ist. Beide vorgeschlagenen Schemata können
das Netzwerk innerhalb seines Kapazitätsbereichs effektiv betreiben. Im Vergleich zum
Referenzschema haben beide viel kleinere Warteschlangen, viel kleinere Schwankungen
in der Warteschlangenauslastung und viel geringere Ende-zu-Ende-Verzögerungen.
To My Youth and My Beloved Husband.
— Xiaoshen Song
Acknowledgements
First and foremost, I would like to express my deepest gratitude to my PhD supervisor
Prof. Giuseppe Caire. Without him this research would have not been possible. He has
always been there for me, providing unending support and motivation. He tolerates my
shortcomings and helps me to overcome my weakness. Besides, he is passionate about
new technologies and exciting ideas. He trained me in building efficient research skills
which led me to finish my PhD research effectively and in time. I consider myself very
lucky to have joined his group.
A very special thanks to Dr. Saeid Haghighatshoar. “Research life is tough, find a
mentor!”. So yes, he is my best mentor. Without him I would never have got started
in my PhD research. From academic writing to technical skill, he unconditionally
passes on his valuable experience to me without any reservation. His critical advice,
wide perspective and methodological precision have substantially shaped my scientific
thinking.
I would also like to thank Mozhgan Bayat and Thomas Kühne. Adapting to a new
country and a new culture is not easy. They have provided a listening ear, encouraged
me and unconditionally helped me in private. They have made me feel that I am not
alone in this country. I will always be grateful to them for their valuable friendship.
Many thanks to the doctoral committee experts Prof. Robert W. Heath Jr., Prof.
Joerg Widmer and Prof. Rafael Schaefer for providing insightful comments to my
research and thesis. I have learned a lot from their publications as well as from the long
discussion with them over the defense.
In addition, I would like to express my gratitude and appreciation to my colleagues.
They made my PhD journey funny and interesting. I will always hold dear the days
and nights spent in the office.
I also thank my parents and my sisters for their inspiration and unequivocal support
during my PhD journey. The journey from China to TU Berlin would not have been
possible without their support.
Last but not least, I owe a thanks to my beloved husband Youjiang. I find it difficult
to express my appreciation because it is so boundless. He has seen me through the ups
and downs of the entire PhD journey. He is my most enthusiastic cheerleader; he is my
best friend; and he is an amazing husband. Without his sunny optimism, I would be a
much grumpier person; without his love and support, I would be lost. Fate brought us
x
together in the college. The past 9 years for our meeting and getting along with each
other are recalled, being still happy and romantic. I have always been firm to spend my
future with him together, pursuing our dreams and facing life’s challenges hand in hand.
I would like to thank all my friends from different countries and cultures for long
lasting friendships and enjoyable experiences during my stay in Berlin.
List of Publications
Below is a selection of publications that I authored / co-authored during my PhD
candidate duration.
Journal Papers:
1. X. Song
, S. Haghighatshoar, and G. Caire,“A scalable and statistically robust beam
alignment technique for mm-wave systems,” IEEE Transactions on Wireless Communi-
cations, 2018. (page 20)
2. X. Song
, S. Haghighatshoar, and G. Caire,“Efficient beam alignment for mmWave
single-carrier systems with hybrid MIMO transceivers,” IEEE Transactions on Wireless
Communications, 2019. (page 38)
3. X. Song
, T. Kühne, and G. Caire, “Fully-/Partially-Connected Hybrid Beamforming
Architectures for mmWave MU-MIMO,” IEEE Transactions on Wireless Communications,
2019. (page 58)
4. X. Song
, Yahya H. Ezzeldin, Giuseppe Caire, Christina Fragouli, “Efficient Beam
Scheduling for Half-Duplex mmWave Relay Networks,” IEEE Transactions on Wireless
Communications, 2020. (to be submitted) (page 78)
Conference Papers:
1. X. Song
, S. Haghighatshoar, and G. Caire, “A robust time-domain beam alignment
scheme for multi-user wideband mmwave systems,” in WSA 2018; 22nd International
ITG Workshop on Smart Antennas, 2018, pp. 1-7.
2. X. Song
, S. Haghighatshoar, and G. Caire, “An Efficient CS-Based and Statistically
Robust Beam Alignment Scheme for mmWave Systems,” in 2018 IEEE International
Conference on Communications (ICC), 2018, pp. 1-6.
3. X. Song
, T. Kühne, and G. Caire, “Fully-Connected vs. Sub-Connected Hybrid Precoding
Architectures for mmWave MU-MIMO,” in ICC 2019-2019 IEEE International Conference
on Communications (ICC), 2019, pp. 1-7.
4. X. Song
, and G. Caire, “Queue-Aware Beam Scheduling for Half-Duplex mmWave Relay
Networks,” in 2020 IEEE International Symposium on Information Theory (ISIT).
(accepted)
xii
5.
T. Kühne,
X. Song
, G. Caire, K. Rasilainen, T. H. Le, M. Rossi, et al., “Performance
Simulation of a 5G Hybrid Beamforming Millimeter-Wave System,” in WSA 2020; 24th
International ITG Workshop on Smart Antennas, 2020, pp. 1-6.
This thesis is an accumulation of publications. It is based on the above selected
journal papers (three published papers after peer-reviewing and one to-be-submitted
journal manuscript), which I wrote as first author. These four journal papers constitute
the four main chapters (Chapter 3 - Chapter 6) in this thesis. At the beginning of
the corresponding chapters, an introductory section with supplementary background
information as well as the clarification of each authors’ contributions are provided.
In reference to IEEE copyrighted material which is used with permission in this
thesis, the IEEE does not endorse any of TU Berlin’s products or services. Internal or
personal use of this material is permitted. If interested in reprinting/republishing IEEE
copyrighted material for advertising or promotional purposes or for creating new collective
works for resale or redistribution, please go to
https://www.ieee.org/publications/
rights/rights-link.html to learn how to obtain a License from RightsLink.
Table of Contents
Title Page i
Abstract iii
Zusammenfassung v
Acknowledgements ix
List of Publications xi
1 Introduction 1
1.1 Background for mmWave communication . . . . . . . . . . . . . . . . . 1
1.1.1 Distinctive mmWave characteristics . . . . . . . . . . . . . . . . 4
1.1.2 Potentials and challenges for mmWave communication . . . . . 5
1.2 Related works in the state of the art . . . . . . . . . . . . . . . . . . . 6
1.3 Contributions and structure of this thesis . . . . . . . . . . . . . . . . . 8
1.4 Notations .................................. 10
2 System Model for mmWave MU-MIMO 11
2.1 Hybrid mmWave transceiver architectures . . . . . . . . . . . . . . . . 12
2.2 Channelmodel ............................... 13
2.3 Signalingmodel............................... 15
2.4 Summary .................................. 17
3 Initial Beam Alignment for mmWave OFDM Systems 19
3.1 Introduction................................. 19
3.2 Clarification of each authors’ contributions . . . . . . . . . . . . . . . . 19
3.3 Original journal article . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4 Initial Beam Alignment for mmWave Single-Carrier Systems 37
4.1 Introduction................................. 37
4.2 Clarification of each authors’ contributions . . . . . . . . . . . . . . . . 37
4.3 Original journal article . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
xiv TABLE OF CONTENTS
5 Data Communication for mmWave Multi-User MIMO 57
5.1 Introduction................................. 57
5.2 Clarification of each authors’ contributions . . . . . . . . . . . . . . . . 57
5.3 Original journal article . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6 Beam Scheduling for mmWave Relay Networks 77
6.1 Introduction................................. 77
6.2 Clarification of each authors’ contributions . . . . . . . . . . . . . . . . 77
6.3 Original journal article . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7 Conclusions 95
7.1 Summaryofthisthesis........................... 95
7.2 Futuredirections .............................. 96
Appendix A Acronyms and Abbreviations 99
Bibliography 101
1
Introduction
1.1 Background for mmWave communication
Wireless communication has become an integral part of our lives today. As data-hungry
mobile devices and applications become increasingly prevalent, the mobile communication
infrastructure needs to evolve dramatically to support the exploding demand for wireless
data. Ericsson has predicted that the volume of mobile data traffic will increase five folds
from 2018 to 2024, reaching 136 exabytes (EBs) per month (as illustrated in Figure. 1.1),
equivalent to a compound annual growth rate of 31% [1]. This ever growing trend is
expected to continue mainly due to the services that require massive data, such as high
definition (HD) video streaming, online gaming, virtual reality applications and so on [2,
3, 4]. For example (e.g.), video streaming contributed 60% (
∼
16 EB/month) of total
mobile traffic in 2018, which is expected to reach 74% (
∼
100 EB/month) by 2024 [1].
In addition, billions of new devices envisioned to be connected in the future generation
of wireless networks to provide massive connectivity are also expected to contribute to
the increase in data consumption [1].
The motivation for the evolution from 1G to 2G, 3G, and 4G (the first, second,
third and fourth generation mobile communication, respectively) was to improve a
particular aspect of mobile communication. For example, 1G to 2G transition improved
voice service and increased network capacity by using digital communications; 3G and
4G were developed to improve the data rates. However, the evolution of the next
generation mobile communication (5G) features concurrent improvements in many
areas. Specifically, they include high data rates (1
−
10 Gbps), low latency (less than 1
milliseconds), massive connectivity (
∼
tens of billions new devices), and better quality
of service [1, 5, 6].
The international telecommunication union (ITU) has classified the 5G usage
scenarios (i.e., the international mobile telecommunications (IMT) for 2020 and beyond)
21.1 Background for mmWave communication
16 Ericsson Mobility Report | November 2018Forecasts
Mobile data traffic outlook
In 2024, 5G networks will carry 25 percent
of global mobile data traffic.
Monthly mobile data traffic per smartphone
continues to increase in all regions, driven
by improved device capabilities and more
affordable data plans, as well as an increase
in data-intensive content.
North America has the highest monthly
usage, reaching 8.6 gigabytes (GB) at the
end of this year, and is set to reach 50GB
by the end of 2024. In North East Asia, traffic
per smartphone has grown strongly during
2018 – increasing by around 140 percent
year-on-year. The region now has the
second-highest monthly usage at 7.3GB and
is projected to reach 21GB at the end of the
forecast period. Attractive data plans as well
as innovative mobile apps and content are
driving growth in China.
Total mobile data traffic is expected
to be five times higher
Close to 90 percent of total mobile data
traffic is generated by smartphones today
– a figure which is projected to reach
95 percent at the end of 2024. As monthly
usage per smartphone continues to increase,
total mobile data traffic is predicted to rise at
a compound annual growth rate (CAGR) of
31 percent over the forecast period, reaching
136 exabytes (EB) per month by the end
of 2024. It is expected that 25 percent of
mobile data traffic worldwide will be carried
by 5G networks at that time. This is 1.3 times
more than the total traffic today.
Currently, the 5G traffic forecast does not
include traffic generated by fixed wireless
access (FWA) services. However, as FWA
is one of the early use cases planned for 5G
in some regions, it could have a significant
impact on the forecast figures, depending
on market uptake of the service.
5G data traffic 4G/3G/2G data trafficGlobal mobile data traffic (EB per month)
20
40
80
100
60
140
120
20242014 2018 2020 20222016
Mobile data traffic per active smartphone (GB per month)
5040302010
North America 2018
2024
Middle East
and Africa
2018
2024
North East Asia 2018
2024
South East Asia
and Oceania
2018
2024
Latin America 2018
2024
India 2018
2024
Central and
Eastern Europe
2018
2024
Western
Europe
2018
2024
0
0
8.6
50
2.9
15
7.3
21
3.8
19
3.4
18
6.8
15
4.7
19
6.1
32
Figure 1.1: Expected worldwide mobile data demand [1].
into three broad categories [7]: Enhanced mobile broadband (eMBB), ultra-reliable
and low-latency communications (uRLLC), and massive machine type communications
(mMTC). Specifically, eMBB aims to meet the customers’ demand for an increasingly
digital lifestyle, and focuses on services that have high requirements for bandwidth,
such as HD videos, virtual reality (VR), and augmented reality (AR). uRLLC aims to
meet expectations for the demanding digital industry and focuses on latency-sensitive
services, such as assisted and automated driving, and remote management. And mMTC
aims to meet demands for a further developed digital society and focuses on services
that include high requirements for connection density, such as smart city and smart
agriculture. Figure. 1.2 illustrates some examples for the envisioned 5G usage scenarios
in IMT 2020 and beyond [8].
Refer to [7], the IMT 2020 requirements for 5G data rates are: peak data rate of
20 Gbps in downlink, 10 Gbps in uplink, user perceived data rate of 100 Mbps in
downlink and 50 Mbps in uplink, spectral efficiency of 30 bps/Hz in downlink and
15 bps/Hz in uplink. In order to achieve such high data rates, some key technologies
are expected to be the integral parts of 5G, e.g., network densification with various
small cells for interference management [9, 10, 11], massive multiple input multiple
output (MIMO) for spatial multiplexing [12, 13], and shifting to higher frequency for
larger bandwidth [14, 15]. In particular, due to the congestion of sub-6GHz bands
used by current cellular networks, immigration towards millimeter wave (mmWave)
frequency bands (30-300 GHz) is considered as the most attractive enabler for 5G and
41.1 Background for mmWave communication
1.1.1 Distinctive mmWave characteristics
Although the available bandwidth of mmWave frequencies is very large, the propagation
characteristics are significantly different from that of the microwave frequency bands,
which can be briefly summarized as follows [22, 16]:
•
Path loss. From Friis’s law, the isotropic path loss increases with the carrier
frequency. As an example, the free-space path loss decays with the square of
carrier frequency. Thus, in a point-to-point communication, one may expect
significant path loss when we move from sub-6GHz to great than 30 GHz carrier
frequency [23].
•
Diffraction and blockage. Diffraction leads to wave propagation in the geometrical
shadow region behind obstacles. Diffraction may cause a non-negligible multipath
propagation under both line of sight (LOS) and non-LOS conditions [24]. From
electromagnetic theory, it is well understood that electromagnetic waves experience
a difficulty to diffract when they propagate at obstacles with physical dimensions
significantly larger than the wavelength [25]. Furthermore, signals at microwave
frequencies can penetrate more easily through solid materials and buildings than
mmWaves. For these reasons, mmWave signals are influenced by the effect of
shadowing and diffraction to a much greater extent than microwave signals. For
instance, one can observe more than 35 dB blockage losses due to bricks, concretes,
etc., and around 35 dB due to human body, where these losses are negligible at
microwave frequency bands [14].
•
Rain attenuation. In general, the losses due to a rain attenuation at mmWave
frequency bands are much larger than those of microwave bands. If we consider
a typical mmWave frequency of 73 GHz, one can observe a rain attenuation of
roughly 10 dB/km, which is quite large [26].
•
Atmospheric absorption. Field measurement results have shown that mmWave
signals are more susceptible to oxygen absorption than that of microwave signals.
For instance, one can observe roughly 20 dB loss around 60 GHz mmWave signal
(see Fig. 1 of [27]).
•
Foliage loss. The attenuation of radio signals caused due to the presence of trees
obstructing the radio link is termed as foliage loss. Foliage losses for mmWaves
are significant and can be a limiting factor for some propagation environments.
Empirical results demonstrate that at 10 m foliage penetration, the loss at 80 GHz
mmWave frequency reaches around 23
.
5dB, which is about 15 dB higher compared
to that of the 3GHz microwave frequency [28].
1. Introduction 5
1.1.2 Potentials and challenges for mmWave communication
Having understood the distinctive characteristics of mmWave propagation, the ad-
vantages of mmWave communication are self-evident. As we have mentioned before,
mmWave frequencies allow for larger channel bandwidth allocations which directly result
in higher data rates and indirectly to reduced latencies of the network. Namely, service
providers will be able to support user-data-hungry applications with minimal latency.
Also, mmWave communication can be used in small cell setting with reduced coverage
area, i.e., to establish more densely packed communication links and exploit spatial
reuse to provide increased capacity gains.
In addition to the large bandwidth and capacity gain, the utilizing of massive MIMO
techniques can guarantee many extra performance gains for mmWave communication:
•
Beamforming gain. The small wavelength at mmWave frequencies enables to pack
a large number of antenna elements in a small form factor, which offers high
beamforming gain to compensate for the excessive free space propagation path
loss.
•
Interference suppression. In multi-user systems, the use of multiple antennas at
the transmitter and receiver can significantly increase the transmission directivity,
which accordingly can increase the potential to alleviate intra-channel interference.
This is achieved via precoding at the transmitter or combining at the receiver, or
a combination of both.
•
Diversity gain. Spatial diversity can be exploited in mmWave systems with multiple
antennas at both ends to mitigate the impact of fluctuations and loss of signals in
the channel.
•
Multiplexing gain. With multiple antennas at the transmitter, parallel streams can
be transmitted to the users without using additional bandwidth or power. This
increases the number of spatial dimensions for communication.
However, despite the great potential associated with mmWave communication, a
number of challenges need to be addressed so as to be able to exploit these benefits.
•
Power consumption. In a conventional fully-digital massive MIMO structure, a
radio frequency (RF) chain is dedicated to each antenna element, which would result
in using a large number of RF chains at mmWave frequencies, and consequently
imposes prohibitive power consumption.
•
Hybrid transceiver architecture. A common solution to reduce the power
consumption at mmWave systems is to utilize a hybrid transceiver architecture.
Namely, each RF chain is cascaded with a digital baseband unit, which leads to
61.2 Related works in the state of the art
a much lower number of RF chains in comparison with the number of antennas.
Accordingly, traditional channel estimation / precoding / combining approaches
devised for fully-digital MIMO architectures are not applicable to mmWave hybrid
MIMO architectures. This is because the transceivers lack straight access to each
antenna element owing to the limited number of RF chains. In addition, the
channel matrix is large due to employing large arrays in mmWave systems, and
the signal to noise ratio (SNR) is fairly low as a result of severe path loss before
beamforming.
•
User Mobility. A major challenge that comes with user mobility in mmWave
transmissions is the significant fluctuations of the channel coefficients since channel
coherence time in the mmWave range is very small resulting in a large Doppler
spreading. Thus, the signaling schemes used in mobile mmWave communication
must take into account the fast time-varying channel states.
•
Integrated circuit (IC) design. Additional factors that need to be considered
when designing ICs for mmWave systems with high carrier frequencies and wide
bandwidth include non-linear distortions in the power amplifiers (PAs), phase
noise and IQ (in-phase and quadrature) imbalance because the severity of these
errors scales up with high frequency transmissions.
•
mmWave relaying. With the increasing interest in developing small cells for
mmWave communication, how to use relays to increase coverage and to support
mmWave wireless backhaul for dense small cell deployments remain a big challenge.
The main concern thereby includes how to guarantee efficient beam scheduling,
high data rate, network stability, low latency, etc..
1.2 Related works in the state of the art
Due to the great potential of mmWave communications, multiple international
organizations have emerged for the standardization. In particular, IEEE 802.15.3c
specifies the physical layer and MAC (Media Access Control) layer for indoor wireless
personal area network (WPAN, also referred to as the piconet) at unlicensed 60 GHz
band, which is composed of several wireless nodes and a single piconet controller [29].
IEEE 802.11ad specifies the physical layer and MAC layer for local area network (LAN)
at 60 GHz band to support multi-Gbps wireless applications [30]. In particular for
the physical layer, two operating modes are defined, the orthogonal frequency division
multiplexing (OFDM) mode for high performance applications (e.g., high data rate), and
the single carrier (SC) mode for low power and low complexity implementation [29, 6].
Additionally, 5G NR (new radio), which is designed to be the global standard for a unified
and more capable 5G wireless air interface, specifies the capability to use mmWave bands
1. Introduction 7
to achieve high data rates, enhanced network energy performance, forward compatibility,
low latency, and beam-centric design to allow for massive number of antennas [3]. I
would refer to [3, 6, 29, 30] for a more detailed review of the corresponding standards.
In addition to the aforementioned standardization activities, many research efforts
have also been put into mmWave system design. In terms of the different communication
phases in a wireless system, we classify the related literature into four categories, i.e.,
the initial access phase, the data communication phase, the relay networking and finally
the hardware aspect.
An essential component to obtain large antenna gains at mmWave frequencies consists
of identifying suitable narrow beam combinations in the initial access phase. The problem
of finding an AoA-AoD (angle of arrival, angle of departure) pair is referred to as beam
alignment (BA). The inefficiency of naive exhaustive search and the spars characteristic of
mmWave channels have motivated a large variety of BA approaches in the literature, e.g.,
the multi-level hierarchical schemes [31, 32, 33, 34, 35], the compressed sensing schemes
[36, 37, 38, 39], etc.. All these algorithms, in some way, suffer from some limitations, e.g.,
non-scalable for multi-user scenarios, long-time invariant channel assumptions, limited
to single-side training, etc..
A large number of works on hybrid architectures have investigated the data
communication phase for mmWave systems with an assumption of full channel state
information (CSI) [40, 41, 42, 43, 44, 45, 46, 47], namely, the vectors of baseband
complex channel coefficients at each array element are known. These works focus on
the optimization of the hybrid precoder using the full CSI knowledge. Unfortunately,
this assumption is obviously not feasible in a realistic system, since in order to acquire
such coefficients, one should be able to sample each antenna element, i.e., one would
need an RF chain per antenna element or exhaustively measure all elements successively.
Obviously, the former is prohibitively power consuming and the latter is prohibitively
time consuming.
While relays on sub-6GHz bands suffer from severe interference due to their
ominidirectional transmissions, the directivity of mmWave antennas significantly
mitigates interference, especially in backhaul systems [19, 48]. Recently many efforts
have also been made to study the mmWave relay network regime with an emphasis on
one or several aspects, such as relay selection, congestion control, routing, scheduling and
so on [17, 49, 19, 50, 48, 51]. However, we observe that the existing works more or less
encounter some limitations, e.g., the limitation of single path streaming, the ignorance
of source admission control, etc.. Particularly, a fundamental information theoretical
understanding of mmWave relay networks in terms of its potential at maximum is rather
unexplored.
For the hardware aspects, a common theme that underlies most of the hybrid
mmWave works is that the fully-connected (FC) architecture outperforms the subarray
architecture only at the cost of a higher hardware complexity. However, many reference
81.3 Contributions and structure of this thesis
works [52, 41, 40, 46, 43] have ignored hardware impairments [42], such as the power
dissipation, the PA nonlinear distortion and so on [53]. In particular, the nonlinear
PAs employed at the BS can drastically distort the transmit signal when operated close
to saturation [54]. To this end, a certain power backoff from the saturation power of
a PA should be considered accordingly for different signaling schemes and transceiver
architectures, such that the PAs can always work in their linear operating region.
As we can see, although many studies have been dedicated to mmWave communication
in the last decade, there are still many research gaps regarding to a practical mmWave
implementation.
1.3 Contributions and structure of this thesis
This thesis is an accumulation of publications. It is based on four selected journal papers
(three published papers after peer-reviewing [55, 56, 57] and one to-be-submitted journal
manuscript [58]), which I wrote as first author. These four journal papers constitute the
four main chapters (Chapter 3 - Chapter 6) of this thesis.
An overview of the thesis structure and the contributions of each chapters is given
in below.
•
Chapter 1 is the introduction, which provides the background of mmWave
communication as well as its distinctive characteristics, potential, challenges
and the state of the art.
•
Chapter 2 provides an description of mmWave wireless communication systems as
well as the relevant concepts. The mathematical channel and signaling models for
mmWave multi-user MIMO (MU-MIMO) are also provided in this chapter so as
to prepare the reader for the technical subjects covered in this thesis.
•
Chapter 3 studies the initial beam alignment (BA) problem for OFDM mmWave
systems. This chapter presents an efficient BA scheme, which explores the AoA-
AoD channel domain through pseudo-random multi-finger beam patterns, and
then constructs an estimate of the resulting channel second-order statistics. The
resulting under-determined system of equations is efficiently solved by using the
technique of non-negative least-squares (NNLS). As a result of quadratic channel
measuring, the proposed scheme is highly robust to variations of the channel
time-dynamics compared with the concurrent approaches in the literature. Also,
since all the estimations take place in the downlink, the proposed approach has a
strong scalability for multi-user scenarios.
•
Chapter 4 is a horizontal extension of Chapter 3, which studies the BA problem
for single-carrier (SC) mmWave systems. In this Chapter, we propose a new BA
scheme where the base station (BS) periodically probes the channel in the downlink
1. Introduction 9
via a pre-specified pseudo-random beamforming codebook and pseudo-random
spreading codes, letting each user equipment (UE) estimate its strongest path
direction. This scheme again formulates the BA problem as the estimation of a
sparse non-negative second-order statistic channel vector and then uses NNLS
technique to efficiently find the strongest AoA-AoD pair connecting each UE to the
BS. The proposed scheme is completely done in time domain and is highly robust to
fast channel variations caused by the large Doppler spread between the multipath
components. Furthermore, this chapter will show that after achieving BA, the
beamformed channel is essentially frequency-flat, such that SC communication
needs no equalization in the time domain.
•
Chapter 5 focuses on data communication after BA is achieved. This chapter
presents two typical hybrid digital analog (HDA) mmWave antenna architectures
that can be regarded as two extreme cases, namely, the fully-connected (FC) and
the one-stream-per-subarray (OSPS) architectures. A joint evaluation of the initial
BA and the consequent data communication is considered, where the latter takes
place by using the beam direction information obtained by the former. A family
of MU-MIMO precoding schemes are investigated to well adapt to the hybrid
architectures and the beam information extracted from the BA phase. In addition,
the power efficiency of the two hybrid architectures are also evaluated by taking
into account the power dissipation at different hardware components as well as the
power backoff under typical power amplifier constraints. A small conclusion from
this chapter is that the two architectures achieve similar sum spectral efficiency,
while the OSPS architecture is advantageous with respect to the FC case in terms
of hardware complexity and power efficiency, at the sole cost of a slightly longer
BA time-to-acquisition due to its reduced beam angle resolution.
•
Chapter 6 studies the relay networking for mmWave wireless systems. Although the
optimal beam directions for each node pair can be obtained through an BA phase,
how to efficiently schedule the beams, in terms of avoiding the queuing explosion
as well as assuring large data rates and small end-to-end delays is the main focus
of this chapter. More precisely, this chapter studies the beam scheduling problem
for mmWave half-duplex (HD) relay networks, where the relay topology can be
arbitrary and a link is active only if both nodes focus their beams to face each
other. The approximate information theoretical Shannon capacity is introduced to
help understand at maximum the potential of the underlying networks. Based on
the theoretically optimal schedule results, a prior network simplification procedure
is implemented to reduce the network topology complexity, on top of which two
practical beam scheduling schemes, i.e., the deterministic edge coloring (EC)
scheduler and the adaptive backpressure (BP) scheduler are presented. The former
is a very simple one-time computation and then periodical state repetition, hence
10 1.4 Notations
is more suitable for static scenarios. The later is an “online” approach which
will update in every time slots, thus is more favorable for time-varying scenarios.
Both of the proposed schedulers can achieve much smaller queuing backlogs,
much smaller backlog fluctuations, and much lower packet end-to-end delays in
comparison with the reference baseline scheme.
•Chapter 7 finally concludes the thesis and provides suggestions for future work.
1.4 Notations
Vectors, matrices and scalars are denoted by boldface small letters(e.g.,
a
) , boldface
capital letters (e.g.,
A
) and non-boldface letters (e.g.,
a
,
A
), respectively. Sets are
denoted by calligraphic letter
A
with its cardinality denoted by
|A|
. The empty set is
denoted by
∅
.
E
is for the expectation,
⊗
is for Kronecker product,
⊙
is for Hadamard
product,
⊛
is for continuous-time convolution.
AT
denotes transpose,
A∗
denotes
conjugate, and
AH
denotes conjugate transpose of a matrix
A
, respectively. The
complex circularly symmetric Gaussian distribution with a mean
µ
and a variance
γ
is
denoted by
CN
(
µ, γ
). For an integer
K∈Z+
, the shorthand notation [
K
]is used to
represent the set of non-negative integers {1, ..., K}.
2
System Model for mmWave
MU-MIMO
5G promises great flexibility to support a myriad of Internet Protocol (IP) devices,
small cell architectures, and dense coverage areas. Applications envisioned for 5G
include the Tactile Internet, vehicle-to-vehicle communication, vehicle-to-infrastructure
communication, as well as peer-to-peer and machine-to-machine communication, all of
which will require extremely low network latency and on-call demand for large bursts of
data over minuscule time epochs. Figure. 2.1 shows how backhaul connects the fixed
cellular infrastructure (e.g., BS) to the core telephone network and the Internet [5].
0018-926X (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2017.2734243, IEEE
Transactions on Antennas and Propagation
Fig. 1: Mobile networks are evolving from 4G towards 5G. Shown here are small cells, edge servers, wireless backhaul, and 5G multi-tier architecture.
to the backhaul. The BBU processes and modulates IP packet
data from the core network into digital baseband signals where
they are transmitted to remote radio heads (RRHs). The digital
baseband signal travels from the BBU to a RRH via a common
public radio interface (CPRI) through a digital radio-over-fiber (D-
RoF) connection, also known as fronthaul. The RRH converts the
digital signal to analog for transmission over the air at the carrier
frequency by connecting to amplifiers and antennas to transmit the
downlink from the cell tower. The RRH also converts the received
radio frequency (RF) uplink signal from the UEs into a digital
baseband signal which travels from the RRH to the BBU via the
same CPRI and D-RoF connection to the base of the cell tower.
The BBU then processes and packetizes the digital baseband signal
from the RRH and sends it through a backhaul connection to the
core network. In summary, fronthaul is the connection between the
RRH and BBU in both directions and backhaul is the connection
between the BBU and the core network in both directions.
Modern cellular architectures support a more flexible deploy-
ment of radio resources that may be distributed using a cloud
radio access network technique, where a BS is split into two parts
[42], one part where the RRHs are at remote cell sites, and in the
other part, one centralized BBU is located up to tens of kilometers
away (see Fig. 1). CPRI is used for fronthaul, and interconnects
the centralized BBU and multiple RRHs through D-RoF. MmWave
wireless backhaul and fronthaul will offer fiber-like data rates and
bandwidth to infrastructure without the expense of deploying wired
backhaul networks or long-range D-RoF [9], [43], [44].
B. Small Cells
An effective way to increase area spectral efficiency is to shrink
cell size [40], [45], [46] where the reduced number of users per
cell, caused by cell shrinking, provides more spectrum to each
user. Total network capacity vastly increases by shrinking cells and
reusing the spectrum, and future nomadic BSs and direct device-
to-device connections between UEs are envisioned to emerge in
5G for even greater capacity per user [47]. Femtocells that can
dynamically change their connection to the operator’s core network
will face challenges such as managing RF interference and keeping
timing and synchronization, and various interference avoidance and
adaptive power control strategies have been suggested [45]. An
analysis of the wireless backhaul traffic at 5.8 GHz, 28 GHz,
and 60 GHz in two typical network architectures showed that
spectral efficiency and energy efficiency increased as the number of
small cells increased [48], and backhaul measurements and models
at 73 GHz were made in New York City [20], [49]. Work in
[50] showed a theory for power consumption analysis, which is
strikingly similar to noise figure, for comparing energy efficiency
and power consumption in wideband networks. An early small-cell
paper [51] gave insights into enhancing user throughput, reducing
signaling overhead, and reducing dropped call likelihoods.
C. Multi-tier Architecture
The roadmap for 5G networks will exploit a multi-tier archi-
tecture of larger coverage 4G cells with an underlying network of
closer-spaced 5G BSs as shown in Fig. 1. A multi-tier architecture
allows users in different tiers to have different priorities for channel
access and different kinds of connections (e.g., macrocells, small
cells, and device-to-device connections), thus supporting higher
data rates, lower latencies, optimized energy consumption, and
interference management by using resource-aware criteria for the
BS association and traffic loads allocated over time and space [52].
Schemes and models for load balanced heterogeneous networks in
a multi-tier architecture are given in [53], [54]. 5G applications
will also require novel network architectures that support the
convergence of different wireless technologies (e.g., WiFi, LTE,
mmWave, low-power IoT) that will interact in a flexible and
seamless manner using Software Defined Networking and Network
Virtualization principles [55], [56].
D. 5G Air Interface
The design of new physical layer air interfaces is an active area
of 5G research. Signaling schemes that provide lower latency, rapid
beamforming and synchronization, with much smaller time slots
and better spectral efficiency than the orthogonal frequency division
Figure 2.1:
An illustration of 5G small cells, edge servers, wireless backhaul, and
multi-tier architecture model [5].
As we have discussed before, to address the ever increasing data demand, the wireless
industry for 5G is moving to mmWave frequencies, since for the backhaul/fronthaul,
14 2.2 Channel model
equals to a half-wavelength λ/2, the elements of aT(θk,l)and aR(ϕk,l)are given by
[aT(θ)](i′−1)·M
ˆ+d=ej(d−1)πsin(θ)·ejΨ(i′,θ), d ∈[M
ˆ](2.2a)
[aR(ϕ)]n=ej(n−1)πsin(ϕ), n ∈[N],(2.2b)
where in
(2.2a)
we assume that (
i′≡
1
, M
ˆ
=
M
)for the FC architecture as shown
in Figure. 2.3(a), and (
i′∈
[
MRF
]
, M
ˆ
=
M
MRF
)for the OSPS architecture as shown in
Figure. 2.3(b). The additional term
Ψ
(
i′, θ
)in
(2.2a)
takes into account the phase shifts
among different subarrays, given by
Ψ(i′, θ) = 2π
λ(i′−1) ·Dx·sin(θ),(2.3)
where
i′
indicates the index of the subarrays and
Dx≥
0denotes the subarray center-
to-center spacing in the scan direction. Hence, in the special case with
Dx
= 0, all the
subarrays are co-located; while with
Dx
=
M
MRF ·λ
2
, the antenna element layout in the
scan direction for the OSPS architecture is exactly the same as for the FC architecture.
We adopt a block fading model, where the coefficient of the
l
-th multipath component
ρs,k,l
is constant over a short interval (within one slot) and changes from slot to slot
according to a wide-sense stationary process statistics characterized by its power spectral
density (Doppler spectrum) [60]. When the channel coherence time (related to the
inverse of the bandwidth of the Doppler spectrum, see [60]) is significantly larger than
the slot duration but equal or smaller than the (non-consecutive) slot separation in time,
a convenient model is to consider the coefficients as i.i.d. across different slots. Moreover,
the Doppler shift
νk,l
as defined in
(2.1)
introduces a continuous phase rotation for
each channel sample. Each multipath component (channel tap coefficient) is formed
by the superposition of a large number of micro-scattering components (e.g., due to
rough surfaces) having (approximately) the same AoA-AoD and delay. By the central
limit theorem, it is customary to model the superposition of these many small effects as
Gaussian [61, 62]. Hence, the multipath component coefficients can be modeled as Rice
fading given by
ρs,k,l ∼√γk,l (︄√︄ηk,l
1 + ηk,l
+1
√1 + ηk,l
ρˇs,k,l)︄,(2.4)
where
γk,l
denotes the overall multipath component strength,
ηk,l ∈
[0
,∞
)indicates the
strength ratio between the specular reflection (or LOS) and the scattered components,
and
ρˇs,k,l ∼ CN
(0
,
1) is a zero-mean unit-variance complex Gaussian random variable
whose value changes in an i.i.d. fashion across different slots. In particular,
ηk,l → ∞
indicates a pure LOS path while
ηk,l
= 0 indicates a pure scattered path, affected by
Rayleigh fading.
2. System Model for mmWave MU-MIMO 15
The AoA-AoDs (
ϕk,l, θk,l
)in
(2.1)
can take on arbitrary values in the continuous
AoA-AoD domain. Following the widely used approach of [63], known as beam-domain
representation, we obtain a finite-dimensional representation of the channel response
(2.1). More precisely, we consider the discrete set of AoA-AoDs
Φ := {︃ϕ
ˇ: (1 + sin(ϕ
ˇ))/2 = n−1
N, n ∈[N]}︃,(2.5a)
Θ := {︃θ
ˇ: (1 + sin(θ
ˇ))/2 = m−1
M, m ∈[M]}︃.(2.5b)
It follows that the corresponding sets
AR
:=
{aR
(
ϕ
ˇ
) :
ϕ
ˇ∈
Φ
}
and
AT
:=
{aT
(
θ
ˇ
) :
θ
ˇ∈
Θ
}
form discrete dictionaries to represent the channel response. For the ULAs considered
in this paper, the dictionaries
AR
and
AT
, after suitable normalization, reduce to
the columns of unitary discrete Fourier transform (DFT) matrices
FN∈CN×N
and
FM∈CM×M, with elements
[FN]n,n′=1
√Nej2π(n−1)(n′−1
N−1
2), n, n′∈[N],(2.6a)
[FM]m,m′=1
√Mej2π(m−1)(m′−1
M−1
2), m, m′∈[M].(2.6b)
Consequently, based on a subarray basis indexed by
i′
, the beam-domain representation
of the channel response (2.1) is given by [63, 15]
H
ˇi′
s,k(t, τ) = FH
NHs,k(t, τ)·(︂FM⊙1{(i′−1)M
ˆ+1:i′M
ˆ,1:M})︂
=
Lk
∑︂
l=1
H
ˇi′
s,k,l(t)δ(τ−τl),(2.7)
where (
i′≡
1
, M
ˆ
=
M
)for the FC architecture, and (
i′∈
[
MRF
]
, M
ˆ
=
M
MRF
)for the
OSPS architecture. Here we define H
ˇi′
s,k,l
(
t
) :=
FH
N
H
s,k,l
(
t
)
·(︂FM⊙1{(i′−1)M
ˆ+1:i′M
ˆ,1:M})︂
as the beam-domain
l
-th multipath component between the
k
-th UE and the BS, where
1{a1:a2,b1:b2}∈CM×M
is an indicator matrix, with 1at the components indexed by rows
from
a1
to
a2
and by columns from
b1
to
b2
, otherwise zero. The indicator matrix takes
into account the fact that the number of antenna elements for each subarray in the
OSPS architecture is MRF times less than that in the FC architecture.
2.3 Signaling model
Let
xs
(
t
) = [
xs,1
(
t
)
, xs,2
(
t
)
, ..., xs,K
(
t
)]
T
denote the continuous-time baseband equivalent
signal (either pilot or data signal), transmitted from the BS over the
s
-th slot. With
HDA beamforming, the beamformed signal at the output of the transmitter over the
16 2.3 Signaling model
s-th slot is generally given by
x
ˆs(t) = √︂E0·URF
s·WBB
s·xs(t),(2.8)
where for simplicity of exposition we restrict to the case of uniform power allocation, with
E0
=
PtotTc
K
indicating the per-chip energy of each signal stream, where
Ptot
denotes the
total radiated power at the BS and
Tc
=
1
B
denotes the chip duration with
B
indicating
the signaling bandwidth. In
(2.8)
, we define
WBB
s∈CMRF×K
and
URF
s∈CM×MRF
as
the baseband (digital) and the RF analog beamforming matrices, respectively. Note
that, depending on the transmitter architecture, the analog beamforming matrix
URF
s
takes on the form
[u
˜s,1,u
˜s,2,··· ,u
˜s,MRF ]and
u
˜s,10··· 0
0 u
˜s,2··· 0
.
.
..
.
.....
.
.
0 0 ··· u
˜s,MRF
(2.9)
for the FC and the OSPS architectures, respectively, where
u
˜s,i ∈CM
ˆ
,
i∈
[
MRF
], with
M
ˆ
=
M
for the FC architecture and
M
ˆ
=
M
MRF
for the OSPS architecture. Hence, in
both cases
URF
s
has dimension
M×MRF
, but FC has a full matrix, while OSPS has a
block-diagonal matrix, due to the constrained connectivity. Without loss of generality,
the beamforming vectors are normalized as ∑︁MRF
i=1 ∥us,i∥2=MRF.
The beamformed signal
(2.8)
goes through the channel as defined in
(2.1)
. At the
UE side, because of the HDA architecture, the UE does not have direct access to
each antenna element. Instead, at each slot
s
, the UE obtains only a projection of
the received signal by applying some beamforming vector in the analog domain. For
notation simplicity, let’s consider a single RF chain at each UE with
NRF
= 1. The
extension to
NRF >
1is straightforward and will be considered in later sections. Thus,
the received signal at the k-th UE side is given by
yˆs,k(t) =vH
s,kHs,k(t, τ)⊛x
ˆs(t) + zs,k(t)
=√︂E0vH
s,kHs,k(t, τ)⊛(︂URF
s·WBB
s·xs(t))︂
+zs,k(t),(2.10)
where
vs,k ∈CN
denotes the normalized beamforming vector with
∥vs,k∥
= 1 at the
k
-th
UE, and
zs,k
(
t
)is the continuous-time complex additive white Gaussian noise (AWGN)
at the output of the UE RF chain, with a power spectral density (PSD) of
N0
Watt/Hz.
2. System Model for mmWave MU-MIMO 17
In order to clearly describe the channel condition between the BS and a generic UE,
it is useful to first define the channel SNR before beamforming (BBF)
SNRBBF
, given by
SNRBBF, k =Ptot ∑︁Lk
l=1 γk,l
N0B.(2.11)
where
k
is the index of the UE and
γk,l
denotes the strength of the
l
-th multipath
component. The SNR in
(2.11)
indicates the ratio of the total received signal power
(summing over all the multipath components) over the total noise power at the receiver
baseband processor input, assuming that the signal is isotropically transmitted by
the BS and isotropically received at the
k
-th UE over the total bandwidth
B
. As
mentioned before, one of the challenges of mmWaves communication is that the SNR
before beamforming SNRBBF in (2.11) may be very low.
2.4 Summary
This chapter presents two “extreme” hybrid mmWave antenna architectures, on top of
which the mathematical channel and signaling models are also provided. The main object
of this chapter is to prepare the reader for the basic mmWave channel mathematics
covered in this thesis.
3
Initial Beam Alignment for mmWave
OFDM Systems
3.1 Introduction
To cope with the severe path loss at mmWave frequencies, directional beamforming
both at the BS side and the UE side is necessary in order to establish a strong path
conveying enough signal power. Finding such beamforming directions is referred to as
beam alignment (BA). This chapter presents an efficient BA scheme which can be used
in the initial access phase for mmWave OFDM systems.
3.2 Clarification of each authors’ contributions
This chapter is a journal publication, which is a joint work with Saeid Haghighatshoar
and Giuseppe Caire. I wrote this journal as the first author. The citation information is
in below:
X. Song, S. Haghighatshoar, and G. Caire,“A scalable and statistically robust
beam alignment technique for mm-wave systems,” IEEE Transactions on Wireless
Communications, 2018. DOI: 10.1109/TWC.2018.2831697
All the authors contributed to this paper, but I have implemented all the experiments
and simulations. I also wrote the complete first draft (including all sections) of this
paper.
Saeid Haghighatshoar provided valuable ideas for the channel and signaling model
as well as the mathematical techniques for the channel estimation. He also modified my
first draft in terms of its English expressions.
20 3.3 Original journal article
Giuseppe Caire, who is my PhD supervisor, provided valuable discussions in each
meeting of this work. He also did a final modification of the overall draft.
3.3 Original journal article
The following article is a reprint of the original journal paper. It is the accepted version
of the paper. The copyright information is given in page xii of this thesis as well as in
the first page of the reprinted paper.
A Scalable and Statistically Robust Beam
Alignment Technique for mm-Wave Systems
Xiaoshen Song, Student Member, IEEE, Saeid Haghighatshoar, Member, IEEE, Giuseppe Caire, Fellow,
IEEE
©2018 IEEE. Reprinted, with permision, from X. Song, S. Haghighatshoar, and G. Caire, ”A scalable and statistically robust beam alignment technique
for mm-wave systems,” IEEE Transactions on Wireless Communications, 2018. The published version can be found online: https://ieeexplore.ieee.
org/abstract/document/8356247. This reprint is the accepted version of the paper.
Abstract—Millimeter-Wave (mm-Wave) frequency bands
provide an opportunity for much wider channel bandwidth
compared with the traditional sub-6 GHz band. Communi-
cation at mm-Waves is, however, quite challenging due to
the severe propagation pathloss incurred by conventional
isotropic antennas. To cope with this problem, directional
beamforming both at the Base Station (BS) side and at the
User Equipment (UE) side is necessary in order to establish
a strong path conveying enough signal power. Finding such
beamforming directions is referred to as Beam Alignment
(BA). This paper presents a new scheme for efficient BA.
Our scheme finds a strong propagation path identified by
an Angle-of-Arrival (AoA) and Angle-of-Departure (AoD)
pair, by exploring the AoA-AoD domain through pseudo-
random multi-finger beam patterns, and constructing an
estimate of the resulting second-order statistics (namely,
the average received power for each pseudo-random beam
configuration). The resulting under-determined system of
equations is efficiently solved using non-negative constrained
Least-Squares, yielding naturally a sparse non-negative vector
solution whose maximum component identifies the optimal
path. As a result, our scheme is highly robust to variations
of the channel time-dynamics compared with alternative
concurrent approaches based on the estimation of the in-
stantaneous channel coefficients, rather than of their second-
order statistics. In the proposed scheme, the BS probes the
channel in the Downlink (DL) and trains simultaneously an
arbitrarily large number of UEs. Thus, “beam refinement”,
with multiple interactive rounds of Downlink/Uplink (DL/UL)
transmissions, is not needed. This results in a scalable BA
protocol, where the protocol overhead is virtually independent
of the number of UEs since all the UEs run the BA procedure
at the same time. Extensive simulation results illustrate that
our approach is superior to the state-of-the-art BA schemes
proposed in the literature in terms of training overhead
in multi-user scenarios and robustness to variations in the
channel dynamics.
Index Terms—Millimeter-Wave, Beam Alignment, Com-
pressed Sensing, Non-Negative Least-Squares (NNLS).
I. INTRODUCTION
Communication at millimeter-waves (mm-Waves) provides
an opportunity to fulfill the demand for high data rates in
the next generation communication networks because of
the large available bandwidth [1]. A critical challenge to
signaling at mm-Waves compared with sub-6 GHz spec-
The authors are with the Electrical Engineering and Computer Science
Department, Technische Universit¨
at Berlin, 10587 Berlin, Germany (e-
mail: [email protected]).
X. Song is sponsored by the China Scholarship Council
(201604910530).
trum is the severe propagation loss when conventional
isotropic antennas are used [2]. The standard way to
counter the isotropic pathloss consists of using antenna gain
at both the transmitter and the receiver sides. In a mobile
environment, such antenna gain is achieved by electroni-
cally steerable antenna arrays, in order to cope with beam
direction changes due to the relative motion of transmitter
and receiver. Fortunately, due to the small wavelength, it
is possible to package a large number of antenna elements
in a small form factor, such that large antenna arrays
can be implemented at both the Base Station (BS) side
and the User Equipment (UE) side. Moreover, it has been
observed experimentally and modeled mathematically that
the propagation channel at mm-Waves is formed by a very
sparse collection of scatterers in the angle domain [3–6].
This implies that, to establish reliable communication, the
BS and the UE need to focus their beams in the direction
of a strong path. For example, in the case of Line-of-Sight
(LoS) propagation, the beams must point at each other
since the LoS path is typically the strongest one.
More in general, we refer to the problem of finding
a narrow beam direction at both the BS and the user
sides yielding a SNR after beamforming above a desired
threshold as the Beam Alignment (BA) problem. This
problem is quite well studied in the literature [3–16]. In
particular, it is known to be a challenging problem since in
mm-Waves the SNR before beamforming (i.e., in isotropic
propagation conditions) is typically very low, especially
in outdoor non-LoS conditions. Moreover, although the
number of array antennas may be very large, the number
of Radio Frequency (RF) chains is limited, due to the
difficulty of implementing a full RF chain (including
A/D conversion, modulation, and PA/LNA amplification)
for each array element in a very small form factor and
for a very large bandwidth. The small number of RF
chains prevents the implementation of classical digital
beamforming schemes in the baseband domain. Hence, a
widely studied approach consists of Hybrid-Digital-Analog
(HDA) beamforming [7, 17]. In this case, a naive sequential
scanning of the Angle-of-Departure (AoD) and Angle-of-
Arrival (AoA) domains with narrow beams in order to
find an alignment to a strongly connected propagation
path is very time-consuming and would incur a large
initial acquisition protocol overhead, not suited for outdoor
mobile applications [11–18].
3. Initial Beam Alignment for mmWave OFDM Systems 21
A. Related State-of-the-Art
The inefficiency of naive alignment search has motivated
BA algorithms based on hierarchical adaptive search, in-
teractive search, and Compressed Sensing (CS) techniques
[8–16].
The fundamental idea of hierarchical methods is to use
wider beam patterns at the start of the search and to refine
them in several consecutive stages. In [11], for example, the
authors develop a bisection algorithm in which the range
of AoDs and AoAs are divided by a factor of 2at each step
and is refined by probing the resulting 2×2sections and
identifying the section with the maximum received power.
A similar idea using overlapped beam patterns is used in
[12]. Such hierarchical techniques, however, require the
interaction of the BS with each individual UE, since the
training is bi-directional and involves both Downlink (DL)
probing and Uplink (UL) feedback for each iterative round.
In [13], a method is proposed where the BS and the UE
iteratively and collaboratively identify the dominant eigen-
vector of their channel matrix via the well-known power
method. However, this approach requires to demodulate the
signal at each antenna both at the BS and at the UE sides.
Therefore, this method is essentially incompatible with the
HDA beamforming structure.
More recently, considering the natural channel sparsity
in the AoA-AoD domain [3–6], CS-based algorithms have
been proposed for BA in mm-Waves [14–16, 19–21]. These
algorithms are efficient and particularly attractive for multi-
user scenarios, but they are based on the assumption that
the instantaneous channel remains invariant during the
whole probing/measuring stage (the same assumption is
also adopted in [11, 12]). This assumption is typically not
satisfied in practice due to the large Doppler spread at mm-
Waves, implying significant time-variations of the chan-
nel coefficients even in conditions of moderate mobility
[22, 23].1
B. Contributions
In this paper, we propose a novel BA scheme that has
the following advantages compared with the existing works
in the literature:
1) Low-Complexity Beam Direction Estimation: Our
scheme finds a strong propagation path identified by
an AoA-AoD pair, by exploring the AoA-AoD domain
through pseudo-random multi-finger beam patterns, and
constructing an estimate of the resulting second-order
statistics (namely, the average received power for each
pseudo-random beam configuration). The resulting under-
determined system of equations is efficiently solved using
Non-Negative Least-Squares (NNLS), yielding naturally
1Notice that the channel time-variations are greatly reduced after BA is
achieved, since once the beams are aligned, the effective channel angular
spread is very small [23]. However, before BA is achieved, the channel
variability over time can be large, since even a small motion of a few
centimeters traverses several wavelengths, potentially producing multiple
deep fades [22].
a sparse non-negative vector solution whose maximum
component identifies the optimal path.
2) System-Level Scalability: In our approach, the BS
actively probes the channel by periodically broadcasting a
beamforming codebook (consists of a sequence of pseudo-
random beamforming patterns) over reserved beacon slots
in the DL, while all UEs stay in listening mode. Measure-
ments are collected by the UEs, which locally and in-
dependently identify the AoA-AoD of a strong multipath
component. Since there is no need for interaction between
the BS and each UE, the proposed BA scheme is highly
scalable and its overhead and complexity do not grow with
the number of active users in the system.
3) User-Specific Beamforming Codebook: During the
beacon slots, each UE makes use of its own receive
beamforming codebook. The BS needs no knowledge of
such codebook, which can be locally generated by each UE.
We shall show that the optimal angular spreading factor of
the receiver beamforming patterns yielding the fastest BA
acquisition time depends on the pre-beamforming SNR.
Hence, our method has the advantage that beamforming
codebook of each UE can be individually and locally
tailored, depending on hardware constraints (number of RF
chains) and SNR conditions, without impacting the overall
system functions.
4) Robustness to Variations in Channel Statistics: Our
scheme is based on quadratic measurements (i.e., averaged
received power, yielding estimates of the channel second
order statistics), rather than linear measurements of the
channel coefficient vectors. As such, our scheme is highly
robust to variations in the channel time-dynamics. We
also illustrate via numerical simulations that existing CS-
based algorithms fail to estimate the channel strong path
direction when the channel is significantly time-varying,
i.e., it undergoes several fading cycles during the estimation
period, whereas our scheme performs well for a wide range
of channel dynamics. Using channel second order statistics
for BA is also considered in [24] via the Maximum Like-
lihood (ML) estimation of the channel covariance matrix.
However, in [24] the channel probing signals are transmit-
ted isotropically through a single antenna or via a fixed
beamforming pattern from the BS side. The drawback is
that with isotropic transmission the received SNR at the UE
side might be very low whereas with fixed beamforming
the transmit pattern might not hit any strong multipath
component. Moreover, in [24] the UEs can estimate only
their corresponding AoAs rather than the joint AoA-AoD
pairs of the strong paths. In contrast, our scheme yields the
joint AoA-AoD pairs, allowing full BA at both the BS and
the UE sides.
Notation: We denote vectors by boldface small (e.g.,
a) and matrices by boldface capital (e.g., A) letters.
Scalars are denoted by non-boldface letters (e.g., a,A). We
represent sets by calligraphic letter Aand their cardinality
with |A|. We denote the empty set by ∅. We use Efor the
expectation, ⊗for the Kronecker product of two matrices,
22 3.3 Original journal article
BS with Mantennas
Scatterer clusters
UE1with Nantennas
UE2with Nantennas
(Random codebook
NNLS estimation
(Random codebook
NNLS estimation
(BS →UEs
Pseudo-random codebook
Fig. 1: Illustration of the physical channel model and our
proposed Beam Alignment (BA) scheme.
ATfor transpose, A∗for conjugate, and AHfor conjugate
transpose of a matrix A. The output of an optimization
problem such as arg minx∈Xf(x)is denoted by x∗. The
complex circularly symmetric Gaussian distribution with a
mean µand a variance γis denoted by CN(µ, γ). For an
integer k∈Z, we use the shorthand notation [k]for the
set of non-negative integers {1, ..., k}.
II. BASIC SETUP
A. Channel Model
We consider a mm-Wave system including a BS
equipped with a Uniform Linear Array (ULA) with M
antennas and mMRF chains. We consider a generic
UE, also equipped with a ULA with Nantennas and
nNRF chains. We assume that both the BS and UE
arrays have the antenna spacing d=λ
2, where λis the
wavelength given by λ=c0
f0, where c0is the speed of
the light and f0is the carrier frequency. We denote by
θ, φ ∈[−π
2,π
2]the steering angles with respect to the BS
and UE arrays. We represent the array responses of the BS
and UE to a planar wave coming from the angles θand φ
by the M-dim and N-dim array vectors a(θ)∈CMand
b(φ)∈CNrespectively, with elements
[a(θ)]k=ej(k−1)πsin(θ), k ∈[M],(1a)
[b(φ)]l=ej(l−1)πsin(φ), l ∈[N].(1b)
We assume that the communication between the BS and the
UE occurs via a collection of sparse multi-path components
(MPCs) in the AoA-AoD-delay domain [1], where the N×
Mlow-pass equivalent impulse response of the channel at
a symbol time sis given by2
Hs(τ) =
L
X
l=1
ρs,lb(φl)a(θl)Hδ(τ−τl),(2)
2Consistently with the current technology trend in mm-Wave systems,
in this paper we focus on a Time-Division Duplexing (TDD), where the
UL and the DL communication occur over the same frequency band.
where ρs,l is the random channel gain of the l-th MPC
at AoA-AoD-delay (θl, φl, τl),l∈[L]. Typically the
number of significant MPCs satisfies Lmax{M, N}
[2]. In practice there may be a large number of MPCs
that convey such a small amount of signal power that can
be simply neglected since in any case they will not be
useful for signal transmission even after the BA is achieved.
Note that in the channel model, we made the implicit
assumption (very common in most beamforming and array
processing literature) that the communication bandwidth
Bis much smaller than the carrier frequency f0, such
that the array responses in (1) are essentially constant with
f∈[f0−B/2, f0+B/2]. We adopt a block fading model,
where the channel gains ρs,l,l∈[L], remain invariant
over the channel coherence time ∆tcbut change randomly
across different coherence times according to a given wide-
sense stationary process with given Doppler power spectral
density [25]. We also assume that each MPC is formed
by a cluster of micro-scatterers corresponding (roughly) to
the same delay and AoA-AoD (see Fig. 1), such that the
channel gains ρs,l ∼ CN(0, γl)have a zero-mean complex
Gaussian distribution.
We also assume that the angle coherence time, i.e., the
time scale over which the AoA-AoDs of the scatterers
{(θl, φl)}L
l=1 change significantly, is much longer than the
channel coherence time ∆tc. Hence, the angles can be
treated as locally constant (but unknown) during the BA
phase. This local stationarity of the scattering geometry
is widely used in the literature and confirmed by channel
sounding measurements (e.g., see [23, 26]).
B. Signaling Model
Consider the communication between the BS and a
generic UE. Since the BS has mRF chains, it can transmit
up to mdifferent data streams. For a given signaling inter-
val t0, let xs,i(t),t∈[st0,(s+ 1)t0), be the continuous-
time baseband equivalent signal corresponding to the i-th
data stream. We assume that the channel is time-invariant
over each symbol, i.e., t0<∆tc. To transmit the i-th data
stream, the BS applies a beamforming vector us,i ∈CM.
Without loss of generality, the beamforming vectors are
normalized such that kus,ik= 1.3The (baseband equiva-
lent) transmitted signal at symbol time sis given by
xs(t) =
m
X
i=1
xs,i(t)us,i.(3)
The corresponding received signal at the UE array is
rs(t) = ZHs(τ)xs(t−τ)dτ
=
L
X
l=1
m
X
i=1
ρs,lxs,i(t−τl)b(φl)a(θl)Hus,i
3Also, note that here we are assuming that the beamforming vectors
us,i,i∈[m]are implemented in the RF domain via an analog
beamforming network and therefore they are frequency flat, i.e., they are
constant over the whole signal bandwidth.
3. Initial Beam Alignment for mmWave OFDM Systems 23
=
L
X
l=1
m
X
i=1
ρs,lgBS
s,l,ixs,i(t−τl)b(φl)(4)
where gBS
s,l,i := a(θl)Hus,i denotes the beamforming gain
along the l-th MPC at the BS side for the i-th RF chain.
As stated before, we assume that the UE is also equipped
with nRF chains and the analog RF signal received at
the UE antenna array is distributed into these chains for
demodulation. This is achieved by signal splitters that
divide the signal power by a factor of n. The noise in the
receiver is mainly introduced by the RF chain electronics
(filter, mixer, and A/D conversion). It follows that the noisy
received signal at the output of the j-th RF chain at the
UE side is given by
ys,j(t) = 1
√nvH
s,jrs(t) + zs,j(t)
=1
√n
L
X
l=1
m
X
i=1
ρs,lgBS
s,l,ixs,i(t−τl)vH
s,jb(φl)+zs,j (t)
=
m
X
i=1
1
√n
L
X
l=1
ρs,lgBS
s,l,igUE
s,l,jxs,i(t−τl) + zs,j(t)
(5)
where vs,j ∈CNdenotes the normalized beamforming
vector of the j-th RF chain at the UE side, where
gUE
s,l,j := vH
s,jb(φl)denotes the array gain of the j-th
RF chain along the l-th MPC, and where zs,j(t)is the
continuous-time complex Additive White Gaussian Noise
(AWGN) at the output of the j-th RF chain, with Power
Spectral Density (PSD) of N0Watt/Hz. The factor 1/√n
in (5) takes into account the power split said above.
In this paper, we consider OFDM signaling with given
subcarrier separation ∆f, hence, each symbol xs,i(t)in
the general model defined before corresponds here to an
OFDM symbol. The number of subcarriers is given by
F:= B/∆f, where Bdenotes the channel bandwidth
as defined before. We make the standard assumption that
the duration τcp of the Cyclic Prefix (CP) of the OFDM
modulation is longer than the channel delay spread, im-
plying t0= 1/∆f+τcp with τcp ≥max{τl}−min{τl}.
Hence, after OFDM demodulation, the Inter-Block Inter-
ference is completely removed and we can focus on a
per-symbol model in the frequency domain [25]. Also,
for simplicity, we neglect the effect of pulse-shaping in
the OFDM signaling and assume a frequency-flat pulse
response. Applying the Fourier transform to the matrix-
valued channel impulse response (2), the frequency-domain
channel matrix at symbol interval sis given by
ˇ
Hs(f) =
L
X
l=1
ρs,lb(φl)a(θl)He−j2πfτl.(6)
We denote the OFDM subcarriers as {fω=ω
t0:ω∈[F]}.
The channel matrix at subcarrier ωis given by Hs[ω] :=
ˇ
Hs(fω). Let ˇxs,i[ω]denote the frequency-domain data
symbol for the i-th stream. Applying OFDM demodulation
024
0
2
4
0
0.5
1
M
N
(a)
0
5
0
5
0
0.5
1
M
N
(b)
010
0
10
0
0.5
1
M
N
(c)
020
0
20
0
0.5
1
M
N
(d)
Fig. 2: Illustration of the sparsity of the channel matrix ˇ
Hs[ω]at
an arbitrary subcarrier ωconsisting of 3off-grid AoA-AoDs with
increasing number of antennas for M=N= 4 (a), M=N= 8
(b), M=N= 16 (c), M=N= 32 (d).
to the received signal (5), we obtain the corresponding
frequency-domain received signal at the j-th receiver RF
chain, with transmit beamforming vector us,i and receive
beamforming vector vs,j in the form
ˇys,i,j [ω] = 1
√nvH
s,jHs[ω]us,i ˇxs,i[ω] + ˇzs,j[ω]
=1
√n
L
X
l=1
ρs,le−j2πω
t0τlgBS
s,l,igUE
s,l,j ˇxs,i[ω] + ˇzs,j[ω],
(7)
where ˇzs,j[ω]∼ CN(0, σ2)denotes the noise at j-th RF
chain of UE at subcarrier ω, with variance σ2= ∆fN0
which we assume is known for each UE [6].
C. Beam Alignment
During the DL probing slots (see frame structure dis-
cussed in Section III), we assume that the signal cor-
responding to different transmitted streams xs,i(t)are
orthogonal, i.e.,
hxs,i, xs,i0i:= Z(s+1)t0
st0
xs,i(t)∗xs,i0(t)dt =Eiδi,i0,(8)
where Eiis the energy per symbol for the i-th data stream
and δi,i0is the Kronecker delta symbol (equal to 1for
i=i0and 0 otherwise). For example, this can be obtained
in the frequency domain by using OFDM and mapping the
different streams onto sets of non-overlapping subcarriers.
Letting rs,i,j(t) := PL
l=1 ρs,lgBS
s,l,igUE
s,l,jxs,i(t−τl)denote
the signal contribution relative to the i-th transmitted data
stream of the BS received at the output of the j-th RF
chain of the UE (see (5)), defining Biand Pi=Ei/t0
to be the bandwidth and the average power of xs,i(t),
respectively, and recalling that γl=E[|ρs,l|2], the SNR
24 3.3 Original journal article
after beamforming (ABF) for the i-th data stream received
at the j-th RF chain at the UE is given by
SNRABFi,j :=
1
t0
EhR(s+1)t0
st0|rs,i,j(t)|2dti
nN0Bi
=PiPL
l=1 γl|gUE
s,l,j|2|gBS
s,l,i|2
nN0Bi
.(9)
We define the total transmit power of the BS as Ptot =
Pm
i=1 Pi. In particular, for equal power allocation (Pi=
Ptot/m) over the streams, we have
SNRABFi,j =Ptot PL
l=1 γl|gUE
s,l,j|2|gBS
s,l,i|2
mnN0Bi
,(10)
For later use, we also define the SNR before beamforming
(BBF) by
SNRBBF := Ptot PL
l=1 γl
N0B.(11)
This is the SNR obtained when a single data stream
(m= 1) is transmitted through a single BS antenna and
is received in a single UE antenna (isotropic transmission)
over a single RF chain (n= 1) with full-band spreading.
A challenge in mm-Wave communication is that the
SNR before beamforming SNRBBF in (11) is typically
very low. This cannot be increased by simply boosting the
transmit power Ptot because of hardware and regulation
limitations, also because, in general, we would like to
design energy-efficient systems. Another option consists of
communicating over a small bandwidth B0< B. However,
it is well-known that this strategy is suboptimal.4In fact,
assuming a Gaussian channel with SNR equal to SNRBBF,
Shannon’s capacity formula yields that the achievable rate
in bit/s when communicating over a bandwidth B0is given
by R=B0log(1 + (B/B0)SNRBBF), which is increasing
for 0< B0≤B. Hence, by using a bandwidth smaller
than the available channel bandwidth B, the achievable
rate is reduced. It follows that the only viable alternative to
achieve a reasonable SNR consists in using antenna arrays
with a large number of antennas both at the BS and at the
UE. The goal of BA is to find good beamforming vectors
usand vsat the BS and the UE, respectively, in order
to boost the SNR by a factor ≈Mat the BS side and a
factor ≈Nat the UE side. This is achieved by aligning
the beamforming vectors along the AoA-AoD of a strong
MPC of the channel.
4This statement holds only in the case where the channel coefficients
change sufficiently slowly in time. More in general, for time-varying
wideband fading channels, it has been shown (e.g., see [27–30]) that
spreading the transmit power over the entire bandwidth is suboptimal and
drives the achievable rate to zero for B→ ∞. Intuitively, this is due to
the inability of the receiver to estimate the fading channel coefficients,
as explained in [27]. The issue of optimal signaling in the presence of
time-varying fading is quite intricate and goes beyond the scope of this
paper. As a matter of fact, when a large beamforming gain is available at
both the BS and the UE side, the effective channel coefficients after beam
alignment are slowly varying (see [23]) and the SNR after beamforming
is large enough, such that the channel can be treated as a standard block-
fading AWGN channel with fully known channel coefficients.
D. Sparse Beamspace Representation
The AoA-AoDs (θl, φl)in (6) take continuous values.
In this paper we adopt the approximate finite-dimensional
(discrete) beamspace representation following the well-
known approach of [1, 3, 31]. We consider the discrete
set of AoA-AoDs
Θ := {ˇ
θ: (1 + sin(ˇ
θ))/2 = k−1
M, k ∈[M]},(12a)
Φ := {ˇ
φ: (1 + sin(ˇ
φ))/2 = k0−1
N, k0∈[N]},(12b)
and use the corresponding array responses A:= {a(ˇ
θ) :
ˇ
θ∈Θ}and B:= {b(ˇ
φ) : ˇ
φ∈Φ}as a discrete dictionary
to represent the channel response. For the ULAs consid-
ered in this paper, the dictionary Aand B, after suitable
normalization, yield orthonormal bases corresponding to
the columns of the unitary DFT matrices FMand FN
[5], where we define the D-dimensional DFT matrix with
elements
[FD]k,k0=1
√Dej2π(k−1)( k0−1
D−1
2), k, k0∈[D].(13)
Hence, we obtain the beamspace representation of the
channel matrix as Hs[ω] = FNˇ
Hs[ω]FH
M, where
ˇ
Hs[ω] =
L
X
l=1
ρs,le−j2πω
t0τlˇ
b(φl)ˇ
a(θl)H,(14)
where ˇ
a(θl) := FH
Ma(θl)and ˇ
b(φl) := FH
Nb(φl)denote
the coefficient vectors of the array responses a(θl)and
b(φl)with respect to the DFT bases, respectively. The m0-
th entry of ˇ
a(θl)is given by
[ˇ
a(θl) ]m0=1
√M
M−1
X
i=0
e−j2πi(m0−1
M−1
2)ejπi sin(θl)
=1
√M
sin(πψlM)
sin(πψl)e−jπψl(M−1),(15)
where ψl=m0−1
M−1
2sin(θl)−1
2. A similar expression
holds for ˇ
b(φl). It is seen from (15) that |[ˇ
a(θl) ]m0|=
1
√M|sin(πψlM)|
|sin(πψl)|is a localized kernel around θl=
sin−1[2(m0−1)
M−1] with a resolution of 1
M. In general, the
AoA-AoDs of the MPCs are not aligned with the discrete
grid G= Θ ×Φ. However, as the number of antennas
Mat the BS and Nat the UE increases, the DFT basis
provide good sparsification of the channel matrix ˇ
Hs[ω].
This is qualitatively illustrated in Fig. 2 for a channel with
L= 3 discrete off-grid MPCs. It is seen that, as Mand
Nincrease, the resulting representation ˇ
Hs[ω]is more and
more sparse.
III. PROPOSED BEAM-ALIGNMENT ALGORITHM
A. High-Level Overview
In the proposed scheme, the channel is periodically
probed by the BS while the UEs remain in the listen-
ing mode. During the listening mode, each UE gathers
3. Initial Beam Alignment for mmWave OFDM Systems 25
measurements of the channel, which is continued until the
UE gathers a sufficient number of measurements such that
the AoA-AoD of a strong MPC can be reliably identified.
After this directional channel estimation is done, the UE
tries to announce its identity (user ID) and its beam ID
(i.e., the index of the discrete AoD corresponding to the
estimated strong MPC) to the BS by sending a control
packet. Such control packet is sent over the Random Access
Control CHannel (RACCH), i.e., a dedicated slot in the
frame used for random access, as in virtually all current
cellular standard in use today. During the RACCH, the BS
stays in listening mode. If the control packet is successfully
decoded, the BS responds with a beamformed ACK using
the AoD information extracted from the control packet,
over a DL data slot. During the data slots, the UE stays
in listening mode using its own estimated beam. It follows
that the ACK enjoys the full (two-sided) beamforming gain.
At this point, BA is achieved and high-SNR communication
can take place over the data slots. An overview of the
proposed initial acquisition and BA protocol is illustrated
in Fig. 3.
Fig. 4 illustrates the proposed frame structure, consisting
of three parts: the DL beacon slot, the RACCH slot,
and Data Transmission slot. During the DL beacon slots
(corresponding to Fig. 3 #1), the BS probing signal is
formed by a sequence of pseudo-random beam patterns
(referred to as the transmit beamforming codebook), re-
peated periodically, and priori known to all UEs. Each
UE makes measurements of the beacon transmission by
applying its own (individual) sequence of receive beam
patterns (referred to as the receive beamforming codebook).
The number of measurements may differ from user to user,
depending on the individual pre-beamforming SNR and on
the number of receiver RF chains. We will show in the
simulation section that, when a UE is close to the BS,
i.e., its received signal power (SNR) is sufficiently high, it
can use wider beams and take less measurement rounds in
time to speed up the estimation. In contrast, when a UE
is far from the BS, i.e., the received signal power (SNR)
is very low, it applies narrower receive beams to achieve
sufficiently large SNR and takes more rounds in time to
collect sufficient number of measurements. In general, a
UE might not know the SNR of its channel and may
need an adaptive strategy to find a suitable beamwidth
for BA. Nevertheless, since the beacon signal is repeated
periodically, all the users no matter whether they are weak
or strong are able to gather as many measurements as they
need.
During the RACCH slots (Fig. 3 #2), the BS stays in
listening mode and uses its mRF chains to form mcoarse
beam patterns (sectors), covering the whole BS angle
domain, in order to provide some receiver beamforming
gain. Notice that the control packet in the RACCH may
fail because of incorrect directional channel estimation (i.e.,
the UE beam points in a wrong direction), or because of a
collision in the RACCH due to another user, or also simply
BS UE
#1 Pilot
Listening Mode
Estimation
#2 Control Packet
#3 ACK
#4 ACK
Random Access
#5 Data Data transmission
Fig. 3: Illustration of the proposed Beam Alignment (BA) process
between the BS and a generic UE. The procedures (#2∼#5) are
independently done at each UE. All the UEs share the same BS
beamforming codebook (#1).
because of statistical fluctuations of the noise, yielding a
small but non-zero packet error probability. In all these
cases, the BS will not respond with the ACK packet in the
data field, and the UE will try again, after gathering more
beacon measurements. It should be noticed that packet
losses in the RACCH are handled in various ways in
all cellular standards in operation today, and surely the
RACCH can be dimensioned such that it does not represent
a system bottleneck. Furthermore, collisions in the RACCH
are not a specific problem of our scheme. In fact, they
exist in some form in any scheme for initial acquisition
operating in a multiuser environment. Actually, schemes
based on interactive beam refinement, requiring multiple
control packets and pilot signals to be sent in both UL and
DL, are definitely more prone to such problems than the
proposed scheme. Since the RACCH is not specific of the
proposed scheme, in the following we shall assume that,
when the UE has correctly estimated its best MPC, the
control packet is received without errors. This allows to
compare different systems in a simple and direct manner,
and focus on the important and specific aspects of BA.
B. BS Channel Probing and UE Sensing
Without loss of generality, we focus on the BA procedure
for a generic UE and omit the UE index. Consider the
channel matrix Hs[ω]between the BS and the UE arrays,
as defined in Section II-D, and its beamspace representation
ˇ
Hs[ω]at beacon slot s∈[T]and subcarrier ω∈[F], where
Tis the effective period of beam training.
For simplicity of exposition, we assume that the beacon
slot contains a single OFDM symbol interval.5At each
beacon slot, the BS uses its mRF chains to probe the
channel along mbeamforming vectors us,i,i∈[m], by
5The generalization to multiple OFDM symbols per slot is immediate
and slots of S≥1OFDM symbols shall be used in the numerical results.
26 3.3 Original journal article
Large separation
ω∈ F1
ω∈ F2
ω
s
Pseudo-random beam sweeping (BS beacon)
Random Access Control CHannel (RACCH) slot
Data slots
Fig. 4: (Top) Frame structure of the proposed BA scheme. Notice
that the beacon, RACCH, and data slots are multiplexed in time
according to a TDD scheme. The data slot includes both DL and
UL subslots. (Bottom) Different beacon signals are orthogonally
multiplexed over disjoint sets of subcarriers ω∈ Fi,i∈[m]. In
the figure’s example we have two orthogonal beacon signals on
the “blue” and on the “orange” combs of subcarriers.
transmitting an OFDM symbol xs,i(t)along each us,i.
We design the beacon OFDM symbols xs,i(t)such that
they are mutually orthogonal in the frequency domain and
can be separated at the UE side. Thanks to orthogonality,
each beacon pilot stream provides the UE with different
measurements from the underlying channel. In particular,
for each i∈[m]we define a subset Fi⊂[F]of size
|Fi| ≤ Fsuch that Fi∩ Fi0=∅for i6=i0(see
Fig. 4). We choose each subset Fito form a “comb”
of subcarriers of equal size |Fi|=F0, with sufficient
subcarrier separation such that the corresponding channel
matrices {Hs[ω] : ω∈ Fi}are mutually uncorrelated.
A main ingredient of our proposed BA scheme is the
pseudo-random beamforming codebook transmitted by the
BS during the beacon slots, defined as the collection of sets
CBS := {Us,i :s∈[T], i ∈[m]}, where Us,i is the angle-
domain support (i.e., the subset of quantized angles in the
virtual beamspace representation) defining the directions
to which the transmit beam patterns us,i sends the signal
power. We assume |Us,i|=κu≤Mfor all (s, i). The
beamforming vectors are given by us,i =FMˇ
us,i, where
ˇ
us,i =1Us,i
√κu, and where 1Us,i denotes a vector with 1
at components in the support set Us,i and 0elsewhere. An
example of such patterns with the corresponding vector ˇ
us,i
is shown in Fig. 5 (a). The pseudo-random nature of the
codebook is due to the fact that the sequences of angular
support sets {Us,i :i∈[m], s ∈[T]}are generated in a
pseudo-random manner.
The second ingredient of our proposed BA algorithm is a
local receive codebook at each UE, through which the UE
makes measurements in order to estimate the AoA-AoD
information of its strong MPCs. Each UE can customize
(locally) its own receive beamforming codebook defined
by the collection of sets CUE := {Vs,j :s∈[T], j ∈[n]},
where Vs,j is the angle-domain support defining the direc-
tions from which the receiver beam patterns vs,i collect
signal power. We assume |Vs,j |=κv≤Nfor all (s, j).
The beamforming vectors are given by vs,j =FNˇ
vs,j,
where ˇ
vs,j =1Vs,j
√κv. Similar to the parameter κuat the
transmitter side, the parameter κvcontrols the spread of
the sensing window at the UE side. This is illustrated again
in Fig. 5 (a).
During the s-th beacon slot, the UE applies the receive
beamforming vector vs,j to its j-th RF chain, obtaining the
frequency-domain received signal (after OFDM demodula-
tion) given by (7) for i∈[m]and ω∈ Fi. Note that the
mprobing signals xs,i(t)are orthogonal in the frequency
domain and therefore can be perfectly separated at the
receiver. It is convenient to write (7) directly in terms of
the beamspace representation as
ˇys,i,j[ω] = 1
√nˇ
vH
s,j ˇ
Hs[ω]ˇ
us,i ˇxs,i[ω] + ˇzs,j[ω].(16)
The BS total transmit power Ptot is allocated equally to all
the probing streams i∈[m], all the subcarriers in ω∈ Fi,
and all the κubeamspace directions. Hence, the symbols
{ˇxs,i[ω] : ω∈ Fi}have uniform power distribution with
E[|ˇxs,i[ω]|2] = Ptot
mF 0:= Pdim (power per transmit signal
dimension). In fact, without loss of generality, we choose
the frequency-domain probing symbols to be constant and
given by ˇxs,i[ω] = √Pdim.
Considering the beamforming patterns defined by CBS
and CUE, it is not difficult to see that
|gBS
s,l,i|2=|a(θl)Hus,i|2=|a(θl)HFMˇ
us,i|2≤M
κu
,
(17a)
|gUE
s,l,j|2=|vH
s,jb(φl)|2=|b(φl)HFNˇ
vs,j|2≤N
κv
.(17b)
Applying the upper bounds (17a) and (17b) in (10), we
obtain the maximum possible SNR for channel estimation
in the per-subcarrier observation (16), given by
SNRCE
ABF := Pdim
n
MN PL
l=1 γl
κuκvσ2
=MN
κuκvmn ×B
F0∆f×SNRBBF.(18)
where F0denotes the adopted number of subcarriers and
∆fdenotes the subcarrier bandwidth.
By setting κu=κv= 1 in (18), we obtain the beam-
forming gain after BA, namely, aligning the beams along
the strongest scatterer. Moreover, (18) puts in evidence the
role of the different factors: the first term expresses the
power concentration in the spatial domain, i.e., the ratio the
maximal available beamforming gain MN, divided by the
total signal dimensions in the spatial multiplexing domain
κuκvmn over which the signal is spread; the second term
corresponds to the power concentration in the frequency
3. Initial Beam Alignment for mmWave OFDM Systems 27
domain; the third term is the SNR before beamforming,
defined in (11).
The frequency spreading factor F0and angle spreading
factors κu, κvcan be optimized depending on the specific
cell topology (e.g., on the size of the cell, which in
turn determines the worst-case SNR before beamforming).
Clearly, by making κu(resp., κv) larger, each beam pattern
probes (resp., sense) simultaneously more directions, but
the total power is spread over all such directions. In
contrast, by making κu(resp., κv) smaller, the beam
pattern explores less directions but obtains better power
concentration in the angle domain. It is also important
to notice the effect of F0: as we shall see in Section
III-C, the AoA-AoD estimator builds some sample-mean
statistics by averaging over a sufficiently large number of
uncorrelated channel fading realization over the frequency
domain. Hence, larger F0provide better averaging at the
cost of spreading the total power over more subcarriers.6
Remark 1: Angular probing schemes via random,
pseudo-random, or even adaptive codebooks can also be
found in [11, 12, 15, 21]. Our proposed codebook in
this paper can be seen as an improved version of those
schemes where the width of the beam, i.e., κvcan be
individually selected by each UE to achieve an optimal
tradeoff between angular exploration and the SNR obtained
in each measurement. ♦
C. Channel Estimation at the UE Side
The strong MPCs of the channel correspond to the
components (k, k0)in the matrix ˇ
Hs[ω]with large sec-
ond moment. An immediate consequence of the chan-
nel model definition and the standard assumption of un-
correlated MPCs is that the element second moments
E[|(ˇ
Hs[ω])k,k0|2]] are invariant both with respect to s(time)
and with respect to ω(frequency) [32]. If we had direct
access to measurements of the elements of ˇ
Hs[ω], a naive
approach would build estimators for the second moments
(sample covariance), and try to identify the largest element.
However, this would require a number of RF chains equal
to the number of antenna elements. In contrast, we have
only access to the projections ˇ
vH
s,j ˇ
Hs[ω]ˇ
us,i from the
observation in (16).
Using ˇxs,i[ω] = √Pdim in (16), we can write the received
beacon symbol observation at the UE as
ˇys,i,j [ω] = rPdim
nˇ
vH
s,j ˇ
Hs[ω]ˇ
us,i + ˇzs,j[ω]
=rPdim
n(ˇ
uT
s,i ⊗ˇ
vH
s,j)ˇ
hs[ω] + ˇzs,j[ω]
=rPdim
ngH
s,i,j ˇ
hs[ω] + ˇzs,j[ω],(19)
where ˇ
hs[ω] = vec( ˇ
Hs[ω]) denotes the vectorized
beamspace representation of the channel matrix at sub-
6This tradeoff in the choice of the spreading parameters F0and κu, κv
can be seen as an instance of the well-known exploration-exploitation
tradeoff in statistics.
carrier ω∈ Fi, where we used the well-known identity
vec(ABC) = (CT⊗A)vec(B), and where we define
the combined probing and sensing beamforming pattern
as gs,i,j =ˇ
u∗
s,i ⊗ˇ
vs,j ∈CMN , which is common across
all the subcarriers ω∈ Fibut differs for different pairs of
BS and UE RF chains (i, j).
In practice, each beacon slot is formed by a block of
S≥1OFDM symbols. With a slight abuse of notation, we
index the symbols belonging to the (s+1)-th slot as sS+s0,
for s0∈[S]. In order to estimate the average received power
at the UE j-th RF chain output due to the signal transmitted
by the BS i-th RF chain in the s-th beacon slot, we form
the averaged quadratic measurement
ˇqs,i,j =1
SF0X
s0∈[S]X
ω∈Fi|ˇysS+s0,i,j[ω]|2
=Pdim
ngH
s,i,j
1
SF0X
s0∈[S]X
ω∈Fi
ˇ
hsS+s0[ω]ˇ
hsS+s0[ω]H
gs,i,j
+1
SF0X
s0∈[S]X
ω∈Fi|ˇzsS+s0,j[ω]|2
+1
SF0X
s0∈[S]X
ω∈Fi
ξsS+s0,j[ω],(20)
where the first and the second terms correspond to the
signal contribution and to the noise contribution, and where
ξsS+s0,j[ω]=2rPdim
nRe gH
s,i,j ˇ
hsS+s0[ω]ˇzsS+s0
,j[ω]H
(21)
denotes the signal-noise cross term. Note that since the
AWGN noise (ˇzsS+s0,j[ω]) and the Gaussian channel coef-
ficients (ˇ
hsS+s0[ω]) are independent, the cross term has a
zero mean. Thus, when the number of dimensions S×F0
(over which the instantaneous power ˇqs,i,j is averaged) is
large, it contributes negligibly to (20) and can be treated as
a residual term in our formulation. Moreover, the empirical
covariance matrix of the channel vector converges as
1
SF0X
s0∈[S]X
ω∈Fi
ˇ
hsS+s0[ω]ˇ
hsS+s0[ω]H
→E[hs[ω]hs[ω]H] =: Σh.(22)
Similarly, the noise term converges to
1
SF0X
s0∈[S]X
ω∈Fi|ˇzsS+s0,j[ω]|2→σ2.(23)
Hence, the received power in (20) gives an approximate
1-dimensional noisy projection of the covariance matrix
Σhwith respect to the combined probing and sensing
vector gs,i,j. We include the signal-noise cross term in
(21) and the difference between the empirical and statistical
averages in (22) and (23) as a residual error. This results
in
ˇqs,i,j =Pdim
ngH
s,i,jΣhgs,i,j +σ2+ ˇws,i,j,(24)
28 3.3 Original journal article
where ˇws,i,j is the measurement error term. When all the
AoA-AoDs lie on a discrete grid, ˇ
hs[ω]is a sparse vector
with i.i.d. components with only a few nonzero coefficients
corresponding to the scatterers. In practice, for large Mand
N,ˇ
hs[ω]is almost sparse with small clusters of non-zero
coefficients concentrated around the AoA-AoD pairs of the
strong MPCs (as illustrated in Fig. 2). Correspondingly,
also Σhis a very sparse matrix, with strong components
localized on the main diagonal.
Next, we put (24) in the form suitable for the pro-
posed AoA-AoD estimation algorithm. With reference to
Fig. 5 (a), recall the beam probing and sensing vectors
ˇ
us,i =1Us,i
√κuand ˇ
vs,j =1Vs,j
√κv. With reference to Fig. 5
(b), let ˇ
Γdenote the N×Mmatrix with elements
Pdim
nκuκv
E[|[ˇ
Hs[ω]]k,k0|2]. Finally, define the NM ×1binary
vectors bs,i,j := 1Us,i ⊗1Vs,j , with components equal to
0 or 1, where the 1’s correspond to the positions in the set
Vs,j ×Us,i of the quantized AoA-AoDs pairs probed/sensed
by the beamforming vectors ˇ
vs,j and ˇ
us,i, respectively.
With these definitions, after a little algebra, we can rewrite
(24) as
ˇqs,i,j =bT
s,i,jvec(ˇ
Γ) + σ2+ ˇws,i,j.(25)
Since the BS transmits along mRF chain in each beacon
slot and the UE has nRF chains to sense the channel, the
UE obtains mn equations for the unknown vector vec(ˇ
Γ)
as in (25). Over Tbeacon slots the UE obtains mnT
equations, which can be written in the form
ˇ
q=B·vec(ˇ
Γ) + σ21+ˇ
w,(26)
where the vector ˇ
q= [ˇq1,1,1, . . . ˇq1,m,n, . . . , ˇqT,m,n]Tcon-
sists of all the mnT measurements calculated as in (20), the
mnT ×MN matrix B= [b1,1,1, . . . , b1,m,n, . . . , bT,m,n]T
is uniquely defined by the beamforming codebooks CBS
and CUE, and ˇ
w∈RmnT is the residual noise and error
in the measurements. At this point, some remarks are in
order.
Remark 2: An implicit assumption made here is that
each UE is frame-synchronous with the BS, i.e., it knows,
at each beacon slot sthe subsets {Us,i :i= 1, . . . , m}
of beam directions of the BS. It is clear that a lack of
frame synchronization between a UE and the BS would
lead to a wrong construction of the measurement matrix
Bin (26). Notice however that, since the beacon patterns
are repeated periodically with some period of Tframes,
this requires only to be aware of the start epoch of the
period. This assumption is explicitly or implicitly made
in virtually all works dealing with initial beam acquisi-
tion (aka, BA problem) [8–15], as reviewed in Section I.
Therefore, this is not a particularly restrictive assumption
specific to our approach. As in most works, we assume
that such coarse frame information can be gathered from
some external source. In practice, this may be either an
overlay cell operating at some standard cellular frequency
(e.g., typically in the range of sub-6 GHz) or, for a stand-
alone small-cell mm-Wave system, by letting the cells be
frame synchronous. In this way, the UE needs to acquire the
frame period only once when it joins at first the system, at
the cost of a small additional overhead.7Then, the frame
synchronization is maintained while the UE roams from
cell to cell. ♦
Remark 3: As already remarked, there exist a fairly
large number of existing/concurrent works that make use
of pseudo-random beam patterns in order to gather linear
(compressed) measurements of the channel matrix, with the
goal of estimating the channel matrix coefficients. This is
typically obtained by using some CS technique, leveraging
the fact that the propagation channels are sparse in the
angle/delay domain [14–16, 19–21]. It is important to note
that our scheme differs from all these works in one key
fact. Namely, our scheme gathers quadratic compressed
measurements (see (24)) and not linear. While in this way
we loose the ability of estimating the complex channel
coefficients, we can estimate the channel second order
statistics in the beamspace domain, and identify the strong
MPCs in terms of the corresponding average received
signal power. This information is much more stable and
robust to variations in the channel time-dynamics than the
channel coefficients themselves. In fact, it is easy to see
that when the channel coefficients vary significantly over
the measurement time, the system of linear equations in CS
schemes tends to become unidentifiable. For example, in
the limiting case of independent channel coefficients across
the measurement slots, each new measurement depends
on a new set of coefficients, such that the number of
measurements is always less than the number of non-
zero channel coefficients. In these conditions, fundamental
information theoretic bounds show that stable reconstruc-
tion is not possible for any CS algorithm, no matter how
sparse the channel is [33]. In contrast, focusing on the
channel second-order statistics sampled by the quadratic
measurements in (24)-(26), we can always gather a number
of measurements mnT larger than the number of non-zero
coefficients in vec(ˇ
Γ), for sufficiently large T, such that
the strong components in vec(ˇ
Γ)can be identified with
high probability. ♦
As a general comment, we notice here that it is more
sensible and more robust to first estimate the beamforming
direction (e.g., via the proposed scheme) and then estimate
the beamformed channel in the regime of high SNR, rather
than trying to first estimate the channel coefficients (in low
SNR and at the mercy of the possibly large time-variations)
and then computing the beamforming coefficients.
D. Non-Negative Least-Squares
In order to identify the AoA-AoD directions of the
strong scatterers, we estimate the MN dimensional vector
vec(ˇ
Γ)from the mnT-dimensional observation given in
(26). Because of the presence of the measurement noise ˇ
w,
a standard approach consists of solving the Least-Squares
7For example, this can be obtained by trying sequentially the different
cyclic shifts of the beacon sequence until successful alignment.
3. Initial Beam Alignment for mmWave OFDM Systems 29
−π
2
π
2
BS with AoD subset Us,i
−π
2
π
2
UE with AoA subset Vs,j
(a)
Vs,j
| {z }
Us,i
| {z }
(b)
Fig. 5: (a) Illustration of the subset of AoA-AoDs at time slot sprobed by the i-th beacon stream transmitted by the BS and
received by the j-th RF chain of the UE, for M=N= 10. The AoD subset is given by Us,i ={1,3,4,6,8,10}(numbered
counterclockwise) with beamforming vector ˇ
us,i =1
√6[1,0,1,0,1,0,1,1,0,1]T. The AoA subset is given by Vs,j ={2,4,5,7,9}
(numbered counterclockwise) with receive beamforming vector ˇ
vs,j =1
√5[0,1,0,1,1,0,1,0,1,0]T. (b) The channel gain matrix ˇ
Γ
(with two strong MPCs indicated by the dark spots) measuring along Vs,j × Us,i.
(LS) problem minˇ
ΓkB·vec(ˇ
Γ) + σ21−ˇ
qk2. However,
in general MN is significantly larger than mnT, such that
the system of equations is heavily underdetermined and the
LS solution yields meaningless results. The key observation
here is that ˇ
Γis sparse (by assumption) and non-negative
(by construction). Recent results in CS show that when the
underlying parameter ˇ
Γis non-negative, the simple non-
negative constrained LS given by
ˇ
Γ∗= arg min
vec(ˇ
Γ)∈RMN
+kB·vec(ˇ
Γ) + σ21−ˇ
qk2,(27)
is still enough to impose sparsity of the solution ˇ
Γ∗
[34, 35], with no need for an explicit sparsity-promoting
regularization term in the objective function as in the
classical LASSO algorithm [36]. The (convex) optimization
problem (27) is generally referred to as Non-Negative
Least-Squares (NNLS), and has been well investigated in
the literature, starting with Donoho et al. in [37]. More
recently, in the context of CS it has been shown that the
non-negativity constraint alone might suffice to recover a
sparse non-negative signal from under-determined linear
measurements both in the noiseless case [38–41] and in
the noisy case [34, 35]. Moreover, [34] demonstrates that
NNLS has a noisy recovery performance comparable to
that of LASSO. In [34] it is also shown that NNLS along
with an appropriate thresholding provides state-of-the-art
performance in terms of support estimation. This property
is very relevant in the context of this paper, where the
identification of the support of ˇ
Γcorresponds to finding
the AoA-AoD directions strongly coupled by MPCs.
As discussed in [34], NNLS implicitly performs `1-
regularization and promotes the sparsity of the resulting
solution provided that the measurement matrix satisfies
the M+-criterion [42]. This property is beneficial for our
proposed BA scheme because of the natural sparsity of
the mm-Wave channel in AoA-AoD domain. Posed in our
framework, the measurement matrix Bfulfills the M+-
criterion if there is a vector g0∈RmnT
+such that BTg0>
0. It is not difficult to see that when g0=1is an all-one
vector of dimension mnT, the i-th component of [BTg0]i
corresponds to the number of measurement patterns that hit
the AoA-AoD pair corresponding to i∈[MN]. Hence, the
necessary condition BTg0>0can be simply interpreted
as the fact that the set of mnT measurement patterns
should hit all MN AoA-AoD pairs at least once. Also, as
stated in [42], NNLS performs better when the condition
number maxi∈[MN][BTg0]i
mini∈[MN][BTg0]iis close to 1, which is met when
the measurement patterns (i.e., the rows of B) cover the
whole set of AoA-AoDs quite uniformly. This also provides
a criterion to design good pseudo-random beamforming
codebooks for the BA problem.
In terms of numerical implementations, the NNLS can
be posed as an unconstrained LS problem over the positive
orthant and can be solved by several efficient techniques
such as Gradient Projection, Primal-Dual techniques, etc.,
with an affordable computational complexity [43], gener-
ally significantly less than CS algorithms for problems of
the same size and sparsity level. We refer to [44, 45] for
the recent progress on the numerical solution of NNLS and
a discussion on other related work in the literature.
IV. PERFORMANCE EVALUATION
In this section we evaluate the performance of our proposed
algorithm via numerical simulations. To run the NNLS
optimization in (27), we use the implementation of NNLS
in MATLAB© called lsqnonneg.m.
Channel and Signal Model. We assume f0= 70 GHz
carrier frequency and bandwidth of B= 1 GHz. The
30 3.3 Original journal article
10 20 30 40 50 60 70 80
0.2
0.4
0.6
0.8
1
Number of slots T
PD
BS codebook CBS #1
BS codebook CBS #2
BS codebook CBS #3
BS codebook CBS #4
Fig. 6: Detection probability PDof the proposed scheme for
different pseudo-random codebooks (denoted by CBS), where M=
N= 32,F0= 3,m= 3,n= 2,κu=κv= 8,SNRBBF =−33
dB.
OFDM subcarrier spacing is 480 kHz in compliance with
recent 3GPP standard specifications [46, 47]. Assuming
τcp∆f= 0.07 (i.e., the CP length is 7% of the OFDM
duration), we obtain t0= 2.23 µs and around F= 2048
subcarriers (plus some guard band). We fix the frame
duration of our scheme (i.e., the repetition interval of the
beacon slot) to 1ms, consists of 448 OFDM symbols (per
subcarrier).
Abeacon slot contains S= 14 OFDM symbols, the
random access slot also contains 14 OFDM symbols, and
the remaining 420 symbols are used for data transmission
[46]. We assume that the BS has M= 32 antennas and
m= 3 RF chains, and the UE has N= 32 antennas and
n= 2 RF chains. We announce an individual experiment
to be successful if the index of the strongest component in
ˇ
Γis correctly estimated (i.e., it coincides with the actual
strongest MPC AoA-AoD location, up to the discrete angle
grid quantization).
Dependence on the Random BS Codebook. We gen-
erated 4different beamforming codebooks at the BS side.
Each codebook consists of a randomly generated sequence
of patterns, identified by binary vectors of dimension
Mand Hamming weight κu, obtained by independently
sampling the set of all possible M
κusuch equal-weight
vectors. For simplicity, we consider a very sparse channel
model with only one MPC. Fig. 6 illustrates the detection
probability for the different pseudo-random codebooks,
where the angle spreading factors at the BS and the user
sides are set to κu=κv= 8, respectively. We repeated
each experiment 200 times and plot the resulting detec-
tion probability versus training period length T. Notice
that different codebooks have quite similar performances.
This demonstrates the fact that, well-known in several
CS contexts, that the scheme is quite insensitive to the
specific measurement matrix, as long as it is sufficiently
randomized.
10 20 30 40 50 60 70 80
0.2
0.4
0.6
0.8
1
Number of slots T
PD
L=1
L=2
L=3
Fig. 7: Detection probability PDof the proposed scheme for
different number of paths (scatterers) L, where M=N= 32,
F0= 3,m= 3,n= 2,κu=κv= 16,SNRBBF =−33 dB
(L= 1), −32 dB (L= 2), −31 dB (L= 3).
20 40 60 80 100 120 140 160
0
0.2
0.4
0.6
0.8
1
Number of slots T
PD
κu=κv= 4
κu=κv= 8
κu=κv= 16
κu=κv= 25
Fig. 8: Detection probability PDof the proposed scheme for
different angle spreading factors (κu,κv), where M=N= 32,
F0= 3,m= 3,n= 2,SNRBBF =−33 dB.
Performance with Different Number of Paths (Scat-
terers) L.To illustrate that our proposed scheme works
equally well for single-path and multi-path scenarios, we
repeat the simulation with different number of MPCs
L= 1,2,3, with different strengths γ1> γ2=γ3. Fig. 7
shows the performance of the proposed scheme, where we
announce an individual experiment to be successful if the
strongest path (γ1) is correctly identified. It is seen that the
scheme performs equally well for a single MPC (L= 1)
and multiple MPCs (L= 2,3), where in both cases, at most
T= 40 beacon slots are sufficient to ensure a successful
BA with high probability.
Dependence on the Angle Spreading Factors κuand
κv.The angle spreading factors κuand κvimpose a trade-
off between the angle coverage of the probing/sensing
matrix B(exploration) and its receive SNR at the user side
(exploitation). By making κu(resp., κv) larger, each beam
3. Initial Beam Alignment for mmWave OFDM Systems 31
pattern probes simultaneously more directions, but the total
power is spread over all such directions. In contrast, by
making κu(resp., κv) smaller, the beam pattern explores
less directions but obtains better power concentration in
the angle domain. This is illustrated in Fig. 8. It is seen
that increasing the spreading factor from κu=κv= 4 to
κu=κv= 8 yields a performance improvement. However,
a further increase to κu=κv= 16 slightly degrades the
performance, and the degradation is severe for even larger
values κu=κv= 25. As already remarked a few times
in this paper, the choice of the spreading factor κvat the
UE can be tailored to the individual SNR condition (e.g.,
these may depend on the distance between UE and BS).
To pinpoint this, we repeated the simulation to find the
best κvat the UE side as a function of its channel SNR.
This is illustrated in Fig. 9, reporting the best κvand the
best search time Tas function of UE SNR (assuming a
threshold of PD≥0.95 and a spreading factor of κu= 8
at the BS side). It is seen that, as expected, when a UE
enjoys a larger SNR (e.g., it is close to the BS), it should
use a larger κvin order to better explore the channel thus
reducing the search time T. In contrast, when a UE is in
low SNR conditions (e.g., it is far from the BS), it should
apply a smaller κvin order to gather measurements with
sufficiently large SNR.
Dependence on the Number of Subcarriers F0.As
explained in Section III-B and III-C, using a large number
of subcarriers F0in the beacon signals ensures a reliable
averaging of the received instantaneous power (see (24))
at the cost of a reduced SNR per subcarrier. As shown in
Fig. 10, increasing the number of subcarriers from F0= 1
to F0= 3 improves the performance, but increasing further
to F0= 10,30 degrades the performance considerably.
Dependence on the Probing Dimensions (κuκvmn).
Note that, for a certain pre-beamforming SNR, the output
of the proposed BA scheme inherently depends on the
probing dimensions, i.e., the product κuκvmn. This is
illustrated by the curves marked as (#1, #10) and (#2,
#20) in Fig. 11. For various different configurations of the
parameters, if both the number of measurements (mn) and
the probing dimensions (κuκvmn) in each slot are the
same, a similar performance is achieved. This is useful
in terms of system design where more complexity can be
pushed towards the BS (e.g., more RF chains at the BS)
while keeping the same system-level performance.
System-Level Scalability. We consider a multiuser sce-
nario where Kdenotes the number of UEs in the sys-
tem and K(T)denotes the number of UEs that have
achieved BA (i.e., that have successfully detected their
strong MPC) after Tframes. Fig. 12 compares the fraction
K(T)
Kachieved by interactive bisection scheme [11] and our
proposed scheme. In the simulations, we assume that for
[11] the users are trained one by one with an ideal cost-free
feedback in each round, whereas in our case, all the users
are trained independently and simultaneously, i.e., all the
users share the same BS pseudo-random probing codebook,
while each user use its own sensing codebook which is
also randomly generated. As we can see, the training
overhead of interactive methods scales proportionally with
the number of active users, whereas in our scheme the
overhead does not grow with the number of users. Note
that in practice, the feedback scheme for each iterative
round in [11] costs UL transmissions and may not be
ideal since the beamforming gains are very poor at the
initial rounds. In contrast, the proposed scheme needs only
one UL transmission of the RACCH packet, where the
full beamforming gain at the UE side and the sectored
beamforming gain at the BS side (as discussed in Section
III-A) are available.
Robustness w.r.t. Variations in Channel Statistics. To
investigate the sensitivity of the proposed scheme as well
as competing CS-based schemes to channel time-variations,
we consider a simple Gauss-Markov model for the channel
correlation in time given by
ρs,l =αρs−1,l +p1−|α|2νs,l, s ∈Z+,(28)
where ρ0,l ∼ CN(0, γl), where νs,l ∼ CN(0, γl)is an i.i.d.
sequence (innovation), and where |α| ∈ [0,1] controls the
channel correlation in time. This model is widely used as
a simple and intuitive way to model correlated fading (see
[48]). We assume that the channel is constant over each
beacon slot of 14 OFDM symbols, and evolves in time
according to (28) from slot to slot. More precisely, |α|= 1
yields channel coefficients constant over the whole BA
phase, while |α|= 0 yields channel coefficient changing
in an i.i.d. fashion over the beacon slots. In general, a
full range of channel time variations can be obtained by
varying |α|between 0 and 1. In Fig. 13, we compare the
detection probability of the proposed scheme with that of
other CS-based schemes presented in [15, 16]. In [15],
the instantaneous channel coefficients are estimated using
the Orthogonal Matching Pursuit (OMP) technique. In
[16] an improvement is proposed where the congruence
of the channel AoA/AoD components across a Selected
comb of subcarriers is exploited by applying a Simulta-
neous Orthogonal Matching Pursuit (SS-OMP) technique.
Fig. 13 illustrates the simulation results. It is seen that, the
proposed NNLS scheme performs much better over a wide
range of channel time-correlations whereas the OMP/SS-
OMP schemes are quite fragile in the presence of channel
time-variations.
V. CONCLUSION
In this paper, we proposed an efficient Beam Alignment
(BA) scheme for mm-Wave multiuser MIMO systems. In
the proposed scheme, the AoA/AoD of a strong MPC
component is estimated by exploring the AoA-AoD do-
main through pseudo-random multi-finger beam patterns,
and constructing an estimate of the resulting second-order
statistics (namely, the average received power for each
pseudo-random beam configuration). The resulting under-
32 3.3 Original journal article
−35 −25−15 −5 5
0
4
8
12
16
20
24
28
SNRBBF (dB)
opt(κv)
(a)
−35 −25−15 −5 5
0
15
30
45
60
75
SNRBBF (dB)
T
(b)
Fig. 9: (a) The optimal spreading factor opt(κv)at the UE in terms of different SNRBBF when the BS spreading factor is fixed at
κu= 8. (b) The average training slots Tthat ensures a high detection probability PD≥0.95.
10 20 30 40 50 60 70 80
0
0.2
0.4
0.6
0.8
1
Number of slots T
PD
F′= 1
F′= 3
F′= 10
F′= 30
Fig. 10: Detection probability PDof the proposed scheme for
different number of subcarriers F0deployed for each RF chain at
the BS side, where M=N= 32,m= 3,n= 2,SNRBBF =−33
dB.
determined system of equations is efficiently solved using
NNLS, yielding naturally a sparse non-negative vector
solution whose maximum component identifies the optimal
path. In the proposed scheme, the channel is probed by
the BS by sending (pseudo-random) beamformed beacon
DL signals, and sensed by the UEs by applying (pseudo-
random) receive beam patterns. The scheme can train
simultaneously a large number of users, since it requires
no interactive (multiple rounds) bi-directional transmission
of pilots and/or control packets as in bisection methods.
Also, the scheme is robust to the channel coefficient time-
dynamics since it is based on the estimation of the channel
second order statistics (received power) rather than on
trying to estimate the complex channel coefficients, as done
in other concurrent schemes also based on random beams
and compressed sensing. Overall, the proposed scheme
provides a very competitive performance both in terms
10 20 30 40 50 60 70
0.4
0.6
0.8
1
Number of slots T
PD
m=3,n=2,κu= 8,κv= 8 #1
m=6,n=1,κu= 8,κv= 8 #1′
m=3,n=1,κu= 8,κv=16 #2
m=3,n=1,κu=16,κv= 8 #2′
Fig. 11: Detection probability PDof the proposed scheme when
the product of mn and mnκuκv, i.e., the number of measure-
ments and the probing dimensions in the spatial multiplexing
domain, are constant, and where M=N= 32,F0= 3,
SNRBBF =−33 dB.
of of scalability (with respect to the number of users)
and robustness (with respect to the channel coefficient
statistics), than the state-of-art algorithms for initial beam
acquisition proposed so far.
ACKNOWLEDGMENTS
X.S. is sponsored by the China Scholarship Council
(201604910530). G.C. is supported by the Alexander von
Humboldt Foundation through a Professorship Grant, and
this work was also supported in part by a Collaborative
Research Grant of Intel Research.
REFERENCES
[1] R. W. Heath, N. Gonzalez-Prelcic, S. Rangan, W. Roh, and A. M.
Sayeed, “An overview of signal processing techniques for millimeter
3. Initial Beam Alignment for mmWave OFDM Systems 33
20 40 60 80
0
0.2
0.4
0.6
0.8
1
Number of slots T
E[K(T)
K]
NNLS K= 6
NNLS K= 9
NNLS K= 15
Bisec K= 6
Bisec K= 9
Bisec K= 15
Fig. 12: Comparison of the performance of our proposed scheme
with that of interactive bisection method in [11] in terms of the
fraction of users whose channel is estimated until a given time
slot Tgiven by K(T)
K. We take M=N= 32,F0= 3,m= 3,
n= 2,κu=κv= 8,SNRBBF =−33 dB.
10 20 30 40 50 60 70 80
0
0.2
0.4
0.6
0.8
1
Number of slots T
PD
NNLS α= 0
NNLS α= 0.7
NNLS α= 1
OMP α= 0
OMP α= 0.7
OMP α= 1
SS-OMP α= 0
SS-OMP α= 0.7
SS-OMP α= 1
Fig. 13: Comparison of detection probability PDbetween pro-
posed NNLS scheme (m= 3, F0= 3, κu=κv= 8), the
OMP scheme in [15] (m= 1, F0= 1, κu=κv= 16), and
the SS-OMP scheme in [16] (m= 3, F0= 3, κu=κv= 8)
for M=N= 32,n= 2 and SNRBBF =−33 dB, when the
path gains change from i.i.d. (α= 0) to constant (α= 1) over
consecutive beacon slots.
wave MIMO systems,” IEEE journal of selected topics in signal
processing, vol. 10, no. 3, pp. 436–453, 2016.
[2] T. S. Rappaport, S. Sun, R. Mayzus, H. Zhao, Y. Azar, K. Wang,
G. N. Wong, J. K. Schulz, M. Samimi, and F. Gutierrez, “Millimeter
wave mobile communications for 5g cellular: It will work!” Access,
IEEE, vol. 1, pp. 335–349, 2013.
[3] A. M. Sayeed, “Deconstructing multiantenna fading channels,” IEEE
Transactions on Signal Processing, vol. 50, no. 10, pp. 2563–2579,
2002.
[4] T. Nitsche, C. Cordeiro, A. B. Flores, E. W. Knightly, E. Perahia,
and J. C. Widmer, “IEEE 802.11 ad: directional 60 GHz communi-
cation for multi-Gigabit-per-second Wi-Fi,” IEEE Communications
Magazine, vol. 52, no. 12, pp. 132–141, 2014.
[5] Z. Chen and C. Yang, “Pilot decontamination in wideband massive
MIMO systems by exploiting channel sparsity,” IEEE Transactions
on Wireless Communications, vol. 15, no. 7, pp. 5087–5100, 2016.
[6] S. Haghighatshoar and G. Caire, “The beam alignment problem in
mmwave wireless networks,” in 2016 50th Asilomar Conference on
Signals, Systems and Computers, Nov 2016, pp. 741–745.
[7] V. Desai, L. Krzymien, P. Sartori, W. Xiao, A. Soong, and A. Alkha-
teeb, “Initial beamforming for mmwave communications,” in 2014
48th Asilomar Conference on Signals, Systems and Computers, Nov
2014, pp. 1926–1930.
[8] J. Wang, Z. Lan, C.-W. Pyo, T. Baykas, C.-S. Sum, M. A. Rahman,
J. Gao, R. Funada, F. Kojima, H. Harada et al., “Beam codebook
based beamforming protocol for multi-gbps millimeter-wave wpan
systems,” Selected Areas in Communications, IEEE Journal on,
vol. 27, no. 8, pp. 1390–1399, 2009.
[9] L. Chen, Y. Yang, X. Chen, and W. Wang, “Multi-stage beam-
forming codebook for 60ghz wpan,” in 2011 6th International
ICST Conference on Communications and Networking in China
(CHINACOM), Aug 2011, pp. 361–365.
[10] S. Hur, T. Kim, D. J. Love, J. V. Krogmeier, T. Thomas, A. Ghosh
et al., “Millimeter wave beamforming for wireless backhaul and
access in small cell networks,” Communications, IEEE Transactions
on, vol. 61, no. 10, pp. 4391–4403, 2013.
[11] A. Alkhateeb, O. El Ayach, G. Leus, and R. W. Heath, “Channel
estimation and hybrid precoding for millimeter wave cellular sys-
tems,” Selected Topics in Signal Processing, IEEE Journal of, vol. 8,
no. 5, pp. 831–846, 2014.
[12] M. Kokshoorn, H. Chen, P. Wang, Y. Li, and B. Vucetic, “Millimeter
wave MIMO channel estimation using overlapped beam patterns and
rate adaptation,” IEEE Transactions on Signal Processing, vol. 65,
no. 3, pp. 601–616, 2017.
[13] P. Xia, R. W. Heath, and N. Gonzalez-Prelcic, “Robust analog
precoding designs for millimeter wave MIMO transceivers with
frequency and time division duplexing,” IEEE Transactions on
Communications, vol. 64, no. 11, pp. 4622–4634, Nov 2016.
[14] D. E. Berraki, S. M. D. Armour, and A. R. Nix, “Application
of compressive sensing in sparse spatial channel recovery for
beamforming in mmwave outdoor systems,” in 2014 IEEE Wire-
less Communications and Networking Conference (WCNC), 2014,
Conference Proceedings, pp. 887–892.
[15] A. Alkhateeb, G. Leusz, and R. W. Heath, “Compressed sensing
based multi-user millimeter wave systems: How many measurements
are needed?” in 2015 IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP), 2015, Conference Proceed-
ings, pp. 2909–2913.
[16] J. Rodr´
ıguez-Fern´
andez, N. Gonz´
alez-Prelcic, K. Venugopal, and
R. W. Heath Jr, “Frequency-domain compressive channel estimation
for frequency-selective hybrid mmwave MIMO systems,” arXiv
preprint arXiv:1704.08572, 2017.
[17] A. F. Molisch, V. V. Ratnam, S. Han, Z. Li, S. L. H. Nguyen, L. Li,
and K. Haneda, “Hybrid beamforming for massive MIMO-a survey,”
arXiv preprint arXiv:1609.05078, 2016.
[18] C. N. Barati, S. A. Hosseini, S. Rangan, P. Liu, T. Korakis,
S. S. Panwar, and T. S. Rappaport, “Directional cell discovery in
millimeter wave cellular networks,” IEEE Transactions on Wireless
Communications, vol. 14, no. 12, pp. 6664–6678, 2015.
[19] J. Choi, “Beam selection in mm-wave multiuser MIMO systems
using compressive sensing,” IEEE Transactions on Communications,
vol. 63, no. 8, pp. 2936–2947, 2015.
[20] R. M´
endez-Rial, C. Rusu, N. Gonz´
alez-Prelcic, A. Alkhateeb, and
R. W. Heath, “Hybrid MIMO architectures for millimeter wave
communications: Phase shifters or switches?” IEEE Access, vol. 4,
pp. 247–267, 2016.
[21] K. Venugopal, A. Alkhateeb, N. G. Prelcic, and R. W. Heath, “Chan-
nel estimation for hybrid architecture based wideband millimeter
wave systems,” IEEE Journal on Selected Areas in Communications,
2017.
[22] R. J. Weiler, M. Peter, W. Keusgen, and M. Wisotzki, “Measuring
the busy urban 60 ghz outdoor access radio channel,” in 2014
IEEE International Conference on Ultra-WideBand (ICUWB), 2014,
Conference Proceedings, pp. 166–170.
[23] V. Va, J. Choi, and R. W. Heath, “The impact of beamwidth on
temporal channel variation in vehicular channels and its implica-
tions,” IEEE Transactions on Vehicular Technology, vol. 66, no. 6,
pp. 5014–5029, 2017.
[24] P. A. Eliasi, S. Rangan, and T. S. Rappaport, “Low-rank spatial
channel estimation for millimeter wave cellular systems,” IEEE
34 3.3 Original journal article
Transactions on Wireless Communications, vol. 16, no. 5, pp. 2748–
2759, 2017.
[25] A. F. Molisch, Wireless communications. John Wiley & Sons, 2012,
vol. 34.
[26] W. Shen, L. Dai, B. Shim, Z. Wang, and R. W. H. Jr.,
“Channel feedback based on aod-adaptive subspace codebook in
FDD massive MIMO systems,” CoRR, vol. abs/1704.00658, 2017.
[Online]. Available: http://arxiv.org/abs/1704.00658
[27] M. M´
edard and R. G. Gallager, “Bandwidth scaling for fading
multipath channels,” IEEE Transactions on Information Theory,
vol. 48, no. 4, pp. 840–852, 2002.
[28] A. Lozano and D. Porrat, “Non-peaky signals in wideband fading
channels: Achievable bit rates and optimal bandwidth,” IEEE Trans-
actions on Wireless Communications, vol. 11, no. 1, pp. 246–257,
2012.
[29] F. G´
omez-Cuba, J. Du, M. M´
edard, and E. Erkip, “Unified capacity
limit of non-coherent wideband fading channels,” IEEE Transactions
on Wireless Communications, vol. 16, no. 1, pp. 43–57, 2017.
[30] G. Durisi, U. G. Schuster, H. Bolcskei, and S. Shamai, “Noncoherent
capacity of underspread fading channels,” IEEE Transactions on
Information Theory, vol. 56, no. 1, pp. 367–395, 2010.
[31] W. U. Bajwa, J. Haupt, A. M. Sayeed, and R. Nowak, “Compressed
channel sensing: A new approach to estimating sparse multipath
channels,” Proceedings of the IEEE, vol. 98, no. 6, pp. 1058–1076,
2010.
[32] M. R. Akdeniz, Y. Liu, M. K. Samimi, S. Sun, S. Rangan, T. S.
Rappaport, and E. Erkip, “Millimeter wave channel modeling and
cellular capacity evaluation,” IEEE Journal on Selected Areas in
Communications, vol. 32, no. 6, pp. 1164–1179, 2014.
[33] Y. Wu and S. Verd´
u, “Optimal phase transitions in compressed
sensing,” IEEE Transactions on Information Theory, vol. 58, no. 10,
pp. 6241–6263, 2012.
[34] M. Slawski, M. Hein et al., “Non-negative least squares for high-
dimensional linear models: Consistency and sparse recovery without
regularization,” Electronic Journal of Statistics, vol. 7, pp. 3004–
3056, 2013.
[35] R. Kueng and P. Jung, “Robust nonnegative sparse recovery
and the nullspace property of 0/1 measurements,” arXiv preprint
arXiv:1603.07997, 2016.
[36] R. Tibshirani, “Regression shrinkage and selection via the lasso,”
Journal of the Royal Statistical Society. Series B (Methodological),
pp. 267–288, 1996.
[37] D. L. Donoho, I. M. Johnstone, J. C. Hoch, and A. S. Stern,
“Maximum entropy and the nearly black object,” Journal of the
Royal Statistical Society. Series B (Methodological), pp. 41–81,
1992.
[38] A. M. Bruckstein, M. Elad, and M. Zibulevsky, “On the uniqueness
of non-negative sparse & redundant representations,” in 2008 IEEE
International Conference on Acoustics, Speech and Signal Process-
ing, March 2008, pp. 5145–5148.
[39] D. L. Donoho and J. Tanner, “Counting the faces of randomly-
projected hypercubes and orthants, with applications,” Discrete &
computational geometry, vol. 43, no. 3, pp. 522–541, 2010.
[40] M. Wang and A. Tang, “Conditions for a unique non-negative
solution to an underdetermined system,” in 2009 47th Annual
Allerton Conference on Communication, Control, and Computing
(Allerton), Sept 2009, pp. 301–307.
[41] M. Wang, W. Xu, and A. Tang, “A unique “nonnegative” solution
to an underdetermined system: From vectors to matrices,” IEEE
Transactions on Signal Processing, vol. 59, no. 3, pp. 1007–1016,
2011.
[42] R. Kueng and P. Jung, “Robust nonnegative sparse recovery
and the nullspace property of 0/1 measurements,” arXiv preprint
arXiv:1603.07997, 2016.
[43] D. P. Bertsekas and A. Scientific, Convex optimization algorithms.
Athena Scientific Belmont, 2015.
[44] D. Kim, S. Sra, and I. S. Dhillon, “Tackling box-constrained
optimization via a new projected quasi-newton approach,” SIAM
Journal on Scientific Computing, vol. 32, no. 6, pp. 3548–3563,
2010.
[45] D. K. Nguyen and T. B. Ho, “Anti-lopsided algorithm for
large-scale nonnegative least square problems,” arXiv preprint
arXiv:1502.01645, 2015.
[46] “3GPP TR 38.802 V2.0.0 (2017-03) - Study on New Radio (NR)
Access Technology; Physical Layer Aspects (Release 14),” 2017.
[47] A. Ghosh. (2017) 5G mmWave Revolu-
tion & New Radio - IEEE 5G. [Online].
Available: {https://5g.ieee.org/images/files/pdf/5GmmWave\ }\\
{Webinar\IEEE\Nokia\09\20\2017\final.pdf}
[48] C. C. Tan and N. C. Beaulieu, “On first-order Markov modeling for
the Rayleigh fading channel,” IEEE Transactions on Communica-
tions, vol. 48, no. 12, pp. 2032–2040, 2000.
Xiaoshen Song (S’17) received the B.Sc. degree
in Communication Engineering from Northwest-
ern Polytechnical University, Xi’an, China, in
2013, and the M.Sc. degree in Communication
and Information Systems from the Institute of
Electronics, University of Chinese Academy of
Sciences, Beijing, China, in 2016. Her master’s
thesis focuses on video synthetic aperture radar
(VideoSAR) system design and imaging algo-
rithms. She is currently pursuing the Ph.D. de-
gree with the Communications and Information
Theory (CommIT) group at Technische Universit¨
at Berlin, Berlin, Ger-
many. Her research interests include wireless communication, mmWave
massive MIMO, and compressed sensing.
Saeid Haghighatshoar (S’12–M’15) received
the B.Sc. degree in Electrical Engineering (Elec-
tronics) in 2007 and the M.Sc. degree in Elec-
trical Engineering (Communication Systems) in
2009, both from Sharif University of Technol-
ogy, Tehran, Iran, and the Ph.D. degree in Com-
puter and Communication Sciences from ´
Ecole
Polytechnique F´
ed´
erale de Lausanne, Lausanne,
Switzerland, in 2014. Since 2015, he is a post-
doctoral researcher with Communications and
Information Theory (CommIT) group at Tech-
nische Universit¨
at Berlin, Berlin, Germany. His research interests lie in
Information Theory, Communication Systems, Wireless Communication,
Optimization Theory, and Compressed Sensing.
3. Initial Beam Alignment for mmWave OFDM Systems 35
Giuseppe Caire (S’92 – M’94 – SM’03 – F’05)
was born in Torino, Italy, in 1965. He received
the B.Sc. in Electrical Engineering from Politec-
nico di Torino (Italy), in 1990, the M.Sc. in Elec-
trical Engineering from Princeton University in
1992 and the Ph.D. from Politecnico di Torino in
1994. He has been a post-doctoral research fel-
low with the European Space Agency (ESTEC,
Noordwijk, The Netherlands) in 1994-1995, As-
sistant Professor in Telecommunications at the
Politecnico di Torino, Associate Professor at the
University of Parma, Italy, Professor with the Department of Mobile
Communications at the Eurecom Institute, Sophia-Antipolis, France, a
Professor of Electrical Engineering with the Viterbi School of Engineer-
ing, University of Southern California, Los Angeles, and he is currently
an Alexander von Humboldt Professor with the Electrical Engineering
and Computer Science Department of the Technical University of Berlin,
Germany.
He served as Associate Editor for the IEEE Transactions on Communi-
cations in 1998-2001 and as Associate Editor for the IEEE Transactions on
Information Theory in 2001-2003. He received the Jack Neubauer Best
System Paper Award from the IEEE Vehicular Technology Society in
2003, the IEEE Communications Society & Information Theory Society
Joint Paper Award in 2004 and in 2011, the Okawa Research Award
in 2006, the Alexander von Humboldt Professorship in 2014, and the
Vodafone Innovation Prize in 2015. Giuseppe Caire is a Fellow of IEEE
since 2005. He has served in the Board of Governors of the IEEE
Information Theory Society from 2004 to 2007, and as officer from 2008
to 2013. He was President of the IEEE Information Theory Society in
2011. His main research interests are in the field of communications
theory, information theory, channel and source coding with particular
focus on wireless communications.
36 3.3 Original journal article
4
Initial Beam Alignment for mmWave
Single-Carrier Systems
4.1 Introduction
As discussed before, the IEEE 802.11.ad standard specifies two operating modes at
60 GHz bands, i.e, the OFDM mode for high performance applications (e.g., high data
rate), and the single carrier (SC) mode for low power and low complexity implementation.
On top of the efficient BA scheme for mmWave OFDM systems provided in the last
chapter, this chapter focuses on developing a new BA scheme for mmWave SC systems.
4.2 Clarification of each authors’ contributions
This chapter is a journal publication, which is a joint work with Saeid Haghighatshoar
and Giuseppe Caire. I wrote this journal as the first author. The citation information is
in below:
X. Song, S. Haghighatshoar, and G. Caire,“Efficient beam alignment for mmWave
single-carrier systems with hybrid MIMO transceivers,” IEEE Transactions on Wireless
Communications, 2019. DOI: 10.1109/TWC.2019.2892043
All the authors contributed to this paper, but I have implemented all the experiments
and simulations. I also wrote the complete first draft (including all sections) of this
paper.
Saeid Haghighatshoar provided valuable ideas for the signaling model. He also
modified my first draft in terms of its English expressions.
38 4.3 Original journal article
Giuseppe Caire, who is my PhD supervisor, provided valuable discussions in each
meeting of this work. He also did a final modification of the overall draft.
4.3 Original journal article
The following article is a reprint of the original journal paper. It is the accepted version
of the paper. The copyright information is given in page xii of this thesis as well as in
the first page of the reprinted paper.
Efficient Beam Alignment for mmWave
Single-Carrier Systems with Hybrid MIMO
Transceivers
Xiaoshen Song, Student Member, IEEE, Saeid Haghighatshoar, Member, IEEE, Giuseppe Caire, Fellow,
IEEE
©2019 IEEE. Reprinted, with permision, from X. Song, S. Haghighatshoar, and G. Caire, ”Efficient beam alignment for mmWave single-carrier
systems with hybrid MIMO transceivers,” IEEE Transactions on Wireless Communications, 2019. The published version can be found online:
https://ieeexplore.ieee.org/abstract/document/8625694. This reprint is the accepted version of the paper.
Abstract—Communication at millimeter wave (mmWave)
bands is expected to become a key ingredient of next
generation (5G) wireless networks. Effective mmWave
communications require fast and reliable methods for
beamforming at both the User Equipment (UE) and the
Base Station (BS) sides, in order to achieve a sufficiently
large Signal-to-Noise Ratio (SNR) after beamforming. We
refer to the problem of finding a pair of strongly
coupled narrow beams at the transmitter and receiver
as the Beam Alignment (BA) problem. In this paper, we
propose an efficient BA scheme for single-carrier mmWave
communications. In the proposed scheme, the BS periodically
probes the channel in the downlink via a pre-specified
pseudo-random beamforming codebook and pseudo-random
spreading codes, letting each UE estimate the Angle-of-Arrival
/ Angle-of-Departure (AoA-AoD) pair of the multipath channel
for which the energy transfer is maximum. We leverage
the sparse nature of mmWave channels in the AoA-AoD
domain to formulate the BA problem as the estimation of a
sparse non-negative vector. Based on the recently developed
Non-Negative Least Squares (NNLS) technique, we efficiently
find the strongest AoA-AoD pair connecting each UE to the
BS. We evaluate the performance of the proposed scheme
under a realistic channel model, where the propagation
channel consists of a few multipath components each having
different delays, AoAs-AoDs, and Doppler shifts. The channel
model parameters are consistent with experimental channel
measurements. Simulation results indicate that the proposed
method is highly robust to fast channel variations caused by
the large Doppler spread between the multipath components.
Furthermore, we also show that after achieving BA the
beamformed channel is essentially frequency-flat, such that
single-carrier communication needs no equalization in the
time domain.
Index Terms—mmWave, Beam Alignment, Single-Carrier,
Compressed Sensing, Non-Negative Least Squares (NNLS).
I. INTRODUCTION
The majority of existing wireless communication systems
operate in the sub-6GHz microwave spectrum, which
The authors are with the Electrical Engineering and Computer Science
Department, Technische Universit¨
at Berlin, 10587 Berlin, Germany
(e-mail: [email protected]).
The work of X. Song is sponsored by the China Scholarship Council
(201604910530).
The work of G. Caire and S. Haghighatshoar is partially funded by a
Professorship Grant of the Alexander von Humboldt Foundation and by
the EU H2020 Project SERENA.
has now become very crowded. As a result, millimeter
wave (mmWave) spectrum ranging from 30 to 300 GHz
has been considered as an alternative to achieve very
high data rates in the next generation wireless systems.
At these frequencies, a signal bandwidth of 1GHz with
Signal-to-Noise Ratio (SNR) between 0dB and 3dB yields
data rates ∼1Gb/s per data stream. A mmWave Base
Station (BS) supporting multiple data streams through the
use of multiuser Multiple-Input Multiple-Output (MIMO)
can achieve tens of Gb/s of aggregate rate, thus fulfilling
the requirements of enhanced Mobile Broad Band (eMBB)
in 5G [1, 2].
A main challenge of communication at mmWaves
is the short range of isotropic propagation. According
to Friis’s Law [3], the effective area of an isotropic
antenna decreases polynomially with frequency, therefore,
the isotropic pathloss at mmWaves is considerably larger
compared with sub-6GHz counterpart. Moreover, signal
propagation through scattering elements also suffers from a
large attenuation at high frequencies. Fortunately, the small
wavelength of mmWave signals enables to pack a large
number of antenna elements in a small form factor, such
that it is possible to cope with the severe isotropic pathloss
by using large antenna arrays both at the BS side and
the User Equipment (UE) side, providing an overall large
beamforming gain. An essential component to obtain such
large antenna gains consists of identifying suitable narrow
beam combinations, i.e., a pair of Angle of Departure
(AoD) at the BS and Angle of Arrival (AoA) at the UE,
yielding a sufficiently large beamforming gain through the
scatterers in the channel. 1The problem of finding an
AoA-AoD pair with a large channel gain is referred to
as Initial Beam Training, Acquisition, or Alignment in the
literature (see references in Section I-A). Consistently with
our previous work [4], we shall refer to it simply as Beam
Alignment (BA).
It is important to define the conditions under which the
BA operation must be performed. In this work, we focus
on MIMO devices with a Hybrid Digital Analog (HDA)
1We refer to AoD for the BS and AoA for the UE since the proposed
scheme consists of downlink probing from the BS to the UEs. Of course,
due to the propagation angle reciprocity, the role of AoA and AoD is
referred in the uplink.
4. Initial Beam Alignment for mmWave Single-Carrier Systems 39
structure. HDA MIMO is widely proposed especially for
mmWave systems, since the size and power consumption
of all-digital architectures prevent the integration of many
antenna elements on a small space. A HDA transceiver
architecture consists of the concatenation of an analog part
implementing the beamforming functions, and a digital part
implementing the baseband processing [5, 6]. This poses
some specific challenges: i) The signal received at the
antennas passes through an analog beamforming network
with only a limited number of Radio Frequency (RF)
chains, much smaller than the number of antennas. Hence,
the baseband vector of received signal samples at the
output of the physical antenna array are not simultaneously
available; ii) Due to the large isotropic pathloss, the
received signal power is very low before beamforming, i.e.,
at every antenna port. Therefore, the BA scheme must be
able to operate in very low SNR conditions; iii) Because
of the large number of antennas at both sides, the size
of the channel matrix between each UE and the BS is
very large. However, extensive measurements have shown
that mmWave channels typically exhibit a small number
of multipath components (on average of up to 3strong
components), each corresponding to a scattering cluster
with small delay / angle spreading [7, 8]. Considering the
discretization of the AoA-AoD domain according to the
UE and BS array resolution, a suitable BA scheme requires
the identification of a very sparse set of AoA-AoD pairs
coupled via strong propagation coefficient in the very
high-dimensional matrix of all possible pairs of discrete
beam directions [9, 10].
The other fundamental aspect to the BA problem
is that this is the first operation that a UE must
accomplish in order to communicate with the BS. Hence,
while coarse frame and carrier frequency synchronization
may be assumed (especially for the non-stand alone
system, assisted by some other existing cell operating
at lower frequencies), the fine timing and Doppler shift
compensation cannot be assumed. It follows that the BA
operation must cope with significant timing offsets and
Doppler shifts. In addition, in a multipath propagation
environment with paths coming from different directions,
each path may be affected by a different Doppler shift.
In multicarrier (OFDM-based) systems, this may lead
to significant inter-carrier interference, which has been
typically ignored in most of the current literature.
A. Related Work
The most straightforward BA method is an exhaustive
search, where the BS and the UE scan all the AoA-AoD
beam pairs until they find a strong one [7]. This is, however,
prohibitively time-consuming, especially considering the
very large dimension of the channel matrix due to very
large number of antennas. Several BA algorithms have been
recently proposed in the literature. All these algorithms, in
some way, aim at achieving reliable BA while using less
overhead than exhaustive search.
In [11], a two-stage pseudo-exhaustive BA scheme was
proposed, where in the first stage, the BS isotropically
probes the channel, while the UE scans its discrete beam
directions (beam sweeping) to find the best AoA. In the
second stage, the UE probes the channel along the AoA
found in the first stage, while the BS performs beam
sweeping to find the best AoD. A main limitation of [11] is
that, due to the isotropic BS beamforming in the first stage,
the scheme may suffer from a low pre-beamforming SNR
[9, 12, 13], which may impair the whole BA performance.
Some mmWave standards such as IEEE 802.11ad [14]
proposed to use multi-level hierarchical BA schemes
(e.g., see also [15–18]). The underlying idea is to start
with sectors of wide beams to do a coarse BA and
then shrink the beamwidth adaptively and successively
to obtain a more refined BA. The drawback of such
schemes, however, is that each UE has its own specific
AoA as seen from the BS side, thus, the BS needs
to interact with each UE individually. As a result, all
these hierarchical schemes require non-trivial coordination
among the UEs and the BS, which is difficult to have
at the initial channel acquisition stage. Moreover, since
hierarchical schemes require interactive uplink-downlink
communication between the BS and each individual UE,
it is not clear how the overhead of such schemes scales
in small cell scenarios with significant mobility of users
across cells, where the BA procedure should be repeated
at each handover.
The sparse nature of mmWave channels, i.e.,
large-dimension channel matrices along with very
sparse scatterers in the AoA-AoD domain [7, 8], motivates
the application of Compressed Sensing (CS) methods to
speed up the BA. There are two groups of CS-based
methods in the literature. The first group (e.g., see
[9, 19–21]) applies CS to estimate the complex baseband
channel coefficients. These algorithms are efficient and
particularly attractive for multiuser scenarios, but they are
based on the assumption that the instantaneous channel
remains invariant during the whole probing/measuring
stage. As anticipated before, this assumption is difficult
to meet at mmWaves because of the large Doppler
spread between the multipath components coming from
different angles, implying significant time-variations of
the channel coefficients even for UEs with small mobility
[10, 22, 23].2The second group of CS-based schemes
focuses on estimating the second-order statistics of the
channel, i.e., the covariance of the channel matrix, which
is very robust to channel variations. In [10] for example,
2Notice that the channel delay spread and time-variation are greatly
reduced after BA is achieved, since once the beams are aligned, the
communication occurs only through a single multipath component with
small effective angular spread, whose delay and Doppler shift can be well
compensated [23]. However, before BA is achieved the channel delay
spread and time-variation can be large due to the presence of several
mulipath components, each with its own delay and Doppler shift. In
this case, even a small motion of a few centimeters traverses several
wavelengths, potentially producing multiple deep fades [22].
40 4.3 Original journal article
aMaximum Likelihood (ML) method was proposed to
estimate the covariance of the channel matrix. However,
this scheme suffers from low SNR and the BA is achieved
only at the UE side because of isotropic probing at the
BS.
In our previous work [4], we proposed a novel efficient
BA scheme that jointly estimates the two-sided AoA-AoD
of the strongest path from the second-order statistics
of the channel matrix. A limitation of [4], as well as
most works based on OFDM signaling [9, 10], is the
assumption of perfect OFDM frame synchronization and no
inter-carrier interference. This is in fact difficult to achieve
at mmWaves due to the potentially large multipath delay
spread, Doppler shifts, and very low SNR before BA. These
weaknesses, together with the fact that OFDM signaling
suffers from large Peak-to-Average Power Ratio (PAPR),
has motivated the proposal of single-carrier transmission
[24, 25] as a more favorable option at mmWaves. Recently,
[20, 21] proposed a time-domain BA approach based
on CS techniques for single-carrier mmWave systems.
However, as in [9, 19], this work focuses on estimating
the instantaneous complex channel coefficients, with the
assumption that these complex coefficients remain invariant
over the whole training stage, which is an unrealistic
assumption, as discussed above [4, 10, 22, 23].
B. Contributions
In this paper, we propose a novel efficient BA scheme
for single-carrier mmWave communications with HDA
transceivers and frequency-selective multipath channels. In
the proposed scheme, each UE independently estimates its
best AoA-AoD pair over the reserved beacon slots (see
Section III), during which the BS periodically broadcasts
its probing time-domain sequences. We exploit the sparsity
of the mmWave channel in both angle and delay domains
[26] to reduce the training overhead. We also pose
the estimation of the strongest AoA-AoD pair as a
Non-Negative Least Squares (NNLS) problem, which can
be efficiently solved by standard techniques. Our main
contributions can be summarized as follows:
1) Pure time-domain operation. Unlike our prior work in
[4] and other works based on OFDM signaling [9, 10], the
scheme proposed in this paper takes place completely in
the time-domain and uses Pseudo-Noise (PN) sequences
with good correlation properties that suits single-carrier
mmWave systems.
2) More general and realistic mmWave channel model.
We consider a quite general mmWave wireless channel
model, taking into account the fundamental features of
mmWave channels such as fast time-variation due to
Doppler, frequency-selectivity, and the AoA-AoD sparsity
[10, 22, 27].
3) Tolerance to large Doppler shifts. As in [4, 10],
we design a signaling scheme to collect quadratic
measurements, yielding estimates of the channel
second-order statistics in the discretized AoA-AoD
domain. Since quadratic measurements are related to the
estimation of the received signal power, which is invariant
with respect to the phase rotation of the channel taps,
the proposed scheme is highly robust to the channel
time-variations caused by the large Doppler spread
between the multipath components.
4) Impact of the PN sequence length. Unlike our prior
work in [28] and the work in [6], where Doppler is
modeled as a phase rotation across different frames but
the phase is kept constant over each beacon slot, here
we consider a truly continuous linear (in time) phase
rotation within the whole beacon slot. As a by-product
of this realistic Doppler model, we notice that longer PN
sequences do not necessarily exhibit better performance
since they undergo larger phase rotations. We illustrate by
numerical simulations that there is an optimal PN sequence
length based on the given set of parameters, using which
the proposed scheme achieves better performance in the
presence of large Doppler shifts.
5) System-level scalability and low-complexity beam
direction estimation. In our scheme, the BS actively probes
the channel by periodically broadcasting a pseudo-random
beamforming codebook over reserved beacon slots while
all the UEs remain in the listening mode. Therefore, each
UE is able to collect measurements from its channel locally
and independently of all the other UEs. We pose the
identification of the strongly coupled AoA-AoD pairs based
on the measurements of each UE as an underdetermined
system of noisy linear equations and solve it efficiently
using Non-Negative Least-Squares (NNLS). Due to the
properties of the NNLS, this yields a sparse estimate of
the vector of non-negative channel gain coefficients in the
discrete AoA-AoD domain. We illustrate via numerical
simulations that the proposed scheme outperforms existing
time-domain BA algorithms proposed in the literature in
terms of training overhead. Moreover, in contrast with
hierarchical algorithms, it does not require multiple rounds
of uplink-downlink interaction between the BS and the
UEs during the BA. Therefore, the proposed scheme
is scalable, in the sense that its protocol overhead is
essentially constant with the number of active UEs in the
system.
6) Effectiveness of single-carrier modulation. Our
proposed time-domain BA scheme is tailored to
single-carrier mmWave systems. In particular, we
show that, after achieving BA, the effective channel
reduces essentially to a single path with a single delay
and Doppler shift, with relatively large SNR due to the
high beamforming gain. This means that single-carrier
modulation needs no time-domain equalization and the
baseband signal processing after BA is indeed very
simple, requiring only standard timing and carrier
frequency offset (CFO) recovery, operating in relatively
large SNR conditions (after beamforming).
Notation: We denote vectors, matrices and scalars by
a,Aand a(A) respectively. We represent sets by Aand
4. Initial Beam Alignment for mmWave Single-Carrier Systems 41
AoA
AoD
()a
()b
AoA
AoD
Fig. 1: Illustration of the channel sparsity in the Angle of Arrival
(AoA), Angle of Departure (AoD), and delay domains. (a) Slices
of the channel power spread function over discrete delay taps,
where only a few slices contain scattering components with large
power. (b) Marginal power spread function of the channel in
the AoA-AoD domain obtained from the integration of the power
spread function over the delay domain.
their cardinality with |A|. We use Efor the expectation,
⊗for the Kronecker product, ATfor transpose, A∗for
conjugate, and AHfor conjugate transpose. We define the
vectorization operator as vec(·). For an integer k∈Z, we
use the shorthand notation [k]for the index set {1, ..., k}.
II. PROBLEM STATEMENT
In this section, we provide a general overview of the BA
problem based on the channel second-order statistics. Then,
in Sections III and IV we provide the fully detailed system
model and the proposed algorithm.
A. Channel Second-Order Statistics
We consider a widely used and well accepted mmWave
scattering channel model (e.g., see [7, 8]), where the
propagation between the BS and a generic UE occurs
along a sparse collection of multipath components in the
continuous AoA-AoD-delay (φ, θ, τ)domain, including
a possible Line-of-Sight (LOS) component as well as
some Non-Line-of-Sight (NLOS) reflected paths [25].
The channel follows locally the classical Wide-Sense
Stationary with Uncorrelated Scattering (WSSUS) model
[29, 30]. The average signal energy distribution over the
AoA-AoD-Delay domain is described by the Power Spread
Function (PSF) fp(φ, θ, τ). In brief, fp(φ, θ, τ)dφdθdτ
is the aggregate signal power transfer coefficient for the
propagation paths in the AoA-AoD region [φ, φ +dφ)×
[θ, θ+dθ)with path delays in [τ, τ +dτ). The PSF encodes
the second-order statistics of the channel and it is locally
time-invariant as long as the propagation geometry does
not change significantly. The time scale over which the
PSF is time-invariant is very large with respect to the
inverse of the signaling bandwidth, justifying the local
WSS assumption. Practical channel measurements have
shown that only a few discrete delays carry significant
signal energy, corresponding to the propagation delays of
the LOS and some reflection NLOS paths [7, 8, 26]. This
is illustrated in Fig. 1 (a), where only a few slices of the
PSF with respect to the delay domain contain scattering
components with large power. The marginal PSF of the
channel in the AoA-AoD domain is obtained by integrating
over the delay variable as
fp(φ, θ) = Zτ
fp(φ, θ, τ)dτ, (1)
and it is typically very sparse in the continuous angle
domain (see, e.g., Fig. 1 (b)).
B. Beam-Alignment Using Second-order Statistics
In terms of BA, we are interested in finding an AoA-AoD
pair corresponding to strong communication path between
the UE and the BS. If the marginal PSF of the channel in
the AoA-AoD domain fp(φ, θ)in (1) was a-priori known,
the BA problem would simply boil down to finding the
support of fp(φ, θ)(e.g., see the two bubbles in Fig. 1
(b)). In practice, however, fp(φ, θ)is not known and should
be estimated via a suitable signaling scheme. With this in
mind, we can pose the BA problem as follows.
BA Problem: Design a suitable signaling between the
BS and the UE, find an estimate of the AoA-AoD PSF
fp(φ, θ), and identify an AoA-AoD pair (φ0, θ0)with
a sufficiently large strength fp(φ0, θ0).
In this paper, we use pseudo-random waveforms with
nice auto-/cross-correlation properties as the probing
signals. We will show that, using the proposed signaling,
each UE is able to collect its own quadratic measurements,
which yield noisy linear projections of a suitably
discretized version of the marginal PSF fp(φ, θ). By
expressing such linear projections as a matrix-vector
product, we formulate the PSF estimation as the
Least-Squares solution of an underdetermined system
of linear equations. Imposing the non-negativity of the
discretized PSF coefficients, we are in the presence of a
NNLS problem, which naturally yields a sparse solution
[31, 32].
Fig. 2 (a) illustrates the proposed frame structure which
consists of three parts: the downlink beacon slot, the
Random Access Control CHannel (RACCH) slot, and the
data slot. An overview of the proposed initial acquisition
and BA protocol is illustrated in Fig. 2 (b). As in [4, 28],
the measurements are collected by the UEs from the
sequence of downlink beacon slots broadcasted by the
BS. By running the NNLS estimation algorithm mentioned
above, each UE selects its strongest AoA-AoD pair, i.e., the
discrete beam indices corresponding to the strongest path in
the estimated discretized PSF. Then, the initial acquisition
protocol proceeds as described in [4, 28]. Namely, the UE
sends a beamformed packet to the BS in the RACCH slot,
during which the BS stays in listening mode and uses
its MRF RF chains to form MRF coarse beam patterns
(sectors) covering the whole BS angle domain, in order
to provide some receiver beamforming gain. The RACCH
packet contains basic information such as user ID and the
42 4.3 Original journal article
where (φl, θl, τl, νl)denote the AoA, AoD, delay, and
Doppler shift of the l-th component, and δ(·)denotes
the Dirac delta function. The vectors aT(θl)∈CMand
aR(φl)∈CNare the array response vectors of the BS
and UE at AoD θland AoA φlrespectively, with elements
given by
[aT(θ)]m=ej(m−1)πsin(θ), m ∈[M],(3a)
[aR(φ)]n=ej(n−1)πsin(φ), n ∈[N],(3b)
where we assume that the spacing of the ULA antennas
equals to half wavelength.
For the sake of modeling simplicity, we assume in (2)
that each multipath component has a very narrow footprint
over the AoA-AoD and delay domain. Extension to more
widely spread multipath clusters is straightforward and
will be applied in the numerical simulations. Moreover,
we make the very standard assumption in array processing
that the array response vectors are invariant with frequency
over the signal bandwidth. More precisely, we assume
that the wavelength λover the frequency interval f∈
[f0−B/2, f0+B/2] can be approximated as λ0=
c/f0, where cdenotes the speed of light. This is indeed
well verified when Bis less than 1/10 of the carrier
frequency (e.g., B= 1 GHz with carrier between 30 and 70
GHz). Each scatterer corresponding to a AoA-AoD-Delay
(φl, θl, τl)has a Doppler shift νl=∆vlf0
cwhere ∆vl
indicates the relative speed of the receiver, the l-th scatterer,
and the transmitter [6]. We adopt a block fading model,
where the coefficient of the l-th multipath component
ρs,l remains invariant over the channel coherence time
∆tcbut change i.i.d. randomly across different coherence
times [10]. Each scatterer is formed by the superposition
of a large number of micro-scattering components (e.g.,
due to rough surfaces) having (approximately) the same
AoA-AoD and delay. By the central limit theorem it
is customary to model the superposition of these many
small effects as Gaussian [29, 30]. Hence, the multipath
component coefficients are modeled as Rice fading given
by
ρs,l ∼√γlrηl
1 + ηl
+1
√1 + ηl
ˇρs,l,(4)
where γldenotes the overall multipath component strength,
ηl∈[0,∞)indicates the strength ratio between the
LOS and the NLOS components, and ˇρs,l ∼ CN(0,1)
is a zero-mean unit-variance complex Gaussian random
variable. In particular, ηl→ ∞ indicates a pure LOS path
while ηl= 0 indicates a pure NLOS path, affected by
standard Rayleigh fading.
B. Proposed Signaling Scheme
We assume that the BS can simultaneously transmit up
to MRF Mdifferent pilot streams. In our previous
work [4], we considered OFDM signaling where the
different pilot streams are assigned to non-overlapping sets
of orthogonal subcarriers, such that (in the absence of
inter-carrier interference) they can be perfectly separated
by the UE in the frequency domain. However, such scheme
may incur performance degradation in the presence of
significant Doppler spread between the different multipath
components and/or carrier frequency offset between the
BS transmitter and UE receiver. Hence, in this work
we consider single-carrier signaling where different PN
sequences are assigned to each pilot stream, similar
to Code Division Multiple Access (CDMA). In the
proposed scheme, the different pilot streams are generally
non-orthogonal but the cross-interference is very small if
the assigned PN sequences are sufficiently long with good
cross-correlation properties. As we shall see, this signaling
scheme yields very good robustness to Doppler.
Let xs,i(t),t∈[st0,(s+ 1)t0), be the continuous-time
baseband equivalent PN signal corresponding to the i-th
(i∈[MRF]) pilot stream transmitted over s-th slot, given
by
xs,i(t) =
Nc
X
n=1
%n,ipr(t−nTc), %n,i ∈ {1,−1},(5)
where t0denotes the duration of the PN sequence, pr(t)
is a square-root Nyquist pulse4[33] with normalized
energy R|pr(t)|2dt = 1, and {%n,i :n∈[Nc]}is
the n-th chip symbol. The PN sequence has a chip
duration Tc, bandwidth B0≈1/Tc≤B(where B
denotes the maximum available channel bandwidth), and
a total of Nc=t0/Tcchips. We shall choose a suitable
sequence length Nc, such that the time-domain signal (5)
is transmitted in a sufficiently small time-interval t0over
which the channel can be considered time-invariant, i.e.,
t0∆tc.
To transmit the i-th pilot stream, the BS applies
a beamforming vector us,i ∈CM. Without loss of
generality, the beamforming vectors are normalized such
that kus,ik= 1. As mentioned before, we consider a
HDA beamforming architecture where the beamforming
function is implemented in the analog RF domain. Hence,
the beamforming vectors us,i,i∈[MRF], are independent
of frequency and constant over the whole bandwidth. The
transmitted signal at slot sis given by
xs(t) =
MRF
X
i=1 rPtotTc
MRF
xs,i(t)us,i
=
MRF
X
i=1
Nc
X
n=1 rPtotTc
MRF
%n,ipr(t−nTc)us,i,(6)
where Ptot is the total transmit power which is equally
distributed into the MRF RF chains from BS. The term
Ptot Tc
MRF indicates the energy per chip of the transmitted
PN sequences, where Tcdenotes the chip duration.
4A square-root Nyquist pulse is a finite-energy waveform pr(t)such
that the squared magnitude of its spectrum |Pr(f)|2satisfies the Nyquist
criterion [33].
44 4.3 Original journal article
Consequently, the received baseband equivalent signal at
the UE array is
rs(t) = ZHs(t, τ)xs(t−τ)dτ
=
L
X
l=1
Hs,l(t)xs(t−τl)
=
MRF
X
i=1
L
X
l=1 rPtotTc
MRF
Hs,l(t)xs,i(t−τl)us,i,(7)
where Hs,l(t) := ρs,lej2πνltaR(φl)aT(θl)H,l∈[L]are the
time-varying MIMO channel taps corresponding to the L
multipath components as in (2).
With a hybrid MIMO structure, the UE does not have
direct access to (a sampled version of) the components of
rs(t). Instead, at each beacon slot s, the UE must apply
some beamforming vector in the analog domain obtaining a
projection of the received signal. Since the UE has NRF RF
chains, it can obtain up to NRF such projections per slot.
The analog RF signal received at the UE antenna array
is distributed across the NRF RF chains for demodulation.
This is achieved by signal splitters that divide the signal
power by a factor of NRF. Thus, the received signal at the
output of the j-th RF chain at the UE side is given by
ˆys,j (t) = 1
√NRF
vH
s,jrs(t) + zs,j(t)
=
MRF
X
i=1
L
X
l=1pEdimvH
s,jHs,l(t)us,ixs,i(t−τl)+zs,j (t),
(8)
where Edim =Ptot Tc
MRFNRF indicates the per-stream pilot chip
energy distributed over the transmit and receive RF chains,
vs,j ∈CNdenotes the normalized beamforming vector of
the j-th RF chain at the UE side with kvs,jk= 1,zs,j(t)
is the continuous-time complex Additive White Gaussian
Noise (AWGN) at the output of the j-th RF chain, with a
Power Spectral Density (PSD) of N0Watt/Hz. The noise
at the receiver is mainly introduced by the RF chain
electronics, e.g., filter, mixer, and A/D conversion. The
factor 1
√NRF in (8) takes into account the power split said
above, assuming that this only applies to the useful signal
and not to the thermal noise. Therefore, this received signal
model is a conservative worst-case assumption.
In realistic conditions, we have Tcνl1.5Hence, the
phase time-variation over the duration of the chip pulse
shape is negligible. It follows that we can replace the
continuously time-varying matrix tap coefficient Hs,l(t)
with its discrete approximation, which can be simply
written in the form
Hs,l(t)t∈[nTc,(n+1)Tc)≈ρs,lej2π(ˇνs,l+νlnTc)aR(φl)aT(θl)H
=Hs,lej2πνlnTc(9)
5For example, consider Tc= 1 ns, ∆vl= 10 m/s and f0= 60 GHz
yielding νl= 2 kHz and Tcνl= 2 ·10−6.
with n∈[Nc], where Hs,l := ρs,lej2πˇνs,l aR(φl)aT(θl)H,
and where ˇνs,l represents a phase rotation at the beginning
of the s-th beacon slot which is irrelevant since it can be
incorporated in the Gaussian coefficient ρs,l. As a result,
the product term Hs,l(t)xs,i(t−τl)in (8) can be written
as
Hs,l(t)xs,i(t−τl) = Hs,l
Nc
X
n=1
%n,iej2πνlnTcpr(t−nTc−τl)
:= Hs,lxl
s,i(t−τl),(10)
where xl
s,i(t)is given by
xl
s,i(t) =
Nc
X
n=1
%n,iej2πνlnTcpr(t−nTc).(11)
Notice that xl
s,i(t)consists of a modified modulated PN
sequence where the chip symbols %n,i are rotated by the
time-varying phase factor ej2πνlnTcdue to the Doppler
shift. Substituting (10) into (8), we can write the received
signal ˆys,j(t)in (8) as
ˆys,j (t)=
MRF
X
i=1
L
X
l=1pEdimvH
s,jHs,lus,ixl
s,i(t−τl)+zs,j(t).
(12)
Since the PN sequences assigned to the MRF RF
chains are mutually (roughly) orthogonal, the MRF pilot
streams transmitted from the BS side can be approximately
separated at the UE by passing each j-th received signal
(12) through a bank of matched filters where the ifilter
has impulse response x∗
s,i(−t) = PNc
n=1 %n,ip∗
r(−t+nTc).
Consequently, the i-th BS pilot stream received through the
j-th RF chain at the UE is given by
ys,i,j(t) = Zˆys,j(τ)x∗
s,i(τ−t)dτ
=
L
X
l=1
MRF
X
i0=1
pEdimvH
s,jHs,lus,iRxl
i0,i(t−τl)+zc
s,j(t)
(a)
≈
L
X
l=1 pEdimvH
s,jHs,lus,iRxl
i,i(t−τl)+zc
s,j(t)
(13)
where ∀i, i0∈[MRF],Rxl
i0,i(t) := Rxl
s,i0(τ)x∗
s,i(τ−t)dτ
represents the correlation between the Doppler-rotated
sequence xl
s,i0(t)given by (11) and the desired sequence
xs,i(t), and zc
s,j(t) = Rzs,j(τ)x∗
s,i(τ−t)dτ denotes the
noise at the output of the matched filter. The approximation
(a)in (13) follows the fact that, the cross-correlations
between different PN sequences are nearly zero, i.e.,
Rx
i0
,i(t) = Rxs,i0(τ)x∗
s,i(τ−t)dτ ≈0, for i06=i. Since
the phase rotation introduced by Doppler is very small
(νlTc1), we can also safely assume that Rxl
i0
,i(t) =
Rxl
s,i0(τ)x∗
s,i(τ−t)dτ ≈0, for i06=i. However, it
is important to point out that these are only working
assumptions in order to derive our algorithm. The actual
4. Initial Beam Alignment for mmWave Single-Carrier Systems 45
performance of the scheme will of course depend also on
the residual non-zero cross-interference between the PN
sequences. Hence, in our numerical simulations, we made
no such simplification and took into account all the cross
terms arising from non-perfect orthogonality.
Consider (13) and suppose that the output signal at
the UE side is sampled at the chip-rate. The resulting
discrete-time signal can be written as
ys,i,j[k] = ys,i,j(t)|t=kTc
=
L
X
l=1pEdimvH
s,jHs,lus,iRxl
i,i(kTc−τl)+zc
s,j[k],
(14)
where k∈[ˇ
Nc]indicates the sampling index, ˇ
Nc≥
Nc+∆τmax
Tcdenotes the total number of samples in
each received PN sequence, and ∆τmax = max{|τl−
τl0|:l, l0∈[L]}denotes the maximum delay spread of
the channel. Note that for PN sequences, the sequence
of samples {|Rxl
i,i(kTc−τl)|:k∈[ˇ
Nc]}in (14) has
sharp peaks at indices kl≈τl
Tc, corresponding to the
delays of the channel multipath components. Intuitively
speaking, the output ys,i,j[k]at those indices klyields
Gaussian variables whose power is obtained by projecting
the AoA-AoD-Delay PSF fp(φ, θ, τ)along beamforming
vectors (us,i,vs,j)in the angular domain and along the
kl-th delay slice τ∈[klTc,(kl+ 1)Tc]. The slicing in the
delay domain results from the fact that, as said before,
|Rxl
i,i(kTc−τl)|is well localized around kl. We refer to
Fig. 1 (a) for an illustration and will use this property later
on in the paper to design our BA algorithm.
C. Sparse Beam-space Representation
The AoA-AoDs (φl, θl)in (2) take on arbitrary values
in the continuous AoA-AoDs domain. Following the
widely used approach of [34], known as beam-space
representation, we obtain a finite-dimensional
representation of the channel response (2) by discretizing
the angle domain. Consider the discrete set of AoA-AoDs
Φ := {ˇ
φ: (1 + sin(ˇ
φ))/2 = n−1
N, n ∈[N]},(15a)
Θ := {ˇ
θ: (1 + sin(ˇ
θ))/2 = m−1
M, m ∈[M]}.(15b)
The corresponding sets of array responses AR:= {aR(ˇ
φ) :
ˇ
φ∈Φ}and AT:= {aT(ˇ
θ) : ˇ
θ∈Θ}form discrete
dictionaries to represent the channel response. For the
ULAs considered in this paper, the dictionaries ARand
AT, after suitable normalization, reduce to the columns
of unitary Discrete Fourier Transform (DFT) matrices
FN∈CN×Nand FM∈CM×M, with elements
[FN]n,n0=1
√Nej2π(n−1)( n0−1
N−1
2), n, n0∈[N],(16a)
[FM]m,m0=1
√Mej2π(m−1)( m0−1
M−1
2), m, m0∈[M].
(16b)
The channel beam-space representation consists of
expressing the channel matrix as the linear combination of
the outer product of rank-1 matrices of the form fN,nfH
M,m
for all n∈[N]and m∈[M], where fN,n and fM,m denote
the n-th and m-th columns of FNand of FM, respectively.
Explicitly, the beam-space representation expression is
given by
Hs(t, τ) =
N
X
n=1
M
X
m=1 ˇ
Hs(t, τ)n,m fN,nfH
M,m
=FNˇ
Hs(t, τ)FH
M,(17)
where the beam-space representation of the channel
response is given by
ˇ
Hs(t, τ) = FH
NHs(t, τ)FM=
L
X
l=1
ˇ
Hs,l(t)δ(τ−τl),(18)
where ˇ
Hs,l(t) := FH
NHs,l(t)FMcorresponds to the
beam-space l-th channel path.
As shown in our earlier work [4], as the number of
antennas Mat the BS and Nat the UE increases, the
DFT basis provides a good sparsification of the propagation
channel. As a result, ˇ
Hs(t, τ)can be approximated as
a sparse matrix, with non-zero elements in the locations
corresponding to small clusters of discrete AoA-AoD pairs
in the proximity of the (continuous) angle pairs of the L
scatterers of the physical channel. We may encounter a grid
error in (18) since the AoAs/AoDs do not necessarily fall
into the uniform grid Φ×Θ. Nevertheless, as shown in [4],
the grid error becomes negligible by increasing the number
of antennas (i.e., the grid resolution). We hasten to say that,
in our simulations, we do not constrain the AoA-AoD pairs
of the physical channel to take on values on the discrete
grid; therefore, the grid discretization effects is fully taken
into account in our numerical results.
IV. PROPOSED BEAM ALIGNMENT SCHEME
A. BS Channel Probing and UE Sensing
Consider the scattering channel model in (2) and its
beam-space representation in (18). In our proposed scheme,
at each beacon slot s, the BS probes the channel along MRF
beamforming vectors us,i,i∈[MRF], each of which is
applied to a unique PN sequence signal xs,i(t). We select
the beamforming vectors at the BS side according to a
pre-defined pseudo-random codebook, which is a collection
of the angle sets CT:= {Us,i :s∈[T], i ∈[MRF]}, where
Us,i denotes the angle-domain support of the beamforming
vector us,i, i.e., the indices of the quantized angles in
the beam-space representation of us,i, and where Tis
the effective period of beam training. We assume that
the beamforming vector us,i sends equal power along the
directions in Us,i with the number of active angles given by
|Us,i|=: κu≤M, which we assume to be the same for all
(s, i). We call κuthe angle spreading factor with respect
46 4.3 Original journal article
to the transmit beamforming vectors. Consequently, we can
write such beamforming vectors as us,i =FMˇ
us,i, where
ˇ
us,i =1Us,i
√κu, and where 1Us,i denotes a vector with 1at
components in the support set Us,i and 0elsewhere. One
can simply imagine the vector ˇ
us,i as a multi-finger beam
pattern in the angle-domain as illustrated in Fig. 3 (a).6
We assume that the angle indices in Us,i in the codebook
CTare a priori generated in a random manner and are a
priori known to all UEs in the system. This is similar
to the BS-dependent pseudo-random synchronization codes
used in the 3G WCDMA standard [36]. Thus, we call CT
a pseudo-random codebook.
At the UE side, each UE can locally customize its own
receive beamforming codebook defined as CR:= {Vs,j :
s∈[T], j ∈[NRF]}, where Vs,j, with |Vs,j|=κv≤N
for all (s, j), is the angle-domain support, defining the
directions from which the receiver beam patterns collect
the signal power. We define the beamforming vectors at
the UE side by vs,j =FNˇ
vs,j, where ˇ
vs,j =1Vs,j
√κvagain
defines the finger-shaped beam patterns as shown in Fig. 3
(a). Similar to the power spreading factor κuat the BS,
the parameter κvcontrols the spread of the sensing beam
patterns at the UE.
In our scheme, the UEs collect their measurements
independently and simultaneously, without any influence
or coordination to each other. Therefore, the scheme is
quite scalable for multiuser scenarios, where the overhead
of training all the UEs does not increase with the number
of UEs. This represents a significant advantage with
respect to traditional multi-level/interactive BA schemes,
that require multiple beam-sweeping rounds and interactive
data exchanges between the BS and each UE, such that the
acquisition protocol overhead grows proportionally to the
number of UEs being acquired.
B. UE Measurement Sparse Formulation
During the s-th beacon slot, the UE applies the receive
beamforming vector vs,j to its j-th RF chain. Assuming
that the probing PN signals xs,i(t)are approximately
orthogonal in the time domain as discussed before, each
RF chain at the UE side can almost perfectly separate the
transmitted MRF pilot streams. Thus, using the beam-space
representation of the channel in (18), we can write (14) as
ys,i,j[k]=
L
X
l=1pEdim ˇ
vH
s,j ˇ
Hs,l ˇ
us,iRxl
i,i(kTc−τl)+zc
s,j[k],
(19)
where ˇ
us,i =FH
Mus,i and ˇ
vs,j =FH
Nvs,j are the
beamforming vectors in the beam-space domain. Here,
we used the unitary property of the DFT matrices, i.e.,
6Note that, in our scheme, the beamforming vectors result from a
uniform linear combining of the DFT vectors. Further optimization of
the beamforming vectors with non-uniform combining [35] is possible.
However, this goes outside the scope of the present work and it is left for
future investigation.
FH
MFM=IMand FH
NFN=IN, where IMand INare
identity matrices of dimension Mand Nrespectively.
To formulate the sparse estimation problem, we define
the vectors ˇ
hs,l = 1/√NRF ·vec(ˇ
Hs,l),l∈[L], resulting
in a reformulated channel matrix ˇ
Hs= [ˇ
hs,1,··· ,ˇ
hs,L]
that collects all the channel coefficients in the beam-space
domain. We also define a vector ci
k= [Rx1
i,i(kTc−
τ1),··· , RxL
i,i (kTc−τL)]T·√Edim, which can be regarded
as the Power Delay Profile (PDP) of the i-th pilot stream
transmitted along the Lpaths and sampled at the k-th
discrete delay tap kTc. Consequently, we can express the
received beacon signal (19) at the UE as
ys,i,j[k] =
L
X
l=1pEdim ˇ
vH
s,j ˇ
Hs,l ˇ
us,iRxl
i,i(kTc−τl)+zc
s,j[k]
= (ˇ
us,i ⊗ˇ
v∗
s,j)Tˇ
Hsci
k+zc
s,j[k]
=gT
s,i,j ˇ
Hsci
k+zc
s,j[k],(20)
where we used the well-known identity vec(ABC) =
(CT⊗A)vec(B), and where gs,i,j := ˇ
us,i ⊗ˇ
v∗
s,j
denotes the combined beam-space representation of the
beamforming vectors corresponding to the i-th RF chain
at the BS and the j-th RF chain at the UE.
Next, we introduce a slight generalization of the scheme
illustrated so far, by allowing the repetition of the PN
beacon sequences S≥1times during each beacon slot (see
Fig. 2 (a)). Hence, each beacon slot consists of Ssubslots,
each of which contains a PN sequence transmission as
explained above. Since beamforming is implemented in
the analog RF domain, it is typically impractical to switch
the beamforming pattern during the beacon slot. Hence,
we assume that the combined beamforming vector gs,i,j
remains constant over the Ssubslots, whereas ˇ
Hschanges
because of the Doppler shifts νl. Over different beacon
slots, in contrast, the beamforming vector gs,i,j changes
periodically according to the pre-defined pseudo-random
beamforming codebook Us,i ×Vs,j as said before. In order
to accommodate for this extension, with a slight abuse of
notation, we index the received subslots belonging to the
s-th beacon slot as sS +s0,s0∈[S], where the index s
labels the beacon slots and the index s0labels the subslots
inside each beacon slot. It follows that the received signal
through the i-th RF chain at the BS and the j-th RF chain
at the UE after matched filtering (refer to (20)) can be
written as
ysS+s0,i,j[k] = gT
s,i,j ˇ
HsS+s0ci
k+zc
sS+s0,j[k].(21)
As anticipated in Section I, in order to ensure a
robust scheme with respect to fast channel variations [10],
we focus on the second-order statistics of the channel
coefficients. More specifically, we accumulate the energy
at the output of the matched filter across all the ˇ
Nc
discrete delay taps, by computing the following quadratic
measurements in (22), where the first two terms correspond
to the useful signal and noise contributions, respectively,
4. Initial Beam Alignment for mmWave Single-Carrier Systems 47
qs,i,j =1
S
S
X
s0=1
ˇqsS+s0,i,j
=gT
s,i,j
L
X
l=1 1
S
S
X
s0=1
ˇ
hsS+s0
,l ˇ
hH
sS+s0
,l!ˇ
Nc
X
k=1
Edim|Rxl
i,i(kTc−τl)|2
g∗
s,i,j (23a)
+
ˇ
Nc
X
k=1 1
S
S
X
s0=1 |zc
sS+s0,j[k]|2!+ws,i,j,(23b)
have variance N0Rx(0) [33]. Hence, we can assume the
approximation
1
S
S
X
s0=1 |zc
sS+s0,j[k]|2≈E[|zc
sS+s0,j[k]|2] = N0Rx(0),
(26)
which holds true in the limit of large S. Using the definition
of Γin (25), the approximations (24) and (26), and defining
the NM-dimensional binary vectors
bs,i,j := gs,i,j√κuκv=1Us,i ⊗1Vs,j ,(27)
we can eventually write (23) in the convenient compact
form
qs,i,j =bT
s,i,jvec(Γ) + ˇ
NcN0Rx(0) + ews,i,j,(28)
where ews,i,j collects the error term ws,i,j plus all the
residual errors incurred by the above approximations.7
Notice that bs,i,j contains “ones” in the positions
corresponding to the discrete angle support Us,i × Vs,j
from the beamforming codebook, while it contains “zeros”
everywhere else. Hence, the inner product bT
s,i,jvec(Γ)
corresponds effectively to collecting all the signal energy
received from the AoA-AoD pairs indexed by the angular
support Us,i ×Vs,j. An example of the probing geometry
is illustrated in Fig. 3 (b).
In order to gain insight on the role of the algorithm
parameters κu, κv, MRF, NRF, and Tc, it is useful to
compare the SNR before beamforming (BBF) with the
SNR associated to each of the measurements in (28). We
define the SNR BBF as
SNRBBF =Ptot PL
l=1 γl
N0B.(29)
This is the ratio of the total received signal power
(summing over all the multipath components) over the
total noise power at the receiver baseband processor input,
with total bandwidth B. As mentioned before, one of
the challenges of BA and in general communication at
mmWaves is that the SNR before beamforming SNRBBF
7We should point out here again that the goodness of such
approximations reflects in the variance of the error term ews,i,j . The fact
that our algorithm works very well in very low SNR conditions (see
Section V) confirms that all the working assumptions made here are valid
and justified.
in (29) is typically very low. The average SNR of
the measurements in (28), with average taken over the
randomness of the beam codebook and the channel, can
be qualitatively quantified as
SNRMEA =PtotTcPL
l=1 γl·MN
κuκvMRFNRFN0
.(30)
This quantity is explained as follows: the energy per
chip PtotTcis uniformly spread over the angular
fraction κuκv/(MN)and over the MRFNRF measurements
obtained in each beacon slot. Comparing (29) and
(30) prompts to the following qualitative observations:
i) by making the product κuκvlarge, we explore
simultaneously more angle directions, but the signal power
is spread over a broader angle such that the SNR per
measurement decreases. Therefore, we expect the existence
of an exploration/exploitation trade-off with respect to
the product κuκv(as noticed in [4]). ii) The scheme
gathers MRFNRF new measurements for each beacon slot,
but the SNR per measurements decreases with MRFNRF.
Hence, a similar exploitation/exploration trade-off exists
with respect to the number of RF chains used in the BA
algorithm (see also [4]). iii) By making Tclarger than
1/B, the signal power is effectively concentrated in a
bandwidth 1/Tc< B. This energy accumulation in the
frequency domain improves the SNR per measurement.
However, given a total pilot signal duration, increasing
Tcdecreases the number of chips of the PN sequence
such that the cross-interference between PN sequences and
their delayed versions increases. Therefore, there exists a
trade-off between energy concentration in the frequency
domain and self-interference in the system, reflected in the
variance of the error term ews,i,j.
C. Path Strength Estimation via Non-Negative Least
Squares
After Tbeacon slots, the UE obtains a total number of
MRFNRFTequations, given by
q=B·vec(Γ) + ˇ
NcN0Rx(0) ·1+e
w,(31)
where the vector q= [q1,1,1, . . . q1,MRF,NRF , . . . , qT,MRF,NRF ]T∈
RMRFNRFTconsists of all MRFNRFT
measurements achieved as in (28),
B= [b1,1,1, . . . , b1,MRF,NRF , . . . , bT,MRF,NRF ]T∈
4. Initial Beam Alignment for mmWave Single-Carrier Systems 49
RMRFNRFT×MN is uniquely defined by the pseudo-random
beamforming codebook of the BS and the local
beamforming codebook of the UE, and e
w∈RMRFNRFT
denotes the residual error.
In order to identify the strong AoA-AoD quantized
directions, the UE needs to estimate the MN-dim vector
vec(Γ)from the MRFNRFT-dim observation (31) in
presence of the measurement noise e
w, where in general,
MN is significantly larger than MRFNRFT. There are
a great variety of algorithms to solve (31) in the
Least-Squares sense. The key observation here is that Γ
is sparse (by the sparse nature of mmWave channels) and
non-negative (by the second-order statistic construction of
our scheme). As discussed in our previous work [4], recent
results in CS show that when the underlying parameter Γ
is non-negative, the simple non-negative constrained Least
Squares (LS) given by
Γ?= arg min
Γ∈RN×M
+kB·vec(Γ) + ˇ
NcN0Rx(0) ·1−qk2,
(32)
is sufficient to yield a sparse solution Γ?[31, 32], without
the need for an explicit sparsity-promoting regularization
term in the objective function as for example in the
classical LASSO algorithm [37]. The (convex) optimization
problem (32) is generally referred to as Non-Negative
Least Squares (NNLS), and has been well investigated
in the literature. As discussed in [31], NNLS implicitly
performs `1-regularization and promotes the sparsity of the
resulting solution provided that the measurement matrix
Bsatisfies the M+-criterion [32], i.e., there exits a
vector d∈RMRFNRFT
+such that BTd>0. In our
case, this criterion can be simply interpreted as the fact
that the set of MRFNRFTmeasurement beam patterns
should hit all the MN AoA-AoD pairs at least once,
which is almost fully satisfied in our scheme because of
the random finger-shaped beam patterns, also because of
the pseudo-random property of the designed beamforming
codebook.
In terms of numerical implementation, the NNLS can be
posed as an unconstrained LS problem over the positive
orthant and can be solved by several efficient techniques
such as Gradient Projection, Primal-Dual techniques,
etc., with an affordable computational complexity [38],
which is generally significantly less than conventional CS
algorithms for problems of the same size and sparsity
level. We refer to [39, 40] for the recent progress on the
numerical solution of NNLS and a discussion on other
related work in the literature.
V. PERFORMANCE EVALUATION
We consider a system with M= 32 antennas, MRF = 3 RF
chains at the BS, and N= 32 antennas, NRF = 2 RF chains
at a generic UE. We assume a short preamble structure used
in IEEE 802.11ad [20, 41], where the beacon slot is of
duration t0S= 1.891 µs. The system is assumed to work
at f0= 70 GHz, has a maximum available bandwidth of
B= 1.76 GHz, namely, each beacon slot amounts to more
than 3200 chips as in [20, 21]. We assume the channel
contains L= 3 links given by (γl= 1, η1= 100),(γ2=
0.6, η2= 10) and (γ3= 0.6, η3= 0), where γldenotes the
scatterer strength, ηlindicates the strength ratio between
the LOS and the NLOS propagation as in (4). Thus, the first
scatterer can be roughly regarded as the LOS path, while
the remaining scatterers represent the NLOS paths. This
is consistent with the practical mmWave MIMO channel
measurements in [27], where the relative power level of
the NLOS path is around 10 dB lower than the desired
LOS path. We assume that the relative speed ∆vlfor each
path is in the range 0∼8m/s. We announce a success if
the location of the strongest component in Γ?(see (32))
coincides with the LOS path.8
In the following simulations,9we evaluate the
performance of our time-domain BA scheme according to
three viewpoints: i) We study the effect of various scheme
parameters on the achieved BA probability; ii) We show
the superiority of our proposed scheme in comparison with
other recently proposed time-domain BA schemes [20, 21];
iii) We consider the effectiveness of the BA scheme in
the context of single-carrier modulation. To tackle the
latter aspect we compute upper and lower bounds on the
ergodic achievable rate for the effective SISO channel
between the BS and the UE after BA. These bounds
show that BA yields essentially a frequency-flat channel
even when the original channel has multiple multipath
components. Also, the effective SNR of the channel after
BA is quite large. Therefore, single-carrier modulation with
standard timing and carrier synchronization and without
time-domain equalization works very well.
A. Success Probability of the Proposed BA Scheme
Dependence on the beam spreading factors (κu, κv).
As discussed at the end of Section IV-B (see also our
previous work [4, 28]), the trade-off between the angle
exploration of the measuring matrix Band the SNR
of received measurements is illustrated in Fig. 4 (a).
Increasing the angular spreading factor from κu=κv= 4
to κu=κv= 8 improves the performance. However, the
performance keeps degrading when (κu, κv)are increased
to κu=κv= 16,22.
Dependence on the PN sequence length Ncand
robustness to Doppler shifts. In general, larger PN
8In the case that there is no LOS link, one can announce a success if
the location of the strongest component in Γ?coincides with the central
AoA-AoD of the strongest scatterer cluster.
9We will use lsqnonneg.m in MATLAB© to solve the NNLS
optimization problem in (32). Also, for simplicity, in our simulations, we
assume that the sizes of the beamforming codebooks given by 1
MRF |CT|
and 1
NRF |CR|on both sides, are the same as the number of effective
beam training beacon slots T. In practical implementation, however, the
BS codebook size should be fixed and used periodically, since it is shared
to all UEs in advance; while the local beamforming codebook for each
UE can be set to any size depending on the individual UE.
50 4.3 Original journal article
3
10 20 30 40 50 60 70
0.2
0.4
0.6
0.8
1
Number of beacon slots T
PD
(a)
κu=κv= 4
κu=κv= 8
κu=κv= 16
κu=κv= 22
15 20 25 30 35 40
0.4
0.6
0.8
1
Number of beacon slots T
(b)
Nc= 16
Nc= 32
Nc= 64
Nc= 128
Nc= 256
20 25 30 35 40 45
0.6
0.7
0.8
0.9
1
Number of beacon slots T
(c)
∆vl?= 0 m/s
∆vl?= 3 m/s
∆vl?= 5 m/s
∆vl?= 8 m/s
Fig. 4: Detection probability PDof the proposed time-domain scheme with respect to (a) different power spreading factors (κu,
κv), where Nc= 64 and the relative speed of the strongest path ∆vl?= 5 m/s; (b) different PN sequence lengths Nc, where
κu=κv= 8 and the relative speed of the strongest path ∆vl?= 5 m/s; (c) different relative speed values of the strongest path
∆vl?, where κu=κv= 8,Nc= 64. In all the above cases, M=N= 32,MRF = 3,NRF = 2,B0=B,SNRBBF =−14 dB.
sequence length Ncprovides better correlation properties,
such that different pilot streams can be well separated at the
UE. However, increasing Ncincreases the whole duration
t0=NcTcof the transmitted signal. Thus, because of the
Doppler shift, the received PN sequence undergoes larger
phase rotation of the chips. This rotation degrades the PN
sequence correlation property. This is illustrated in Fig. 4
(b). Increasing the PN sequence length Ncfrom Nc= 16
to Nc= 32,64 improves the performance of the proposed
scheme. However, the performance degrades slightly when
Ncis increased to Nc= 128,256. In general, our scheme is
highly insensitive to the Doppler spread between different
multipath components, as illustrated in Fig. 4 (c). For
example, varying the speed difference between the paths
from 0to 8m/s, the BA success probability remains
virtually unchanged. This provides a significant advantage
with respect to schemes based on OFDM signaling, which
is known to be fragile to uncompensated Doppler shifts
yielding inter-carrier interference.
Comparison with other time-domain methods. Fig. 5
compares the performance of our proposed scheme with
a recently proposed time-domain approach [20, 21] based
on the Orthogonal Matching Pursuit (OMP) CS technique.
The approach in [20, 21] assumes that the channel
vector coefficients remain constant over the whole training
stage (in other words, it assumes a completely stationary
situation with zero Doppler shifts). It can be seen from
Fig. 5 that the proposed scheme exhibits much more robust
performance with respect to the channel time-variations
whereas the approach in [20, 21] fails when the channel is
fast time-varying.
Remark 1: In all the simulations so far, for simplicity,
we have considered path delays equal to integer multiples
of the chip duration, namely, τl=Gl·Tc, for some
integer Gl. In such cases, the chip pulse shape with any
arbitrary square-root Nyquist pulse [33] yields the same
10 20 30 40 50 60 70 80
0.2
0.4
0.6
0.8
1
Number of beacon slots T
PD
NNLS slow-varying channel
NNLS fast-varying channel
OMP slow-varying channel
OMP fast-varying channel
Fig. 5: Comparison of the proposed scheme based on NNLS
with that in [20, 21] based on OMP for both slow-varying
channels (i.e., when the instantaneous channel coefficients
are time-invariant) and fast-varying channels (i.e., when the
instantaneous channel coefficients decorrelate almost completely
from slot to slot due to the large Doppler spread), where M=
N= 32,MRF = 3,NRF = 2,κu=κv= 8,B0=B,Nc= 64,
SNRBBF =−14 dB.
performance since the samples of any Nyquist pulse at the
output of the matched filter (see, e.g., (14)) are zero at
all integer multiples of Tcexcept 0. In practice, however,
the path delays are not integer multiple of Tcand can
be generally written as τl=Gl·Tc+ ∆τlfor some
0<∆τl< Tc, referred to as the delay fractional part. In
general, this is not an issue during the data communication
phase since the delays are well compensated by suitable
synchronization at the receiver, but it may affect the
performance of our proposed BA since it is unrealistic to
assume any proper synchronization during the BA. As a
result, in the presence of non-null delay fractional parts,
the performance of our scheme may depend on the specific
4. Initial Beam Alignment for mmWave Single-Carrier Systems 51
10 20 30 40 50 60 70
0.2
0.4
0.6
0.8
1
Number of beacon slots T
PD
NNLS, integer delay
NNLS, RRC, fractional delay
NNLS, Rect, fractional delay
Fig. 7: Illustration of pulse shaping effect, where M=N= 32,MRF = 3,NRF = 2,κu=κv= 8,B0=B,
Nc= 64,SNRBBF =−14 dB.
B. Effectiveness of Single-Carrier Modulation
Assume that after a BA procedure as proposed in Section IV, the strongest component in Γ?
corresponds to the l?-th scatterer between the BS and the UE. Hence, the estimated beamforming
vectors for the data transmission are given by ul?=FMˇ
ul?at the BS and vl?=FNˇ
vl?at the
UE respectively, where ˇ
ul?∈CMis an all-zero vector with a 1at the component corresponding
to the AoD of the l?-th scatterer, and ˇ
vl?∈CNis an all-zero vector with a 1at the component
corresponding to the AoA of the l?-th scatterer. We assume that in the downlink data transmission
phase, the BS and the UE employ a single RF chain, therefore, with a slight abuse of notation,
we assume that transmitted waveform, consisting of Ndinformation symbols, is given by x(t) =
PNd
n=1 √PtotTd·dnpr(t−nTd), where pr(t)denotes the normalized band-limited pulse shaping
filter (such as a raised cosine pulse), Td= 1/B indicates the symbol transmission rate over the
whole bandwidth B, and ∀n∈[Nd],dn∈ {1,−1}indicate the information symbols. From (7)
and (8), the received signal after passing through the beamforming vectors (vl?,ul?)is given by
ˆy(t) = vH
l?ZH(t, τ)x(t−τ)dτul?+z(t)
=
L
X
l=1
Nd
X
n=1
Clpr(t−nTd−τl)ej2π(ˇνl+νlnTd)+z(t),(34)
Fig. 6: Illustration of pulse shaping effect, where M=N= 32,
MRF = 3,NRF = 2,κu=κv= 8,B0=B,Nc= 64,SNRBBF =
−14 dB, the relative speed of the strongest path ∆vl?= 5 m/s.
chip pulse used. In order to investigate this effect, we
perform numerical simulations with two different pulse
shapes: root-raised-cosine (RRC) and rectangular (Rect)
pulses [33]. Fig. 6 illustrates the simulation results for
random fractional delays. As expected, we see a slight
performance degradation compared with the ideal integer
delay curve (an additional 5∼10 slots for PD≥0.95).
However, the effect of non-integer delays and of the
specific chip pulse shape is rather small. Furthermore, it
is observed that the Rect pulse yields less degradation than
RRC.
B. Effectiveness of Single-Carrier Modulation
After running the BA protocol as described in Section
IV, and assuming that the strongest multipath component
is correctly identified, we denote such component as l?-th.
Hence, the estimated beamforming vectors for the data
transmission are given by ul?=FMˇ
ul?at the BS and
vl?=FNˇ
vl?at the UE respectively, where ˇ
ul?∈CMis
an all-zero vector with a 1at the component corresponding
to the AoD of the l?-th scatterer, and ˇ
vl?∈CNis an
all-zero vector with a 1at the component corresponding
to the AoA of the l?scatterer. In this section we focus
on the data transmission phase under the beam alignment
assumption. During the data transmission phase, a standard
single-carrier linear modulated signal consisting of Nd
information symbols is used. The complex baseband signal
is given by x(t) = PNd
n=1 √PtotTd·dnpr(t−nTd), where
pr(t)denotes a unit-energy square-root Nyquist pulse
shaping filter (e.g., a RRC pulse), Td= 1/B indicates the
symbol interval, and {dn}is the sequence of unit-energy
modulation symbols, belonging to a suitable modulation
constellation [33]. From (7) and (8), the received signal
including the transmit and receive beamforming vectors
(vl?,ul?)is given by
ˆy(t) = ZvH
l?H(t, τ)ul?x(t−τ)dτ +z(t)
=
L
X
l=1
Nd
X
n=1
Cldnpr(t−nTd−τl)ej2π(ˇνl+νlnTd)+z(t),
(33)
where Cl:= √PtotTdρlvH
l?aR(φl)aT(θl)Hul?. The
receiver uses standard timing and carrier synchronization
with respect to the multipath component l?selected
by the BA algorithm. When BA is achieved, the SNR
corresponding such l?multipath component is quite large,
since it is boosted by the combined beamforming gain of
the UE and the BS. For example, in order to support a
spectral 1 bit/s/Hz with practical coded modulation (e.g.,
using a QPSK constellation with binary coding rate 1/2),
the SNR after beamforming should be between 0 and 3 dB,
depending on the coding scheme used. In these conditions,
it is well-known that timing and carrier synchronization
can be considered virtually ideal. Therefore, the receiver
performs matched filtering with respect to the symbol pulse
pr(t), sampling at epochs t=kTd+τl?, and symbol
de-rotation by the factor e−j2π(ˇνl?+νl?nTd). It follows
that the discrete-time baseband signal at the output of
the matched filter and synchronizer takes on the form
of (34), where zc
n[k]denotes the noise at the output of
the matched filter with variance N0,10 and we define
ϕ(t) = Rpr(τ)p∗
r(τ−t)dτ. In (34) (a)we used the fact that
since pr(t)is a square-root Nyquist pulse, then ϕ(¯
k·Td)is
equal to 1for ¯
k= 0 and is zero otherwise. The first term
in (34) corresponds to the desired symbol dkmultiplied
by an overall channel coefficient Cl?that contains the
beamforming gain achieved by BA, whereas the last two
terms correspond the inter-symbol interference and noise,
respectively.
The resulting SNR after beamforming SNRABF is given
by (35), where in (35) (a)we used the fact that ϕ(t)≈0for
|t|> Td, thus, Pn∈[Nd]E[|Clϕ((k−n)Td+τl?−τl)|2].
E[|Cl|2], in (35) (b)we used the fact that the interference
caused by the other paths is negligible (compared with
the noise floor of the receiver) since for the paths whose
AoA-AoD is away from the beamforming directions the
SNR is even lower than the isotropic SNR SNRBBF defined
in (29). Finally, in (35) (c)we used the fact that the
dominant path l?has nearly full beamforming gain MN. It
is seen from (35) that SNRABF is around MN times larger
than SNRBBF. This justifies the assumption of nearly ideal
timing and carrier recovery.
Consequently, the ergodic achievable rate in (34) can be
upper and lower bounded as (36) and (37), respectively
[42]. The upper bound (36) is obtained via the Maximum
Ratio Combining for the case where all the delayed
versions of the transmitted signal are separately observable
(this is sometimes referred to as “matched filter upper
bound”). The lower bound is actually achieved by a simple
10As usual, we assume that the symbol pulse has unit energy, i.e.,
R|pr(t)|2dt = 1, therefore the noise sample has the same variance of
the noise power spectral density N0[33].
52 4.3 Original journal article
y(t)|t=kTd+τl?=
Nd
X
n=1
dnCl?ϕ[(k−n)Td]
| {z }
(a)
=dkCl?
+
Nd
X
n=1
dnX
l6=l?
Clϕ[(k−n)Td+τl?−τl]·ej2π(ˇνl−ˇνl?+(νl−νl?)nTd)+
Nd
X
n=1
zc
n[k],(34)
SNRABF =E[|Cl?dk|2]
Pl6=l?Pn∈[Nd]E[|dn|2]E[|Clϕ((k−n)Td+τl?−τl)|2] + E[|zc
n[k]|2]
(a)
≈E[|dk|2]×E[|Cl?|2]
E[|dn|2]×Pl6=l?E[|Cl|2] + E[|zc
n[k]|2]
(b)
≈E[|dk|2]×E[|Cl?|2]
E[|zc
n[k]|2]
(c)
=Ptot ·γl?·MN
N0B,(35)
Rub?=E"log2 1 + PL
l=1 |Clϕ(τl?−τl)|2
N0!#,(36)
Rlb?= log2 1 + |E[Cl?ϕ(0)]|2
N0+Var(Cl?ϕ(0)) + Pm∈[Nd]Pl6=l?E[|Clϕ(mTd+τl?−τl)|2]!.(37)
Desired Region
−30 −20 −10 0 10 20 30
0
5
10
15
20
SNRBBF (dB)
Achievable rate (bit/s/Hz)
ub
lb
Fig. 7: The ergodic achievable rate after a successful Beam
Alignment using the proposed time-domain scheme, where M=
N= 32,B0=B,Nc= 64, the relative speed of the strongest
path ∆vl?= 5 m/s.
receiver that treats all the Inter-symbol Interference (ISI)
as a Gaussian noise.
Ergodic achievable rate bounds. In Fig. 7, we illustrate
the lower and upper bounds on the achievable ergodic rate
(36) (37) as a function of SNRBBF, under the assumption
of successful BA, i.e., that the BA algorithm found beam
indices of the strongest path.11 It is clear that the lower
11As seen before, this happens with probability ≈1after a few tens
of beacon slots.
bound is self-interference limited while the upper bound is
not. However, the gap between the bounds is quite small
in the regime of low pre-beamforming SNR (SNRBBF <
10 dB), while the achievable ergodic spectral efficiency
in this regime can be quite high, which is relevant in
mmWave applications. In particular, it is important to
recall here that the lower bound refers to the case of
single-carrier transmission without any equalization. For
example, focusing on a realistic spectral efficiency between
1∼2bit/s/Hz, we notice that single-carrier with the
proposed BA scheme and no equalization (just standard
post-beamforming timing and frequency synchronization)
achieves the relevant spectral efficiency in the range of
SNRBBF between -30 and -20 dB, and suffers from a very
small gap with respect to the best possible equalization
(given by the upper bound).
PDP before and after Beam Alignment (BA). Fig. 8
compares the average PDP of the mmWave channel with
L= 3 multipath components before and after BA. It can
be seen from Fig. 8 (a) that, before BA, the channel has
a relatively large delay spread and is highly frequency
selective. Moreover, since different multipath components
are mixed with each other and since each one has its own
delay and Doppler shift, the time-domain channel is highly
time-varying. In contrast, as seen from Fig. 8 (b), after
BA, the channel effectively consists of a single multipath
component, thus, it is almost flat in frequency. Also,
note that in contrast with the former case where different
4. Initial Beam Alignment for mmWave Single-Carrier Systems 53
20 40 60 80 100 120 140
0
2
4
6
8
10
E[|ys,i,j[k]|2]
(a)
20 40 60 80 100 120 140
0
20
40
60
80
100
Delay
E[|ys,i,j[k]|2]
(b)
Fig. 8: Illustration of the Power Delay Profile (PDP) with multi-path (L= 3) channel in (14).
(a)Before Beam Alignment.(b)After Beam Alignment
different Doppler frequencies, in the latter case the Doppler frequency of the single multi-path
component can be easily compensated by standard timing, frequency, and phase synchronization
techniques at the receiver.
Ergodic achievable rate bounds. In Fig. 9, we illustrate the lower and upper bounds on the
achievable ergodic rate (see (36) and (37)) as a function of SNRBBF. While it is clear that the
lower bound is interference-limited while the upper bound is not, we notice that the gap between
the bounds is quite small in the regime of low pre-beamforming SNR (SNRBBF <10 dB), which
is relevant in mmWave applications. At the same time, the achievable ergodic spectral efficiency
in this regime can be quite high. In particular, we remark here that the lower bound refers
to the case of single-carrier transmission without any equalization. For example, focusing on
a realistic spectral efficiency between 1 and 2 bit/s/Hz, we notice that single-carrier with the
proposed BA scheme and no equalization (just standard post-beamforming timing and frequency
synchronization) achieves the relevant spectral efficiency in the range of SNRBBF between -30
Fig. 8: Illustration of the PDP with multipath (L= 3) channel
in (14).(a)Before Beam Alignment.(b)After Beam Alignment
multipath components were mixed with different Doppler
frequencies, in the latter case the Doppler frequency of
the single multipath component can be easily compensated
by standard timing, frequency, and phase synchronization
techniques at the receiver.
VI. CONCLUSION
In this paper, we proposed a novel time-domain Beam
Alignment (BA) scheme for mmWave MIMO systems
with a HDA architecture. The proposed scheme is
particularly suited for single-carrier multiuser mmWave
communication, where each user has access to the whole
bandwidth, and all the users within the BS coverage can
be trained simultaneously. We focused on the channel
second-order statistics, incorporating both the random
channel gains and Doppler shifts into the channel matrix to
further capture the realistic features of mmWave channels.
We applied the recently developed Non-Negative Least
Squares (NNLS) technique to efficiently find the strongest
path for each user. Simulation results showed that the
proposed scheme incurs moderately low training overhead,
achieves very good robustness to fast time-varying
channels, and it is very robust to large Doppler shifts
among different multipath components. Furthermore, we
have shown that the multipath channel after BA reduces
essentially to a single giant tap. Hence, single-carrier
signaling can perform very efficiently and requires just
standard timing and frequency synchronization (that works
well at high SNR after beamforming) while it requires
no time-domain equalization. This makes the proposed
BA scheme together with single-carrier signaling a strong
contender for future mmWave systems, especially in
outdoor mobile scenarios.
REFERENCES
[1] F. Boccardi, R. W. Heath, A. Lozano, T. L. Marzetta, and
P. Popovski, “Five disruptive technology directions for 5G,” IEEE
Communications Magazine, vol. 52, no. 2, pp. 74–80, 2014.
[2] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C.
Soong, and J. C. Zhang, “What will 5G be?” IEEE Journal on
selected areas in communications, vol. 32, no. 6, pp. 1065–1082,
2014.
[3] J. Zhao, X. Wang, and H. Viswanathan, “Directional beam
alignment for millimeter wave cellular systems,” in 2016 IEEE
36th International Conference on Distributed Computing Systems
(ICDCS), June 2016, pp. 619–628.
[4] X. Song, S. Haghighatshoar, and G. Caire, “A scalable and
statistically robust beam alignment technique for mm-Wave
systems,” IEEE Trans. on Wireless Comm., vol. PP, pp. 1–1, 2018.
[5] A. F. Molisch, V. V. Ratnam, S. Han, Z. Li, S. L. H. Nguyen,
L. Li, and K. Haneda, “Hybrid beamforming for massive MIMO:
A survey,” IEEE Communications Magazine, vol. 55, no. 9, pp.
134–141, 2017.
[6] R. W. Heath, N. Gonzalez-Prelcic, S. Rangan, W. Roh, and A. M.
Sayeed, “An overview of signal processing techniques for millimeter
wave MIMO systems,” IEEE journal of selected topics in signal
processing, vol. 10, no. 3, pp. 436–453, 2016.
[7] M. R. Akdeniz, Y. Liu, M. K. Samimi, S. Sun, S. Rangan, T. S.
Rappaport, and E. Erkip, “Millimeter wave channel modeling and
cellular capacity evaluation,” IEEE Journal on Selected Areas in
Communications, vol. 32, no. 6, pp. 1164–1179, June 2014.
[8] P. Schniter and A. Sayeed, “Channel estimation and precoder design
for millimeter-wave communications: The sparse way,” in 2014 48th
Asilomar Conference on Signals, Systems and Computers, Nov 2014,
pp. 273–277.
[9] J. Rodr´
ıguez-Fern´
andez, N. Gonz´
alez-Prelcic, K. Venugopal, and
R. W. Heath Jr, “Frequency-domain compressive channel estimation
for frequency-selective hybrid mmWave MIMO systems,” arXiv
preprint arXiv:1704.08572, 2017.
[10] P. A. Eliasi, S. Rangan, and T. S. Rappaport, “Low-rank spatial
channel estimation for millimeter wave cellular systems,” IEEE
Transactions on Wireless Communications, vol. 16, no. 5, pp.
2748–2759, 2017.
[11] J. Palacios, D. D. Donno, and J. Widmer, “Tracking mm-Wave
channel dynamics: Fast beam training strategies under mobility,”
in IEEE INFOCOM 2017 - IEEE Conference on Computer
Communications, May 2017, pp. 1–9.
[12] S. Hur, T. Kim, D. J. Love, J. V. Krogmeier, T. A. Thomas,
and A. Ghosh, “Millimeter wave beamforming for wireless
backhaul and access in small cell networks,” IEEE Transactions
on Communications, vol. 61, no. 10, pp. 4391–4403, October 2013.
[13] S. Haghighatshoar and G. Caire, “The beam alignment problem in
mmWave wireless networks,” in 2016 50th Asilomar Conference on
Signals, Systems and Computers, Nov 2016, pp. 741–745.
[14] IEEE P802.11ad, Part 11, “Wireless LAN medium access control
(MAC) and physical layer (PHY) specifications amendment 3:
enhancements for very high throughput in the 60 GHz band,” IEEE
Computer Society, 2012.
[15] A. Alkhateeb, O. El Ayach, G. Leus, and R. W. Heath, “Channel
estimation and hybrid precoding for millimeter wave cellular
systems,” Selected Topics in Signal Processing, IEEE Journal of,
vol. 8, no. 5, pp. 831–846, 2014.
[16] M. Kokshoorn, H. Chen, P. Wang, Y. Li, and B. Vucetic, “Millimeter
wave MIMO channel estimation using overlapped beam patterns and
rate adaptation,” IEEE Transactions on Signal Processing, vol. 65,
no. 3, pp. 601–616, 2016.
[17] S. Noh, M. D. Zoltowski, and D. J. Love, “Multi-resolution
codebook and adaptive beamforming sequence design for
millimeter wave beam alignment,” IEEE Transactions on Wireless
Communications, vol. 16, no. 9, pp. 5689–5701, Sept 2017.
[18] M. Hussain and N. Michelusi, “Throughput optimal beam alignment
in millimeter wave networks,” in 2017 Information Theory and
Applications Workshop (ITA), Feb 2017, pp. 1–6.
[19] A. Alkhateeb, G. Leus, and R. W. Heath, “Compressed sensing
based multi-user millimeter wave systems: How many measurements
are needed?” in 2015 IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP), April 2015, pp.
54 4.3 Original journal article
2909–2913.
[20] K. Venugopal, A. Alkhateeb, N. G. Prelcic, and R. W.
Heath, “Channel estimation for hybrid architecture-based wideband
millimeter wave systems,” IEEE Journal on Selected Areas in
Communications, vol. 35, no. 9, pp. 1996–2009, 2017.
[21] K. Venugopal, A. Alkhateeb, R. W. Heath, and N. G. Prelcic,
“Time-domain channel estimation for wideband millimeter wave
systems with hybrid architecture,” in Acoustics, Speech and Signal
Processing (ICASSP), 2017 IEEE International Conference on.
IEEE, 2017, Conference Proceedings, pp. 6493–6497.
[22] R. J. Weiler, M. Peter, W. Keusgen, and M. Wisotzki, “Measuring
the busy urban 60 GHz outdoor access radio channel,” in 2014 IEEE
International Conference on Ultra-WideBand (ICUWB), Sept 2014,
pp. 166–170.
[23] V. Va, J. Choi, and R. W. Heath, “The impact of beamwidth
on temporal channel variation in vehicular channels and its
implications,” IEEE Transactions on Vehicular Technology, vol. 66,
no. 6, pp. 5014–5029, 2017.
[24] A. Ghosh, T. A. Thomas, M. C. Cudak, R. Ratasuk, P. Moorut,
F. W. Vook, T. S. Rappaport, G. R. MacCartney, S. Sun, and S. Nie,
“Millimeter-wave enhanced local area systems: A high-data-rate
approach for future wireless networks,” IEEE Journal on Selected
Areas in Communications, vol. 32, no. 6, pp. 1152–1163, June 2014.
[25] S. Buzzi, C. D’Andrea, T. Foggi, A. Ugolini, and G. Colavolpe,
“Single-carrier modulation versus OFDM for millimeter-wave
wireless MIMO,” IEEE Transactions on Communications, vol. PP,
no. 99, pp. 1–1, 2017.
[26] A. Nasser and M. Elsabrouty, “Frequency-selective massive MIMO
channel estimation and feedback in angle-time domain,” in 2016
IEEE Symposium on Computers and Communication (ISCC), June
2016, pp. 1018–1023.
[27] T. H¨
alsig, D. Cvetkovski, E. Grass, and B. Lankl, “Statistical
properties and variations of LOS MIMO channels at millimeter wave
frequencies,” arXiv preprint arXiv:1803.07768, 2018.
[28] X. Song, S. Haghighatshoar, and G. Caire, “A robust time-domain
beam alignment scheme for multi-user wideband mmWave systems,”
in WSA 2018; 22th International ITG Workshop on Smart Antennas
(to be published), March 2018, pp. 1–7.
[29] P. Bello, “Characterization of randomly time-variant linear
channels,” IEEE Transactions on Communications Systems, vol. 11,
no. 4, pp. 360–393, 1963.
[30] A. Goldsmith, Wireless communications. Cambridge University
Press, 2005.
[31] M. Slawski, M. Hein et al., “Non-negative least squares for
high-dimensional linear models: Consistency and sparse recovery
without regularization,” Electronic Journal of Statistics, vol. 7, pp.
3004–3056, 2013.
[32] R. Kueng and P. Jung, “Robust nonnegative sparse recovery and
the nullspace property of 0/1 measurements,” IEEE Transactions on
Information Theory, vol. 64, no. 2, pp. 689–703, Feb 2018.
[33] J. G. Proakis and M. Salehi, Digital communications. McGraw-Hill,
2008.
[34] A. M. Sayeed, “Deconstructing multiantenna fading channels,” IEEE
Transactions on Signal Processing, vol. 50, no. 10, pp. 2563–2579,
2002.
[35] J. Song, J. Choi, T. Kim, and D. J. Love, “Advanced quantizer
designs for FDD-based FD-MIMO systems using uniform planar
arrays,” IEEE Transactions on Signal Processing, vol. 66, no. 14,
pp. 3891–3905, July 2018.
[36] E. Dahlman, P. Beming, J. Knutsson, F. Ovesjo, M. Persson, and
C. Roobol, “WCDMA – the radio interface for future mobile
multimedia communications,” IEEE Transactions on Vehicular
Technology, vol. 47, no. 4, pp. 1105–1118, 1998.
[37] R. Tibshirani, “Regression shrinkage and selection via the lasso,”
Journal of the Royal Statistical Society. Series B (Methodological),
pp. 267–288, 1996.
[38] D. P. Bertsekas and A. Scientific, Convex optimization algorithms.
Athena Scientific Belmont, 2015.
[39] D. Kim, S. Sra, and I. S. Dhillon, “Tackling box-constrained
optimization via a new projected quasi-Newton approach,” SIAM
Journal on Scientific Computing, vol. 32, no. 6, pp. 3548–3563,
2010.
[40] D. K. Nguyen and T. B. Ho, “Anti-lopsided algorithm for
large-scale nonnegative least square problems,” arXiv preprint
arXiv:1502.01645, 2015.
[41] E. Perahia, C. Cordeiro, M. Park, and L. L. Yang, “IEEE 802.11
ad: Defining the next generation multi-Gbps Wi-Fi,” in Consumer
Communications and Networking Conference (CCNC), 2010 7th
IEEE. IEEE, Conference Proceedings, pp. 1–5.
[42] G. Caire, “On the ergodic rate lower bounds with applications to
massive MIMO,” arXiv preprint arXiv:1705.03577, 2017.
Xiaoshen Song (S’17) received the B.Sc.
degree in Communication Engineering from
Northwestern Polytechnical University, Xi’an,
China, in 2013, and the M.Sc. degree in
Communication and Information Systems from
the Institute of Electronics, University of
Chinese Academy of Sciences, Beijing, China,
in 2016. Her master’s thesis focuses on
video synthetic aperture radar (VideoSAR)
system design and imaging algorithms. She is
currently pursuing the Ph.D. degree with the
Communications and Information Theory (CommIT) group at Technische
Universit¨
at Berlin, Berlin, Germany. Her research interests include
wireless communication, mmWave MIMO, and compressed sensing.
Saeid Haghighatshoar (S’12–M’15) received
the B.Sc. degree in Electrical Engineering
(Electronics) in 2007 and the M.Sc. degree
in Electrical Engineering (Communication
Systems) in 2009, both from Sharif University
of Technology, Tehran, Iran, and the Ph.D.
degree in Computer and Communication
Sciences from ´
Ecole Polytechnique F´
ed´
erale
de Lausanne, Lausanne, Switzerland, in 2014.
Since 2015, he is a postdoctoral researcher
with Communications and Information Theory
(CommIT) group at Technische Universit¨
at Berlin, Berlin, Germany. His
research interests lie in Information Theory, Communication Systems,
Wireless Communication, Optimization Theory, and Compressed
Sensing.
4. Initial Beam Alignment for mmWave Single-Carrier Systems 55
Giuseppe Caire (S’92 – M’94 – SM’03
– F’05) was born in Torino, Italy, in
1965. He received the B.Sc. in Electrical
Engineering from Politecnico di Torino (Italy),
in 1990, the M.Sc. in Electrical Engineering
from Princeton University in 1992 and the
Ph.D. from Politecnico di Torino in 1994.
He has been a post-doctoral research fellow
with the European Space Agency (ESTEC,
Noordwijk, The Netherlands) in 1994-1995,
Assistant Professor in Telecommunications at
the Politecnico di Torino, Associate Professor at the University of Parma,
Italy, Professor with the Department of Mobile Communications at the
Eurecom Institute, Sophia-Antipolis, France, a Professor of Electrical
Engineering with the Viterbi School of Engineering, University of
Southern California, Los Angeles, and he is currently an Alexander
von Humboldt Professor with the Electrical Engineering and Computer
Science Department of the Technical University of Berlin, Germany.
He served as Associate Editor for the IEEE Transactions on
Communications in 1998-2001 and as Associate Editor for the IEEE
Transactions on Information Theory in 2001-2003. He received the Jack
Neubauer Best System Paper Award from the IEEE Vehicular Technology
Society in 2003, the IEEE Communications Society & Information Theory
Society Joint Paper Award in 2004 and in 2011, the Okawa Research
Award in 2006, the Alexander von Humboldt Professorship in 2014, and
the Vodafone Innovation Prize in 2015. Giuseppe Caire is a Fellow of
IEEE since 2005. He has served in the Board of Governors of the IEEE
Information Theory Society from 2004 to 2007, and as officer from 2008
to 2013. He was President of the IEEE Information Theory Society in
2011. His main research interests are in the field of communications
theory, information theory, channel and source coding with particular
focus on wireless communications.
56 4.3 Original journal article
5
Data Communication for mmWave
Multi-User MIMO
5.1 Introduction
Hybrid digital analog (HDA) beamforming is the most practical solution for mmWave
communication regarding the implementation cost, performance and power efficiency.
This chapter presents two HDA mmWave antenna architectures that can be regarded as
two extreme cases, namely, the fully-connected (FC) architecture and the one-stream-
per-subarray (OSPS) architecture. A joint performance evaluation of the initial beam
alignment and the consequent data communication will be provided, such that the latter
takes place by using the beam direction information obtained by the former. In addition,
the power efficiency of the the two architectures will also be evaluated, which takes into
account the hardware impairments, e.g., power dissipation, power backoff, etc..
5.2 Clarification of each authors’ contributions
This chapter is a journal publication, which is a joint work with Thomas Kühne and
Giuseppe Caire. I wrote this journal as the first author. The citation information is in
below:
X. Song, T. Kühne, and G. Caire, “Fully-/Partially-Connected Hybrid Beamforming
Architectures for mmWave MU-MIMO,” IEEE Transactions on Wireless Communica-
tions, 2019. DOI: 10.1109/TWC.2019.2957227
58 5.3 Original journal article
All the authors contributed to this paper. I authored the channel and signaling models.
I implemented the simulations for beam alignment section and data communication
section. I also wrote the complete first draft (including all sections) of this paper.
Thomas Kühne authored all the hardware sections. He proposed the HDA antenna
architecture model and the hardware impairment model.
Giuseppe Caire, who is my PhD supervisor, provided valuable discussions in each
meeting of this work. He also did a final modification of the overall draft.
5.3 Original journal article
The following article is a reprint of the original journal paper. It is the accepted version
of the paper. The copyright information is given in page xii of this thesis as well as in
the first page of the reprinted paper.
Fully- / Partially-Connected Hybrid Beamforming
Architectures for mmWave MU-MIMO
Xiaoshen Song, Student Member, IEEE, Thomas K¨
uhne, and Giuseppe Caire Fellow, IEEE
©2019 IEEE. Reprinted, with permision, from X. Song, T. K¨
uhne, and G. Caire, ”Fully-/Partially-Connected Hybrid Beamforming Architectures for
mmWave MU-MIMO,” IEEE Transactions on Wireless Communications, 2019. The published version can be found online: https://ieeexplore.ieee.
org/document/8931770. This reprint is the accepted version of the paper.
Abstract—Hybrid digital analog (HDA) beamforming has
attracted considerable attention in practical implementation
of millimeter wave (mmWave) multiuser multiple-input
multiple-output (MU-MIMO) systems due to the low power
consumption with respect to its fully digital baseband
counterpart. The implementation cost, performance, and
power efficiency of HDA beamforming depends on the
level of connectivity and reconfigurability of the analog
beamforming network. In this paper, we investigate the
performance of two typical architectures that can be regarded
as extreme cases, namely, the fully-connected (FC) and
the one-stream-per-subarray (OSPS) architectures. In the
FC architecture each RF antenna port is connected to all
antenna elements of the array, while in the OSPS architecture
the RF antenna ports are connected to disjoint subarrays.
We jointly consider the initial beam acquisition and data
communication phases, such that the latter takes place by
using the beam direction information obtained by the former.
We use the state-of-the-art beam alignment (BA) scheme
previously proposed by the authors and consider a family
of MU-MIMO precoding schemes well adapted to the beam
information extracted from the BA phase. We also evaluate
the power efficiency of the two HDA architectures taking
into account the power dissipation at different hardware
components as well as the power backoff under typical power
amplifier constraints. Numerical results show that the two
architectures achieve similar sum spectral efficiency, while
the OSPS architecture is advantageous with respect to the FC
case in terms of hardware complexity and power efficiency,
at the sole cost of a slightly longer BA time-to-acquisition due
to its reduced beam angle resolution.
Index Terms—Millimeter Waves, MU-MIMO, HDA
Beamforming, Beam Acquisition, Spectral Efficiency, Power
Efficiency.
I. INTRODUCTION
Millimeter wave (mmWave) multiuser multiple-input
multiple-output (MU-MIMO) communications have
emerged as one of the most promising techniques for the
second phase of 5G wireless systems, aimed at achieving
broadband data communications at unprecedented
high rates (≥1Gb/s per user) in very dense urban
small-cell environments. The relatively underutilized
The authors are with the Electrical Engineering and Computer Science
Department, Technische Universit¨
at Berlin, 10587 Berlin, Germany
(e-mail: [email protected]).
X. Song is sponsored by the China Scholarship Council
(201604910530). This work was funded by the European Union’s
Horizon 2020 research and innovation programme under grant agreement
No. 779305 (SERENA).
mmWave spectrum (30-300 GHz) allows to achieve a
target ∼1Gb/s per data stream with ∼1GHz signal
bandwidth, provided that the system can support a spectral
efficiency of about 1 bit/s/Hz. Such relatively low spectral
efficiency per stream can be achieved with rather standard
modulation and coding techniques (e.g., binary codes of
rate 1/2mapped onto a QPSK constellation), when that
the signal to interference plus noise ratio (SINR) at the
receiver is between 0and 3dB (depending on the gap to
capacity of the underlying code).1
Due to the severe isotropic pathloss incurred by
mmWave frequencies, large antenna gains are required both
at the base station (BS) side and the user equipment (UE)
side. Fortunately, the small carrier wavelength associated
with mmWave frequencies enables large antenna arrays to
be packed in a small form factor, such that the required
large antenna gain can be obtained using beamforming. For
example, in a single-user scenario where the signal-to-noise
ratio (SNR) at the receiver in isotropic propagation
conditions2is between −30 dB ∼ −20 dB (a quite realistic
situation for outdoor mmWave channels), a combined Tx
and Rx beamforming gain of 30 dB is needed such that,
when the Tx and the Rx beams are well aligned, the
resulting SNR after beamforming reaches the desired target
(a bit above 0dB, as argued before).
Realizing fast and accurate digitally steerable
beamforming at mmWave, however, is not a trivial
task. One main challenge is that the conventional full
digital transceiver architecture (with one radio frequency
(RF) chain per antenna element) is infeasible due
to hardware cost, power consumption, and above all
power dissipation in the small integrated array form
factor. Each RF chain consists of (roughly speaking)
analog-to-digital / digital-to-analog (A/D, D/A) converters,
up / down-conversion mixers, filters, power amplifiers
(PAs), and low-noise amplifiers (LNAs). It follows that
a design goal for mmWave transceivers is to reduce the
number of RF chains to be significantly smaller than the
number of antenna array elements. For this reason, the
1With ideal single-user capacity achieving codes for the Gaussian
channel, we have that log(1 + SINR) = 1 bit/s/Hz is achieved for
SINR = 1 (i.e., 0 dB). In practice, gaps of a fraction of a dB to 3-4
dB are obtained by actual coding schemes adopted in current standards.
2Here the isotropic propagation conditions correspond to one active
antenna at the transmitter (Tx) and one active antenna at the receiver
(Rx), respectively.
5. Data Communication for mmWave Multi-User MIMO 59
concatenation of digital and analog beamforming, known
as hybrid digital analog (HDA) beamforming architecture,
has been widely considered. In such a context, the limited
number of RF chains are used to enable the multistream
baseband processing, while an analog processing is used to
realize the antenna beamforming gain. A primary objective
of HDA beamforming is to maximize the multiuser sum
rate, while keeping the hardware costs, complexity, and
power efficiency, within some desirable targets.
A. Related Work
A large number of works have addressed HDA
beamforming for mmWave communication systems.
Rather than giving a complete account of such considerable
body of literature (out of scope of the present non-tutorial
paper), we consider a few significant representatives and
examine their proposed approaches in a critical manner.
A common assumption in most of existing works is that
the analog part of the HDA precoder can only utilize
phase control. This phase control can be realized through
either phase shifters [1–6] or lenses [7, 8]. Consequently,
the problem of finding the (sub-) optimal analog and
digital precoding matrices is transformed into a series
of relatively complicated decomposition steps [2–6],
since the underlying optimization problem is non-convex.
This phase-only control assumption may somewhat
reduce the hardware complexity. However, the signaling
freedom is also drastically reduced and the corresponding
optimization computational complexity is typically
prohibitive for practical real-time implementations.
These drawbacks motivate the exploration of an analog
precoding architecture with both phase and amplitude
controls [9, 10]. In fact, it has been demonstrated in
practice that simultaneous phase and amplitude control
is fully feasible at mmWaves with good accuracy, low
complexity, and low cost [11, 12].
Another severe limitation appearing in several HDA
beamforming works is the assumption of invariant
instantaneous channel coefficients over a large time
duration [1, 13, 14]. It is known that, in order to overcome
the heavy signal attenuation, communication at mmWaves
requires an initial beam acquisition (which we refer as
beam alignment (BA)) [7, 15, 16]. The goal of BA is to
find a pair of narrow beams connecting each UE with the
BS.3Thus, the nearly invariant channel assumption only
makes sense after BA is achieved, since once the beams are
aligned, the communication occurs only through a single
narrow path with small effective angular spread, whose
delay and Doppler shift can be easily compensated using
standard synchronization techniques [17–19]. However,
before BA is achieved, the channel delay spread and
time-variation can be large due to the presence of several
multipath components (both the LOS and the non-LOS
(NLOS) paths), each with its own delay and Doppler
3E.g., in line-of-sight (LOS) propagation, the aligned directions
typically coincide with the AoA and AoD of the LoS path.
shift. In this case, the instantaneous channel coefficients
change very fast. Any BA algorithms relying on an
invariant instantaneous channel assumption are no-longer
feasible, since for example, even a small motion of a
few centimeters traverses several wavelengths, potentially
producing multiple deep fades [20, 21].
In addition, a large number of works on HDA
architectures investigated only the data communication
phase and assume full channel state information (CSI)
[2–6, 10, 22, 23], i.e., that the vectors of baseband
complex channel coefficients at each array element
are known. These works focus on the optimization
of the HDA precoder using the full CSI knowledge.
Unfortunately, this assumption is obviously not feasible in
a realistic system. In order to acquire such coefficients,
one should be able to sample each antenna element, i.e.,
one would need an RF chain per antenna element or
exhaustively measure all elements successively. Hence, if
full CSI knowledge was possible, no HDA beamforming
would be needed, since we could simply implement
baseband digital beamforming/multiuser precoding, which
is performance-wise more efficient. As a matter of fact,
it makes sense to study HDA architectures under the
assumption that only a low-dimensional projection of the
channel vectors can be measured by the limited number
of RF chains. To this end, a hybrid precoding scheme
exploiting implicit CSI (i.e., the couplings of all possible
pairs of analog beamforming vectors) was proposed in [24].
However, the work in [24] (as well as in [4, 6, 10, 22, 23])
is limited to a single-user configuration and does not treat
the MU-MIMO case.
It is known that MU-MIMO is superior to single-user
beamforming from a network spectral efficiency
perspective even under HDA, provided that the user
density is rich enough such that the BS can schedule
subsets of UEs to be served by spatial multiplexing
with sufficient angular separation [25, 26]. Hence, this
motivates us to consider the implementation of MU-MIMO
schemes under realistic HDA architecture constraints. Two
“extreme” HDA architectures are depicted in Fig. 1 [27].
Fig. 1 (a) shows a fully-connected (FC) architecture, where
each RF antenna port is connected to all antenna elements
of the array. At the other extreme, Fig. 1 (b) shows
what we refer to as the one-stream-per-subarray (OSPS)
architecture, where each RF antenna port is connected
to a disjoint subarray. A common theme that underlies
most of the HDA works is that the FC architecture
outperforms the OSPS architecture only at the cost of
higher hardware complexity. However, many reference
works [3, 8, 10, 22, 23] ignore hardware impairments
[6], such as the power dissipation and the PA nonlinear
distortion. In particular, the nonlinear PAs employed at
the BS can drastically distort the transmit signal when
operated close to saturation [28]. To this end, a certain
power backoff from the saturation power of a PA should
be considered accordingly for different signaling schemes
60 5.3 Original journal article
and transceiver architectures, such that the PAs can always
work in their linear operating region.
B. Contributions
In this paper we overcome the shortcomings of the
present literature outlined before, and comprehensively
evaluate the performance of HDA architectures (in
particular, as shown in Fig. 1), where we assume both
amplitude and phase control for each analog path. Our
main focus is on the MU-MIMO downlink, but similar
and symmetric conclusions can be reached for the uplink
as well. Our main contributions are summarized as follows:
1) More general and realistic mmWave channel model.
We consider a quite general mmWave wireless channel
model, taking into account the fundamental features of
mmWave channels such as fast time-variation due to
Doppler, frequency-selectivity, and the AoA-AoD sparsity
[20, 21, 29]. The numerical results based on our proposed
channel model are further verified on the 3D geometry
based channel generator QuaDRiGa [30], which has
become a standard tool in industrial R&D as well as in
3GPP standardization.
2) More practical hardware impairments and
power efficiency analysis. When comparing the HDA
beamforming performance of different transmitter
architectures, we take into account the practical hardware
impairments, particularly, the potential power dissipation
of the underlying analog network components, as well
as the unavoidable power backoff for the nonlinear PAs.
While the former is not difficult to be compensated, the
latter is highly dependent on the peak-to-average power
ratio (PAPR) of the input signal, which (as illustrated in
Section V) should be carefully investigated in terms of
different signaling and modulating schemes. On top of
the potential hardware impairments, we also evaluate the
power efficiency of the most power consuming PAs with
respect to different transmitter architectures. Numerical
results show that the OSPS architecture with single-carrier
(SC) modulation achieves the highest power efficiency.
3) A joint evaluation of initial BA and data
communication. As mentioned before, a main limitation in
most hybrid beamforming works is that they only focus on
the data communication and assume full CSI. To address
this issue, we consider both initial BA and consecutive data
communication in this paper. We assume that the precoder
in the data communication phase can only exploit a limited
amount of CSI, which is obtained along the beams acquired
in the BA phase. Hence, the signaling and communication
procedure in our paper captures the fundamental features
of practical mmWave communication.
4) Low-complexity data transmit precoding. In the
BA phase, we use our previously proposed BA scheme
[16, 18, 19], after which each UE obtains a sparse estimate
of the channel gains associated to all pairs of AoA-AoD on
a finely spaced discrete grid, corresponding to the Tx and
Rx beamforming codebooks. For the data communication
phase, we consider three alternative precoding options on
top of the effective channel after the BA phase. These are
referred to as beam steering (BST), analog maximum ratio
transmission (MRT), and joint analog maximum ratio and
baseband zeroforcing (MR-ZF), respectively. The proposed
schemes are very suitable for practical implementations
due to the low-time-overhead and low-complexity. In
particular, the MR-ZF precoding scheme proposed in this
paper outperforms the state-of-the-art counterparts in the
literature.
Notation: We denote vectors, matrices, and scalars by
a,A, and a(A), respectively. For an integer K∈Z,
[K]denotes the index set {1, ..., K}. We represent sets by
calligraphic Aand their cardinality with |A|. We use E[·]
for the expectation, k·k for l2-norm, ~for continuous-time
convolution, ⊗for the Kronecker product, for Hadamard
product.
II. CHANNEL AND SIGNAL MODELS
A. Channel Model
One of the main new features of 5G wireless networks
is the densely spread small cell layer [31]. In small
cell configurations as illustrated in Fig. 2 (a), the BS
creates a fixed arc-like sectorized beam in the elevation
direction. The orientation of the BS beam center in the
elevation direction tends to be fixed with an elevation
angle αe[32]. It follows that the probing area in the
range direction is restricted and the intensive initial beam
searching takes place mainly in the azimuth direction.
For notation simplicity, in this paper we only focus on
the 2D azimuth plane. Extension to the 3D geometry is
conceptually straightforward although may lead to a rather
high dimensional search for the initial beam acquisition
phase. In the small cell scenario as illustrated in Fig. 2,
where the beam shape in the elevation direction is fixed
a priori in order to define the cell footprint area, the 2D
azimuth geometry is fully justified. We assume that the BS
serving simultaneously KUEs. The BS is equipped with
a uniform linear array (ULA) of Mantennas and MRF RF
chains, where K≤MRF M. Each UE is equipped with
a ULA of Nantennas and NRF NRF chains. Since
the focus of this paper is the BS architecture, we consider
the case of NRF = 1, where the extension to NRF >1
is straightforward and was considered in our work on BA
[16, 18, 19]. The propagation channel between the BS and
the k-th UE, k∈[K], consists of Lkmax{M, N}
significant multipath components. As a result, the N×M
baseband equivalent impulse response of the channel at
time slot scan be written as
Hs,k(t, τ) =
Lk
X
l=1
ρs,k,lej2πνk,ltaR(φk,l)aT(θk,l)Hδ(τ−τk,l)
=
Lk
X
l=1
Hs,k,l(t)δ(τ−τk,l),(1)
5. Data Communication for mmWave Multi-User MIMO 61
multipath component ρs,k,l is constant over a short interval
(within one slot) and changes from slot to slot according
to a wide-sense stationary process statistics characterized
by its power spectral density (Doppler spectrum) [33].
When the channel coherence time (related to the inverse
of the bandwidth of the Doppler spectrum, see [33]) is
significantly larger than the slot duration but equal or
smaller than the (non-consecutive) slot separation in time,
a convenient model is to consider the coefficients as i.i.d.
across different slots. Moreover, the Doppler shift νk,l as
defined in (1) introduces a continuous phase rotation for
each channel sample. Each multipath component (channel
tap coefficient) is formed by the superposition of a large
number of micro-scattering components (e.g., due to rough
surfaces) having (approximately) the same AoA-AoD and
delay. By the central limit theorem, it is customary to model
the superposition of these many small effects as Gaussian
[34, 35]. Hence, the multipath component coefficients can
be modeled as Rice fading given by
ρs,k,l ∼√γk,l rηk,l
1 + ηk,l
+1
p1 + ηk,l
ˇρs,k,l!,(4)
where γk,l denotes the overall multipath component
strength, ηk,l ∈[0,∞)indicates the strength ratio
between the specular reflection (or LOS) and the scattered
components, and ˇρs,k,l ∼ CN(0,1) is a zero-mean
unit-variance complex Gaussian random variable whose
value changes in an i.i.d. fashion across different slots.
In particular, ηk,l → ∞ indicates a pure LOS path while
ηk,l = 0 indicates a pure scattered path, affected by
Rayleigh fading.
The AoA-AoDs (φk,l, θk,l)in (1) can take on arbitrary
values in the continuous AoA-AoD domain. Following
the widely used approach of [36], known as beam-domain
representation, we obtain a finite-dimensional
representation of the channel response (1). More precisely,
we consider the discrete set of AoA-AoDs
Φ := ˇ
φ: (1 + sin(ˇ
φ))/2 = n−1
N, n ∈[N],(5a)
Θ := ˇ
θ: (1 + sin(ˇ
θ))/2 = m−1
M, m ∈[M].(5b)
It follows that the corresponding sets AR:= {aR(ˇ
φ) : ˇ
φ∈
Φ}and AT:= {aT(ˇ
θ) : ˇ
θ∈Θ}form discrete dictionaries
to represent the channel response. For the ULAs considered
in this paper, the dictionaries ARand AT, after suitable
normalization, reduce to the columns of unitary Discrete
Fourier Transform (DFT) matrices FN∈CN×Nand
FM∈CM×M, with elements
[FN]n,n0=1
√Nej2π(n−1)( n0−1
N−1
2), n, n0∈[N],(6a)
[FM]m,m0=1
√Mej2π(m−1)( m0−1
M−1
2), m, m0∈[M].
(6b)
Consequently, based on a subarray basis indexed by i0, the
beam-domain representation of the channel response (1) is
given by [7, 36]
ˇ
Hi0
s,k(t, τ) = FH
NHs,k(t, τ)·FM1{(i0−1) ˆ
M+1:i0ˆ
M,1:M}
=
Lk
X
l=1
ˇ
Hi0
s,k,l(t)δ(τ−τl),(7)
where (i0≡1,ˆ
M=M)for the FC architecture,
and (i0∈[MRF],ˆ
M=M
MRF )for the OSPS
architecture. Here we define ˇ
Hi0
s,k,l(t) := FH
NHs,k,l(t)·
FM1{(i0−1) ˆ
M+1:i0ˆ
M,1:M}as the beam-domain l-th
multipath component between the k-th UE and the BS,
where 1{a1:a2,b1:b2}∈CM×Mis an indicator matrix, with
1at the components indexed by rows from a1to a2and by
columns from b1to b2, otherwise zero. The indicator matrix
takes into account the fact that the number of antenna
elements for each subarray in the OSPS architecture is MRF
times less than that in the FC architecture.
As shown in our earlier work [16] (and the references
therein), for the FC architecture, as the number of antennas
Mat the BS and Nat the UE increases, the DFT
basis provides a good sparsification of the propagation
channel. As a result, ˇ
Hi0
s,k(t, τ)can be approximated as
a sparse matrix, with non-zero elements in the locations
corresponding to small clusters of discrete AoA-AoD pairs.
For the OSPS architecture, note that the indices of non-zero
elements in ˇ
Hi0
s,k(t, τ)are identical for all i0∈[MRF].
However, the channel sparsity depends on the number of
antennas in each subarray. In both cases, we may encounter
a grid error in (7) since the AoAs-AoDs do not necessarily
fall into the uniform grid Φ×Θ. Nevertheless, as shown
in [16], the grid error becomes negligible by increasing the
number of (subarray) antennas (i.e., the grid resolution). In
our simulations, we do not constrain the AoA-AoD pairs
of the physical channel to take on values on the discrete
grid; therefore, the grid discretization effect is fully taken
into account in our numerical results.
B. Signaling Model
Because of space limitation, in this paper we focus
on SC signaling. Similar conclusions can be reached for
OFDM, although the latter is generally more fragile to
frame synchronization errors, large PAPR, and, before
BA is achieved, to inter-carrier interference due to
the fact that the Doppler spread between the several
multipath components may be large [19, 37]. Let xs(t) =
[xs,1(t), xs,2(t), ..., xs,K(t)]Tdenote the continuous-time
baseband equivalent signal (either pilot or data signal),
transmitted over the s-th slot. With HDA beamforming, the
beamformed signal at the output of the transmitter over the
s-th slot is generally given by
ˆ
xs(t) = pE0·URF
s·WBB
s·xs(t),(8)
5. Data Communication for mmWave Multi-User MIMO 63
where for simplicity of exposition we restrict to the case
of uniform power allocation, with E0=Ptot Tc
Kindicating
the per-chip energy of each signal stream, where Ptot
denotes the total radiated power at the BS and Tc=1
B
denotes the chip duration with Bindicating the signaling
bandwidth. In (8), we define WBB
s∈CMRF×Kand URF
s∈
CM×MRF as the baseband (digital) and the RF analog
beamforming matrices, respectively. Note that, depending
on the transmitter architecture, the analog beamforming
matrix URF
stakes on the form
[˜
us,1,˜
us,2,··· ,˜
us,MRF ]and
˜
us,10··· 0
0˜
us,2··· 0
.
.
..
.
.....
.
.
0 0 ··· ˜
us,MRF
(9)
for the FC (left) and the OSPS (right) architectures,
respectively, where ˜
us,i ∈Cˆ
M,i∈[MRF], with ˆ
M=M
for the FC architecture and ˆ
M=M
MRF for the OSPS
architecture. Hence, in both cases URF
shas dimension
M×MRF, but FC has a full matrix, while OSPS has a
block-diagonal matrix, due to the constrained connectivity.
Without loss of generality, the beamforming vectors are
normalized as PMRF
i=1 kus,ik2=MRF.
The beamformed signal (8) goes through the channel
as defined in (1). At the UE side, because of the HDA
architecture, the UE does not have direct access to each
antenna element. Instead, at each slot s, the UE obtains
only a projection of the received signal by applying some
beamforming vector in the analog domain. We consider a
single RF chain at each UE as mentioned before. Thus, the
received signal at the k-th UE side is given by
ˆys,k(t) =vH
s,kHs,k(t, τ)~ˆ
xs(t) + zs,k(t)
=pE0vH
s,kHs,k(t, τ)~URF
s·WBB
s·xs(t)
+zs,k(t),(10)
where vs,k ∈CNdenotes the normalized beamforming
vector with kvs,kk= 1 at the k-th UE, and zs,k(t)is the
continuous-time complex Additive White Gaussian Noise
(AWGN) at the output of the UE RF chain, with a Power
Spectral Density (PSD) of N0Watt/Hz.
In the following, we will evaluate the performance of
different transmitter architectures as shown in Fig. 1. For
this purpose, it is useful to first define the channel SNR
before beamforming (BBF) SNRBBF, given by
SNRBBF, k =Ptot PLk
l=1 γk,l
N0B.(11)
where kis the index of the UE and γk,l denotes the
strength of the l-th multipath component. The SNR in
(11) indicates the ratio of the total received signal power
(summing over all the multipath components) over the
total noise power at the receiver baseband processor input,
assuming that the signal is isotropically transmitted by
the BS and isotropically received at the k-th UE over
the total bandwidth B. As mentioned before, one of the
challenges of mmWaves communication is that the SNR
before beamforming SNRBBF in (11) may be very low.
III. BEAM ACQUISITION AND DATA TRANSMISSION
We evaluate the performance of the FC and OSPS
architectures including both the BA phase and the
consequent data transmission phase, where the latter uses
the beam information obtained by the former. For the
BA phase we use the scheme proposed in our previous
work [19], that compares favorably with respect to several
competing schemes proposed in the literature. For the sake
of space limitation, we provide here only a high-level
summary of the scheme and invite the reader to consider
[19] for the full details. Fig. 3 (a) illustrates the considered
frame structure, which consists of three parts: the beacon
slot, the random access control channel (RACCH) slot,
and the data slot. As shown in Fig. 3 (b), the BS
broadcasts its pilot signals periodically over the beacon
slots. The measurements are collected at each UE locally
and independently of other UEs. Based on measurements
accumulated over a sequence of several beacon slots, each
UE can estimate a set of strongly coupled AoA-AoD
pairs, corresponding to the directions of strong propagation
paths between the UE and the BS arrays. These determine
the beamforming direction for possible data transmission.
During the RACCH slot, the BS stays in listening mode and
the UEs send beamformed uplink packets. These packets
contain basic information such as the UE ID and the
beam indices of the selected BS beam directions. The BS
responds with an acknowledgment (ACK) data packet in
the data subslot of a next frame, using the indicated beam
indices for transmission. From this moment on, the BS and
the UE are connected in the sense that, if the procedure is
successful, they can communicate by aligning their beams
along a small number of multipath components with strong
average power transmission.
As explained in details in [19], the BS beacon signals
are formed by MRF different PN sequences, each of
which undergoes a “multifinger” beam pattern obtained by
selecting a subset of the columns (or masked DFT columns
as in the case of OSPS). The beamforming patterns send
the signal energy uniformly distributed along subsets of
the BS AoD grid. The beamforming patterns follow a
pre-determined pseudo-random sequence, similar in the
spirit to the primary synchronization code of a W-CDMA
3G system for BS identification. During the beacon slot,
each UE kreceives using its own pseudo-random sequence
of multifinger beam patterns, and integrates the received
signal energy over the multiple time segments within a
beacon slot in order to obtain an estimate of the average
received energy. As a result, this fully non-coherent energy
measurement yields (approximately) the average energy
sum of several multipath components. These multipath
components corresponds to the AoA-AoD pairs in the grid
64 5.3 Original journal article
[33], and Ndindicates the number of the transmit symbols.
Accordingly, the received data signal at the k-th UE is
given by (13), where wk0denotes the k0-th column of
WBB,∆k,n,l = 2π(ˇνk,l +νk,lnTc), and Ck,k0,l,n :=
ρk,ldk0,n√E0(vH
kaR(φk,l)aT(θk,l)HURFwk0). We assume
that each UE uses standard timing synchronization with
respect to its strongest multipath component indexed by
l1, which is selected by its initial BA. To decode the
data signal, each UE performs matched filtering with
respect to the symbol pulse pr(t), sampling at epochs
t= ˆnTc+τk,l1. It follows that the discrete-time baseband
signal received at the k-th UE receiver takes on the form
of (14), where ˆn∆
k,k0,ˆn,n,l := (ˆn−n)Tc+τk,l1−τk0,l,
ϕr[t∆] = ϕr(t)|t=t∆:= Rpr(τ)p∗
r(τ−t∆)dτ, and zc
k[ˆn]
denotes the noise at the output of the matched filter with
variance N0·R|pr(t)|2dt =N0. As we can see, the first
term in (14) corresponds to the desired data symbol dk,n
multiplied by a different complex coefficient over each path
l.6Whereas, the last two terms in (14) correspond to the
multiuser interference and noise, respectively. By treating
the multiuser interference as noise, the asymptotic ergodic
spectral efficiency of the k-th UE is given by (15) and the
sum rate reads Rsum =PK
k=1 Rk. In all the schemes treated
here, coherent communication can be practically achieved
by including per-user beamformed pilot symbols at the cost
of a very small overhead, as it is quite state of art and usual
in virtually any modern wireless communication standard.
For simplicity, we shall not take into account this overhead
or the degradation of quasi-coherent receivers, which is
well known and not a specific feature of the systems under
consideration.
1) Hybrid Precoding Formulation
Now the remaining problem is how to define the
precoding/combining vectors. We assume that the BS
communicates with the k-th UE along its top-pbeams.
We will show later that the parameter p≥1is somehow a
tradeoff between the transmitter power spreading, multiuser
interference, and the system robustness to potential
blockages. To simplify the practical implementation, we
define the combining vector at the k-th UE as
vk=1
√pFN·
p
X
p0=1
ˇ
vk,p0,(16)
where ˇ
vk,p0∈CNis an all-zero vector with a 1at
the component corresponding to the p0-th strong AoA,
i.e., the AoA index of the p0-th strong component in
Γ?
k. Denoted by V∈CNK×Kas the aggregated receive
beamforming matrix given by V=diag(v1,v2, ..., vK).
6Actually, we have shown in our precious work [19] that, the phase
perturbations over several strong paths are easy to compensate by standard
carrier synchronization techniques given that a successful BA is achieved
and the effective channel after BA has a very small time spreading. Due
to the space limit, in (14) and also in our simulations, we will keep
the phase perturbations such that the numerical results coincide with the
conservative worst-case scenario.
It follows that the receive data signal vector ¯
y(t) =
[y1(t), y2(t), ..., yK(t)]T∈CKcorresponding to the K
UEs can be written as
¯
y(t) = pE0VH·H(t, τ)~URF ·WBB ·xd(t)+¯
z(t)
(a)
=pE0VH·H(t, τ)·U·ARF ·WBB~xd(t)+¯
z(t)
(b)
=pE0e
H(t, τ)·ARF ·WBB~xd(t) + ¯
z(t),
(17)
where ¯
z(t)∈CKindicates the noise vector, URF :=
U·ARF is the analog beamforming matrix, e
H(t, τ) :=
VH·H(t, τ)·Udenotes a constructed effective channel,
and Hs(t, τ)∈CNK×Mrepresents the aggregated
instantaneous channel of all the KUEs given by
H(t, τ) = H1(t, τ)T,H2(t, τ)T,··· ,HK(t, τ)TT,(18)
where Hk(t, τ),k∈[K], is given in (1). In (17) (a), we
define U∈CM×pK as the angular support, and ARF =
[a1,a2, ..., aK]∈CpK×Kas the coefficient tuning for the
analog part. More precisely, we assume U= [U1, ..., UK],
where Uk∈CM×p,k∈[K], takes on the form
Uk=FM1{(k0−1) ˆ
M+1:k0ˆ
M,1:M}
×hˇ
uk,1,ˇ
uk,2, ..., ˇ
uk,pi,(19)
where (i0≡1,ˆ
M=M)for the FC architecture, and (i0=
k, ˆ
M=M
MRF )for the OSPS architecture. Also, we define
ˇ
uk,p0∈CM,p0∈[p], as an all-zero vector with a 1at
the component corresponding to the p0-th strongest AoD
of Γ?
k.
Notice that in order to construct the beamforming vector
at each k-th UE and the precoding vectors at the BS, only
the AoA-AoD indices of the pstrongest components in the
estimated channel gain matrix Γ?
kare needed. Then, once
these vectors are fixed, the resulting effective channel has
much lower dimensions than the original physical N×M
channel (from array to array). Therefore, it can be estimated
using orthogonal uplink pilots and channel reciprocity as
in regular TDD MU-MIMO (e.g., see [38, 39]). Namely,
the constructed effective channel matrix e
H(t, τ)in (17) (b)
has dimension K×(p K), and can be estimated using pK
uplink pilot sub-slots using TDD reciprocity.
2) Beam Steering (BST) Scheme
The BST scheme consists of simply steering the Kdata
streams towards the KUEs along their strongest AoD.
Hence, we have p= 1 in (16) and in (19), respectively.
In such case, the analog tuning matrix and the baseband
precoding matrices under the BST precoding scheme turn
to be identity, i.e., ARF =WBB =IK. Note that in the
BST scheme, we do not need any additional uplink channel
estimation of e
H(t, τ). Namely, once the UEs has fed back
its strongest AoD control packet, the BS can immediately
provide the BST precoder.
5. Data Communication for mmWave Multi-User MIMO 67
ˆyk(t) = pE0vH
kHk(t, τ)~URF ·WBB ·xd(t)+zk(t)
=
Nd
X
n=1
K
X
k0=1
Lk
X
l=1
dk0,npE0vH
kHk,l(t)URFwk0pr(t−τk0,l −nTc) + zk(t)
=
Nd
X
n=1
K
X
k0=1
Lk
X
l=1
Ck,k0,l,nej∆k,n,l pr(t−τk0,l −nTc) + zk(t)(13)
yk[ˆn] = yk(t)|t=ˆnTc+τk,l1= ˆyk(t)~p∗
r(−t)t=ˆnTc+τk,l1
=
Nd
X
n=1
K
X
k0=1
Lk
X
l=1
Ck,k0,l,nej∆k,n,l ϕrˆn∆
k,k0,ˆn,n,l+
Nd
X
n=1
zc
k[ˆn]
=
Nd
X
n=1
Lk
X
l=1
Ck,k,l,nej∆k,n,l ϕrˆn∆
k,k,ˆn,n,l+X
k06=k
Lk
X
l=1
Ck,k0,l,nej∆k,n,l ϕrˆn∆
k,k0,ˆn,n,l+zc
k[ˆn]
(14)
Rk= log2
1 +
EhPLk
l=1 Ck,k,l,nej∆k,n,l ϕrhˆn∆
k,k,ˆn,n,li2i
EhPk06=kPLk
l=1 Ck,k0,l,nej∆k,n,l ϕrhˆn∆
k,k0,ˆn,n,li2i+N0
(15)
3) Analog Maximum Ratio Transmission (MRT) Scheme
In this scheme, we aims to maximize the desired
signal power as well as to increase the scheme blockage
robustness. To this end, the baseband precoding matrix
remains identity, i.e., WBB =IK, while the k-th analog
MRT tuning vector (i.e., the k-th column of ARF) is given
by
ak=e
H(t, τ){k,:}H
1{(k0−1)ˆp+1:k0ˆp}·∆RF,(20)
where e
H(t, τ){k,:}indicates the k-th row of e
H(t, τ),
and ∆RF ∈R+denotes the normalizing factor
such that PMRF
i=1 kuik2=MRF. The indicator vector
1{(k0−1)ˆp+1:k0ˆp}∈CpK has components 1over the index
{(k0−1)ˆp+1 : k0ˆp}otherwise 0, where (k0≡1,ˆp=pK)
for the FC architecture and (k0=k, ˆp=p)for the
OSPS architecture. Here the indicator vector ensures that,
in the OSPS architecture, the analog beamforming matrix
URF =U·ARF satisfies the block diagonal structure as
illustrated in (9).
4) Joint Analog Maximum Ratio and Baseband
Zeroforcing (MR-ZF) Scheme
On top of the previous MRT scheme, in this joint MR-ZF
scheme, we propose to make use of the baseband precoding
to further reduce the multiuser interference. Accordingly,
the analog MRT vectors in ARF are given by (20), while
the baseband ZF matrix WBB takes on the form
WBB =
e
H(t, τ)ARFH
·e
H(t, τ)ARF e
H(t, τ)ARFH−1
·∆ZF,
(21)
where ∆ZF ∈R+is the normalizing factor ensuring the
total radiated power constraint, i.e., PK
k=1 kwkk2=K.
IV. HARDWARE IMPAIRMENTS
In all the above derivations, we have implicitly assumed
that all the hardware components work in their ideal
range without any distortion or power dissipation. However,
in practical hardware systems, such assumption is not
trivial to meet. For example, the implementation of HDA
transceivers consists of a large number of power dividers
and combiners in the analog part, particularly for the
FC architecture. The power dissipation caused by these
components has a severe impact on the transmit power and
the power efficiency. Moreover, due to the superposition
of multiple beamformed pilots / data, the input signal at
the PAs may encounter a large PAPR. Also, different
beamforming vectors will create different power levels
for different PAs. As a result, the input power for
some individual PAs may exceed their saturation limit
(relevant to per-antenna power constraint) and even cause
a disruption of the whole transmission. All these hardware
impairment have a severe impact on the transmitter
performance and should not be neglected. In this section,
we will provide the mathematical model to evaluate the
hardware efficiency of different transmitter architectures
given in Fig. 1.
68 5.3 Original journal article
We assume that each analog path has simultaneous
amplitude and phase control as shown in Fig. 1. Refer
to (8), let ˜
x∈CMdenote the pre-amplified beamformed
signal7, given by
˜
x=√αcom ·e
URF ·√αdiv ·WBB ·x,(22)
where x= [x1,··· , xK]∈CKdenotes the transmit
symbol, with E[|xi|2] = ,i∈[K]. The factor αdiv
indicates the power splitting at the divider, with αdiv =1
M
for the FC architecture as shown in Fig. 1 (a) and αdiv =
MRF
Mfor the OSPS architecture as shown in Fig. 1 (b).
Moreover, the factor αcom models the power dissipation
factor of the combiners, i.e., αcom =1
MRF for the FC
architecture, and αcom = 1 for the OSPS architecture. Both
αdiv and αcom result from the hardware implementation
and are based on the corresponding S-parameters of the
dividers and combiners as in [5]. We assume that the
baseband beamforming matrix WBB is of dimension K×K
with K=MRF, and the analog beamforming matrix
e
URF = [u1, ..., uMRF ]∈CM×MRF satisfies the specific
FC / OSPS architecture as illustrated in (9).
We consider the rather simple BST precoding with
WBB =IK. To first meet the total power constraint, for
any i∈[MRF], we have kuik2=Mfor the FC architecture
and kuik2=M
MRF for the OSPS architecture, respectively.
It follows that the effective pre-amplified radiated power
of the beamformed signal ˜
xin (22) can be written as
˜
P=E[˜
xH˜
x] = αcomαdiv ·E[xH(e
URF)He
URFx]
=αcomαdiv ·tr E[xxH]·(e
URF)He
URF.(23)
Accordingly, the pre-amplified radiated power for the FC
and the OSPS architectures reads ˜
PFC =MRF 1
MRF and
˜
POSPS =MRF, respectively. As we can see, in order
to achieve the same output power, the FC transmitter
should compensate for an additional combiner power
dissipation. More precisely, the transmitter should either
boost the input signal as MRFxor choose PAs with
larger gain for the amplification stage. We consider the
former approach and mathematically include the potential
boosting factor MRF as well as the factors (αcom, αdiv)
into the beamforming matrix e
URF. Denoted by URF as
the integrated analog beamforming matrix, such that the
pre-amplified beamformed signal in (22) can be written
as ˜
x=URF ·WBB ·x, which is consistent with our
assumptions and formulations in Section II.
The beamformed signal (22) then goes through the
amplification stage, where at each antenna branch a
PA amplifies the signal before transmission. We assume
that the PAs in different antenna branches have the
same input-output relation. For any given antenna in the
transmitter array, let Prad denote the radiated power of
the antenna, and Pcons denote the consumed power of the
7For notation simplicity, here we ignored the slot index sand the time
index t.
corresponding PA, which includes both the radiated power
and the dissipated power. Following the approach in [28],
the power consumed by the PA takes on the form
Pcons =√Pmax
ηmax pPrad,(24)
where Pmax is the maximum output power of the PA with
Prad ≤Pmax and ηmax is the maximum efficiency of the
PA. Note that this relation holds for the most common
PA implementations and is therefore a good choice for the
following calculation. Considering that the PAs are often
the predominant power consumption part, we define ηeff
given by
ηeff =Prad
Pcons
(25)
as the metric to effectively compare the power efficiency of
the two transmitter architectures shown in Fig. 1. Note that
due to the superposition of multiple beamforming vectors
(particularly for the FC architecture) and the potentially
high PAPR of the time-domain transmit waveform ˜
xin
(22) (particularly with OFDM signaling), the input power
for some individual PA may exceed its saturation limit. This
would result in non-linear distortion and even the disruption
of the whole transmission. To compare the two transmitter
architectures and ensure that all the underlying MPAs
simultaneously work in their linear range, we generally
have two options:
Option I: Both the FC architecture and the OSPS
architecture utilize the same PA but apply a different input
back-off αoff ∈(0,1], such that the peak power of the
radiated signal is smaller than Pmax. As a reference, we
denote by (Prad,0, ηmax,0)as the parameters of a reference
PA under the reference precoding/beamforming strategy
with a power backoff factor αoff,0(as illustrated later
in Section V). For different scenarios (with certain αoff)
the average radiated power and the consumed power take
the form Prad =αoff
αoff,0Prad,0,Pcons =√Pmax,0
ηmax,0√Prad. The
transmitter power efficiency is given by
ηeff =Prad
Pcons
=√Prad ·ηmax,0
pPmax,0
.(26)
Option II: We choose to deploy different PAs for the
FC architecture and the OSPS architecture. More precisely,
we assume that the underlying PA has a maximum output
power of Pmax =αoff,0
αoff Pmax,0, where αoff has the same
value as in Option I. Consequently, the average radiated
power and the consumed power of the underlying PA can
be written as Prad =Prad,0,Pcons =√Pmax,0·αoff,0/αoff
ηmax √Prad.
The transmitter power efficiency is given by
ηeff =Prad
Pcons
=√Prad ·ηmax
pPmax,0·αoff,0·√αoff.(27)
Note that the characteristics (Pmax and ηmax)of
different PAs highly depend on the operation frequency,
5. Data Communication for mmWave Multi-User MIMO 69
implementation, and technology. Aiming at illustrating how
to apply the proposed analysis framework in practical
system design, we will exemplify a set of PA parameters
in Section V to evaluate the efficiency ηeff of the two
architectures given in Fig. 1. For the comparison of BA
and data communication algorithms, we are interested in
the performance of the corresponding algorithms using
the different transmitter architectures but with the same
channel condition as in (11). Therefore, we assume
the same total radiated power Ptot constraint for both
architectures in Fig. 1. In practical systems, this assumption
can be satisfied by applying a certain power backoff as in
Option I or chosing different PAs as in Option II. This in
addition fulfills the per-antenna power constraint, such that
all the underlying PAs work in their linear range with an
identical scalar gain. However, we will show in Section V
that, under the same radiated power constraint, different
architectures may have a different power efficiency.
V. NUMERICAL EVALUATION
We now present the numerical results to evaluate
the proposed precoding schemes and to illustrate the
performance of different transmitter architectures as shown
in Fig. 1. The BA scheme was already extensively
studied in [16, 18, 19] in terms of complexity,
system-level scalability, and robustness to fast channel
time-variations / large Doppler spread. Hence, here we
focus only on the difference in time-to-successful BA
required by the two BS architectures under comparison.
We consider a system with a BS using M= 128 antennas
and MRF = 4 RF chains. The BS simultaneously schedules
K=MRF = 4 UEs, each of which uses N= 16 antennas
and NRF = 1 RF chain. We assume a short preamble
structure used in IEEE 802.11ad [1, 40], where the beacon
slot is of duration t0S= 1.891 µs. The system is assumed
to work at f0= 40 GHz with a bandwidth of B= 0.8GHz,
namely, each beacon slot amounts to more than 1500 chips.
In the following simulations, otherwise stated, we will
assume a fixed total radiated power constraint Ptot, where
all the underlying PAs working in their linear range (w.r.t.,
per-antenna power constraint Pmax) with an identical scalar
gain. The MU-MIMO channel is generated in two ways:
1) In Section V-A and Section V-B, we use the channel
model in (1) to generate the channel matrix between
each UE kand the BS. Based on the practical mmWave
MIMO channel measurements in [29], we assume Lk= 3,
k∈[K], multipath components for each UE, given by
(γk,1= 1, ηk,1= 100),(γk,2= 0.6, ηk,2= 10) and
(γk,3= 0.4, ηk,3= 0) with respect to (4). Thus, the first
link can be roughly regarded as the LOS path, while the
remaining links represent the NLOS paths. We also assume
that the LOS paths for the simultaneously scheduled UEs
are well separated in the beam domain, while all the NLOS
paths are generated in a random way.
2) In Section V-C, we use the quasi-deterministic radio
channel generator (QuaDRiGa) to generate the propagation
channel matrix. The channel model is based on the
3GPP 38.901 standard and takes into account the spatial
consistency [30, 41]. In this case, the height of the BS
antenna array is set to 10 m. The beam center of the
BS orientates to the ground with an elevation angle8of
αe=−20◦as shown in Fig. 2 (a). The simultaneous
scheduled UEs are set to 1.5m in height and 18 ∼25 m
horizontally away from the BS with a downlink AoD
difference of ∆θmin ≈8◦[42]. Each UE kmoves towards
the BS at a speed of ∆vk= 1 m/s. We will show
that the numerical results based on our proposed channel
model (1) are consistent with the results based on the
QuaDRiGa generator, implying that the proposed work not
only theoretically but also practically provides valuable
references for mmWave system design.
A. Evaluation of the Proposed Precoding Schemes
The efficiency of the proposed precoding schemes are
illustrated in Fig. 6. As a comparison, we also simulate the
ZF precoder proposed in [26], where the effective channel
is approximated by the initial BA vectors, and only a
single path is selected between each UE and the BS. As
we can see from Fig. 6 (a), for the FC architecture with
no blockage, all the schemes coincide with each other in
the range of SNRBBF ≤0dB. Whereas when SNRBBF >
0dB, the performance ranking of the underlying precoding
schemes is as follow (MR-ZF, p= 2)≈(MR-ZF, p= 1)>
(MRT, p= 1)>(MRT, p= 2)>(BST) ≈(ZF in [26]).
Here the MRT scheme with p= 2 performs worse than
with p= 1 due to the fact of power spreading and the
fact that with multiple receiving directions, the UE tends
to have more interference. However, this effect is not
observable in the MR-ZF scheme because of the further
power coefficient tuning and interference cancellation,
which results from the baseband zeroforcing. Next, by
increasing the blockage probability of the strongest path
while remaining unblocked for all the less strong paths
between each UE and the BS, as shown in Fig. 6 (b)
and Fig. 6 (c), the curves with p= 2 drops much less
than the others (equivalent to p= 1), and the scheme
of MR-ZF with p= 2 achieves the best performance.
For the OSPS architecture, when there is no blockage as
shown in Fig. 6 (d), in the low SNR range (SNRBBF ≤
−10 dB), all the curves (roughly) coincide with each
other. Whereas, by increasing SNRBBF >−10 dB, the
precoding schemes rank (MR-ZF, p= 2)≈(MR-ZF, p=
1)>(MRT, p= 1)≈(BST) ≈(ZF in [26]) >(MRT, p= 2).
Similar with the FC case, the MR-ZF scheme for the
OSPS architecture achieves the best performance when
increasing the blockage probability as shown in Fig. 6 (e)
and Fig. 6 (f). As a brief summary w.r.t. the given scenario,
for both architectures, when the channel SNR is weak and
there is no blockage, we claim that the BST scheme is
8In QuaDRiGa, the elevation angle 90◦points to the zenith and 0◦
points to the horizon.
70 5.3 Original journal article
13
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(a)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(b)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
13
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(a)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(b)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
13
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(a)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(b)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
13
13
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(a)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(b)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
13
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(a)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(b)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
13
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(a)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(b)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p=2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p=2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p=2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
13
13
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(a)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(b)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
13
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(a)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(b)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
13
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(a)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(b)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p=2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p=2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p=2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
13
13
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(a)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(b)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
13
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(a)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(b)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
13
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(a)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(b)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p=2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(e)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p=2
OSPS, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
(f)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p=2
OSPS, ZF in [26]
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0,
(b) 0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
than the latter for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the
range of SNRBBF ≤ −10 dB, which is more relevant in
mmWave channels, all the 4curves coincide with each
other. Namely, for either the MR-ZF scheme or the BST
scheme, the two architectures achieve a rather similar
spectral efficiency. In contrast, when SNRBBF >−10 dB,
the MR-ZF scheme performs better. The two architectures
with the MR-ZF precoding again achieve a rather similar
performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a
SC modulation. We use reference PAs with Pmax,0= 6
dBm and ηmax,0= 0.3. The backoff factor with respect
to different waveforms and transmitter architectures can
be written as αoff = 1/(PPAPR), where PPAPR represents
the PAPR of the input signal at a PA. The investigation
for 3GPP LTE in [37] showed that with a probability
of 0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work in
the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear
range have roughly the same maximum efficiency ηmax,0.
Fig. 6: The sum spectral efficiency vs. increasing SNRBBF. The blockage probability of the strongest path is given by (a) 0.0, (b)
0.3, (c) 0.6for the FC architecture, and (d) 0.0, (e) 0.3, (f) 0.6for the OSPS architecture.
preferred since it is rather simple but adequate to achieve
good performance. However, when the channel SNR is
not too weak or there are potential blockages, the MR-ZF
scheme with p > 1outperforms the other schemes. As
a side note, in practical implementation, the choice of p
should not be too large since it plays a trade-off between
blockage robustness, power spreading and the overhead for
additional channel estimation.
B. Fully-Connected (FC) or One-Stream-Per-Subarray
(OSPS)?
Note that the performance of different architectures
highly depends on the channel condition and the underlying
precoders. On top of the given scenario in this paper,
we jointly evaluate the architecture performance in three
aspects:
Training efficiency for the initial BA phase. Let PD
denote the detection probability, i.e., the probability of
finding the strongest AoA-AoD pair between the BS and a
generic UE. The BA results are illustrated in Fig. 7 (a). As
a comparison, we also simulate a recent time-domain BA
algorithm proposed in [43], which focuses on estimating
the instantaneous channel coefficients with an orthogonal
matching pursuit (OMP) technique. As we can see, the
proposed BA scheme requires much less training overhead
than that in [43]. In addition, due to the fact that the OSPS
architecture has lower angular resolution and encounters
larger sidelobe power leakage than the FC case, the former
requires moderately ∼10 more beacon slots than the latter
for PD≥0.95.
Spectral efficiency for the data communication phase.
To compare the spectral efficiency of the two transmitter
architectures as shown in Fig. 1, we consider a no-blockage
scenario and focus on two precoding schemes, i.e., the
simple BST scheme and the high-performance MR-ZF
scheme with p= 2. As we can see in Fig. 7 (b), in the range
of SNRBBF ≤ −10 dB, which is more relevant in mmWave
channels, all the 4curves coincide with each other. Namely,
for either the MR-ZF scheme or the BST scheme, the two
architectures achieve a rather similar spectral efficiency. In
contrast, when SNRBBF >−10 dB, the MR-ZF scheme
performs better. The two architectures with the MR-ZF
precoding again achieve a rather similar performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the
simple BST precoder. Also, since the modulation highly
affects the power efficiency, we will take into account
both the SC and the OFDM signaling in this section.
We first assume a reference scenario as the baseline, i.e,
the OSPS architecture using the BST precoder and a SC
modulation. We use reference PAs with Pmax,0= 6 dBm
and ηmax,0= 0.3. The backoff factor with respect to
different waveforms and transmitter architectures can be
written as αoff = 1/(PPAPR), where PPAPR represents the
PAPR of the input signal at a PA. The investigation
5. Data Communication for mmWave Multi-User MIMO 71
13
20 40 60 80 100 120 140 160
0.2
0.4
0.6
0.8
1
Number of beacon slots T
PD
(a)
FC, NNLS
OSPS, NNLS
FC, OMP [43]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(b)
FC, BST
FC, MR-ZF p=2
OSPS, BST
OSPS, MR-ZF p=2
−2 0 2 4 6
−5
0
5
10
Prad,0(dBm)
Prad (dBm)
(c)
FC, SC, αoff=−9.8dB
FC, OFDM, αoff =−11.4dB
OSPS, SC, αoff=−7.2dB
OSPS, OFDM, αoff=−11.4dB
−6−4−20246
0.1
0.15
0.2
0.25
0.3
Prad (dBm)
ηeff
(d)
FC, SC
FC, OFDM
OSPS, SC
OSPS, OFDM
Fig. 7: The performance comparison of different transmitter architectures. (a) The initial BA detection probability vs. the training overhead
with SNRBBF =−19 dB. (b) The sum spectral efficiency vs. increasing SNRBBF, without blockage. (c) The actual radiated power under
Option I vs. the radiated power of the reference scenario. (d) The power efficiency under Option II vs. the actual radiated power.
achieve a rather similar spectral efficiency. In contrast, when
SNRBBF >−10 dB, the MR-ZF scheme performs better. The
two architectures with the MR-ZF precoding again achieve a
rather similar performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the simple
BST precoder. Also, since the modulation highly affects the
power efficiency, we will take into account both the SC and the
OFDM signaling in this section. We first assume a reference
scenario as the baseline, i.e, the OSPS architecture using
the BST precoder and a SC modulation. We use reference
PAs with Pmax,0= 6 dBm and ηmax,0= 0.3. The backoff
factor with respect to different waveforms and transmitter
architectures can be written as αoff = 1/(PPAPR), where
PPAPR represents the PAPR of the input signal at a PA.
The investigation for 3GPP LTE in [37] showed that with
a probability of 0.999, the PAPR of the LTE SC waveform
(known as SC-FDMA) is smaller than ∼7.2dB and the PAPR
of the LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of the
signals from different RF chains. Since each OFDM signal
can be modeled as a Gaussian random process [37] and the
signals from different RF chains are independent, the PAPR
of the sum is the same as of one RF chain. For the case of SC
signaling, there is no clear work in the literature that shows
how the sum of SC signals behaves. We simulated the sum
of MRF = 4 SC signals using the same parameters as in [37].
The result shows that with probability of 0.999 the PAPR of
the sum is smaller than ∼9.8dB. We apply these values and
without loss of generality, we choose αoff,0=−7.2dB as
the reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the same
efficiency for a given Prad. However, as illustrated in Fig. 7 (c),
the OSPS architecture with SC signaling (OSPS, SC) achieves
the highest Prad, followed by (FC, SC), (OSPS, OFDM), and
(FC, OFDM). In contrast, by deploying different PAs (Option
II)9, Fig. 7 (d) shows that (OSPS, SC) achieves the highest
power efficiency, followed by (FC, SC), (OSPS, OFDM) and
(FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms the
FC case in terms of hardware complexity and power efficiency,
only at the cost of a slightly longer latency for the initial BA.
C. Simulations Based on QuaDRiGa
In this section, we resort to the 3D geometry based channel
generator QuaDRiGa [30] to show that our numerical results
are quite consistent with practical mmWave communication
channels.10 More precisely, we apply our BA and precoding
schemes over ∼3×105channel snapshots generated by
QuaDRiGa. These channel snapshots correspond to a short
9Since the ηmax of different PAs highly depends on the technology, for
simplicity, we assume that different PAs working in their linear range have
roughly the same maximum efficiency ηmax,0.
10Due to the QuaDRiGa generator limits, only the no-blockage scenario is
considered in this section.
13
20 40 60 80 100 120 140 160
0.2
0.4
0.6
0.8
1
Number of beacon slots T
PD
(a)
FC, NNLS
OSPS, NNLS
FC, OMP [43]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(b)
FC, BST
FC, MR-ZF p=2
OSPS, BST
OSPS, MR-ZF p=2
−2 0 2 4 6
−5
0
5
10
Prad,0(dBm)
Prad (dBm)
(c)
FC, SC, αoff=−9.8dB
FC, OFDM, αoff =−11.4dB
OSPS, SC, αoff=−7.2dB
OSPS, OFDM, αoff=−11.4dB
−6−4−20246
0.1
0.15
0.2
0.25
0.3
Prad (dBm)
ηeff
(d)
FC, SC
FC, OFDM
OSPS, SC
OSPS, OFDM
Fig. 7: The performance comparison of different transmitter architectures. (a) The initial BA detection probability vs. the training overhead
with SNRBBF =−19 dB. (b) The sum spectral efficiency vs. increasing SNRBBF, without blockage. (c) The actual radiated power under
Option I vs. the radiated power of the reference scenario. (d) The power efficiency under Option II vs. the actual radiated power.
achieve a rather similar spectral efficiency. In contrast, when
SNRBBF >−10 dB, the MR-ZF scheme performs better. The
two architectures with the MR-ZF precoding again achieve a
rather similar performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the simple
BST precoder. Also, since the modulation highly affects the
power efficiency, we will take into account both the SC and the
OFDM signaling in this section. We first assume a reference
scenario as the baseline, i.e, the OSPS architecture using
the BST precoder and a SC modulation. We use reference
PAs with Pmax,0= 6 dBm and ηmax,0= 0.3. The backoff
factor with respect to different waveforms and transmitter
architectures can be written as αoff = 1/(PPAPR), where
PPAPR represents the PAPR of the input signal at a PA.
The investigation for 3GPP LTE in [37] showed that with
a probability of 0.999, the PAPR of the LTE SC waveform
(known as SC-FDMA) is smaller than ∼7.2dB and the PAPR
of the LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of the
signals from different RF chains. Since each OFDM signal
can be modeled as a Gaussian random process [37] and the
signals from different RF chains are independent, the PAPR
of the sum is the same as of one RF chain. For the case of SC
signaling, there is no clear work in the literature that shows
how the sum of SC signals behaves. We simulated the sum
of MRF = 4 SC signals using the same parameters as in [37].
The result shows that with probability of 0.999 the PAPR of
the sum is smaller than ∼9.8dB. We apply these values and
without loss of generality, we choose αoff,0=−7.2dB as
the reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the same
efficiency for a given Prad. However, as illustrated in Fig. 7 (c),
the OSPS architecture with SC signaling (OSPS, SC) achieves
the highest Prad, followed by (FC, SC), (OSPS, OFDM), and
(FC, OFDM). In contrast, by deploying different PAs (Option
II)9, Fig. 7 (d) shows that (OSPS, SC) achieves the highest
power efficiency, followed by (FC, SC), (OSPS, OFDM) and
(FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms the
FC case in terms of hardware complexity and power efficiency,
only at the cost of a slightly longer latency for the initial BA.
C. Simulations Based on QuaDRiGa
In this section, we resort to the 3D geometry based channel
generator QuaDRiGa [30] to show that our numerical results
are quite consistent with practical mmWave communication
channels.10 More precisely, we apply our BA and precoding
schemes over ∼3×105channel snapshots generated by
QuaDRiGa. These channel snapshots correspond to a short
9Since the ηmax of different PAs highly depends on the technology, for
simplicity, we assume that different PAs working in their linear range have
roughly the same maximum efficiency ηmax,0.
10Due to the QuaDRiGa generator limits, only the no-blockage scenario is
considered in this section.
13
20 40 60 80 100 120 140 160
0.2
0.4
0.6
0.8
1
Number of beacon slots T
PD
(a)
FC, NNLS
OSPS, NNLS
FC, OMP [43]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(b)
FC, BST
FC, MR-ZF p=2
OSPS, BST
OSPS, MR-ZF p=2
−2 0 2 4 6
−5
0
5
10
Prad,0(dBm)
Prad (dBm)
(c)
FC, SC, αoff=−9.8dB
FC, OFDM, αoff =−11.4dB
OSPS, SC, αoff=−7.2dB
OSPS, OFDM, αoff=−11.4dB
−6−4−20246
0.1
0.15
0.2
0.25
0.3
Prad (dBm)
ηeff
(d)
FC, SC
FC, OFDM
OSPS, SC
OSPS, OFDM
Fig. 7: The performance comparison of different transmitter architectures. (a) The initial BA detection probability vs. the training overhead
with SNRBBF =−19 dB. (b) The sum spectral efficiency vs. increasing SNRBBF, without blockage. (c) The actual radiated power under
Option I vs. the radiated power of the reference scenario. (d) The power efficiency under Option II vs. the actual radiated power.
achieve a rather similar spectral efficiency. In contrast, when
SNRBBF >−10 dB, the MR-ZF scheme performs better. The
two architectures with the MR-ZF precoding again achieve a
rather similar performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the simple
BST precoder. Also, since the modulation highly affects the
power efficiency, we will take into account both the SC and the
OFDM signaling in this section. We first assume a reference
scenario as the baseline, i.e, the OSPS architecture using
the BST precoder and a SC modulation. We use reference
PAs with Pmax,0= 6 dBm and ηmax,0= 0.3. The backoff
factor with respect to different waveforms and transmitter
architectures can be written as αoff = 1/(PPAPR), where
PPAPR represents the PAPR of the input signal at a PA.
The investigation for 3GPP LTE in [37] showed that with
a probability of 0.999, the PAPR of the LTE SC waveform
(known as SC-FDMA) is smaller than ∼7.2dB and the PAPR
of the LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of the
signals from different RF chains. Since each OFDM signal
can be modeled as a Gaussian random process [37] and the
signals from different RF chains are independent, the PAPR
of the sum is the same as of one RF chain. For the case of SC
signaling, there is no clear work in the literature that shows
how the sum of SC signals behaves. We simulated the sum
of MRF = 4 SC signals using the same parameters as in [37].
The result shows that with probability of 0.999 the PAPR of
the sum is smaller than ∼9.8dB. We apply these values and
without loss of generality, we choose αoff,0=−7.2dB as
the reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the same
efficiency for a given Prad. However, as illustrated in Fig. 7 (c),
the OSPS architecture with SC signaling (OSPS, SC) achieves
the highest Prad, followed by (FC, SC), (OSPS, OFDM), and
(FC, OFDM). In contrast, by deploying different PAs (Option
II)9, Fig. 7 (d) shows that (OSPS, SC) achieves the highest
power efficiency, followed by (FC, SC), (OSPS, OFDM) and
(FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms the
FC case in terms of hardware complexity and power efficiency,
only at the cost of a slightly longer latency for the initial BA.
C. Simulations Based on QuaDRiGa
In this section, we resort to the 3D geometry based channel
generator QuaDRiGa [30] to show that our numerical results
are quite consistent with practical mmWave communication
channels.10 More precisely, we apply our BA and precoding
schemes over ∼3×105channel snapshots generated by
QuaDRiGa. These channel snapshots correspond to a short
9Since the ηmax of different PAs highly depends on the technology, for
simplicity, we assume that different PAs working in their linear range have
roughly the same maximum efficiency ηmax,0.
10Due to the QuaDRiGa generator limits, only the no-blockage scenario is
considered in this section.
13
20 40 60 80 100 120 140 160
0.2
0.4
0.6
0.8
1
Number of beacon slots T
PD
(a)
FC, NNLS
OSPS, NNLS
FC, OMP [43]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(b)
FC, BST
FC, MR-ZF p=2
OSPS, BST
OSPS, MR-ZF p=2
−2 0 2 4 6
−5
0
5
10
Prad,0(dBm)
Prad (dBm)
(c)
FC, SC, αoff=−9.8dB
FC, OFDM, αoff =−11.4dB
OSPS, SC, αoff=−7.2dB
OSPS, OFDM, αoff=−11.4dB
−6−4−20246
0.1
0.15
0.2
0.25
0.3
Prad (dBm)
ηeff
(d)
FC, SC
FC, OFDM
OSPS, SC
OSPS, OFDM
Fig. 7: The performance comparison of different transmitter architectures. (a) The initial BA detection probability vs. the training overhead
with SNRBBF =−19 dB. (b) The sum spectral efficiency vs. increasing SNRBBF, without blockage. (c) The actual radiated power under
Option I vs. the radiated power of the reference scenario. (d) The power efficiency under Option II vs. the actual radiated power.
achieve a rather similar spectral efficiency. In contrast, when
SNRBBF >−10 dB, the MR-ZF scheme performs better. The
two architectures with the MR-ZF precoding again achieve a
rather similar performance.
Hardware power efficiency. To evaluate the architecture
power efficiency, otherwise stated, we will consider the simple
BST precoder. Also, since the modulation highly affects the
power efficiency, we will take into account both the SC and the
OFDM signaling in this section. We first assume a reference
scenario as the baseline, i.e, the OSPS architecture using
the BST precoder and a SC modulation. We use reference
PAs with Pmax,0= 6 dBm and ηmax,0= 0.3. The backoff
factor with respect to different waveforms and transmitter
architectures can be written as αoff = 1/(PPAPR), where
PPAPR represents the PAPR of the input signal at a PA.
The investigation for 3GPP LTE in [37] showed that with
a probability of 0.999, the PAPR of the LTE SC waveform
(known as SC-FDMA) is smaller than ∼7.2dB and the PAPR
of the LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of the
signals from different RF chains. Since each OFDM signal
can be modeled as a Gaussian random process [37] and the
signals from different RF chains are independent, the PAPR
of the sum is the same as of one RF chain. For the case of SC
signaling, there is no clear work in the literature that shows
how the sum of SC signals behaves. We simulated the sum
of MRF = 4 SC signals using the same parameters as in [37].
The result shows that with probability of 0.999 the PAPR of
the sum is smaller than ∼9.8dB. We apply these values and
without loss of generality, we choose αoff,0=−7.2dB as
the reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the same
efficiency for a given Prad. However, as illustrated in Fig. 7 (c),
the OSPS architecture with SC signaling (OSPS, SC) achieves
the highest Prad, followed by (FC, SC), (OSPS, OFDM), and
(FC, OFDM). In contrast, by deploying different PAs (Option
II)9, Fig. 7 (d) shows that (OSPS, SC) achieves the highest
power efficiency, followed by (FC, SC), (OSPS, OFDM) and
(FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms the
FC case in terms of hardware complexity and power efficiency,
only at the cost of a slightly longer latency for the initial BA.
C. Simulations Based on QuaDRiGa
In this section, we resort to the 3D geometry based channel
generator QuaDRiGa [30] to show that our numerical results
are quite consistent with practical mmWave communication
channels.10 More precisely, we apply our BA and precoding
schemes over ∼3×105channel snapshots generated by
QuaDRiGa. These channel snapshots correspond to a short
9Since the ηmax of different PAs highly depends on the technology, for
simplicity, we assume that different PAs working in their linear range have
roughly the same maximum efficiency ηmax,0.
10Due to the QuaDRiGa generator limits, only the no-blockage scenario is
considered in this section.
Fig. 7: The performance comparison of different transmitter architectures. (a) The initial BA detection probability vs. the training
overhead with SNRBBF =−19 dB. (b) The sum spectral efficiency vs. increasing SNRBBF, without blockage. (c) The actual radiated
power under Option I vs. the radiated power of the reference scenario. (d) The power efficiency under Option II vs. the actual radiated
power.
for 3GPP LTE in [37] showed that with a probability of
0.999, the PAPR of the LTE SC waveform (known as
SC-FDMA) is smaller than ∼7.2dB and the PAPR of the
LTE OFDM waveform (with 512 subcarriers employing
QPSK) is smaller than ∼11.4dB. We set PPAPR to these
values for the OSPS architecture. For the FC architecture,
however, the input signals of the PAs are the sum of
the signals from different RF chains. Since each OFDM
signal can be modeled as a Gaussian random process [37]
and the signals from different RF chains are independent,
the PAPR of the sum is the same as of one RF chain.
For the case of SC signaling, there is no clear work
in the literature that shows how the sum of SC signals
behaves. We simulated the sum of MRF = 4 SC signals
using the same parameters as in [37]. The result shows
that with probability of 0.999 the PAPR of the sum is
smaller than ∼9.8dB. We apply these values and without
loss of generality, we choose αoff,0=−7.2dB as the
reference scenario. As shown in (26), by deploying the
same PAs (Option I), the two architectures achieve the
same efficiency for a given Prad. However, as illustrated
in Fig. 7 (c), the OSPS architecture with SC signaling
(OSPS, SC) achieves the highest Prad, followed by (FC,
SC), (OSPS, OFDM), and (FC, OFDM). In contrast, by
deploying different PAs (Option II)9, Fig. 7 (d) shows that
(OSPS, SC) achieves the highest power efficiency, followed
by (FC, SC), (OSPS, OFDM) and (FC, OFDM).
To sum up, given the parameters in this paper, the two
architectures achieve a similar sum spectral efficiency with
certain precoders, but the OSPS architecture outperforms
the FC case in terms of hardware complexity and power
efficiency, only at the cost of a slightly longer latency for
the initial BA.
C. Simulations Based on QuaDRiGa
In this section, we resort to the 3D geometry based
channel generator QuaDRiGa [30] to show that our
numerical results are quite consistent with practical
mmWave communication channels.10 More precisely, we
apply our BA and precoding schemes over ∼3×105
channel snapshots generated by QuaDRiGa. These channel
snapshots correspond to a short segment of time evolution,
where the BS is stationary and the speed of each UE
along its moving direction is 1m/s. The simulation results
with respect to different transmitter architectures are shown
9Since the ηmax of different PAs highly depends on the technology,
for simplicity, we assume that different PAs working in their linear range
have roughly the same maximum efficiency ηmax,0.
10Due to the QuaDRiGa generator limits, only the no-blockage scenario
is considered in this section.
72 5.3 Original journal article
15
20 40 60 80 100 120 140 160
0.2
0.4
0.6
0.8
1
Number of beacon slots T
PD
(a)
FC, NNLS
OSPS, NNLS
FC, OMP [43]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(b)
FC, BST
FC, MR-ZF, p= 2
OSPS, BST
OSPS, MR-ZF p=2
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 8: The simulations based on QuaDRiGa: (a) The initial BA detection probability vs. the training overhead, with SNRBBF =
−19 dB. (b) The sum spectral efficiency of different transmitter architectures vs. increasing SNRBBF. (c) The sum spectral
efficiency of the FC architecture vs. increasing SNRBBF . (d) The sum spectral efficiency of the OSPS architecture vs. increasing
SNRBBF.
zero-forcing (MR-ZF), respectively. Particularly, both the
BA scheme and the MR-ZF precoding scheme outperform
the state-of-the-art counterparts in the literature. Given
the parameters in this paper, our simulation results show
that the two architectures achieve a similar sum spectral
efficiency, but the OSPS architecture outperforms the
FC case in terms of hardware complexity and power
efficiency, only at the cost of a slightly longer latency for
the initial BA. Therefore, the OSPS architecture emerges
as a good choice for a simple and efficient design of
MU-MIMO base stations operating at mmWave.
References
[1] K. Venugopal, A. Alkhateeb, N. G. Prelcic, and R. W. Heath,
“Channel estimation for hybrid architecture-based wideband
millimeter wave systems,” IEEE Journal on Selected Areas in
Communications, vol. 35, no. 9, pp. 1996–2009, 2017.
[2] F. Sohrabi and W. Yu, “Hybrid digital and analog beamforming
design for large-scale antenna arrays,” IEEE Journal of Selected
Topics in Signal Processing, vol. 10, no. 3, pp. 501–513, April
2016.
[3] A. Li and C. Masouros, “Hybrid analog-digital millimeter-wave
MU-MIMO transmission with virtual path selection,” IEEE
Communications Letters, vol. 21, no. 2, pp. 438–441, 2017.
[4] S. S. Ioushua and Y. C. Eldar, “Hybrid analog-digital
beamforming for massive MIMO systems,” arXiv preprint
arXiv:1712.03485, 2017.
[5] J. Du, W. Xu, H. Shen, X. Dong, and C. Zhao, “Hybrid
precoding architecture for massive multiuser MIMO with
dissipation: sub-connected or fully connected structures?” IEEE
Transactions on Wireless Communications, vol. 17, no. 8, pp.
5465–5479, 2018.
[6] P. L. Cao, T. J. Oechtering, and M. Skoglund, “Precoding design
for massive MIMO systems with sub-connected architecture
and per-antenna power constraints,” in WSA 2018; 22nd
International ITG Workshop on Smart Antennas, March 2018,
pp. 1–6.
[7] R. W. Heath, N. Gonzalez-Prelcic, S. Rangan, W. Roh, and
A. M. Sayeed, “An overview of signal processing techniques
for millimeter wave MIMO systems,” IEEE journal of selected
topics in signal processing, vol. 10, no. 3, pp. 436–453, 2016.
[8] X. Gao, L. Dai, and A. M. Sayeed, “Low RF-complexity
technologies to enable millimeter-wave MIMO with large
antenna array for 5G wireless communications,” IEEE
Communications Magazine, vol. 56, no. 4, pp. 211–217, APRIL
2018.
[9] A. F. Molisch, V. V. Ratnam, S. Han, Z. Li, S. L. H. Nguyen,
L. Li, and K. Haneda, “Hybrid beamforming for massive MIMO:
A survey,” IEEE Communications Magazine, vol. 55, no. 9, pp.
134–141, 2017.
[10] M. Majidzadeh, A. Moilanen, N. Tervo, H. Pennanen, A. Tölli,
and M. Latva-aho, “Hybrid beamforming for single-user MIMO
with partially connected RF architecture,” in 2017 European
Conference on Networks and Communications (EuCNC), June
2017, pp. 1–6.
[11] M. R. Castellanos, V. Raghavan, J. H. Ryu, O. H. Koymen,
J. Li, D. J. Love, and B. Peleato, “Hybrid multi-user precoding
with amplitude and phase control,” in 2018 IEEE International
Conference on Communications (ICC), May 2018, pp. 1–6.
[12] V. Raghavan, A. Partyka, A. Sampath, S. Subramanian,
O. H. Koymen, K. Ravid, J. Cezanne, K. Mukkavilli,
and J. Li, “Millimeter-wave MIMO prototype: Measurements
and experimental results,” IEEE Communications Magazine,
vol. 56, no. 1, pp. 202–209, 2018.
[13] Z. Gao, L. Dai, and Z. Wang, “Channel estimation for mmwave
massive MIMO based access and backhaul in ultra-dense
network,” in Communications (ICC), 2016 IEEE International
Conference on. IEEE, Conference Proceedings, pp. 1–6.
[14] J. Rodríguez-Fernández, N. González-Prelcic, K. Venugopal,
and R. W. Heath Jr, “Frequency-domain compressive channel
estimation for frequency-selective hybrid mmWave MIMO
15
20 40 60 80 100 120 140 160
0.2
0.4
0.6
0.8
1
Number of beacon slots T
PD
(a)
FC, NNLS
OSPS, NNLS
FC, OMP [43]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(b)
FC, BST
FC, MR-ZF, p= 2
OSPS, BST
OSPS, MR-ZF p=2
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 8: The simulations based on QuaDRiGa: (a) The initial BA detection probability vs. the training overhead, with SNRBBF =
−19 dB. (b) The sum spectral efficiency of different transmitter architectures vs. increasing SNRBBF. (c) The sum spectral
efficiency of the FC architecture vs. increasing SNRBBF . (d) The sum spectral efficiency of the OSPS architecture vs. increasing
SNRBBF.
zero-forcing (MR-ZF), respectively. Particularly, both the
BA scheme and the MR-ZF precoding scheme outperform
the state-of-the-art counterparts in the literature. Given
the parameters in this paper, our simulation results show
that the two architectures achieve a similar sum spectral
efficiency, but the OSPS architecture outperforms the
FC case in terms of hardware complexity and power
efficiency, only at the cost of a slightly longer latency for
the initial BA. Therefore, the OSPS architecture emerges
as a good choice for a simple and efficient design of
MU-MIMO base stations operating at mmWave.
References
[1] K. Venugopal, A. Alkhateeb, N. G. Prelcic, and R. W. Heath,
“Channel estimation for hybrid architecture-based wideband
millimeter wave systems,” IEEE Journal on Selected Areas in
Communications, vol. 35, no. 9, pp. 1996–2009, 2017.
[2] F. Sohrabi and W. Yu, “Hybrid digital and analog beamforming
design for large-scale antenna arrays,” IEEE Journal of Selected
Topics in Signal Processing, vol. 10, no. 3, pp. 501–513, April
2016.
[3] A. Li and C. Masouros, “Hybrid analog-digital millimeter-wave
MU-MIMO transmission with virtual path selection,” IEEE
Communications Letters, vol. 21, no. 2, pp. 438–441, 2017.
[4] S. S. Ioushua and Y. C. Eldar, “Hybrid analog-digital
beamforming for massive MIMO systems,” arXiv preprint
arXiv:1712.03485, 2017.
[5] J. Du, W. Xu, H. Shen, X. Dong, and C. Zhao, “Hybrid
precoding architecture for massive multiuser MIMO with
dissipation: sub-connected or fully connected structures?” IEEE
Transactions on Wireless Communications, vol. 17, no. 8, pp.
5465–5479, 2018.
[6] P. L. Cao, T. J. Oechtering, and M. Skoglund, “Precoding design
for massive MIMO systems with sub-connected architecture
and per-antenna power constraints,” in WSA 2018; 22nd
International ITG Workshop on Smart Antennas, March 2018,
pp. 1–6.
[7] R. W. Heath, N. Gonzalez-Prelcic, S. Rangan, W. Roh, and
A. M. Sayeed, “An overview of signal processing techniques
for millimeter wave MIMO systems,” IEEE journal of selected
topics in signal processing, vol. 10, no. 3, pp. 436–453, 2016.
[8] X. Gao, L. Dai, and A. M. Sayeed, “Low RF-complexity
technologies to enable millimeter-wave MIMO with large
antenna array for 5G wireless communications,” IEEE
Communications Magazine, vol. 56, no. 4, pp. 211–217, APRIL
2018.
[9] A. F. Molisch, V. V. Ratnam, S. Han, Z. Li, S. L. H. Nguyen,
L. Li, and K. Haneda, “Hybrid beamforming for massive MIMO:
A survey,” IEEE Communications Magazine, vol. 55, no. 9, pp.
134–141, 2017.
[10] M. Majidzadeh, A. Moilanen, N. Tervo, H. Pennanen, A. Tölli,
and M. Latva-aho, “Hybrid beamforming for single-user MIMO
with partially connected RF architecture,” in 2017 European
Conference on Networks and Communications (EuCNC), June
2017, pp. 1–6.
[11] M. R. Castellanos, V. Raghavan, J. H. Ryu, O. H. Koymen,
J. Li, D. J. Love, and B. Peleato, “Hybrid multi-user precoding
with amplitude and phase control,” in 2018 IEEE International
Conference on Communications (ICC), May 2018, pp. 1–6.
[12] V. Raghavan, A. Partyka, A. Sampath, S. Subramanian,
O. H. Koymen, K. Ravid, J. Cezanne, K. Mukkavilli,
and J. Li, “Millimeter-wave MIMO prototype: Measurements
and experimental results,” IEEE Communications Magazine,
vol. 56, no. 1, pp. 202–209, 2018.
[13] Z. Gao, L. Dai, and Z. Wang, “Channel estimation for mmwave
massive MIMO based access and backhaul in ultra-dense
network,” in Communications (ICC), 2016 IEEE International
Conference on. IEEE, Conference Proceedings, pp. 1–6.
[14] J. Rodríguez-Fernández, N. González-Prelcic, K. Venugopal,
and R. W. Heath Jr, “Frequency-domain compressive channel
estimation for frequency-selective hybrid mmWave MIMO
15
20 40 60 80 100 120 140 160
0.2
0.4
0.6
0.8
1
Number of beacon slots T
PD
(a)
FC, NNLS
OSPS, NNLS
FC, OMP [43]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(b)
FC, BST
FC, MR-ZF, p= 2
OSPS, BST
OSPS, MR-ZF p=2
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 8: The simulations based on QuaDRiGa: (a) The initial BA detection probability vs. the training overhead, with SNRBBF =
−19 dB. (b) The sum spectral efficiency of different transmitter architectures vs. increasing SNRBBF. (c) The sum spectral
efficiency of the FC architecture vs. increasing SNRBBF . (d) The sum spectral efficiency of the OSPS architecture vs. increasing
SNRBBF.
zero-forcing (MR-ZF), respectively. Particularly, both the
BA scheme and the MR-ZF precoding scheme outperform
the state-of-the-art counterparts in the literature. Given
the parameters in this paper, our simulation results show
that the two architectures achieve a similar sum spectral
efficiency, but the OSPS architecture outperforms the
FC case in terms of hardware complexity and power
efficiency, only at the cost of a slightly longer latency for
the initial BA. Therefore, the OSPS architecture emerges
as a good choice for a simple and efficient design of
MU-MIMO base stations operating at mmWave.
References
[1] K. Venugopal, A. Alkhateeb, N. G. Prelcic, and R. W. Heath,
“Channel estimation for hybrid architecture-based wideband
millimeter wave systems,” IEEE Journal on Selected Areas in
Communications, vol. 35, no. 9, pp. 1996–2009, 2017.
[2] F. Sohrabi and W. Yu, “Hybrid digital and analog beamforming
design for large-scale antenna arrays,” IEEE Journal of Selected
Topics in Signal Processing, vol. 10, no. 3, pp. 501–513, April
2016.
[3] A. Li and C. Masouros, “Hybrid analog-digital millimeter-wave
MU-MIMO transmission with virtual path selection,” IEEE
Communications Letters, vol. 21, no. 2, pp. 438–441, 2017.
[4] S. S. Ioushua and Y. C. Eldar, “Hybrid analog-digital
beamforming for massive MIMO systems,” arXiv preprint
arXiv:1712.03485, 2017.
[5] J. Du, W. Xu, H. Shen, X. Dong, and C. Zhao, “Hybrid
precoding architecture for massive multiuser MIMO with
dissipation: sub-connected or fully connected structures?” IEEE
Transactions on Wireless Communications, vol. 17, no. 8, pp.
5465–5479, 2018.
[6] P. L. Cao, T. J. Oechtering, and M. Skoglund, “Precoding design
for massive MIMO systems with sub-connected architecture
and per-antenna power constraints,” in WSA 2018; 22nd
International ITG Workshop on Smart Antennas, March 2018,
pp. 1–6.
[7] R. W. Heath, N. Gonzalez-Prelcic, S. Rangan, W. Roh, and
A. M. Sayeed, “An overview of signal processing techniques
for millimeter wave MIMO systems,” IEEE journal of selected
topics in signal processing, vol. 10, no. 3, pp. 436–453, 2016.
[8] X. Gao, L. Dai, and A. M. Sayeed, “Low RF-complexity
technologies to enable millimeter-wave MIMO with large
antenna array for 5G wireless communications,” IEEE
Communications Magazine, vol. 56, no. 4, pp. 211–217, APRIL
2018.
[9] A. F. Molisch, V. V. Ratnam, S. Han, Z. Li, S. L. H. Nguyen,
L. Li, and K. Haneda, “Hybrid beamforming for massive MIMO:
A survey,” IEEE Communications Magazine, vol. 55, no. 9, pp.
134–141, 2017.
[10] M. Majidzadeh, A. Moilanen, N. Tervo, H. Pennanen, A. Tölli,
and M. Latva-aho, “Hybrid beamforming for single-user MIMO
with partially connected RF architecture,” in 2017 European
Conference on Networks and Communications (EuCNC), June
2017, pp. 1–6.
[11] M. R. Castellanos, V. Raghavan, J. H. Ryu, O. H. Koymen,
J. Li, D. J. Love, and B. Peleato, “Hybrid multi-user precoding
with amplitude and phase control,” in 2018 IEEE International
Conference on Communications (ICC), May 2018, pp. 1–6.
[12] V. Raghavan, A. Partyka, A. Sampath, S. Subramanian,
O. H. Koymen, K. Ravid, J. Cezanne, K. Mukkavilli,
and J. Li, “Millimeter-wave MIMO prototype: Measurements
and experimental results,” IEEE Communications Magazine,
vol. 56, no. 1, pp. 202–209, 2018.
[13] Z. Gao, L. Dai, and Z. Wang, “Channel estimation for mmwave
massive MIMO based access and backhaul in ultra-dense
network,” in Communications (ICC), 2016 IEEE International
Conference on. IEEE, Conference Proceedings, pp. 1–6.
[14] J. Rodríguez-Fernández, N. González-Prelcic, K. Venugopal,
and R. W. Heath Jr, “Frequency-domain compressive channel
estimation for frequency-selective hybrid mmWave MIMO
15
20 40 60 80 100 120 140 160
0.2
0.4
0.6
0.8
1
Number of beacon slots T
PD
(a)
FC, NNLS
OSPS, NNLS
FC, OMP [43]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(b)
FC, BST
FC, MR-ZF, p= 2
OSPS, BST
OSPS, MR-ZF p=2
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(c)
FC, BST
FC, MRT, p= 1
FC, MRT, p= 2
FC, MR-ZF, p= 1
FC, MR-ZF, p= 2
FC, ZF in [26]
−30 −20 −10 0 10 20 30
0
20
40
60
80
SNRBBF (dB)
Rsum (bit/s/Hz)
(d)
OSPS, BST
OSPS, MRT, p= 1
OSPS, MRT, p= 2
OSPS, MR-ZF, p=1
OSPS, MR-ZF, p= 2
OSPS, ZF in [26]
Fig. 8: The simulations based on QuaDRiGa: (a) The initial BA detection probability vs. the training overhead, with SNRBBF =
−19 dB. (b) The sum spectral efficiency of different transmitter architectures vs. increasing SNRBBF. (c) The sum spectral
efficiency of the FC architecture vs. increasing SNRBBF . (d) The sum spectral efficiency of the OSPS architecture vs. increasing
SNRBBF.
zero-forcing (MR-ZF), respectively. Particularly, both the
BA scheme and the MR-ZF precoding scheme outperform
the state-of-the-art counterparts in the literature. Given
the parameters in this paper, our simulation results show
that the two architectures achieve a similar sum spectral
efficiency, but the OSPS architecture outperforms the
FC case in terms of hardware complexity and power
efficiency, only at the cost of a slightly longer latency for
the initial BA. Therefore, the OSPS architecture emerges
as a good choice for a simple and efficient design of
MU-MIMO base stations operating at mmWave.
References
[1] K. Venugopal, A. Alkhateeb, N. G. Prelcic, and R. W. Heath,
“Channel estimation for hybrid architecture-based wideband
millimeter wave systems,” IEEE Journal on Selected Areas in
Communications, vol. 35, no. 9, pp. 1996–2009, 2017.
[2] F. Sohrabi and W. Yu, “Hybrid digital and analog beamforming
design for large-scale antenna arrays,” IEEE Journal of Selected
Topics in Signal Processing, vol. 10, no. 3, pp. 501–513, April
2016.
[3] A. Li and C. Masouros, “Hybrid analog-digital millimeter-wave
MU-MIMO transmission with virtual path selection,” IEEE
Communications Letters, vol. 21, no. 2, pp. 438–441, 2017.
[4] S. S. Ioushua and Y. C. Eldar, “Hybrid analog-digital
beamforming for massive MIMO systems,” arXiv preprint
arXiv:1712.03485, 2017.
[5] J. Du, W. Xu, H. Shen, X. Dong, and C. Zhao, “Hybrid
precoding architecture for massive multiuser MIMO with
dissipation: sub-connected or fully connected structures?” IEEE
Transactions on Wireless Communications, vol. 17, no. 8, pp.
5465–5479, 2018.
[6] P. L. Cao, T. J. Oechtering, and M. Skoglund, “Precoding design
for massive MIMO systems with sub-connected architecture
and per-antenna power constraints,” in WSA 2018; 22nd
International ITG Workshop on Smart Antennas, March 2018,
pp. 1–6.
[7] R. W. Heath, N. Gonzalez-Prelcic, S. Rangan, W. Roh, and
A. M. Sayeed, “An overview of signal processing techniques
for millimeter wave MIMO systems,” IEEE journal of selected
topics in signal processing, vol. 10, no. 3, pp. 436–453, 2016.
[8] X. Gao, L. Dai, and A. M. Sayeed, “Low RF-complexity
technologies to enable millimeter-wave MIMO with large
antenna array for 5G wireless communications,” IEEE
Communications Magazine, vol. 56, no. 4, pp. 211–217, APRIL
2018.
[9] A. F. Molisch, V. V. Ratnam, S. Han, Z. Li, S. L. H. Nguyen,
L. Li, and K. Haneda, “Hybrid beamforming for massive MIMO:
A survey,” IEEE Communications Magazine, vol. 55, no. 9, pp.
134–141, 2017.
[10] M. Majidzadeh, A. Moilanen, N. Tervo, H. Pennanen, A. Tölli,
and M. Latva-aho, “Hybrid beamforming for single-user MIMO
with partially connected RF architecture,” in 2017 European
Conference on Networks and Communications (EuCNC), June
2017, pp. 1–6.
[11] M. R. Castellanos, V. Raghavan, J. H. Ryu, O. H. Koymen,
J. Li, D. J. Love, and B. Peleato, “Hybrid multi-user precoding
with amplitude and phase control,” in 2018 IEEE International
Conference on Communications (ICC), May 2018, pp. 1–6.
[12] V. Raghavan, A. Partyka, A. Sampath, S. Subramanian,
O. H. Koymen, K. Ravid, J. Cezanne, K. Mukkavilli,
and J. Li, “Millimeter-wave MIMO prototype: Measurements
and experimental results,” IEEE Communications Magazine,
vol. 56, no. 1, pp. 202–209, 2018.
[13] Z. Gao, L. Dai, and Z. Wang, “Channel estimation for mmwave
massive MIMO based access and backhaul in ultra-dense
network,” in Communications (ICC), 2016 IEEE International
Conference on. IEEE, Conference Proceedings, pp. 1–6.
[14] J. Rodríguez-Fernández, N. González-Prelcic, K. Venugopal,
and R. W. Heath Jr, “Frequency-domain compressive channel
estimation for frequency-selective hybrid mmWave MIMO
Fig. 8: The simulations based on QuaDRiGa: (a) The initial BA detection probability vs. the training overhead, with SNRBBF =−19
dB. (b) The sum spectral efficiency of different transmitter architectures vs. increasing SNRBBF. (c) The sum spectral efficiency of
the FC architecture vs. increasing SNRBBF. (d) The sum spectral efficiency of the OSPS architecture vs. increasing SNRBBF.
in Fig. 8. As we can see from Fig. 8 (a), for the initial
BA with PD≥0.95, the FC architecture requires ∼10
less beacon slots than the OSPS case. Whereas, for the
data communication phase as shown in Fig. 8 (b), by
using either the BST or the MR-ZF precoder in the low
SNR range (SNRBBF ≤ −15 dB), and using the MR-ZF
precoder in the high SNR range (SNRBBF >−15 dB), the
two architectures achieve a quite similar performance. In
addition, for both architectures as shown in Fig. 8 (c) and
Fig. 8 (d), respectively, all the curves coincides with each
other in the low SNR range, whereas the MR-ZF precoder
outperforms the rest in the high SNR range. As we can see,
all the results based on the QuaDRiGa generator are quite
consistent with the results based on our proposed channel
model. This consistency implies that our models, schemes,
results and statements are not only theoretically reliable but
also practically applicable.
VI. CONCLUSION
In this paper, we proposed an analysis framework
to evaluate the performance of typical hybrid
transmitters at mmWave frequencies. In particular,
we focused on the comparison of a fully-connected (FC)
architecture and a partially-connected architecture with
one-stream-per-subarray (OSPS) for a MU-MIMO base
station using HDA beamforming. We jointly evaluated
the performance of the two architectures in terms of the
initial beam alignment (BA), the data communication, and
the transmitter power efficiency. We used our recently
proposed BA scheme and further proposed three simple
precoding schemes on top of the effective channel after
the BA. The precoding schemes are based on beam
steering (BST), analog maximum ratio transmitting
(MRT), and joint analog maximum ratio and baseband
zero-forcing (MR-ZF), respectively. Particularly, both the
BA scheme and the MR-ZF precoding scheme outperform
the state-of-the-art counterparts in the literature. Given
the parameters in this paper, our simulation results show
that the two architectures achieve a similar sum spectral
efficiency, but the OSPS architecture outperforms the
FC case in terms of hardware complexity and power
efficiency, only at the cost of a slightly longer latency for
the initial BA. Therefore, the OSPS architecture emerges
as a good choice for a simple and efficient design of
MU-MIMO base stations operating at mmWave.
REFERENCES
[1] K. Venugopal, A. Alkhateeb, N. G. Prelcic, and R. W.
Heath, “Channel estimation for hybrid architecture-based wideband
millimeter wave systems,” IEEE Journal on Selected Areas in
Communications, vol. 35, no. 9, pp. 1996–2009, 2017.
[2] F. Sohrabi and W. Yu, “Hybrid digital and analog beamforming
5. Data Communication for mmWave Multi-User MIMO 73
design for large-scale antenna arrays,” IEEE Journal of Selected
Topics in Signal Processing, vol. 10, no. 3, pp. 501–513, April 2016.
[3] A. Li and C. Masouros, “Hybrid analog-digital millimeter-wave
MU-MIMO transmission with virtual path selection,” IEEE
Communications Letters, vol. 21, no. 2, pp. 438–441, 2017.
[4] S. S. Ioushua and Y. C. Eldar, “Hybrid analog-digital beamforming
for massive MIMO systems,” arXiv preprint arXiv:1712.03485,
2017.
[5] J. Du, W. Xu, H. Shen, X. Dong, and C. Zhao, “Hybrid precoding
architecture for massive multiuser MIMO with dissipation:
sub-connected or fully connected structures?” IEEE Transactions
on Wireless Communications, vol. 17, no. 8, pp. 5465–5479, 2018.
[6] P. L. Cao, T. J. Oechtering, and M. Skoglund, “Precoding design
for massive MIMO systems with sub-connected architecture and
per-antenna power constraints,” in WSA 2018; 22nd International
ITG Workshop on Smart Antennas, March 2018, pp. 1–6.
[7] R. W. Heath, N. Gonzalez-Prelcic, S. Rangan, W. Roh, and A. M.
Sayeed, “An overview of signal processing techniques for millimeter
wave MIMO systems,” IEEE journal of selected topics in signal
processing, vol. 10, no. 3, pp. 436–453, 2016.
[8] X. Gao, L. Dai, and A. M. Sayeed, “Low RF-complexity
technologies to enable millimeter-wave MIMO with large antenna
array for 5G wireless communications,” IEEE Communications
Magazine, vol. 56, no. 4, pp. 211–217, APRIL 2018.
[9] A. F. Molisch, V. V. Ratnam, S. Han, Z. Li, S. L. H. Nguyen,
L. Li, and K. Haneda, “Hybrid beamforming for massive MIMO:
A survey,” IEEE Communications Magazine, vol. 55, no. 9, pp.
134–141, 2017.
[10] M. Majidzadeh, A. Moilanen, N. Tervo, H. Pennanen, A. T¨
olli, and
M. Latva-aho, “Hybrid beamforming for single-user MIMO with
partially connected RF architecture,” in 2017 European Conference
on Networks and Communications (EuCNC), June 2017, pp. 1–6.
[11] M. R. Castellanos, V. Raghavan, J. H. Ryu, O. H. Koymen,
J. Li, D. J. Love, and B. Peleato, “Hybrid multi-user precoding
with amplitude and phase control,” in 2018 IEEE International
Conference on Communications (ICC), May 2018, pp. 1–6.
[12] V. Raghavan, A. Partyka, A. Sampath, S. Subramanian,
O. H. Koymen, K. Ravid, J. Cezanne, K. Mukkavilli, and
J. Li, “Millimeter-wave MIMO prototype: Measurements and
experimental results,” IEEE Communications Magazine, vol. 56,
no. 1, pp. 202–209, 2018.
[13] Z. Gao, L. Dai, and Z. Wang, “Channel estimation for mmwave
massive MIMO based access and backhaul in ultra-dense network,”
in Communications (ICC), 2016 IEEE International Conference on.
IEEE, Conference Proceedings, pp. 1–6.
[14] J. Rodr´
ıguez-Fern´
andez, N. Gonz´
alez-Prelcic, K. Venugopal, and
R. W. Heath Jr, “Frequency-domain compressive channel estimation
for frequency-selective hybrid mmWave MIMO systems,” arXiv
preprint arXiv:1704.08572, 2017.
[15] S. Haghighatshoar and G. Caire, “The beam alignment problem in
mmWave wireless networks,” in 2016 50th Asilomar Conference on
Signals, Systems and Computers, Nov 2016, pp. 741–745.
[16] X. Song, S. Haghighatshoar, and G. Caire, “A scalable and
statistically robust beam alignment technique for mm-Wave
systems,” IEEE Trans. on Wireless Comm., vol. PP, pp. 1–1, 2018.
[17] V. Va, J. Choi, and R. W. Heath, “The impact of beamwidth
on temporal channel variation in vehicular channels and its
implications,” IEEE Transactions on Vehicular Technology, vol. 66,
no. 6, pp. 5014–5029, 2017.
[18] X. Song, S. Haghighatshoar, and G. Caire, “A robust time-domain
beam alignment scheme for multi-user wideband mmWave systems,”
in WSA 2018; 22th International ITG Workshop on Smart Antennas
(to be published), March 2018, pp. 1–7.
[19] ——, “Efficient beam alignment for mmWave single-carrier systems
with hybrid MIMO transceivers,” IEEE Transactions on Wireless
Communications, 2019.
[20] R. J. Weiler, M. Peter, W. Keusgen, and M. Wisotzki, “Measuring
the busy urban 60 GHz outdoor access radio channel,” in 2014 IEEE
International Conference on Ultra-WideBand (ICUWB), Sept 2014,
pp. 166–170.
[21] P. A. Eliasi, S. Rangan, and T. S. Rappaport, “Low-rank spatial
channel estimation for millimeter wave cellular systems,” IEEE
Transactions on Wireless Communications, vol. 16, no. 5, pp.
2748–2759, 2017.
[22] O. El Ayach, R. W. Heath, S. Rajagopal, and Z. Pi, “Multimode
precoding in millimeter wave MIMO transmitters with multiple
antenna sub-arrays,” in Global Communications Conference
(GLOBECOM), 2013 IEEE. IEEE, Conference Proceedings, pp.
3476–3480.
[23] D. Zhang, Y. Wang, X. Li, and W. Xiang, “Hybridly connected
structure for hybrid beamforming in mmWave massive MIMO
systems,” IEEE Transactions on Communications, vol. 66, no. 2,
pp. 662–674, 2018.
[24] H.-L. Chiang, W. Rave, T. Kadur, and G. Fettweis, “Hybrid
beamforming based on implicit channel state information for
millimeter wave links,” IEEE Journal of Selected Topics in Signal
Processing, vol. 12, no. 2, pp. 326–339, 2018.
[25] V. Raghavan, S. Subramanian, J. Cezanne, A. Sampath, O. Koymen,
and J. Li, “Directional hybrid precoding in millimeter-wave MIMO
systems,” in Global Communications Conference (GLOBECOM),
2016 IEEE. IEEE, Conference Proceedings, pp. 1–7.
[26] V. Raghavan, S. Subramanian, J. Cezanne, A. Sampath, O. H.
Koymen, and J. Li, “Single-user versus multi-User precoding for
millimeter wave MIMO systems,” IEEE Journal on Selected Areas
in Communications, vol. 35, no. 6, pp. 1387–1401, June 2017.
[27] X. Song, T. K¨
uhne, and G. Caire, “Fully-connected vs.
sub-connected hybrid precoding architectures for mmWave
MU-MIMO,” in 2019 IEEE International Conference on
Communications (ICC) (accepted).
[28] N. N. Moghadam, G. Fodor, M. Bengtsson, and D. J. Love, “On
the energy efficiency of MIMO hybrid beamforming for millimeter
wave systems with nonlinear power amplifiers,” arXiv preprint
arXiv:1806.01602, 2018.
[29] T. H¨
alsig, D. Cvetkovski, E. Grass, and B. Lankl, “Statistical
properties and variations of LOS MIMO channels at millimeter wave
frequencies,” arXiv preprint arXiv:1803.07768, 2018.
[30] S. Jaeckel, L. Raschkowski, K. B¨
orner, and L. Thiele, “QuaDRiGa:
A 3-D multi-cell channel model with time evolution for
enabling virtual field trials,” IEEE Transactions on Antennas and
Propagation, vol. 62, no. 6, pp. 3242–3256, 2014.
[31] A. Gupta and R. K. Jha, “A Survey of 5G Network: Architecture
and Emerging Technologies,” IEEE Access, vol. 3, pp. 1206–1232,
2015.
[32] M. Agiwal, A. Roy, and N. Saxena, “Next Generation 5G Wireless
Networks: A Comprehensive Survey,” IEEE Communications
Surveys Tutorials, vol. 18, no. 3, pp. 1617–1655, thirdquarter 2016.
[33] J. G. Proakis and M. Salehi, Digital communications. McGraw-Hill,
2008.
[34] P. Bello, “Characterization of randomly time-variant linear
channels,” IEEE Transactions on Communications Systems, vol. 11,
no. 4, pp. 360–393, 1963.
[35] A. Goldsmith, Wireless communications. Cambridge University
Press, 2005.
[36] A. M. Sayeed, “Deconstructing multiantenna fading channels,” IEEE
Transactions on Signal Processing, vol. 50, no. 10, pp. 2563–2579,
2002.
[37] H. G. Myung, J. Lim, and D. J. Goodman, “Peak-to-average
power ratio of single carrier FDMA signals with pulse shaping,”
in Personal, Indoor and Mobile Radio Communications, 2006 IEEE
17th International Symposium on. IEEE, Conference Proceedings,
pp. 1–5.
[38] C. Shepard, H. Yu, N. Anand, E. Li, T. Marzetta, R. Yang,
and L. Zhong, “Argos: Practical many-antenna base stations,” in
Proceedings of the 18th annual international conference on Mobile
computing and networking. ACM, 2012, pp. 53–64.
[39] T. L. Marzetta, E. G. Larsson, H. Yang, and H. Q. Ngo,
Fundamentals of massive MIMO. Cambridge University Press,
2016.
[40] E. Perahia, C. Cordeiro, M. Park, and L. L. Yang, “IEEE 802.11
ad: Defining the next generation multi-Gbps Wi-Fi,” in Consumer
Communications and Networking Conference (CCNC), 2010 7th
IEEE. IEEE, Conference Proceedings, pp. 1–5.
[41] T. S. Rappaport, G. R. MacCartney, S. Sun, H. Yan, and
S. Deng, “Small-Scale, Local Area, and Transitional Millimeter
Wave Propagation for 5G Communications,” IEEE Transactions on
Antennas and Propagation, vol. 65, no. 12, pp. 6474–6490, Dec
2017.
[42] S. Jaeckel, L. Raschkowski, K. B¨
orner, L. Thiele, and F. Burkhardt,
“Quasi deterministic radio channel generator user manual and
74 5.3 Original journal article
documentation,” Fraunhofer Heinrich Hertz Institute Wireless
Communications and Networks, 2016.
[43] K. Venugopal, A. Alkhateeb, R. W. Heath, and N. G. Prelcic,
“Time-domain channel estimation for wideband millimeter wave
systems with hybrid architecture,” in Acoustics, Speech and Signal
Processing (ICASSP), 2017 IEEE International Conference on.
IEEE, 2017, Conference Proceedings, pp. 6493–6497.
Xiaoshen Song (S’17) received the B.Sc.
degree in Communication Engineering from
Northwestern Polytechnical University, Xi’an,
China, in 2013, and the M.Sc. degree in
Communication and Information Systems from
the Institute of Electronics, University of
Chinese Academy of Sciences, Beijing, China,
in 2016. Her master’s thesis focuses on
video synthetic aperture radar (VideoSAR)
system design and imaging algorithms. She is
currently pursuing the Ph.D. degree with the
Communications and Information Theory (CommIT) group at Technische
Universit¨
at Berlin, Berlin, Germany. Her research interests include
wireless communication, mmWave MIMO, and compressed sensing.
Thomas K¨
uhne received his university degree
(5-year Dipl.-Ing. equivalent to a M.Sc.) in
Electrical Engineering from the University
of Technology Dresden. During his master
studies he focused on communication systems
and circuit design. He gained professional
research experience while working 3 years
for the Fraunhofer Heinrich-Hertz-Institute in
Berlin Germany. At the Heinrich-Hertz-Institute
he developed prototypes for mm-wave
communication and measurement devices for
mm-wave channels. Since 2015 he works for the Communications and
Information Theory group of Prof. Caire at the Technische Universit¨
at
Berlin. His research interests include hardware software co-design,
wireless communication systems, and signal processing.
Giuseppe Caire (S’92 – M’94 – SM’03
– F’05) was born in Torino in 1965. He
received the B.Sc. in Electrical Engineering
from Politecnico di Torino in 1990, the M.Sc.
in Electrical Engineering from Princeton
University in 1992, and the Ph.D. from
Politecnico di Torino in 1994. He has been a
post-doctoral research fellow with the European
Space Agency (ESTEC, Noordwijk, The
Netherlands) in 1994-1995, Assistant Professor
in Telecommunications at the Politecnico di
Torino, Associate Professor at the University of Parma, Italy, Professor
with the Department of Mobile Communications at the Eurecom Institute,
Sophia-Antipolis, France, a Professor of Electrical Engineering with
the Viterbi School of Engineering, University of Southern California,
Los Angeles, and he is currently an Alexander von Humboldt Professor
with the Faculty of Electrical Engineering and Computer Science at the
Technical University of Berlin, Germany.
He received the Jack Neubauer Best System Paper Award from the
IEEE Vehicular Technology Society in 2003, the IEEE Communications
Society & Information Theory Society Joint Paper Award in 2004 and
in 2011, the Leonard G. Abraham Prize for best IEEE JSAC paper in
2019, the Okawa Research Award in 2006, the Alexander von Humboldt
Professorship in 2014, the Vodafone Innovation Prize in 2015, and an
ERC Advanced Grant in 2018. Giuseppe Caire is a Fellow of IEEE since
2005. He has served in the Board of Governors of the IEEE Information
Theory Society from 2004 to 2007, and as officer from 2008 to 2013.
He was President of the IEEE Information Theory Society in 2011.
His main research interests are in the field of communications theory,
information theory, channel and source coding with particular focus on
wireless communications.
5. Data Communication for mmWave Multi-User MIMO 75
6
Beam Scheduling for mmWave Relay
Networks
6.1 Introduction
In mmWave communication, one effective way to mitigate the severe path loss,
the sensitivity to blockages and meanwhile to increase the communication range is
beamforming in combination with relaying. Having studied the beamforming issues in
the previous chapters, this chapter focus on the beam scheduling problem for mmWave
half-duplex (HD) relay networks. Two practical beam scheduling schemes, i.e., the
deterministic edge coloring (EC) scheduler and the adaptive backpressure (BP) scheduler,
will be presented to stabilize the network within its capacity range, meanwhile to
guarantee small queuing backlog and end-to-end delay.
6.2 Clarification of each authors’ contributions
This chapter is a journal manuscript, which is a joint work with Yahya H. Ezzeldin,
Giuseppe Caire, and Christina Fragouli. I wrote this journal manuscript as the first
author. This manuscript will be submitted to the journal IEEE TWC in a short time.
Currently the manuscripy is still under modifications by the co-authors. The citation
information is in below:
Xiaoshen Song, Yahya H. Ezzeldin, Giuseppe Caire, Christina Fragouli,“Efficient
Beam Scheduling for Half-Duplex mmWave Relay Networks,” IEEE Transactions on
Wireless Communications, 2020. (to be submitted).
78 6.3 Original journal article
All the authors contributed to this paper. I authored the beam scheduling sections. I
proposed the underlying beam scheduling methods and implemented the simulations for
different beam schedulers. I also wrote the complete first draft (including all sections)
of this paper.
Yahya H. Ezzeldin authored the network capacity section. He implemented the
simulations for the network capacity.
Giuseppe Caire, who is my PhD supervisor, provided valuable discussions in each
meeting of this work. He will also do a final modification together with Christina
Fragouli for the overall draft.
6.3 Original journal article
The following article is a reprint of the original journal manuscript. It is the latest
version of our work. The copyright information is given in page xii of this thesis as well
as in the first page of the reprinted paper
Efficient Beam Scheduling for Half-Duplex
mmWave Relay Networks
Xiaoshen Song†,Student Member, IEEE, Yahya H. Ezzeldin∗,Student Member, IEEE, Giuseppe Caire†,
Fellow, IEEE, Christina Fragouli∗,Fellow, IEEE
©All the authors. Reprinted, with permision, from X. Song, Y. H. Ezzeldin, G. Caire, and C. Fragouli. This paper will be submitted to one of IEEE
transactions. This reprint is the latest version of the paper.
Abstract—Millimeter wave (mmWave) communication is
expected to play a central role in next generation mobile
systems (5G) and beyond, by providing multi-Gbps data rates.
However, the severe pathloss and sensitivity to blockages
at mmWave frequencies significantly challenge practical
implementations. One effective way to mitigate these effects
and to increase the communication range is beamforming in
combination with relaying. In this paper, we study the beam
scheduling problem for mmWave half-duplex (HD) relay
networks, where the relay topology can be arbitrary. Based
on theoretically optimal schedule results, we first implement
a network simplification procedure to reduce the network
topology complexity, and then propose two practically
relevant beam scheduling schemes: the deterministic edge
coloring (EC) scheduler and the adaptive backpressure (BP)
scheduler. The former consists of a very simple one-time
computation of the sequence of scheduling states, which
is then repeated periodically. The one-time computation
depends on the underlying network topology, and therefore
it must be repeated when such topology changes. As such,
this approach is more suited to quasi-static scenarios. The
latter is an “online” approach which updates scheduling
weights and solves at each time slots a weighted sum rate
maximization. Hence, its computational complexity may be
significantly higher than that of EC, but it is better suited to
dynamic time-varying scenarios. With the aid of computer
simulations, we show that both the proposed schedulers
guarantee network stability within the network capacity.
Particularly, in comparison with a baseline scheme, the
proposed schedulers achieve much smaller queuing backlogs,
much smaller backlog fluctuations, and much lower packet
end-to-end delays.
Index Terms—mmWave, relay network, scheduling,
network stability, end-to-end delay, network capacity
I. INTRODUCTION
Migration towards millimeter wave (mmWave) bands
(30-300 GHz) is considered a key enabler for next
generation (5G) mobile networks and beyond [1–4].
Thanks to the large available bandwidth, a mmWave
transceiver can potentially achieve individual link rates
in tens of Gbps. However, compared with the traditional
sub-6GHz frequencies, mmWave communication has three
main characteristics [4–6]: 1) High free-space isotropic
propagation loss; 2) Highly directional propagation along
†X. Song and G. Caire are with the Electrical Engineering and
Computer Science Department, Technische Universit¨
at Berlin, 10587
Berlin, Germany. ∗Y. H. Ezzeldin and C. Fragouli are with the University
of California, Los Angeles, CA 90095, USA.
the line of sight (LoS) and a small number of specular
paths; 3) Vulnerability to obstacles. One effective way to
mitigate these effects is beamforming in combination with
relaying [2], where the former is achieved by utilizing large
antenna arrays at both the transmitter (Tx) and receiver
(Rx) sides and pointing their beams towards each other,
and the latter refers to using intermediate nodes to relay
the source signal to the destination [3].
The beamforming problem in a small cell mmWave
scenario with one base station (BS) and multiple user
equipments (UEs) has been studied in our previous
work [7–9], in which we proposed an efficient initial
beam alignment scheme to find the strongest beam pair
connecting each UE and the BS, such that the consequent
data communication phase can achieve large directivity and
beamforming gain. Under this directional communication,
the multi-user interference among different links is
negligible [3, 10–12], and thus concurrent transmissions
(i.e., spatial reuse) can be fully utilized to improve
the transmission efficiency and to increase the network
capacity.
With the increasing interest in developing small cells for
mmWave communication, how to use relays to increase
the coverage and to support high-rate mmWave wireless
connections for dense small cell deployments remains
a major challenge [3]. The relay network problem at
sub-6 GHz frequencies has been well studied in the past
decades [13–15]. A single source single destination relay
network is a classical information theoretic model [16],
and represents one step towards the understanding and
designing of general multiple multicast networks. In its
own right, there are important situations where one node
wishes to communicate a common message to a set of
other nodes (single multicast relay network), e.g., vehicle
to vehicle (V2V) communication for platooning, where the
head of the platoon sends commands to the other vehicles,
or vehicle to everything (V2X) fast emergency control,
where a road-side base station wishes to send emergency
control messages to all the vehicles in a certain area
[17, 18]. By taking into account the unique characteristics
at mmWave frequencies, the relay nodes in a mmWave
network divide the long link into some short but very
high-rate links to overcome the mmWave sensitivity to
blockages. In such a case, a link is active only if both
nodes focus their beams to face each other, which is
6. Beam Scheduling for mmWave Relay Networks 79
determined by the underlying beam scheduling scheme.
The source and destination cannot communicate with each
other directly because the distance between them is too
large to achieve the required data rate and /or some
obstacles are in between preventing direct communication.
Consider a general half-duplex (HD) relay network, where
all the nodes are assumed to work in HD mode thus
cannot simultaneously transmit and receive 1. Although
the optimal beam directions for each node pair can be
obtained through an initial BA phase, how to efficiently
schedule the beams, in terms of avoiding too large queuing
backlogs at the intermediate nodes as well as assuring a
small end-to-end delay, becomes an important concern in
practical network operations [12].
In this paper, we study the beam scheduling problem
for HD mmWave relay networks with arbitrary topology.
Our study will focus on developing practically relevant
scheduling algorithms guided by theoretical results on
the network approximate capacity Ccs,iid and the optimal
scheduling in mmWave network models [19].
A. Related work
While relays on sub-6GHz bands suffer from severe
interference due to their ominidirectional transmissions,
the directivity of mmWave antennas significantly mitigates
interference [10, 11], especially in backhaul systems [3,
12]. A large body of efforts have been made to study the
mmWave relay network regime with an emphasis on one
or several aspects, i.e., relay selection, congestion control,
routing, scheduling and so on. However, we observe that
the existing works lack the fundamental understanding
of the information theoretic limit of the underlying relay
network model.
The work in [1, 20] studied the relay selection problem,
in which once a direct LoS link is blocked, a relay
selection scheme would be activated to search a best
relay path in terms of the achievable data rate. The work
in [1, 20] can effectively handle an accidental blockage,
however, one should note that mmWave relay network
settings have potentially much more advantages than only
passively dealing with blockages. The work in [3] and
[21] focused on designing a multi-hop mmWave network
for backhauling, range extension and improved robustness
from path diversity. A main limitation in [3, 21], however,
is that it considers only single path streaming [12], i.e., the
selection of a single relay-path with the highest throughput
for each UE. Although a claim is made to maximize the
network capacity, we observe that a more fundamental
capacity exploitation between the source-destination pair
with possibly multiple relay paths is not taken into
consideration. Actually, the throughput improvement with
multiple relay paths (flows) for a single source-destination
pair has been proved in [22]. The underlying idea is to
1We assume each node is equipped with an electronically steerable
antenna array to beamform in the transmitting or receiving directions, so
each node works in the half-duplex (HD) mode
inject as much traffic demands as possible so as to activate
more concurrent transmission flows. Unfortunately the
authors in [22] have ignored a crucial congestion control
procedure, which may result in large queuing aggregation
and network instability. A recent work in [12] used a
network utility maximization (NUM) framework to study
the operation regime of mmWave relay networks, subject to
an upper delay bound and network stability. As mentioned
in this paper, one important suggestion can be to randomly
re-select some paths from the set of all available paths and
then shift among the links with higher payoff (e.g., the
minimum power consumption or the highest throughput).
However, without a prior topology simplification to remove
unnecessary links, the underlying method in [12] is very
likely to split data into too many paths, resulting in
increased signaling overhead and traffic congestion.
In general, the Shannon capacity of an arbitrary HD
mmWave relay network is unknown and is notoriously
hard to study, since for a network with Nnodes, each
of which can either transmit or receive, there exist
as many as 2Npossible states. The classical network
optimization scheme uses a NUM framework [12, 21,
23–25], which includes a joint congestion control and
routing /scheduling, so as to accept data into the network
to maximize certain utilities and to make scheduling
decisions at each node, such that all accepted data are
delivered to intended destinations without overflowing
any queue in intermediate nodes. However, since the
network capacity is unknown, all the existing algorithms
suffer from the complexity of a multi-parameter tuning
procedure to tackle the fundamental utility-delay tradeoff
[23–25]. A recent progress in information theory [19]
proposed a Gaussian HD 1-2-1 model, which corresponds
to an idealized and simplified information theoretic relay
network. In this model, all the nodes work in HD mode. A
potential link is active only if the transmitter beam and the
receiver beam are pointing at each other. In this way, the
fundamental characteristic of directional transmission and
necessity of two-sided (Tx and Rx) beamforming to “close
the link” (i.e., achieving a sufficient received signal power
after beamforming), are captured by the 1-2-1 model. The
authors in [19] designed an algorithm that computes the
optimal schedule to achieve the approximate capacity in
polynomial time. The approximate capacity is information
theoretically optimal for the Gaussian HD 1-2-1 model
within a gap that depends only on the network size N
but not on the topological and operating signal to noise
ratio (SNR). Moreover, this approximate capacity can be
achieved by activating only a subset of all the available
links.
By noticing the great similarities between the Gaussian
HD 1-2-1 model [19] and the HD mmWave relay network
(i.e., very high pathloss and strong directivity), in this
paper, we introduce this information theoretical result
from [19] into the operation regime of HD mmWave
relay networks. This helps us to understand the maximum
80 6.3 Original journal article
II. SYSTEM MODEL
A. Channel model
We consider a general topology for a HD mmWave
N-node network denoted by N0. The network, as shown
in Fig. 1 (c), consists of N−2relays assisting the
communication between a source node (node 1) and a
destination node (node N)2. We assume that the network
operates in slotted time, denoted by t≥0. At any
time slot, each node can point its transmitting/receiving
beam towards at most one other node along the links
corresponding to the edges of the network graph. In
addition, all the relay nodes operate in HD mode, namely,
at any time slot, each relay can be either transmitting to
or receiving from at most one node. Note that the network
graph describes the ensemble of potential links, i.e., the
links that can transmit information provided that the beam
pointing condition is satisfied. This captures the notions
of blocking and distance, i.e., two nodes in the graph are
connected by an edge if they are sufficiently close and there
is no blocking object between them. The potential links
are actually “active” when the beams of the Tx and Rx
nodes connected by the link are “aligned”. This captures
the fact that even in LoS/proximity condition, isotropic
transmission is not sufficient to achieve the desired SNR
over the link, and that beam alignment is necessary. At
any point in time, the network state is determined by
where the node beams are pointing and whether the node
is transmitting or receiving. We denote the network state
by s. We can mathematically model the aforementioned
network operational features by introducing two discrete
set variables Si,t and Si,r, for each node i∈[N]in state
s. The set variable Si,t (respectively, Si,r) indicates the
node towards which node iis pointing its Tx (respectively,
Rx) beam in state s. With this, we have
Si,t ⊆[N]\{1, i},|Si,t| ≤ 1,(1a)
Si,r ⊆[N]\{i, N},|Si,r| ≤ 1,(1b)
|Si,t|+|Si,r| ≤ 1,(1c)
where S1,r =SN,t =∅since the source node always
transmits and the destination node always receives, and
where (1c) follows the HD operation, i.e., for any relay
node i, if Si,t 6=∅, then Si,r =∅, and vice versa.
We denote by H0∈CN×N, the matrix of complex
channel coefficients between nodes in the network, with
element H0,[j,i]=hj,i,i, j ∈[N], representing the
complex channel coefficient from node ito node j.
Also, since the source node can only transmit and the
destination can only receive, we have h1,i =hj,N ≡0
for all i, j ∈[N]. Aside from these restrictions, the
node connection and channel coefficients can be arbitrary.
2For clarity, we focus on a single source-destination pair. However,
the proposed work can be readily extended to multicasting as long as the
(approximate) network capacity and the corresponding optimal scheduling
are known.
Denote the point-to-point link capacity from node ito
node jby L0∈CN×N, with elements L0,[j,i]=lj,i,
i, j ∈[N]. Suppose that the channel inputs satisfy a unit
average power constraint, hence the link capacity lj,i can
be written as
lj,i = log(1 + G·|hj,i|2),∀i, j ∈[N],(2)
where we assume the additive Gaussian noise at each
node is independent and identically distributed (i.i.d.) as
CN(0,1). The factor Gindicates the combined BF gain
of the Tx and Rx beams in alignment condition. Following
[27], we refer to the HD mmWave network described above
as a Gaussian HD 1-2-1 network.
Note that the Gaussian capacity in (2) is fully justified
in light of our previous results in [8], where we have
shown that effectively, after beam alignment, the channel
for each link is reduced to a pure delay and Doppler shift
(all multipath is killed by directional beamforming), hence,
timing and frequency synchronization after beamforming
can be easily implemented. Therefore, the unfaded
Gaussian capacity for the links after beamforming (2) is
a good first-order model.
B. Network capacity results
The Shannon capacity Cof the considered Gaussian HD
1-2-1 network is in general unknown. However, the work
in [27] has proved that Ccan be approximated by Ccs,iid
as follows,
Ccs,iid ≤C≤Ccs,iid +GAP,(3a)
Ccs,iid = max
λs:λs≥0
Psλs=1
min
¯
A⊆[N−1],
A=¯
A∪{1}X
(j,i):i∈A,
j∈Ac
X
s:
j∈Si,t,
i∈Sj,r
λs
lj,i,
(3b)
GAP =ONlog N,(3c)
where (i) Aenumerates all possible cuts in the graph
representing the network topology, the source node 1
always belongs to Aand Ac= [N]\A; (ii) srepresents
all possible network states of the HD 1-2-1 network, with
each network state scorresponding to specific values for
the set variables Si,t and Si,r as defined in (1); (iii) {λs},
i.e., the optimization variables, are the fraction of time
for which state sis active. We refer to a schedule as
the collection of {λs}for all feasible states, such that
they sum up at most to 1. The expression in (3b) can
be explained as maximizing a graph-theoretical min-cut
over all possible feasible schedules of the HD 1-2-1
network. For any Gaussian HD 1-2-1 networks, Ccs,iid is
the approximate capacity of the network, where there exist
a gap in comparison with the Shannon capacity Cand the
gap depends only on the network size Nas shown in (3c).
In [19], it was shown that Ccs,iid in (3b) can be efficiently
computed by solving an equivalent linear program (LP),
where the state activation times {λs}are replaced by link
82 6.3 Original journal article
activation times {λj,i}. Let ¯
Λ∈CN×Nbe the average link
activation time fraction matrix with elements ¯
Λ[j,i]=λj,i.
Then, it follows that
Ccs,iid = max
N
X
j=1
Fj,1,(4a)
s.t. 0≤Fj,i ≤λj,ilj,i, i, j ∈[N],(4b)
X
j∈[N]
Fj,i =X
k∈[N]
Fi,k, i ∈[N−1] \{1},(4c)
λj,i ≥0, i, j ∈[N],(4d)
ˆ
λi,j =λj,i +λi,j, i ∈[N−1], j ∈[N]\[i],(4e)
ˆ
λi,j ≥0, i ∈[N−1], j ∈[N]\[i],(4f)
X
(i,j):i=kor j=k
i<j
ˆ
λi,j ≤1, k ∈[N],(4g)
X
i∈S,j∈S
i<j
ˆ
λi,j ≤|S|−1
2,S ⊆ [N],|S| is odd,
(4h)
where, Fj,i represents the data flow through the link of
capacity lj,i and λj,i represents the fraction of time in
which the link from node ito node jis active. Note that
although the relay links satisfy reciprocity with lj,i =li,j,
i, j ∈[N−1] \{1}, the corresponding link activation time
λi,j and λj,i are not necessarily equal.
Remark 1: Although the LP in (4) has an exponential
number of constraints, it has been shown in [19] that
using the ellipsoid method, the optimal solution for (4) can
be found in polynomial-time in N. The approach relies
on constructing a polynomial-time separation oracle for
the ellipsoid method for the HD 1-2-1 network using the
concept of Gomory-Hu trees [28]. We refer to [19] for more
comprehensive details. Throughout this work, we will use
the approximate capacity Ccs,iid (4) as a prior to bound the
network capacity. ♦
C. Network stability and end-to-end delay
All the exogenous arrivals first enter the transport layer
at the source node, and this data is held in storage reservoirs
to await acceptance to the network layer. The resulting
source admission rate is determined by a congestion control
mechanism. Assume that the transport layer reservoir at
the source node 1is infinitely backlogged. We denote
by x1(t), the source admission rate at slot t. We say
that a network is stable for an average admission rate
¯x1= lim
T→∞
1
TPT
t=1 x1(t)if there exists a scheduling
strategy such that the average backlog of all queues is
finite. A well known result [23] is that the network could
be stable for any ¯x1< C, where Cis the Shannon capacity
of the network. Consider a first-in-first-out (FIFO) system,
we assume that only the packets currently in the node
iat the beginning of slot tcan be transmitted during
that slot. Let Di(t)and Ai(t)be the transmitting and
arriving processes at node i, respectively. The arriving
process Ai(t)is composed of random exogenous arrivals
as well as endogenous arrivals resulting from routing and
transmission decisions from other nodes of the network.
We assume that the Ai(t)arrivals occur at the end of
each slot t, so that they cannot be transmitted during that
slot. Accordingly, the slot-to-slot dynamics of the queuing
backlog Ui(t)satisfies the following
Ui(t+ 1) = max Ui(t)−Di(t),0+Ai(t).(5)
To evaluate the network stability under a certain
scheduling scheme, we define the network average sum
backlog given by
¯
U= lim
T→∞
1
T
T
X
t=1
N−1
X
i=1
Ui(t)
=
N−1
X
i=1 (lim
T→∞
1
T
T
X
t=1
Ui(t))=
N−1
X
i=1
¯
Ui,(6)
where ¯
Ui= lim
T→∞
1
TPT
t=1 Ui(t)denotes the time average
backlog in the queue of node i. Here we have implicitly
ignored the destination node since the backlog at the
destination node UN(t)is always zero.
Note that if the average admission rate at the source node
¯x1exceeds the network capacity, the network would surely
become unstable regardless of the underlying scheduling
schemes. However, within the network capacity region, a
superior beam scheduling scheme should achieve a smaller
average backlog as defined in (6). Actually by Littles’s
theorem [29], a small average backlog indicates also a
small end-to–end delay. Here the end-to-end delay refers
to the time taken for a packet to be transmitted across
the network from the source node 1to the destination
node N. The end-to-end delay comes from several
sources including transmission delay, propagation delay,
processing delay and queuing delay. We assume that the
slot duration is long enough such that the aforementioned
transmission, propagation and processing time are included
within each slot. Moreover, the slot duration remains
constant regardless of the coding and scheduling policies.
Accordingly, the most time consuming part is the queuing
delay [30]. By Little’s theorem [29], the average queuing
delay time ¯ωthat a packet spends in the network satisfies
¯ω=¯
U/¯x1. In the later simulation section, we will evaluate
the network stability as well as the packet end-to-end delay
performance with respect to (w.r.t.) different scheduling
schemes.
D. The NUM framework for joint congestion control and
scheduling
When the exogenous arrival rates are outside the network
capacity region, the network cannot be stabilized without
a congestion control mechanism to limit the amount
of data that is admitted. The classical NUM (network
utility maximization) framework controls the admission
6. Beam Scheduling for mmWave Relay Networks 83
Algorithm 1: The network utility maximization (NUM) framework for joint congestion control and scheduling
Initialization:
Choose V > 0and xmax >0as constant parameters. Initialize the queue backlog at the beginning of time slot t= 1 as
Ui(1) = 0,∀i∈[N].
Iteration:
In each time slot t≥1, repeat the following three steps.
1. Scheduling: At the beginning of each slot t, define the differential backlog weight matrix W(t), with elements
W(t)[j,i]= max{Ui(t)−Uj(t),0}for all j, i ∈[N]. Then choose the scheduling decision matrix Λ(t)∈CN×Nand the
link rate allocation matrix R(t)∈CN×Nas the solution to the following optimization problem
Λ(t),R(t) = arg max
N
X
j=1
N
X
i=1
(W(t)Λ(t)R(t))[j,i](7a)
s.t. R(t)[j,i]≤lj,i,∀i, j ∈[N](7b)
Λ(t)∈ I.(7c)
where lj,i is the link capacity defined in (2),Iconsists of all feasible link activation sets, i.e., all sets of links that can be
simultaneously activated.
2. Congestion control: For the source node i= 1, calculate the admission rate x1(t)as the solution to the following
optimization problem
x1(t) = arg max V·g1(x1(t)) −x1(t)·U1(t)(8a)
s.t. x1(t)∈[0, xmax],(8b)
where the utility function g1(·)is assumed to be non-decreasing and concave, xmax is a large constant number.
3. Queuing update: For each node i∈[N−1], update the queue backlogs for the next time slot as
Ui(t+ 1) = Ui(t)−X
j∈O(i)
R(t)[j,i]+X
j∈I(i)
R(t)[i,j]+x1(t)·1{i=1},(9)
where O(i)and I(i)represent the sets of outgoing links and incoming links of node i, respectively. 1{·} is an indicator
function that takes the value 1if the underlying condition is true, otherwise 0.
congestion via an optimization of the utility function
g0(x1(t))which represents the “satisfaction” received by
sending the commodity data from source node 1to the
destination node Nat an admission rate of x1(t). The
network is then stabilized by applying the backpressure
algorithm at each time slot t[12, 23, 25, 31]. Define R(t)∈
CN×Nand Λ(t)∈CN×Nas the link rate allocation and
the scheduling decision matrices at time slot t, respectively.
The scheduling decision matrix has elements Λ(t)[j,i]= 1
if link (i, j)is activated, otherwise 0. We summarize the
conceptual NUM framework in Algorithm 1. As discussed
before, in most of the literature it is not clear how to
tune the algorithm parameters Vand xmax, which often
needs an empirical trial-and-error procedure. In contrast,
by knowing Ccs,iid in our proposed scheme, we can easily
get rid of the parameters Vand xmax, setting the source
admission rate as a simple constant. Moreover, we exploit
the insight on the underlying optimization problem to do
network simplification so as to significantly reduce the
network topology / scheduling complexity. In what follows
we will present our scheduling schemes in more detail.
III. PROPOSED BEAM SCHEDULING METHODS
In this section we first introduce a pre-processing procedure
to simplify the network topology, on top of which two
beam scheduling schemes are provided: the deterministic
edge coloring (EC) scheduler and the adaptive backpressure
(BP) scheduler.
A. The prior network topology simplification
The topology of the original network N0hinges on the
link capacity matrix L0. Namely, a link connection (i, j)
exists only if L0,[j,i]=lj,i >0. However, based on the
link activation time ¯
Λ[j,i]=λj,i as calculated in (4), in
order to approach the network approximate capacity Ccs,iid,
some links are not necessary to be activated at all. Aiming
at reducing the scheduling complexity, we define a new
associate N-node network Nwith link capacity Lgiven
by
L[j,i]=(lj,i, λj,i >0
0,otherwise.(10)
The new network Nis a simplified version of the original
network N0and contains only the links that are necessary
to use w.r.t. the network approximate capacity Ccs,iid. We
consider a running example as shown in Fig. 2 (a). Without
loss of generality, we assume that the link capacities lj,i
are in the unit of packet per slot (packet/slot). The link
84 6.3 Original journal article
B. The deterministic edge coloring (EC) beam scheduler
The EC scheduler leverages the similarities between
network states in HD and edge coloring in a graph [32, 33].
In particular, an edge coloring assigns colors to edges in
a graph such that no two adjacent edges are colored with
the same color. Similarly in HD, a network state cannot
be a receiver and a transmitter simultaneously. Consider
the same running example as illustrated in Fig. 2 with link
capacity matrix L(after a prior network simplification) and
its associated link activation times matrix ¯
Λ, as defined in
(12). Let Mbe a common multiple of the denominators
in ¯
Λ. We construct an associate multigraph N1w.r.t. the
network N, as illustrated in Fig. 2 (c), where the set of
nodes is the same as in Nand each link (i, j)with capacity
L[j,i]>0is replaced by nj,i parallel edges, given by
nj,i =(M·¯
Λ[j,i],L[j,i]>0
0,otherwise.(15)
It is not difficult to see that nj,i ∈Z,∀i, j ∈[N]. It follows
that the maximum node degree ∆of the graph N1can be
written as
∆ = max
i,j,k∈[N]{nj,k +nk,i}.(16)
In the running example we have M= 12,∆ = 12. The
values of nj,i are labeled aside each edge in Fig. 2(c).
Our proposed EC scheduler is applied on Nand N1
consecutively, and consists of two procedures, namely, the
Path Partitioning (PP) and the Alternating Coloring (AC)
procedures described below.
1) Path Partitioning (PP): The PP procedure is based
on network Nand gives a partition of the network links
into independent paths, such that each link in Nappears
only in one path. The main motivation of the PP procedure
is to provide a logical order for the consequent coloring
procedure. Also, each path resulted from the PP procedure
corresponds to a simple line network with a single flow
direction, that logically align with the overall data flow
from the source node 1to the destination node N. This
enables us to implement a consequent alternating coloring
(AC) procedure (as provided in the next paragraph) to
reduce the packet delay [34]. The PP procedure can be
applied as follows: Choose a node in Nwith no incoming
edges. Traverse an edge from that node to another, and
erase that edge. Continue traversing and erasing edges until
a node with no outgoing edge is reached. This gives a
path in the partition. Then choose a new start node and
repeat the process. Do this until no possible start node
remains. We summarize the PP procedure in Algorithm 2
with step 1). Define Pas the set disjoint paths from the
PP procedure and let Pbe the number such paths. For the
running example we have P= 3 and the paths in Pare
illustrated in Fig. 2(d).
2) Alternating Coloring (AC): For each path in P,
replace each link in the path with its corresponding parallel
edges in N1with the number defined as in (15). We color
the edges in an alternative manner, such that any data
packets entering into the network Nwill be transmitted
towards the destination node as soon as possible. More
precisely, for each path, extract one non-colored edge for
each link if there exists. Start from the first non-colored
edge. Consecutive edges in the path are alternately colored
with the smallest legal color. Continue this extracting
and alternately coloring process until no no-colored edge
remains. We summarize the AC procedure in Algorithm 2
with step 2). The color assignments for the running
example are shown in Fig. 2(d). Note that, as illustrated
in Fig. 2(d), the coloring is done greedily. The first path
p1is colored before moving on to color the second path
p2and so on. When coloring the edge in p2for the first
hop, the smallest legal color is #2 since color #1 is already
used for an edge connected to node 1 in p1.
Define Kas the total number of unique colors used
in Algorithm 2, and Ej,i as the set of colors assigned to
link (i, j). Once the two procedures in Algorithm 2 have
finished, the consequent beam scheduling would reduce to
a simple deterministic repetition among the Kstates, where
each state indexed with k∈[K]corresponds to activating
the links associated to the k-th color. Particularly, for each
time slot t≥1, the scheduling decision is given by
Λ(t)[j,i]=1{ˆ
t∈Ej,i}, i, j ∈[N],ˆ
t= ((t−1) mod K)+1,
(17)
where Λ(t)[j,i]= 1 indicates that link (i, j)is activated at
slot t, and it is idle otherwise. The actual transmission rate
for link (i, j)at slot tis given by
R(t)[j,i]= min{L(t)[j,i]·Λ(t)[j,i], Ui(t)}.(18)
Accordingly, the slot-to-slot queuing evolution follows (9).
Lemma 1: The proposed EC scheduler can achieve at
least 1
2of the network approximate capacity, i.e.,
1
2Ccs,iid < Cmax ≤Ccs,iid,(19)
where Cmax denotes the maximum achievable data rate
under the EC scheduler.
Proof. With a total number of Kcolors used in
the EC scheduler, we have Cmax =∆
KCcs,iid, where
K≥∆since no two incident edges have the same color.
Accordingly, we have proved the upper bound in (19) with
Cmax ≤Ccs,iid. On the other hand, the number of colors
Kw.r.t. network graph N1satisfies K≤2∆ −1. The
proof is straightforward, since for any given edge, there
are at most ∆−1colored edges incident to each of its
endpoints; thus, even if all 2∆ −2edges have different
colors, there is still a single usable color. Accordingly, the
proposed EC scheduler is guaranteed to achieve at least 1
2
of the approximate capacity Ccs,iid, with Cmax >1
2Ccs,iid.
This is a very nice performance guarantee, given the low
complexity and simplicity of scheduling and the fact that
it yields very low latency (as shown later). Actually, a
86 6.3 Original journal article
Algorithm 2: The two procedures for the edge coloring (EC) beam scheduler
1) Procedure path partition (PP)
Initialization: Make Pan empty list; Make Va set that contains all the nodes in the network N;
while Vis nonempty do
let vbe the first node in Vthat has no incoming links;
delete vfrom V;
if node vhas nonzero outgoing links then
make a new path ρempty;
ˆv:= v;
while node ˆvhas nonzero outgoing links do
let (w, ˆv)be an outgoing link of ˆv;
delete (w, ˆv)from N;
put (w, ˆv)in ρ;
ˆv:= w;
end
put path ρin P;
end
if node vhas nonzero degree then
put vin V
end
end
2) Procedure alternately coloring (AC)
Initialization: Define Pas the number of paths in P; Define ¯vp,p∈[P], as the number of nodes in the p-th path; Define
Ej,i as the set of colors assigned to link (i, j),Ej,i are initialized as empty; Replace each link in the p-th path with parallel
edges as defined in (15);
for each path pin Pdo
while there still exists non-colored edge in the p-th path do
for kin [¯vp−1] do
assign the smallest legal color e∈Z+to one of the non-colored edges in the k-th hop of path p;
put ein the corresponding set Ej,i;
end
end
end
classical upper bound [35] on coloring the multigraph N1
states that an optimal coloring scheme (one that uses the
minimum number of colors possible) uses at most ∆ + µ
colors, where µis the multiplicity of graph N1, i.e, the
maximum number of edges in any bundle of parallel edges.
Although not theoretically proven, we have observed in our
simulations that the number of colors Kused by our EC
scheduler satisfies that K≤∆ + µ. Hence in most of
the cases, the EC scheduler can guarantee much more than
1
2Ccs,iid.
C. The adaptive backpressure (BP) beam scheduler
The EC scheduler described in the previous subsection
is rather simple, since once the K-color states are obtained,
the network scheduling becomes deterministic, namely,
the scheduler just needs to periodically repeat the K
states defined by (17). However, since the EC scheduler
is one-time predetermined by the network link capacities
lj,i, the scheduler is mostly favorable for quasi-static
scenarios, and needs to be recomputed whenever some
significant changes in the network topology or potential
link capacities occur. As an alternative approach for
time-varying scenarios, we will consider “online” dynamic
scheduling policies that are guaranteed to achieve stability
for all x1(t)≤Ccs,iid. In particular, we consider the
well-known BP algorithm [23] which is well understood to
stabilize the network whenever the source admission rate
lies within the capacity region of the network.
Define the differential backlog weight matrix W(t)∈
CN×Nwith elements given by
W(t)[j,i]=(max{Ui(t)−Uj(t),0}, λj,i >0
0,otherwise,(20)
where as mentioned before, we have intentionally ignored
all the links that would never be activated (λj,i = 0)
in terms of achieving the network approximate capacity
Ccs,iid. As a consequence, the scheduler only needs to deal
with a much smaller set of links, which can significantly
reduce the scheduling complexity. Similarly, define the
candidate transmit rate matrix ˆ
R(t)∈CN×Nwith
elements given by
ˆ
R(t)[j,i]=(min{Ui(t),L[j,i]}, λj,i >0
0,otherwise.(21)
Then choose the scheduling matrix Λ(t)at slot tas the
solution to the following binary integer program (BIP)
optimization problem
Λ(t) = arg max
N
X
j=1
N
X
i=1 W(t)ˆ
R(t)Λ(t)[j,i]
(22a)
6. Beam Scheduling for mmWave Relay Networks 87
s.t. Λ(t)[j,i]∈ {0,1},(22b)
kΛ(t)[:,i]k1+kΛ(t)[i,:]k1≤1, i ∈[N],(22c)
where (22b) denotes the binary activation constraint, and
(22c) indicates the HD 1-2-1 network operating constraint,
i.e., a node can at most receive from (or transmit to) one
node and cannot do both simultaneously. Accordingly, the
actual link transmission rate due to (22) is given by
R(t)[j,i]=ˆ
R(t)[j,i]·Λ(t)[j,i],(23)
and the slot-to-slot queuing evolution follows the procedure
of (9).
It is well-known that BP is able to stabilize the network
the source admission rate lies within the capacity region
of the network [23]. In the following section, we compare
the performance of our two proposed algorithms with a
“standard” baseline scheme, and show how applying the
EC and BP schedulers on top of the simplified network N
can significantly reduce the scheduling complexity, thus,
bearing smaller queuing backlogs. Also, the packets will
experience much smaller packet end-to-end delays.
Remark 3: Note that although (22) is an integer
linear program, the convex hull of its feasible points
can be represented by a set of linear inequalities using
Edmonds [36] matching polytope. The matching polytope,
although having an exponential number of constraints, can
be efficiently solved in polynomial-time using the ellipsoid
method [19]. ♦
IV. NUMERICAL RESULTS
In this section, we investigate the numerical performance
of the proposed EC and BP schedulers, and compare
them with a “standard” baseline scheme. We start off by
presenting our simulation scenarios and then discuss our
baseline comparison before delving into the simulation
results.
Simulated Examples. We consider two running
examples (two random network topologies) and denote the
two examples by Exp1 and Exp2, respectively. The first
running example Exp1 is the same network N0used in the
previous subsections with total number of nodes N= 7
and the link capacity matrix L0given by (11). By solving
(4) and (10) for Exp1, the network approximate capacity is
Ccs,iid = 15 packets/slot. The link activation time fraction
matrix ¯
Λand the link capacity matrix Lfor the simplified
network Nare shown in (12). The second running example
Exp2 again has N= 7 nodes. The link capacity matrix for
Exp2 is given by
L0=
0000000
7069890
7607660
9970670
9866060
8967600
0676670
.(24)
Following the approach in (4)(10), the link activation
time fraction ¯
Λand the link capacity Lof the simplified
network Nsatisfy
¯
Λ=
0 0 0 0 0 0 0
1
20 0 0 0 0 0
1
20 0 0 0 0 0
07
18 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 1
20 0 0
0 0 1
20 0 1
20
(25)
and
L=
0000000
7000000
7000000
0900000
0000000
0007000
0070070
,(26)
respectively. The network approximate capacity reads
Ccs,iid = 7 packets/slot. Note that unless otherwise stated,
we will assume that simulated network is static.
Baseline scheme. As a comparison, we also consider
a baseline backpressure scheme from [23] and denote it
by BPo. The baseline scheme BPo has been commonly
used in the literature [12, 21, 31], which uses the same
NUM framework as in Algorithm 1 and the underlying
scheduling is also based on the concept of backpressure.
In contrast with our proposed beam schedulers, the BPo
does not exploit the knowledge of the network approximate
capacity and the resulting network simplification. As a
result, the congestion control in BPo requires a complex
multi-parameter (V, xmax)tuning procedure so as to tackle
the fundamental utility-delay tradeoff as illustrated in (8).
Moreover, the scheduling procedure in BPo is directly
implemented on the original network topology N0, which
consequently encounters a larger scheduling complexity.
In what follows, we provide numerical results that: 1)
Evaluate the performance of our proposed schemes; 2)
Compare the performance of our proposed schemes and
the aforementioned baseline scheme. We also separately
provide simulation results for time-varying scenarios,
where the network encounters accidental blockages.
A. Performance evaluation of the proposed schemes
We consider three performance metrics, i.e., the
network stability, the packet end-to-end delay and the
queuing evolution in time-varying scenarios with accidental
blockages.
In terms of network stability, Fig. 3 (a) illustrates the
numerical performance for the example network Exp1.
Here the network approximate capacity is Ccs,iid = 15
packets/slot , shown by the vertical dotted line. As we
can see from Fig. 3 (a), within the network capacity
region x1(t)≤Ccs,iid, both the EC scheduler and the
BP scheduler can guarantee a finite average backlog ¯
U,
i.e., can effectively stabilize the network. In particular,
88 6.3 Original journal article
2 4 6 8 10 12 14 16
0
50
100
150
200
250
Admission rate x1(t)(packet/slot)
Average backlog ¯
U(packets)
(a)
EC
BP
Ccs,iid
2 4 6 8
0
50
100
150
200
Admission rate x1(t)(packet/slot)
(b)
EC
BP
Ccs,iid
Fig. 3: The average backlog ¯
Uwith respect to different source admission rate x1(t). (a) Evaluation on the running example Exp1,
where the network approximate capacity reads Ccs,iid = 15 packets/slot, the maximum node degree of the corresponding associate
multigraph is ∆ = 12, and the total number of unique colors used in the EC scheduler reads K= 12 = ∆. (b) Evaluation on the
running example Exp2, with Ccs,iid = 7 packets/slot, ∆ = 18 and K= 19 >∆.
since the total number of unique colors used in the
EC scheduler equals the maximum node degree of the
corresponding associate multigraph N1with K= ∆ = 12,
the maximum achievable data rate under the EC scheduler
reaches exactly the maximum network capacity point with
Cmax =∆
KCcs,iid =Ccs,iid. The stability evaluation w.r.t.
the example network Exp2 is shown in Fig. 3 (b), where the
network approximate capacity is Ccs,iid = 7 packets/slot.
In this particular example, the total number of unique
colors used in the EC scheduler is slightly greater than
the maximum node degree of the corresponding associate
multigraph N1with K= 19 colors and ∆ = 18 maximum
degree. As a result, the maximum achievable data rate for
Exp2 using the EC scheduler is Cmax =∆
KCcs,iid < Ccs,iid.
In addition to this, the numerical performance in Exp2 is
similar to that in Exp1. As shown in Fig. 3 (b), within
the corresponding capacity ranges, i.e., x1(t)≤Cmax
for the EC scheduler and x1(t)≤Ccs,iid for the BP
scheduler, both the two schedulers can efficiently stabilize
the network with finite average backlog ¯
U. With the same
source admission x1(t)≤Cmax, the average backlog ¯
U
with the BP scheduler is slightly smaller than that with the
EC scheduler.
Note that, although the BP scheduler shows slight
apparent benefits over the EC scheduler as seen in
Fig. 3 (a)-(b), it should be contrasted with its operational
complexity. The BP scheduler must solve a weighted
sum rate maximization (22) at each time slot, while the
EC scheduler uses only one-time computation and then
periodical state repetition.
In terms of packet end-to-end delay, Fig. 4 (a)-(b)
illustrates the numerical performance w.r.t. Exp1. Here
the end-to-end delay indicates how long the packets are
delayed in the queues during the transmission from the
source node to the destination node. The cumulative
density function (CDF) of the packet delay in Fig. 4
indicates the probability that the packet end-to-end delay
is smaller than the specified delay. The packet delay
distribution for example Exp1 under the EC scheduler
is shown in Fig. 4 (a), where the source admission rate
is set as x1(t) = 12 packets/slot. As we can see, with
probability 1the end-to-end delay of each individual data
packet is smaller than 13 slots. By increasing the source
admission rate from x1(t) = 12 packets/slot to x1(t) = 15
packets/slot, the maximum end-to-end delay increases to
15 slots. The BP scheduler achieves similar performance
as shown in Fig. 4 (b). As we can see, with probability
1the packet end-to-end delay is smaller than 3slots for
source admission x1(t) = 12 packets/slot. This maximum
delay shifts to 13 slots for source admission x1(t) = 15
packets/slot.
The delay performance for Exp2 is similar as shown
in Fig. 4 (c)-(d). Namely, with probability 1the packet
end-to-end delay is smaller than 10 slots under EC
scheduler with x1(t) = 4 packets/slot, 14 slots under EC
scheduler with x1(t) = 6 packets/slot, 4slots under BP
scheduler with x1(t) = 4 packets/slot, and finally 5slots
under BP scheduler with x1(t) = 7 packets/slot. A short
conclusion is that for either of the proposed two schedulers,
by increasing the source admission rate x1(t), packet delay
CDF curve translates more to the right side, namely, the
packets experience longer delays.
The queuing evolution regarding the instantaneous (i.e.,
not time-averaged) sum backlog PN−1
i=1 Ui(t)w.r.t. Exp1
is show in Fig. 5 (a) and (b) for the static and the
time-varying scenarios, respectively. Here we assume that
in the static scenario, the link capacities are constant
with no accidental blockages, however, in the time-varying
6. Beam Scheduling for mmWave Relay Networks 89
0 10 20 30 40
0
0.2
0.4
0.6
0.8
1
CDF
(a)
EC, x1(t) = 12
EC, x1(t) = 15
0 10 20 30 40
0
0.2
0.4
0.6
0.8
1
(b)
BP, x1(t) = 12
BP, x1(t) = 15
0 5 10 15 20 25 30
0
0.2
0.4
0.6
0.8
1
End-to-end delay (slots)
CDF
(c)
EC, x1(t) = 4
EC, x1(t) = 6
0 5 10 15 20 25 30
0
0.2
0.4
0.6
0.8
1
End-to-end delay (slots)
(d)
BP, x1(t)=4
BP, x1(t)=7
Fig. 4: The packet end-to-end delay distribution under the proposed EC and BP schedulers. (a) The delay distribution in Exp1 under
the EC scheduler, with admission rate x1(t) = 12 and x1(t) = 15, respectively. (b) The delay distribution in Exp1 under the BP
scheduler, with admission rate x1(t) = 12 and x1(t) = 15, respectively. (c) The delay distribution in Exp2 under the EC scheduler,
with admission rate x1(t) = 4 and x1(t) = 6, respectively. (d) The delay distribution in Exp2 under the BP scheduler, with admission
rate x1(t) = 4 and x1(t) = 7, respectively. For all the cases, the maximum delay will increase by increasing the source admission
rate x1(t).
scenario, link (7,6) will be blocked every T0= 200
slots and each time the blocking will last for 80 slots.
We assume that the network state, the computation of the
network approximate capacity and the overall scheduling
decisions will be updated every TEC = 50 slots for the EC
scheduler and every TBP = 1 slot for the BP scheduler,
respectively. As we can see from Fig. 5 (a), in the static
scenario, the sum backlog and its fluctuations under the
EC scheduler are slightly larger than that under the BP
scheduler. In the time-varying scenario, however, the sum
backlog and its fluctuation under the EC scheduler are
much larger than that under the BP scheduler as illustrated
in Fig. 5 (b). We can observe a similar performance in the
Exp2 network as illustrated in Fig. 5 (c) and (d) for the
static and the time-varying scenarios, respectively. Here
we assume that in the static scenario, the link capacities
are constant with no accidental blockages, while in the
time-varying scenario, link (1,0) will be blocked every
T0= 200 slots and each time the blocking will last for
40 slots. The scheduler updates are the same as in Exp1.
As we can see, the performance difference between the
proposed two schedulers in the static scenario is very
moderate. However, in the time-varying scenario, the BP
scheduler again outperforms the EC scheduler in terms the
amount of queuing backlog and its fluctuations. Therefore,
we claim that the EC scheduler is more suitable for static
scenarios with mulch less computation and slightly larger
sum backlog than that under the BP scheduler. In contrast,
the BP scheduler will be updated in every time slot and
react very fast to blockages, thus is more favorable for
time-varying scenarios.
90 6.3 Original journal article
0 200 400 600 800 1,000
0
50
100
150
200
Sum backlog (packets)
(a)
EC, x1(t) = 13
BP, x1(t) = 13
0 200 400 600 800 1,000
0
50
100
150
200
(b)
EC, x1(t) = 13
BP, x1(t) = 13
0 200 400 600 800 1,000
0
20
40
60
80
100
120
Iterations (slots)
Sum backlog (packets)
(c)
EC, x1(t) = 6
BP, x1(t)=6
0 200 400 600 800 1,000
0
20
40
60
80
100
120
Iterations (slots)
(d)
EC, x1(t) = 6
BP, x1(t)=6
Fig. 5: The instantaneous sum backlog PN−1
i=1 Ui(t)w.r.t. increasing iterations (slots): (a) Exp1 in static scenario. (b) Exp1 in
time-varying scenario with accidental blockages. (c) Exp2 in static scenario. (d) Exp2 in time-varying scenario with accidental
blockages.
B. The performance comparison with the baseline scheme
Fig. 6 (a) compares the instantaneous sum backlog
PN−1
i=1 Ui(t)between the proposed schemes (EC, BP) and
the baseline scheme (BPo) w.r.t. the running example Exp1.
For the baseline scheme BPo, we choose the sum-rate
utility as follows: since there is only one commodity,
then in the NUM framework in Algorithm 1, we have
g1(x1(t)) = x1(t). Aiming at on one hand to approach
the network capacity (w.r.t. large value of xmax), and on
the other hand to handle the utility-delay tradeoff (w.r.t.
(O(V), O(1/V ))), we choose three sets of parameter for
the baseline BPo scheme with (V, xmax) = (200,200),
(V, xmax) = (200,50) and (V, xmax) = (50,200),
respectively. For our proposed schemes EC and BP, since
we have managed to compute the network approximate
capacity Ccs,iid as shown in (4), the congestion control
reduces to a simple constant threshold given by (14). Hence
we do not need to suffer from a complex multi-parameter
(V, xmax)tuning procedure. We pick the point with the
maximum achievable data rate (source admission rate) for
the proposed schemes, i.e., x1(t) = 15 packets/slot for both
of the EC and BP schedulers. As we can see from Fig. 6 (a),
all the underlying schemes can stabilize the network since
they all converge to finite backlogs. The baseline scheme
BPo can approximately approach Ccs,iid in a long-term
average sense indicated by ¯x1. It’s worth noting that the
fluctuation ranges of instantaneous sum backlog converge
to [96,105] packets under the EC scheduler and [93,98]
packets under the BP scheduler, respectively. However, this
fluctuations increase to the ranges of [326,529],[306,404],
and [118,346] packets under the baseline BPo scheme
with (V, xmax) = (200,200),(V, xmax) = (200,50), and
(V, xmax) = (50,200), respectively. Hence, the proposed
schemes achieve much smaller backlog and much smaller
backlog fluctuations compared with the baseline scheme.
As for the packet end-to-end delay, Fig. 6 (b) illustrates
6. Beam Scheduling for mmWave Relay Networks 91
0 200 400 600 800 1,000
0
200
400
600
800
1,000
Iterations (slots)
Sum backlog (packets)
(a)
EC, x1(t) = 15
BP, x1(t) = 15
BPo #1,¯x1= 14.8
BPo #2,¯x1= 14.7
BPo #3,¯x1= 14.5
0 10 20 30 40 50 60
0
0.2
0.4
0.6
0.8
1
End-to-end delay (slots)
CDF
(b)
EC, x1(t) = 15
BP, x1(t) = 15
BPo #1,¯x1= 14.8
BPo #2,¯x1= 14.7
BPo #3,¯x1= 14.5
Fig. 6: The performance comparison between the proposed schemes (EC, BP) and the baseline scheme (BPo) w.r.t. the first running
example Exp1. (a) The instantaneous sum backlog PN−1
i=1 Ui(t)w.r.t. increasing iterations (slots). (b) The packet end-to-end delay
distribution. The multi-parameter sets in the BPo scheme are (V, xmax) = (200,200),(V, xmax) = (200,50) and (V, xmax) =
(50,200) for #1,#2 and #3, respectively.
the delay distributions w.r.t. different schemes in the
running example Exp1. As we can see, when the source
admission rate is set as x1(t) = 15 packets/slot, with
probability 1the end-to-end delay of each individual data
packet is smaller than 15 slots under the EC scheduler and
smaller than 13 slots under the BP scheduler, respectively.
However, all the curves w.r.t. the BPo scheme significantly
shift to the right side. Namely, the maximum packet
end-to-end delays under the BPo scheme with different
parameter sets are much larger than that under the proposed
schemes (>68 slots).
As illustrated in Fig. 7, the numerical results in the
running example Exp2 achieve similar performance as
that in Exp1. Again we choose three sets of parameter
for the baseline scheme BPo with (V, xmax) = (40,50),
(V, xmax) = (40,20) and (V, xmax) = (10,50),
respectively. For the proposed schemes, we pick the
point with the maximum achievable data rate (source
admission rate) , i.e., x1(t)=6packets/slot for the EC
scheduler and x1(t) = 7 packets/slot for the BP scheduler,
respectively. As we can see from Fig. 7 (a), the fluctuation
ranges of instantaneous sum backlog converge to [26,36]
packets under the EC scheduler, [21,21] packets under
the BP scheduler, [224,326] packets under the BPo with
(V, xmax) = (40,50),[226,279] packets under the BPo
with (V, xmax) = (40,20), and [50,149] packets under
the BPo with (V, xmax) = (10,50). Hence, the proposed
schemes achieve much smaller backlog and much smaller
backlog fluctuations compared with the baseline scheme.
The packet end-to-end delay distribution is illustrated in
Fig. 7 (b). As we can see, the maximum end-to-end delay
under the proposed schemes are 14 slots (EC, x1(t) = 6)
and 5slots (BP, x1(t) = 7), respectively. However, the
packets under the baseline BPo scheme with different
parameter sets experience much longer delays (35 slots).
V. CONCLUSION
In this paper, we studied the beam scheduling problem
for HD mmWave relay networks with arbitrary topology.
Our study focused on developing practically relevant
scheduling algorithms guided by theoretical results on the
approximate capacity Ccs,iid and optimal scheduling in
mmWave network models [19]. Based on the theoretically
optimal schedule results, we first implemented a network
simplification procedure to reduce the network topology
complexity. Accordingly, using this simplified topology, we
proposed two practical and very simple beam scheduling
schemes; the deterministic edge coloring (EC) scheduler
and the adaptive backpressure (BP) scheduler. The former
is a very simple one-time computation followed by a
periodic repetitive schedule, hence is more suitable for
quasi-static scenarios. The later is an “online” approach
which will update in every time slot, thus is more
favorable for time-varying scenarios. We have shown
through simulation that both the proposed schedulers can
guarantee the network stability within a certain operating
range of the input rate. In particular, the EC scheduler
guarantees stability for input rates less than ∆
KCcs,iid,
where ∆and Kdenote the maximum degree and the
number of colors used in EC for an associate multigraph,
respectively; The BP scheduler guarantees stability for rates
less than Ccs,iid. Moreover, in comparison with a standard
baseline scheme, which consists of applying classical
BP-based NUM over the whole network (without network
simplification), the proposed schedulers do not require the
92 6.3 Original journal article
0 200 400 600 800 1,000
0
200
400
600
Iterations (slots)
Sum backlog (packets)
(a)
EC, x1(t) = 6
BP, x1(t)=7
BPo #1,¯x1= 6.8
BPo #2,¯x1= 6.7
BPo #3,¯x1= 6.7
0 10 20 30 40 50 60
0
0.2
0.4
0.6
0.8
1
End-to-end delay (slots)
CDF
(b)
EC, x1(t) = 6
BP, x1(t)=7
BPo #1,¯x1= 6.8
BPo #2,¯x1= 6.7
BPo #3,¯x1= 6.7
Fig. 7: The performance comparison between the proposed schemes (EC, BP) and the baseline scheme (BPo) w.r.t. the second running
example Exp2. (a) The instantaneous sum backlog PN−1
i=1 Ui(t)w.r.t. increasing iterations (slots). (b) The packet end-to-end delay
distribution. The multi-parameter sets in the BPo scheme are (V, xmax) = (40,50),(V, xmax) = (40,20) and (V, xmax) = (10,50)
for #1,#2 and #3, respectively.
empirical tuning of the BP control parameters and achieve
a much smaller queuing backlogs and packet end-to-end
delays.
REFERENCES
[1] Y. Niu, W. Ding, H. Wu, Y. Li, X. Chen, B. Ai, and Z. Zhong,
“Relay-Assisted and QoS Aware Scheduling to Overcome Blockage
in mmWave Backhaul Networks,” IEEE Transactions on Vehicular
Technology, vol. 68, no. 2, pp. 1733–1744, 2019.
[2] A. Dimas, D. S. Kalogerias, and A. P. Petropulu, “Cooperative
beamforming with predictive relay selection for urban mmWave
communications,” IEEE Access, vol. 7, pp. 157 057–157 071, 2019.
[3] Y. Yan, Q. Hu, and D. M. Blough, “Path Selection with Amplify
and Forward Relays in mmWave Backhaul Networks,” in 2018
IEEE 29th Annual International Symposium on Personal, Indoor
and Mobile Radio Communications (PIMRC), Sep. 2018, pp. 1–6.
[4] M. Shafi, J. Zhang, H. Tataria, A. F. Molisch, S. Sun, T. S.
Rappaport, F. Tufvesson, S. Wu, and K. Kitao, “Microwave vs.
Millimeter-Wave Propagation Channels: Key Differences and Impact
on 5G Cellular Systems,” IEEE Communications Magazine, vol. 56,
no. 12, pp. 14–20, 2018.
[5] A. F. Molisch, V. V. Ratnam, S. Han, Z. Li, S. L. H. Nguyen,
L. Li, and K. Haneda, “Hybrid beamforming for massive MIMO:
A survey,” IEEE Communications Magazine, vol. 55, no. 9, pp.
134–141, 2017.
[6] R. W. Heath, N. Gonzalez-Prelcic, S. Rangan, W. Roh, and A. M.
Sayeed, “An overview of signal processing techniques for millimeter
wave MIMO systems,” IEEE journal of selected topics in signal
processing, vol. 10, no. 3, pp. 436–453, 2016.
[7] X. Song, S. Haghighatshoar, and G. Caire, “A scalable and
statistically robust beam alignment technique for mm-Wave
systems,” IEEE Trans. on Wireless Comm., vol. PP, pp. 1–1, 2018.
[8] X. Song, S. Haghighatshoar, and G. Caire, “Efficient Beam
Alignment for Millimeter Wave Single-Carrier Systems With
Hybrid MIMO Transceivers,” IEEE Transactions on Wireless
Communications, vol. 18, no. 3, pp. 1518–1533, 2019.
[9] X. Song, T. K¨
uhne, and G. Caire, “Fully-/Partially-Connected
Hybrid Beamforming Architectures for mmWave MU-MIMO,”
IEEE Transactions on Wireless Communications, vol. 19, no. 3, pp.
1754–1769, 2020.
[10] Y. Xu, H. Shokri-Ghadikolaei, and C. Fischione, “Distributed
Association and Relaying With Fairness in Millimeter Wave
Networks,” IEEE Transactions on Wireless Communications,
vol. 15, no. 12, pp. 7955–7970, 2016.
[11] Y. Xu, G. Athanasiou, C. Fischione, and L. Tassiulas,
“Distributed Association Control and Relaying in Millimeter
Wave Wireless Networks,” in 2016 IEEE International Conference
on Communications (ICC), 2016, pp. 1–6.
[12] T. K. Vu, M. Bennis, M. Debbah, and M. Latva-Aho,
“Joint Path Selection and Rate Allocation Framework for 5G
Self-Backhauled mm-wave Networks,” IEEE Transactions on
Wireless Communications, vol. 18, no. 4, pp. 2431–2445, 2019.
[13] J. Chang and Y. Chen, “A cluster-based relay station deployment
scheme for multi-hop relay networks,” Journal of Communications
and Networks, vol. 17, no. 1, pp. 84–92, 2015.
[14] Y. Wei, Y. Hou, L. Li, and M. Song, “Energy efficient topology
control for multi-hop relay cellular networks based on flow
management,” Journal of Communications and Networks, vol. 19,
no. 6, pp. 618–626, 2017.
[15] X. Song, R. Zhang, J. Pan, and J. Liu, “A statistical geometric
approach for capacity analysis in two-hop relay communications,”
in 2013 IEEE Global Communications Conference (GLOBECOM),
Conference Proceedings, pp. 4823–4829.
[16] T. Cover and A. E. Gamal, “Capacity theorems for the relay
channel,” IEEE Transactions on Information Theory, vol. 25, no. 5,
pp. 572–584, 1979.
[17] W. Yi, Y. Liu, Y. Deng, A. Nallanathan, and R. W. Heath, “Modeling
and analysis of mmwave v2x networks with vehicular platoon
systems,” IEEE Journal on Selected Areas in Communications,
vol. 37, no. 12, pp. 2851–2866, 2019.
[18] S. Lien, Y. Kuo, D. Deng, H. Tsai, A. Vinel, and A. Benslimane,
“Latency-optimal mmwave radio access for v2x supporting next
generation driving use cases,” IEEE Access, vol. 7, pp. 6782–6795,
2019.
[19] Y. H. Ezzeldin, M. Cardone, C. Fragouli, and G. Caire,
“Polynomial-time Capacity Calculation and Scheduling for
Half-Duplex 1-2-1 Networks,” in 2019 IEEE International
Symposium on Information Theory (ISIT), 2019, pp. 460–464.
[20] R. Abdel-Raouf, H. Esmaiel, and O. A. Omer, “Fuzzy logic
based relay selection for mmWave communications,” in 2019 9th
Annual Information Technology, Electromechanical Engineering
and Microelectronics Conference (IEMECON), March 2019, pp.
263–267.
[21] J. Garc´
ıa-Rois, F. G´
omez-Cuba, M. R. Akdeniz, F. J.
Gonz´
alez-Castao, J. C. Burguillo, S. Rangan, and B. Lorenzo, “On
the analysis of scheduling in dynamic duplex multihop mmwave
6. Beam Scheduling for mmWave Relay Networks 93
cellular systems,” IEEE Transactions on Wireless Communications,
vol. 14, no. 11, pp. 6028–6042, 2015.
[22] B. P. S. Sahoo, C. Yao, and H. Wei, “Millimeter-Wave Multi-Hop
Wireless Backhauling for 5G Cellular Networks,” in 2017 IEEE 85th
Vehicular Technology Conference (VTC Spring), 2017, pp. 1–5.
[23] L. Georgiadis, M. J. Neely, and L. Tassiulas, “Resource allocation
and cross-layer control in wireless networks,” Foundations and
Trends in Networking, vol. 1, no. 1, pp. 1–144, 2006.
[24] S. Wang and N. Shroff, “Towards fast-convergence, low-delay and
low-complexity network optimization,” Proceedings of the ACM on
Measurement and Analysis of Computing Systems, vol. 1, no. 2,
p. 34, 2017.
[25] H. Yu and M. J. Neely, “A new backpressure algorithm for joint rate
control and routing with vanishing utility optimality gaps and finite
queue lengths,” IEEE/ACM Transactions on Networking, vol. 26,
no. 4, pp. 1605–1618, 2018.
[26] X. Song and G. Caire, “Queue-Aware Beam Scheduling for
Half-Duplex mmWave Relay Networks,” in 2020 IEEE International
Symposium on Information Theory (ISIT), 2020, pp. 1611–1616.
[27] Y. H. Ezzeldin, M. Cardone, C. Fragouli, and G. Caire, “Gaussian
1-2-1 networks: Capacity results for mmWave communications,” in
2018 IEEE International Symposium on Information Theory (ISIT),
June 2018, pp. 2569–2573.
[28] R. E. Gomory and T. C. Hu, “Multi-Terminal Network Flows,”
Journal of the Society for Industrial and Applied Mathematics,
vol. 9, no. 4, pp. 551–570, 1961.
[29] J. D. Little, “A proof for the queuing formula: L=W,” Operations
research, vol. 9, no. 3, pp. 383–387, 1961.
[30] D. Park, “A throughput-optimal scheduling policy for wireless relay
networks,” in 2010 IEEE Wireless Communication and Networking
Conference, April 2010, pp. 1–5.
[31] J. Wang, L. He, and J. Song, “Stochastic Optimization Based
Dynamic User Scheduling and Hybrid Precoding for Broadband
MmWave MIMO,” in ICC 2019 - 2019 IEEE International
Conference on Communications (ICC), 2019, pp. 1–6.
[32] H. N. Gabow, “Using Euler Partitions to Edge Color Bipartite
Multigraphs,” International Journal of Computer & Information
Sciences, vol. 5, no. 4, pp. 345–355, 1976.
[33] C. Sinnamon, “Fast and Simple Edge-Coloring Algorithms,”
preprint arXiv:1907.03201, 2019.
[34] X. Song and G. Caire, “Queue-Aware Beam Scheduling for
Half-Duplex mmWave Relay Networks,” in 2020 IEEE International
Symposium on Information Theory (ISIT), 2020, pp. 1611–1616.
[35] V. G. Vizing, “On an estimate of the chromatic class of a p-graph,”
Discret Analiz, vol. 3, pp. 25–30, 1964.
[36] J. Edmonds, “Maximum matching and a polyhedron with 0,
1-vertices,” Journal of research of the National Bureau of Standards
B, vol. 69, no. 125-130, pp. 55–56, 1965.
94 6.3 Original journal article
7
Conclusions
7.1 Summary of this thesis
In the past decades, tremendous fundings and research efforts have been dedicated to
the investigation of millimeter wave (mmWave) wireless communication, since the use of
mmWaves will solve the spectrum shortage in current sub-6GHz cellular communication
systems and offer unprecedented multi-Gbps date rates for each mobile devices in the
next generation (5G) mobile communication systems. This thesis has proposed several
enabling schemes to address the challenges in mmWave communication including the
initial access, the data communication and the relay networking.
For the initial access, we proposed two efficient beam alignment (BA) schemes for
mmWave OFDM (orthogonal frequency division multiplexing) system and mmWave SC
(single-carrier) system, respectively. The proposed schemes are based on quadratic
channel measurements and the non-negative Least Squares (NNLS) technique in
compressed sensing (CS). These schemes can operate in much more realistic conditions
than existing schemes in the literature, are strongly scalable for multi-user scenarios
and are very robust to fast channel variations cased by Doppler spread.
For the data communication after BA is achieved, we defined two “extreme” hybrid
digital analog (HDA) antenna architectures, i.e., the fully-connected (FC) architecture
and the one-stream-per-subarray (OSPS) architecture. We provided a joint performance
evaluation of the initial access and data communication phases with more realistic channel
and hardware conditions. In each phase, we proposed our own BA and precoding schemes
that outperform the counterparts in the literature. We have observed that the proposed
two architectures achieve similar sum spectral efficiency, but the OSPS architecture
outperforms the FC case in terms of hardware complexity and power efficiency, only at
the cost of a slightly longer time of initial beam acquisition.
96 7.2 Future directions
On top of the above beamforming work, we further extended our work into
mmWave relay networking. For a general half-duplex (HD) mmWave relay network with
arbitrary relay connections, we proposed two beam scheduling schemes to approach
the approximate information theoretical Shannon capacity, namely, the deterministic
edge coloring (EC) scheduler and the adaptive backpressure (BP) scheduler. The EC
scheduler is more suitable for static scenarios since it is one-time computation and
then periodically state repetition. In contrast, the BP scheduler is more favorable
for time-varying scenarios because it updates in every time slots. Both the proposed
schedulers can effectively stabilize the network, meanwhile achieve much smaller queuing
backlogs, much smaller backlog fluctuations, and much lower packet end-to-end delays
in comparison with the reference baseline scheme.
7.2 Future directions
One interesting direction to go on top of this thesis is mmWave vehicle-to-vehicle
(V2V) and vehicle-to-everything (V2X) communication. mmWave V2V and V2X
communication can provide NLOS information about the surrounding environment,
thus improve the safety and traffic efficiency of cooperative automated driving. In the
V2V and V2X scenarios, the nodes will create a relay network and route data packets
through multi-hop transmission. An essential component in these networks consists
of selecting, at each time slot, which beams are active, and in which direction they
should be pointed. Therefore, the problem of beamforming (proposed in Chapter 4-6) is
intrinsically connected with the problem of beam scheduling (proposed in Chapter 7),
since transmission is not isotropic as in conventional wireless networks, rather highly
directional.
Also, the development of mmWave massive MIMO communication technology is now
in the hands of the product departments of companies such as Huawei, qualcomm,
Ericsson, Nokia, etc.. A large number of communication, signal processing, and
optimization algorithms have been development over the years and it remains to be
seen which ones will work well in practice. If 5G becomes a commercial success,
massive digitally controllable antenna arrays will be deployed “everywhere” for countless
applications at mmWave frequencies and even much higher THz frequencies (6G). Thus,
we can expect a future where extremely large aperture array with thousands of antenna
elements are used to serve a set of users. There are, however, practical limits to how
many antennas can be deployed at conventional towers and rooftop locations.
In addition, there is a recent surge of papers applying machine learning (ML) to
various problems in communications. ML is especially powerful when a system has
characteristics that are hard to model or analyze by conventional approaches. Thus
it would be an exciting possibility to use ML in the future mmWave wireless systems
whenever a good model is lacking, or a model is available but it is intractable for analysis.
7. Conclusions 97
However, before ML can be successfully used in communication systems, many obstacles,
like the acquisition of training data, the hard real-time constraints and so on still need
to be overcome.
A
Acronyms and Abbreviations
AoA angle of arrival 7
AoD angle of departure 7
AR augmented reality 2
AWGN additive white Gaussian noise 16
BA beam alignment 7
BBF before beamforming 17
BP backpressure 9
BS base station 8
CSI full channel state information 7
D-RoF digital radio-over-fiber 12
DFT discrete Fourier transform 15
EB exabytes 1
EC edge coloring 9
eMBB enhanced mobile broadband 2
FC fully-connected 7
Gbps gigabit per second 2
HD half-duplex 9
HD high definition 1
HDA hybrid digital analog 9
IC integrated circuit 6
IMT
international mobile telecommunications 1
IP Internet protocol 11
IQ in-phase and quadrature 6
LAN local area network 6
MAC media access control 6
MIMO multiple input multiple output 2
mMTC
massive machine type communications
2
mmWave millimeter wave 2
OFDM
orthogonal frequency division multiplex-
ing 6
OSPS one-stream-per-subarray 9
PA power amplifier 6
PSD power spectral density 16
RF radio frequency 5
SC single carrier 6
SNR signal to noise ratio 6
UE user equipment 9
ULA uniform linear array 13
uRLLC
ultra-reliable and low-latency commu-
nications 2
V2V vehicle to vehicle 96
V2X vehicle to everything 96
VR virtual reality 2
WPAN wireless personal area network 6
Bibliography
[1]
Ericsson Inc. Ericsson Mobility Report. Mar. 2018. url:
https://www.ericsson.
com/assets/local/mobility-report/documents/2018/ericsson-mobility-
report-november-2018.pdf.
[2]
Erik Dahlman, Stefan Parkvall, and Johan Skold. 5G NR: The next generation
wireless access technology. Academic Press, 2018. isbn: 012814324X.
[3]
Xingqin Lin et al. “5G New Radio: Unveiling the essentials of the next generation
wireless access technology”. In: IEEE Communications Standards Magazine 3.3
(2019), pp. 30–37. issn: 2471-2825.
[4]
Khagendra Belbase. “Analysis of Millimeter Wave Wireless Relay Networks”.
PhD thesis. University of Alberta, 2019.
[5]
Theodore S Rappaport et al. “Overview of millimeter wave communications for
fifth-generation (5G) wireless networks—with a focus on propagation models”. In:
IEEE Transactions on Antennas and Propagation 65.12 (2017), pp. 6213–6230.
issn: 0018-926X.
[6]
M. Shafi et al. “5G: A Tutorial Overview of Standards, Trials, Challenges,
Deployment, and Practice”. In: IEEE Journal on Selected Areas in Communications
35.6 (2017), pp. 1201–1221. issn: 0733-8716. doi:10.1109/JSAC.2017.2692307.
[7]
M Series. “IMT Vision–Framework and overall objectives of the future development
of IMT for 2020 and beyond”. In: Recommendation ITU (2015), pp. 2083–.
[8]
Huawei Technologies Co. “White Paper: 5G Network Architecture - A High-Level
Perspective”. In: (2016).
[9]
Naga Bhushan et al. “Network densification: the dominant theme for wireless
evolution into 5G”. In: IEEE Communications Magazine 52.2 (2014), pp. 82–89.
issn: 0163-6804.
[10]
A. Gupta and R. K. Jha. “A Survey of 5G Network: Architecture and Emerging
Technologies”. In: IEEE Access 3 (2015), pp. 1206–1232. issn: 2169-3536. doi:
10.1109/ACCESS.2015.2461602.
102 BIBLIOGRAPHY
[11]
M. Agiwal, A. Roy, and N. Saxena. “Next Generation 5G Wireless Networks:
A Comprehensive Survey”. In: IEEE Communications Surveys Tutorials 18.3
(thirdquarter 2016), pp. 1617–1655. issn: 1553-877X. doi:
10.1109/COMST.2016.
2532458.
[12]
Erik G Larsson et al. “Massive MIMO for next generation wireless systems”. In:
IEEE communications magazine 52.2 (2014), pp. 186–195. issn: 0163-6804.
[13]
Emil Björnson, Jakob Hoydis, and Luca Sanguinetti. Massive MIMO networks:
Spectral, energy, and hardware efficiency. Vol. 11. 3-4. 2017, pp. 154–655.
[14]
M. R. Akdeniz et al. “Millimeter wave channel modeling and cellular capacity
evaluation”. In: IEEE Journal on Selected Areas in Communications 32.6 (June
2014), pp. 1164–1179. issn: 0733-8716. doi:10.1109/JSAC.2014.2328154.
[15]
Robert W Heath et al. “An overview of signal processing techniques for millimeter
wave MIMO systems”. In: IEEE journal of selected topics in signal processing 10.3
(2016), pp. 436–453. issn: 1932-4553.
[16]
M. Shafi et al. “Microwave vs. Millimeter-Wave Propagation Channels: Key
Differences and Impact on 5G Cellular Systems”. In: IEEE Communications
Magazine 56.12 (2018), pp. 14–20.
[17]
Yong Niu et al. “Relay-Assisted and QoS Aware Scheduling to Overcome Blockage
in mmWave Backhaul Networks”. In: IEEE Transactions on Vehicular Technology
68.2 (2019), pp. 1733–1744. issn: 0018-9545.
[18]
A. Dimas, D. S. Kalogerias, and A. P. Petropulu. “Cooperative beamforming with
predictive relay selection for urban mmWave communications”. In: IEEE Access 7
(2019), pp. 157057–157071. issn: 2169-3536. doi:
10.1109/ACCESS.2019.2950274
.
[19]
Y. Yan, Q. Hu, and D. M. Blough. “Path Selection with Amplify and Forward
Relays in mmWave Backhaul Networks”. In: 2018 IEEE 29th Annual International
Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC).
Sept. 2018, pp. 1–6. doi:10.1109/PIMRC.2018.8580768.
[20]
Yilin Li et al. “Radio resource management considerations for 5G millimeter wave
backhaul and access networks”. In: IEEE Communications Magazine 55.6 (2017),
pp. 86–92. issn: 0163-6804.
[21]
Yilin Li. “Efficient Data Delivery in 5G Mobile Communication Networks”. PhD
thesis. Technischen Universität Berlin, 2019.
[22]
TE Bogale, X Wang, and LB Le. “mmWave communication enabling techniques
for 5G wireless systems: A link level perspective”. In: mmWave Massive MIMO.
Elsevier, 2017, pp. 195–225.
BIBLIOGRAPHY 103
[23]
Farooq Khan and Zhouyue Pi. “mmWave mobile broadband (MMB): Unleashing
the 3–300GHz spectrum”. In: 34th IEEE Sarnoff Symposium. IEEE, pp. 1–6. isbn:
1612846807.
[24]
M. Jacob et al. “Diffraction in mm and Sub-mm Wave Indoor Propagation
Channels”. In: IEEE Transactions on Microwave Theory and Techniques 60.3
(2012), pp. 833–844.
[25]
Z. Shi et al. “Three-dimensional spatial multiplexing for directional millimeter-
wave communications in multi-cubicle office environments”. In: 2013 IEEE Global
Communications Conference (GLOBECOM). 2013, pp. 4384–4389.
[26]
T. S. Rappaport, J. N. Murdock, and F. Gutierrez. “State of the Art in 60-GHz
Integrated Circuits and Systems for Wireless Communications”. In: Proceedings
of the IEEE 99.8 (2011), pp. 1390–1436.
[27]
Z. Qingling and J. Li. “Rain Attenuation in Millimeter Wave Ranges”. In: 2006 7th
International Symposium on Antennas, Propagation EM Theory. 2006, pp. 1–4.
[28]
S. Joshi and S. Sancheti. “Foliage loss measurements of tropical trees at 35 GHz”.
In: 2008 International Conference on Recent Advances in Microwave Theory and
Applications. 2008, pp. 531–532.
[29]
Ahmed M Al-samman, Marwan Hadri Azmi, and Tharek Abd Rahman. “A survey
of millimeter wave (mm-Wave) communications for 5G: Channel measurement
below and above 6 GHz”. In: International Conference of Reliable Information
and Communication Technology. Springer, pp. 451–463.
[30]
T. Nitsche, C. Cordeiro, A. B. Flores, E. W. Knightly, E. Perahia, and J. C.
Widmer. “IEEE 802.11 ad: directional 60 GHz communication for multi-Gigabit-
per-second Wi-Fi”. In: IEEE Communications Magazine 52.12 (2014), pp. 132–141.
issn: 0163-6804.
[31]
Ahmed Alkhateeb et al. “Channel estimation and hybrid precoding for millimeter
wave cellular systems”. In: Selected Topics in Signal Processing, IEEE Journal of
8.5 (2014), pp. 831–846.
[32]
Matthew Kokshoorn et al. “Millimeter wave MiMo channel estimation using
overlapped beam patterns and rate adaptation”. In: IEEE Transactions on Signal
Processing 65.3 (2016), pp. 601–616. issn: 1053-587X.
[33]
S. Noh, M. D. Zoltowski, and D. J. Love. “Multi-resolution codebook and adaptive
beamforming sequence design for millimeter wave beam alignment”. In: IEEE
Transactions on Wireless Communications 16.9 (Sept. 2017), pp. 5689–5701. issn:
1536-1276. doi:10.1109/TWC.2017.2713357.
[34]
M. Hussain and N. Michelusi. “Throughput optimal beam alignment in millimeter
wave networks”. In: 2017 Information Theory and Applications Workshop (ITA).
Feb. 2017, pp. 1–6. doi:10.1109/ITA.2017.8023460.
104 BIBLIOGRAPHY
[35]
J. Palacios and D. De Donno and J. Widmer. “Tracking mm-Wave channel
dynamics: Fast beam training strategies under mobility”. In: IEEE INFOCOM
2017 - IEEE Conference on Computer Communications. May 2017, pp. 1–9. doi:
10.1109/INFOCOM.2017.8056991.
[36]
A. Alkhateeb, G. Leus, and R. W. Heath. “Compressed sensing based multi-user
millimeter wave systems: How many measurements are needed?” In: 2015 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Apr. 2015, pp. 2909–2913. doi:10.1109/ICASSP.2015.7178503.
[37]
J. Rodríguez-Fernández, N. González-Prelcic, K. Venugopal, and R. W. Heath Jr.
“Frequency-domain compressive channel estimation for frequency-selective hybrid
mmWave MIMO systems”. In: arXiv preprint arXiv:1704.08572 (2017).
[38]
Kiran Venugopal et al. “Channel estimation for hybrid srchitecture-based wideband
millimeter wave systems”. In: IEEE Journal on Selected Areas in Communications
35.9 (2017), pp. 1996–2009. issn: 0733-8716.
[39]
Kiran Venugopal et al. “Time-domain channel estimation for wideband millimeter
wave systems with hybrid architecture”. In: Acoustics, Speech and Signal Processing
(ICASSP), 2017 IEEE International Conference on. IEEE, 2017, pp. 6493–6497.
isbn: 1509041176.
[40]
O. El Ayach, R. W. Heath, S. Rajagopal, and Z. Pi. “Multimode precoding in
millimeter wave MIMO transmitters with multiple antenna sub-arrays”. In: Global
Communications Conference (GLOBECOM), 2013 IEEE. IEEE, pp. 3476–3480.
isbn: 1479913537.
[41]
Didi Zhang et al. “Hybridly connected structure for hybrid beamforming in
mmWave massive MIMO systems”. In: IEEE Transactions on Communications
66.2 (2018), pp. 662–674. issn: 0090-6778.
[42]
P. L. Cao, T. J. Oechtering, and M. Skoglund. “Precoding design for massive MIMO
systems with sub-connected architecture and per-antenna power constraints”. In:
WSA 2018; 22nd International ITG Workshop on Smart Antennas. Mar. 2018,
pp. 1–6.
[43]
M. Majidzadeh et al. “Hybrid beamforming for single-user MIMO with partially
connected RF architecture”. In: 2017 European Conference on Networks and
Communications (EuCNC). June 2017, pp. 1–6. doi:
10.1109/EuCNC.2017.
7980696.
[44]
Shahar Stein Ioushua and Yonina C Eldar. “Hybrid analog-digital beamforming
for massive MIMO systems”. In: arXiv preprint arXiv:1712.03485 (2017).
[45]
F. Sohrabi and W. Yu. “Hybrid digital and analog beamforming design for large-
scale antenna arrays”. In: IEEE Journal of Selected Topics in Signal Processing 10.3
(Apr. 2016), pp. 501–513. issn: 1932-4553. doi:10.1109/JSTSP.2016.2520912.
BIBLIOGRAPHY 105
[46]
Ang Li and Christos Masouros. “Hybrid analog-digital millimeter-wave MU-MIMO
transmission with virtual path selection”. In: IEEE Communications Letters 21.2
(2017), pp. 438–441. issn: 1089-7798.
[47]
Jingbo Du et al. “Hybrid precoding architecture for massive multiuser MIMO
with dissipation: Sub-connected or fully-connected structures?” In: arXiv preprint
arXiv:1806.02857 (2018).
[48]
T. K. Vu et al. “Joint Path Selection and Rate Allocation Framework for
5G Self-Backhauled mm-wave Networks”. In: IEEE Transactions on Wireless
Communications 18.4 (2019), pp. 2431–2445.
[49]
R. Abdel-Raouf, H. Esmaiel, and O. A. Omer. “Fuzzy logic based relay selection
for mmWave communications”. In: 2019 9th Annual Information Technology,
Electromechanical Engineering and Microelectronics Conference (IEMECON).
Mar. 2019, pp. 263–267. doi:10.1109/IEMECONX.2019.8877074.
[50]
J. García-Rois et al. “On the Analysis of Scheduling in Dynamic Duplex Multihop
mmWave Cellular Systems”. In: IEEE Transactions on Wireless Communications
14.11 (2015), pp. 6028–6042.
[51]
B. P. S. Sahoo, C. Yao, and H. Wei. “Millimeter-Wave Multi-Hop Wireless
Backhauling for 5G Cellular Networks”. In: 2017 IEEE 85th Vehicular Technology
Conference (VTC Spring). 2017, pp. 1–5.
[52]
X. Gao, L. Dai, and A. M. Sayeed. “Low RF-complexity technologies to enable
millimeter-wave MIMO with large antenna array for 5G wireless communications”.
In: IEEE Communications Magazine 56.4 (Apr. 2018), pp. 211–217. issn: 0163-
6804. doi:10.1109/MCOM.2018.1600727.
[53]
J. Palacios, N. González-Prelcic, and J. Widmer. “Managing Hardware Impairments
in Hybrid Millimeter Wave Mimo Systems: A Dictionary Learning-Based Ap-
proach”. In: 2019 53rd Asilomar Conference on Signals, Systems, and Computers.
IEEE, pp. 168–172. isbn: 1728143004.
[54]
Nima N Moghadam et al. “On the energy efficiency of MIMO hybrid beamforming
for millimeter wave systems with nonlinear power amplifiers”. In: arXiv preprint
arXiv:1806.01602 (2018).
[55]
Xiaoshen Song, Saeid Haghighatshoar, and Giuseppe Caire. “A scalable and
statistically robust beam alignment technique for mm-Wave systems”. In: IEEE
Trans. on Wireless Comm. PP (2018), pp. 1–1.
[56]
X. Song, S. Haghighatshoar, and G. Caire. “Efficient Beam Alignment for
Millimeter Wave Single-Carrier Systems With Hybrid MIMO Transceivers”. In:
IEEE Transactions on Wireless Communications 18.3 (2019), pp. 1518–1533.
106 BIBLIOGRAPHY
[57]
X. Song, T. Kühne, and G. Caire. “Fully-/Partially-Connected Hybrid Beamform-
ing Architectures for mmWave MU-MIMO”. In: IEEE Transactions on Wireless
Communications 19.3 (2020), pp. 1754–1769.
[58]
X. Song et al. “Joint Topology Simplification and Beam Scheduling for Half-Duplex
mmWave Relay Networks”. In: IEEE Transactions on Wireless Communications.
(2020 (to be submitted)).
[59]
P. Schniter and A. Sayeed. “Channel estimation and precoder design for millimeter-
wave communications: The sparse way”. In: 2014 48th Asilomar Conference on
Signals, Systems and Computers. Nov. 2014, pp. 273–277. doi:
10.1109/ACSSC.
2014.7094443.
[60]
John G.. Proakis and Masoud Salehi. Digital communications. McGraw-Hill, 2008.
[61]
Philip Bello. “Characterization of randomly time-variant linear channels”. In:
IEEE Transactions on Communications Systems 11.4 (1963), pp. 360–393.
[62] Andrea Goldsmith. Wireless communications. Cambridge University Press, 2005.
[63]
Akbar M Sayeed. “Deconstructing multiantenna fading channels”. In: IEEE
Transactions on Signal Processing 50.10 (2002), pp. 2563–2579. issn: 1053-587X.