Document [original]

entropy

Article

Stealthy Secret Key Generation

Pin-Hsun Lin 1,*, Carsten R. Janda 1, Eduard A. Jorswieck 1and Rafael F. Schaefer 2

1Information Theory and Communication Systems Department, Technische Universität Braunschweig,

38106 Braunschweig, Germany; [email protected] (C.R.J.); [email protected] (E.A.J.)

2Information Theory and Applications Chair, Technische Universität Berlin, 10623 Berlin, Germany;

[email protected]

*Correspondence: [email protected]

Received: 1 June 2020; Accepted: 15 June 2020; Published: 18 June 2020





Abstract:

In order to make a warden, Willie, unaware of the existence of meaningful communications,

there have been different schemes proposed including covert and stealth communications.

When legitimate users have no channel advantage over Willie, the legitimate users may need

additional secret keys to confuse Willie, if the stealth or covert communication is still possible.

However, secret key generation (SKG) may raise Willie’s attention since it has a public discussion,

which is observable by Willie. To prevent Willie’s attention, we consider the source model for

SKG under a strong secrecy constraint, which has further to fulfill a stealth constraint. Our first

contribution is that, if the stochastic dependence between the observations at Alice and Bob fulfills

the strict more capable criterion with respect to the stochastic dependence between the observations

at Alice and Willie or between Bob and Willie, then a positive stealthy secret key rate is identical

to the one without the stealth constraint. Our second contribution is that, if the random variables

observed at Alice, Bob, and Willie induced by the common random source form a Markov chain, then

the key capacity of the source model SKG with the strong secrecy constraint and the stealth constraint

is equal to the key capacity with the strong secrecy constraint, but without the stealth constraint.

For the case of fast fading models, a sufficient condition for the existence of an equivalent model,

which is degraded, is provided, based on stochastic orders. Furthermore, we present an example to

illustrate our results.

Keywords:

secret key generation; source model; stealthy communications; covert communications;

channel resolvability; conceptual wiretap channel; stochastically degraded; stochastic orders

1. Introduction

Consider the following motivating example. Two agents, Alice and Bob, want to establish

a communication that does not raise the curiosity of a warden Willie, whose duty is to monitor if

there is any suspicious activity and also decrypts the data. In order to realize a confidential transmission

for such a scenario, we may adopt the following two steps. The first step is to make Willie unaware of the

existence of the meaningful communication, which is embedded in the messages intended to be delivered

to Bob. In contrast, in a meaningless communication, Bob does not care about the received signal, which is

only used to confuse Willie. If Willie can successfully detect the existence of the meaningful transmission,

then the second step is to use wiretap coding [

] to provide secrecy (or hidability [

]). There are two

main concepts to attain the goal of the first step: (1) communications with a stealth constraint [

] and

(2) communications with a covert constraint [

]. Both concepts make Willie unable to differentiate

between the existence or nonexistence of the meaningful transmission, solely according to the probability

distributions of his observations. More specifically, in the first concept, we transmit meaningful and

meaningless signals non-overlapped in time. Note that the meaningful signal is the one Alice wants

Entropy 2020,22, 679; doi:10.3390/e22060679 www.mdpi.com/journal/entropy

Entropy 2020,22, 679 2 of 14

to communicate with Bob, while the meaningless signal is used to confuse Willie. If well designed,

Willie cannot differentiate between those two signals, because the induced output distributions are close

(the closeness can be defined in several different ways, e.g., by total variational distance, divergence, etc.)

to each other. In the second concept, the meaningful signal can be superimposed on the meaningless

one. Under the stealth constraint, we can have a positive capacity, while the covert transmission rate

is zero, asymptotically. Even though the transmission rate of the second concept is in general zero,

asymptotically, the second order rate is positive following the square root law [5,6].

For the aforementioned two concepts, if the channel between Alice and Bob (denoted by Bob’s channel

in the following) is no better than the channel between Alice and Willie (denoted by Willie’s channel

in the following), we need additional keys to conceal the meaningful signals, e.g., [

]. In particular,

these additional keys are used to choose between codebooks to fool Willie. Our motivation is to

design an achievable scheme for the source model secret key generation (SKG) for the above scenario,

i.e., Bob has no channel advantage over Willie, while fulfilling both the security and stealth constraints,

simultaneously. Note that the keys generated from the stealthy SKG can also be used to protect the data

on top of concealing the behavior of transmission, e.g., encryption/one-time pad, etc. Note also that our

design goal is violated if we directly apply common SKG schemes [

]. This is because common SKG

schemes use public communications for several important operations including advantage distillation,

information reconciliation, and privacy amplification ([

] Chapter 4.3).Without subtle modifications,

these operations will raise Willie’s curiosity. To attain our objective, we focus on stealthy SKG, which is

from its counterpart, stealth communications [

]. The main reason not to consider covertness but stealth

for the SKG is that, under the assumption of a noiseless public discussion channel, there is no ambient

noise to hide the discussion signal. Instead, covert SKG may be feasible if there is a noisy public channel.

In addition, in general, the covert SKG suffers a sub-linear rate, e.g., [

], which is inherited from the covert

communications. Recall that a channel-model SKG with a rate-unlimited public channel was considered

in [

]. The authors applied the scheme from covert communication to the key transmission, while a

stealth-like public discussion was used.

The main contributions of this work are summarized in the following:

•We investigate a source-model SKG under strong secrecy with an additional stealth constraint.

•

We derive an achievable secret key (SK) rate under the stealth constraint, if

I(X

;

Y)≥I(X

;

where

, and

are the observations of the common randomness source at Alice, Bob,

and Willie, respectively. Moreover, if

form a Markov chain

X−Y−Z

, then the SK

capacity with the additional stealth constraint can be achieved without extra cost, compared to

the SKG without the stealth constraint.

•

We prove that a sufficient condition to achieve the stealthy SK capacity can be relaxed from

the physically degraded channel to a stochastically degraded one.

•

A sufficient condition for the existence of an equivalent degraded model is derived by the usual

stochastic order [9], which is for the fast fading Gaussian Maurer’s (satellite) model [10].

Notation: Lower case bold letters denote deterministic vectors, and upper case normal/bold

letters denote random variables/random vectors (or matrices), which will be defined when they are

first mentioned. We denote the probability mass function (pmf) by

. The entropy of

is denoted

H(X)

. The mutual information between two random variables

and

is denoted by

I(X

;

The divergence between distributions

and

is denoted by

D(PX||PY)

X∼F

denotes that the

random variable

follows the distribution

, while

−F

. The subscript

denotes the

symbol, and

Xi,[X1

· · ·

Xi]

X−Y−Z

denotes the Markov chain.

d·e

denotes the ceiling

operator. All logarithms are to base two. (a)+,max(a, 0).

The rest of the paper is organized as follows. In Section 2, we introduce the preliminaries and

the considered system model. In Section 3, we derive our main results. Finally, Section 4concludes

this paper.

Entropy 2020,22, 679 3 of 14

2. Preliminaries and System Model

2.1. Preliminaries

We first introduce some necessary definitions and results to develop our work.

Definition 1. The strong secrecy and the stealth constraints are respectively defined as:

D(PMZn||PMPZn)≤e,

D(PZn||QZn)≤e,

for arbitrarily small

0, where

PZn

, and

QZn

are transmitted messages, the observed signal at Willie,

and the output distributions at Willie induced by meaningful and meaningless signals, respectively.

The second constraint in the above definition can be explained by hypothesis testing as discussed

in [

]. By this viewpoint, if the second constraint is fulfilled, the adversary’s best strategy is to blindly

guess whether the current transmitted signal is meaningful or meaningless.

Definition 2.

Denote a common random source as

PXYZ)

, where

and Z

are the alphabets

of the observations at Alice, Bob, and Willie. The random source is stochastically degraded, if the marginal

distributions

PY|X

and

PZ|X

are identical to those of another source of common randomness

PX˜

Y˜

following the physical degradedness, i.e., X −˜

Y−˜

Corollary 1.

The same marginal property for one transmitter ([

] Theorem 13.9) Consider a discrete

memoryless multiuser channel including one transmitter and two non-cooperative receivers with input

and output alphabets

and

Y × Z

, respectively. The capacity region of such a channel depends only on

the conditional marginal distributions

PY|X

and

PZ|X

and not on the joint conditional distribution

PY,Z|X

where X ∈ X and Y ∈ Y and Z ∈ Z are the transmit signal and the two receive signals, respectively.

Definition 3. δ-robust typicality ([12] Appendix)

The sequence xn∈ X nis δ-robust typical for δ>0:

T(n)

δ(PX) = xn∈ X n:

N(a|xn)

n−PX(a)

≤δPX(a),∀a∈ X , (1)

where N(a|xn)is the number of occurrences of a in xn.

Definition 4. ([9] (1.A.3) For random variables A and B, A ≤st B if and only if ¯

FA(a)≤¯

FB(a)for all a.

Let A=st A0denote that Aand A0have the same distribution.

Lemma 1.

Coupling [

A≤st B

if and only if there exists random variables

A=st A

and

B=st B

such

that ˆ

A≤ˆ

B almost surely.

2.2. System Model

The considered system model is shown in Figure 1. We denote the

-time source observations

at Alice, Bob, and Willie by

, and

, respectively, which follow the independent and identically

distributed (i.i.d.) joint distribution

PXnYnZn=∏n

i=1PXiYiZi=∏n

i=1PXYZ

with alphabets

respectively. The public discussion between Alice and Bob through a noiseless channel is denoted by a

random vector

Fn∈ Fn

. We consider the case without rate limitation on the public discussion channel.

Willie can perfectly observe

. The joint distributions of the signals that Willie can observe when the

SKG is meaningful and meaningless are denoted by

PFnZn

and

QFnZn

, respectively. Alice and Bob aim

at sharing keys K∈ K satisfying the constraints as follows:

Entropy 2020,22, 679 4 of 14

Pe,Pr(K6=ˆ

K)≤e, (2)

log |K| − H(K)≤e, (3)

D(PKZnFn||PKPZnFn)≤e, (4)

D(PFnZn||QFnZn)≤e(5)

for arbitrarily small

0, where

(2)

is the error probability having different keys at Bob from

Alice,

(3)

is the keys’ uniformity constraint, while

|K|

is the number of keys and

(4)

is the constraint

for the strong secret key, which is an adaptation from the stealth communication in Definition 1.

In particular,

is dual to

, and

ZnFn

is dual to

. The stealth constraint is considered in

(5)

which is again an adaptation from Definition 1, i.e., here,

FnZn

is what Willie can observe, instead of

solely Znin stealth communications.

Alice Bob

Willie

PXYZ

Is there SKG?

Figure 1. The system model of the considered stealthy secret key generation (SKG).

Definition 5. The rate of the keys generated fulfilling (2)–(5)is called the achievable stealthy strong SK rate.

Definition 6. The maximum achievable stealthy strong SK rate is called the stealthy strong SK capacity.

3. Main Results

We show two main result in this section: (1) the stealthy strong SK rate and a condition to attain

the capacity; (2) a scheme to identify the fast fading Gaussian Maurer’s model as a degraded one,

so that the stealth SK capacity can be determined explicitly.

3.1. Stealthy Strong Secret Key Rate and Capacity

Our main result is described by the following theorem followed by discussions.

Theorem 1.

drawn from the common random source

PXYZ)

, then the stealthy strong

SK capacity CSK of source model SKG with the stealth constraint can be bounded by:

max{I(X;Y)−I(X;Z),I(Y;X)−I(Y;Z)} ≤ CSK ≤min{I(X;Y|Z),I(X;Y)}. (6)

Furthermore, if (X,Y,Z)forms a Markov chain X −Y−Z, the stealthy strong SK capacity is:

CSK =I(X;Y)−I(X;Z). (7)

Entropy 2020,22, 679 5 of 14

3.2. Sufficient Conditions for a Degraded Common Randomness

In this section, we derive a sufficient condition to obtain

CSK =I(X

;

Y)−I(X

;

. In particular,

we show that this sufficient condition, i.e., the common randomness forming a Markov chain

X−Y−Z

can be relaxed to be stochastically degraded. After that, we show that this relaxed condition can

be satisfied under a quite common setting by, e.g., the fast fading Gaussian Maurer’s (satellite)

model [

]. In particular, a central random source

emits signals passing through fast fading

additive white Gaussian noise (AWGN) channels, which are observed as

, and

at Alice, Bob,

and Willie, respectively.

Theorem 2.

If a common random source

PX˜

Y˜

is stochastically degraded such that

P˜

Y|X=PY|X

and P˜

Z|X=PZ|X, where X −Y−Z, then:

CSK =I(X;Y)−I(X;Z). (8)

The proof is delegated to Appendix C.

Example: Consider the fast fading Gaussian Maurer’s (satellite) model [

] as a special case of

Theorem 2:

X=AXS0+NX, (9a)

Y=AYS0+NY, (9b)

Z=AZS0+NZ, (9c)

where

, and

are independent AWGNs at Bob and Willie, respectively, while both are with

zero mean and unit variance;

, and

follow CDFs

, and

, respectively, and are

the i.i.d. fast fading channel gains from the source

to Alice and Willie, respectively. Note that

intuitively,

, and

have no degradedness relation in general due to the random fading. This is

because, by the definition of degradedness, the trichotomy order of all realizations between the two

fading channels within a codeword length should be the same. We can invoke the same marginal

property [

] to construct an equivalent channel, wherein by imposing the usual stochastic order

constraint, we can identify those fading channels that can be re-ordered in the equivalent channel to

keep the trichotomy order fixed.

If the random channels

and

fulfill

FA2

X(x)≥¯

FA2

Z(x)

for all

, where the subscripts denote

the absolute square of the channel magnitudes, then from Lemma 1, we have equivalent (in the

sense of having the same stealthy SK capacity) observations at Bob and Willie as

Y=ˆ

AXS0+NY

and

Z=ˆ

AZS0+NZ

, respectively, where

X≥ˆ

almost surely. Therefore, it is clear that

FA2

X(x)≥¯

FA2

Z(x)

is a relaxed sufficient condition to guarantee that

is an equivalently stochastically

degraded version of Y.

Assume that

and

in Equations (9a)–(9c) are fast fading magnitudes following the

Nakagami-

distribution with shape parameters

and

and spread parameters

and

[

respectively. From Theorem 2, we know that Zis a degraded version of Xif:

γmx,mx

xΓ(mz)≥γmz,mz

xΓ(mx),∀x,

where

γ(s

x) = Rx

0ts−1e−tdt

is the incomplete gamma function and

Γ(s) = R∞

0ts−1e−tdt

is the ordinary gamma function. An example satisfying the above inequality is

(mx

wx)=(

1, 3

)

and (mz,wz) = (1, 2).

Entropy 2020,22, 679 6 of 14

4. Conclusions

In this work, we analyzed the performance of the secret key generation from a common random

source, which satisfied the additional constraint that the generation of keys should not invoke

the warden Willie’s attention. Our results showed that compared to the normal SKG, the additional

stealth constraint could be fulfilled without extra cost. In particular, the stealthy SK capacity with

strong secrecy constraint is

I(X

;

Y)−I(X

;

, if the common random source satisfies

I(X

;

Y)≥I(X

;

To emphasize the practical relevance, a sufficient condition was derived to attain the degradedness

by the usual stochastic order for the Gaussian Maurer’s (satellite) model for the source of common

randomness under fast fading. As a final note, we can also use Slepian–Wolf coding with a proper use

of the binning code book to derive the same result.

Author Contributions:

Formal analysis, P.-H.L.; investigation, P.-H.L., C.R.J., and E.A.J.; supervision, E.A.J.;

validation, C.R.J. and R.F.S.; writing, review and editing, C.R.J., E.A.J., and R.F.S. All authors have read and agreed

to the published version of the manuscript.

Funding:

The work of Pin-Hsun Lin was supported in part by German Research Foundation (DFG) LI 2886/2-1.

The work of Eduard A. Jorswieck was supported in part by DFG, C. Janda was supported by DFG within the

project Play Scate (DFG JO 801/21-1). We acknowledge support by the German Research Foundation and the

Open Access Publication Funds of the Technische Universität Braunschweig. This work of R. F. Schaefer was

supported by the German Federal Ministry of Education and Research (BMBF) within the national initiative for

“Post Shannon Communication (NewCom)” under Grant 16KIS1004.

Acknowledgments:

The authors would like to thank Matthieu Bloch for fruitful discussions and the anonymous

reviewers’ efforts on Remark 1 and also on improving the quality of the presentation of this paper.

Conflicts of Interest: The authors declare no conflict of interest.

Appendix A. Proof of Theorem 1

Our main idea for deriving the lower bound of the SK capacity in the second scheme is by

constructing a conceptual WTC (CWTC) as in [

]. An equivalent wiretap codebook

{Un(m

w)}

is constructed, where

· · ·

L0,

is the number of secure messages,

and

L1,

nR1

is the number of confusion messages;

and

are uniformly and independently

selected;

Un(m

w)∈ X n

∀(m

. In addition,

(Zn

Un)

are generated according to

PZnFnUn=

∏n

i=1PZiFiUi=∏n

i=1PZiFi|UiPUi, where we consider the equivalent channel from Alice to Willie as:

PZnFn|Un=

∏

i=1

PZiFi|Ui, (A1)

where

(Zn

Fn)

is the equivalent channel output at Willie. Similarly,

(Yn

Fn)

is the equivalent channel

output at Bob. We choose Unmutually independent of Xn,Yn, and Zn.

In order to analyze the stealth, the respective distributions of the meaningful and meaningless

signals at the equivalent channel output at Willie are:

PZnFn=1

L0L1

∑

m=1

∑

w=1

PZn,Fn|Un(zn,fn|un(m,w)), (A2)

QZnFn=∑

PZn,Fn|Un(zn,fn|un)PUn(un). (A3)

Entropy 2020,22, 679 7 of 14

We first decompose the stealth secrecy constraint as follows:

D(PKZnFn||PKPZnFn) + D(PZnFn||QZnFn)

=∑

K,Zn,Fn

PKZnFnlog PKZnFn

PKPZnFn+log PZnFn

QZnFn

=∑

K,Zn,Fn

PKZnFnlog PKZnFn

PKQZnFn

,D(PKZnFn||PKQZnFn)(A4)

(a)

=D(PK||PK) + D(PZnFn|K||QZnFn|PK)

=D(PZnFn|K||QZnFn|PK), (A5)

where (a) follows the chain rule of divergence ([17] Th.2.2.2).

We then apply the channel resolvability analysis [

] to the CWTC, in order to find the rate

constraint on the confusion messages, in order to guarantee the validity of the stealth secrecy

constraint (A4).

From the analysis conducted in Appendix B, we know:

EC[D(PZnFn|K||QZnFn|PK)]≤EZnFnUnlogPZnFn|Un

L1QZnFn+1

. (A6)

Recall that

L1=

dnR1e

is the number of confusion messages inside each bin, and we need to design

it such that

(A6)

is vanishing. The main difference between our proof and that in [

] is that we introduce

an additional channel output at both Bob and Willie by constructing a CWTC for the considered SKG

model, which makes the results from [3] not able to be directly applied.

Now, we reexpress the ratio in the logarithm on the right-hand side (RHS) of (A6) as follows:

PZnFn|Un

QZnFn

(a)

=PZnFnUn

PUn

PZnQFn

(b)

=PZnFnUn

PZnUn

QFn=PFn|ZnUn

QFn, (A7)

where (a) is due to the fact that

and

are independent when a meaningless discussion

is transmitted, which has a pmf denoted by

QFn

; (b) comes the fact that

is independent of

by selection, i.e., PZnUn=PZnPUn.

Note that even though

and

are independent and

and

are independent by

assumption, that does not mean

, and

are necessarily generated according to

PZn,Fn,Un=

PZnPFn,Un

PZn,Fn,Un=PZnPFnPUn

. In fact, since pairwise independence does not imply mutual

independence ([

] Chapter 7.1, 7.2), there exists joint distribution

PZn,Fn,Un

such that we can invoke

tools from the joint asymptotic equipartition property [20].

Then, we can rewrite (A6) as follows:

EC[D(PZnFn|K||QZnFn|PK)] ≤EZnFnUnlog PFn|ZnUn

L1QFn+1. (A8)

Entropy 2020,22, 679 8 of 14

The RHS of

(A8)

can be discussed in two cases as follows similar to [

], according to whether

(zn,fn,un)are jointly typical or not:

d1=∑

(zn,fn,un)∈

δ(PZn,Fn,Un)

PZnFnUn(zn,fn,un)log PFn|UnZn(fn|unzn)

L1QFn(fn)+1!,

d2=∑

(zn,fn,un)/∈

δ(PZn,Fn,Un)

PZnFnUn(zn,fn,un)log PFn|UnZn(fn|unzn)

L1QFn(fn)+1!,

where Tn

δfollows the δ-robust typicality [12] definition for the subsequent derivation.

The Chernoff bound and an important upper bound, which will be used later, are restated in

the following.

Lemma A1. (Chernoff bound ([12] Lemma 16) For every a ∈ X , xn∈ X n, and δ>0,

PN(a|xn)

n≤(1+δ)PX(a)≤e−δ2PX(a)n/3. (A9)

Lemma A2. (Upper bound of the probability of a non-typical set ([12] Lemma 17)

P(xn/∈T(n)

δ)≤2|SX|e−δ2µXn/3, (A10)

where SX,{x∈ X :P(x)>0}and µx,minx∈SXP(x).

Next, we derive the constraint (The constraint that Bob should successfully decode both the secret

and confusion messages is a point-to-point transmission without secrecy, which can be seen from [

Therefore, we omit the proof here.) on R1as follows:

(a)

≤ ∑

(zn,fn,un)∈

δ(PZnFnUn)

PZnFnUn(zn,fn,un)!log 1+2−n[H(F|UZ)−δ]

L12−n(1+e)H(F)!

(b)

≤log 1+2−n[H(F|UZ)−δ]

L12−n(1+e)H(F)!

(c)

=log 1+2−n(R1−I(F;UZ)−e0), (A11)

where (a) comes from ([

] Lemma 18, Lemma 20); (b) comes from the fact that the total probability

of the jointly typical set is smaller than one; (c) is from definition of

and

e0,e[

+H(F)] =

e[1+H(U)]. After that, we have d1→0 if n→∞when the following constraint is fulfilled:

R1>I(UZ;F) + e0(a)

=I(UZ;U⊕X) + e0(b)

=H(U)−H(X|Z) + e0, (A12)

where (a) comes from a specific use of the public discussion following ([

] Theorem 3) and

⊕

is the modulo addition in

. We can follow the argument in ([

] Appendix B) in order to apply

the crypto lemma in

(A12)

(A14)

to unbounded

like the Gaussian case. (b) comes from the fact

Entropy 2020,22, 679 9 of 14

that

is uniformly distributed with the crypto lemma. In addition, we can derive that

d2→

0 as

n→∞as follows:

(a)

≤∑

(zn,un,fn)/∈Tn

δ(PZn,Un,Fn)

PZnFnUn(zn,fn,un)log 1

Fn(fn)+1

(b)

≤∑

(zn,un,fn)/∈Tn

δ(PZn,Un,Fn)

PZnFnUn(zn,fn,un)log 1

µn

+1!

(c)

≤∑

(zn,un,fn)/∈Tn

δ(PZn,Un,Fn)

PZnFnUn(zn,fn,un)nlog 1

µf

+1!

(d)

=nPr ((zn,un,fn)/∈Tn

δ(PZn,Un,Fn))log 1

µf

+1!

(e)

≤2n|SZUF|e−δ2µZUFn/3 log 1

µf

+1!, (A13)

where (a) is due to the fact that

PFn|UnZn(fn|unzn)≤

1, and therefore,

PFn|UnZn(fn|unzn)/L1≤

1; (b) is

by lower bounding

Fn(fn)

with

µf=minfn∈SFnQFn(fn)

, where

SFn,{fn∈ X n:P(fn)>

}

;

µf

(A13)

is a constant, but not a function of

. Therefore, the RHS of

(A13)

can be easily seen to vanish

exponentially fast, if n→∞. Then, from (A11) and (A13), it is clear that (4) and (5) are fulfilled.

By constructing the CWTC, the following rate between Alice and Bob is achievable:

n(R+R1)≤I(Un⊕Xn,Yn;Un)

=H(Yn,Un⊕Xn)−H(Yn,Un⊕Xn|Un)

(a)

=H(Yn) + H(Un)−H(Xn,Yn)

=H(Un)−H(Xn|Yn), (A14)

where (a) again comes from the crypto lemma with the selection of

being independent of

. Then,

from (A12) and (A14), the achievable stealthy strong SK rate can be derived as follows:

nR ≤H(Un)−H(Xn|Yn)−nR1

(a)

<n[H(X|Z)−H(X|Y)]

=n[I(X;Y)−I(X;Z)], (A15)

where (a) is by plugging

(A12)

with the assumption of a memoryless common randomness, which is

independent and identically distributed (i.i.d.). We can interchange the roles of

and

to get

I(Y;X)−I(Y;Z), which completes the proof.

Remark A1.

In addition to the channel resolvability scheme, we can also attain the proof by a modified SWC

scheme, which is sketched as follows. We can first construct a binary auxiliary random variable

, which selects

the meaningful or meaningless discussion when

1and

0, respectively. The stealth constraint is to avoid

Willie successfully guessing the realization of

and can be formulated as

I(S

;

ZnFn)≤e

.By the data processing

inequality for divergence [

], we can have the inequality

I(S

;

ZnFn)≤D(PKZnFn||PKQZnQFn)

, which is an

effective secrecy constraint of the considered SKG model. It is the counterpart to the one of the wiretap channel

with stealth constraint [

]. We can then construct two binning codebooks for

0and

1, where in each

case, there is one corresponding binning codebook, in order to fulfill the secrecy constraint. In the error analysis,

there are two cases: (1) when

1, the probability of Alice and Bob having different keys; (2) when

Entropy 2020,22, 679 10 of 14

the probability of Bob generating a key that is not null. By vanishing enforcing the average error probability,

we can attain the result in Theorem 1.

Remark A2.

In the analysis by the WTC scheme in Appendix A, we derive the stealthy strong SK rate by

combining a tool based one channel resolvability developed in [

] with the CWTC. In particular, we first derive

an upper bound of the averaged stealth secrecy constraint

(A5)

by the random coding analysis. By enforcing the

upper bound to vanish, we can derive the constraints on the stealthy strong SK rate and the confusion rate in the

codebook design. By this scheme, we can proceed with the derivation based on the result of the wiretap channel.

Appendix B. Proof of (A6)

In the following, we show the tedious derivation for (A6) from channel resolvability [3]:

EC[D(PZnFn|K||QZnFn|PK)]

(a)

=EC[D(PZnFn|M||QZnFn|PM)]

(b)

=EC,M"∑

zn,Fn

PZn,Fn|M(zn,fn|M=m)log PZn,Fn|M(zn,fn|M=m)

QZn,Fn(zn,fn)!#

(c)

=EC,M

∑

zn,Fn

∑

w=1

PZn,Fn|Un(zn,fn|un(m,w)) log 



∑L1

l=11

L1PZn,Fn|Un(zn,fn|un(m,l))

QZn,Fn(zn,fn)





(d)

LL1

EC"∑

zn,Fn

∑

m=1

∑

w=1

PZn,Fn|Un(zn,fn|un(m,w)) log ∑L1

l=1QZn,Fn|Un(zn,fn|un(m,l))

L1PZn,Fn(zn,fn)!#

(e)

LL1

EC"∑

zn,Fn

∑

m=1

∑

w=1

PZn,Fn|Un(zn,fn|un(m,w)) log Am(zn,fn|un)

B(zn,fn)#

(f)

LL1∑

un(1,1)

· · · ∑

un(L,L1)

L,L1

∏

m=1,w=1

U(un(m,w)) "∑

zn,Fn

∑

m=1

∑

w=1

PZn,Fn|Un(zn,fn|un)log Am(zn,fn|un)

B(zn,fn)#

(g)

LL1∑

un(1,1)

· · · ∑

un(L,L1)

L,L1

∏

m=1,w=1

U(un(m,w))

∑

zn,Fn







(PZn,Fn|Un(·|un(1, 1))+ · · · +PZn,Fn|Un(·|un(1, L1))) ·log A1(zn,fn|un)

B(zn,fn)+

.....

(PZn,Fn|Un(·|un(L, 1)) · · · +PZn,Fn|Un(·|un(L,L1))) ·log AL(zn,fn|un)

B(zn,fn)







(h)

LL1∑

un(1,1)

· · · ∑

un(L,L1)

∑

zn,Fn







PZn,Fn,Un(·,un(1, 1))

L,L1

∏

m6=1,w6=1

U(un(m,w)) + · · · +PZn,Fn,Un(·,un(1, L1))

L,L1

∏

m6=1,w6=L1

U(un(m,w))·

log A1(zn,fn|un)

B(zn,fn)+· · ·

PZn,Fn,Un(·,un(L, 1))

L,L1

∏

m6=L,w6=1

U(un(m,w)) + · · · +PZn,Fn,Un(·,un(L,L1))

L,L1

∏

m6=L,w6=L1

U(un(m,w))·

log AL(zn,fn|un)

B(zn,fn)







Entropy 2020,22, 679 11 of 14

(i)

LL1∑

zn,Fn







∑

un(1,1)

PZn,Fn,Un(·,un(1, 1))EUn\Un(1,1)hlog A1(zn,fn|un)

B(zn,fn)i+· · · +

∑

un(k,l)

PZn,Fn,Un(·,un(k,l))EUn\Un(k,l)hlog Mk(zn,fn|un)

B(zn,fn)i+· · · +

∑

un(L,L1)

PZn,Fn,Un(·,un(L,L1))EUn\Un(L,L1)hlog AL(zn,fn|un)

B(zn,fn)i







(j)

LL1∑

zn,Fn

(L,L1)

∑

(a,b)=(1,1)

∑

un(a,b)

PZn,Fn,Un(·,un(a,b))EUn\Un(a,b)log Aa(zn,fn|un)

B(zn,fn)

(k)

≤1

LL1∑

zn,Fn

(L,L1)

∑

(a,b)=(1,1)

∑

un(a,b)

PZn,Fn,Un(·,un(a,b)) log EUn\Un(a,b)Aa(zn,fn|un)

B(zn,fn)

(l)

LL1∑

zn,Fn

(L,L1)

∑

(a,b)=(1,1)

∑

un(a,b)

PZn,Fn,Un(·,un(a,b)) log 





PZn,Fn|Un(·|un(a,b)) + ∑

s6=b

∑

un(a,s)

PZn,Fn,Un(·,un(a,s))

B(zn,fn)





(m)

≤1

LL1∑

zn,Fn

(L,L1)

∑

(a,b)=(1,1)

∑

un(a,b)

PZn,Fn,Un(·,un(a,b)) log 





PZn,Fn|Un(·|un(a,b)) +

(L,L1)

∑

(r,s)=(1,1)

∑

un(r,s)

PZn,Fn,Un(·,un(r,s))

B(zn,fn)







(n)

≤1

LL1∑

zn,Fn

(L,L1)

∑

(a,b)=(1,1)

∑

un(a,b)

PZn,Fn,Un(·,un(a,b)) log PZn,Fn|Un(·|un(a,b))

B(zn,fn)+1!

(o)

=EZnFnUnlog PZnFn|Un

L1QZnFn+1,(A16)

where (a) is by constructing a CWTC, such that the key

is interchangeable with the message

; (b)

is by definition of the conditional K-L distance ([

] Definition 2.2); (c) is due to the fact that

PZn,F|M

is the

marginalization of

PZn,Fn|Un

with respect to

, which is the index of the confusion message; in (d), we

expand the expectation with respect to

; (e) is by defining

∑L1

l=1PZn,Fn|Un(zn

fn|un(m

l))

and

L1QZn,Fn(zn

fn)

Am(zn

fn|un)

and

B(zn

fn)

, respectively, to simplify the expression, where

· · ·

; (f) is by

definition of the expectation over

{Un(m

w)}L,L1

m=1,w=1

. Since

{Un(m

w)}

are generated independently

and identically according to

, the joint distribution of codewords in a codebook is the product of

marginal distributions; in (g), we expand the summation with respect to

and

; in (h), we expand the

product according to the form in step (g); in (i), we collect terms to form the expectation

EUn\Un(k,l)

; in (j),

we collect the terms by introducing additional indices

; in (k), we apply Jensen’s inequality to the

logarithm; (l) is by expanding the expectation

EUn\Un(k,l)

; (m) is by adding the term

PZnFnUn(zn

un(a

b))

;

(n) is by the definition of marginalization over

PZnFnUn

with respect to

. In particular, the second

term on the RHS of the numerator in (m) becomes

EUn[PZn,Fn|Un] = QZnFn

from

(A3)

; (o) is by definition

of the expectation.

Appendix C. Proof of Theorem 2

We sketch the proof as follows in three steps, while the main idea is summarized in Figure A1.

The key is to show that the stochastically degraded source

implies that the corresponding

CWTC [

] is also stochastically degraded, then the secret key capacity

CSK

of the source is the same

as the secrecy capacity of a CWTC constructed from a physically degraded source

. The first

step is to construct the CWTC of the source

and prove that, if

X−Y−Z

, then

U−Y0−Z0

, i.e., the

corresponding CWTC is also physically degraded, where

is the conceptual code symbol, uniformly

distributed in

and independent of

. The equivalently received signals at Bob and Willie in the

CWTC are

Y0,(Y

U⊕X)

and

Z0,(Z

U⊕X)

, respectively, while

U⊕X

is the signal transmitted through

the public channel. The second step is to construct a stochastically degraded source

from

. After that, we construct the corresponding CWTC of

Y00

Z00)

, where

Y00 ,(˜

U⊕X)

and

Z00 ,(˜

U⊕X)

. The third step is to show that the CWTC described by

Y00

Z00)

has the same

marginals as the CWTC described by

Z0)

, i.e., the two CWTC’s have the same secrecy capacity.

In addition to the fact that stochastic degradedness is no stronger than the physical degradedness,

Entropy 2020,22, 679 12 of 14

this results in that the former one should not have a higher secret key capacity than the latter one.

We then know that the sources

and

have the same secret key capacity, which completes

the proof.

Common random source Conceptual WTC

Physically degraded:

Stochastically degraded:

Figure A1.

Key steps in the proof of Theorem 2. First is to show that the CWTCalso physically

degraded if the source is physically degraded. Then, we show that the CWTC constructed from

a stochastically degraded source corresponding to the physically degraded source is a stochastically

degraded CWTC with the same marginal as that physically degraded CWTC. Finally, we show that

the secrecy capacity of the second CWTC is indeed the secret key capacity of the stochastically degraded

source. The key and secrecy capacities of the physically degraded source and the corresponding CWTC

are denoted by

CSK

and

, respectively, while the key (rate) and secrecy capacities of the stochastic

source and the corresponding CWTC are denoted by C0

SK (R0

SK)and C0

S, respectively.

In the following, we will prove that the stochastically degraded random source

implies

that the corresponding CWTC [

] is also stochastically degraded, which is constructed by the

corresponding physically degraded source

. We start from constructing the CWTC of the

random source

, where the equivalently received signals at Bob and Willie are

Y0,(Y

U⊕X)

and

Z0,(Z

U⊕X)

, respectively. If

X−Y−Z

, then

U−Y0−Z0

, i.e., the CWTC is also a physically degraded

one, which can be shown as follows:

I(U;Z0|Y0)(a)

=H(Z,U⊕X|Y,U⊕X)−H(Z,U⊕X|Y,U⊕X,U)

(b)

=H(Z|Y)−H(Z|Y,U⊕X,U)

(c)

=H(Z|Y)−H(Z|Y,X,U)(A17)

(d)

=H(Z|Y)−H(Z|Y,X)

(e)

=0,

where (a) is by the definition of

and

; (b) is by the crypto lemma [

], and

is selected to be

independent of

and

; (c) is from the fact that given

, we can know

from

U⊕X

; (d) is by the same

reason as (b); (e) is due to X−Y−Zand by the definition of conditional mutual information.

Now, we consider the stochastically degraded source of common randomness

fulfilling

P˜

Y|X=PY|X

and

P˜

Z|X=PZ|X

. Similar to the first step, we construct the CWTC from

, namely

{(U

Y00

Z00)

PU,Y00,Z00 =PUPY00,Z00|U}

, where

Y00 ,(˜

U⊕X)

and

Z00 ,(˜

U⊕X)

are the equivalent channel

outputs at Bob and Willie, respectively. To prove that the two CWTC’s

PY0,Z0|U

and

PY00,Z00|U

are equivalent,

we can invoke the same marginal property in ([

] Theorem 16.6) to prove that

PY00|U=PY0|U

and

PZ00 |U=PZ0|U

, which is shown in the following. By the definitions of

Y00

and

, we know that

PY00|U=u=

P˜

Y,u⊕X,P˜

Y,Xu

and

PY0|U=u=PY,u⊕X,PY,Xu

, respectively, where we define

u⊕X

. Note that due to the

Entropy 2020,22, 679 13 of 14

closed operation

⊕

, the distribution of

is a left circular shift of that of

. Instead of directly

proving P˜

Y,Xu=PY,Xu, we can equivalently prove P˜

Y|Xu=PY|Xu,∀u, as follows:

P˜

Y|Xu=x=P˜

Y|X=x0=PY|X=x0=PY|Xu=x,(A18)

where the second equality is due to the assumption of

forming a stochastically degraded source

from the physically degraded one

. Therefore,

PY00|U=PY0|U

. Similarly, we can derive

PZ00 |U=PZ0|U

Hence, we prove that the CWTC of a stochastic degraded source is also a degraded CWTC and the

two CWTC’s have the same secrecy capacity.

Note that due to

X−Y−Z

, we can derive the key capacity

CSK

from the corresponding CWTC,

not just an achievable key rate. For the source

, till now, we may only claim the achievable

secret key rate, but not the secret key capacity, namely

, is the same as

CSK

. That is because the

CWTC is in general only an achievable scheme to derive the secret key rate. However, due to the fact

that the stochastic degradedness is more general than the physical degradedness, i.e., less stringent on

characterizing the order between

and

, the stochastic degradedness cannot result in a larger secret

key capacity than the physically degraded one. Therefore, we attain that

SK =CSK =I(X

;

Y)−I(X

;

which completes the proof.

References

1. Wyner, A.D. The wiretap channel. Bell Syst. Tech. J. 1975,54, 1355–1387. [CrossRef]

Che, P.H.; Bakshi, M.; Jaggi, S. Reliable deniable communication: Hiding messages in noise. In Proceedings

of the 2013 IEEE International Symposium on Information Theory, Istanbul, Turkey, 7–12 July 2013;

pp. 2945–2949.

Hou, J.; Kramer, G. Effective secrecy: Reliability, confusion and stealth.In Proceedings of the 2014 IEEE

International Symposium on Information Theory, Honolulu, HI, USA, 29 June–4 July 2014; pp. 601–605.

Wang, L.; Wornell, G.W.; Zheng, L. Fundamental limits of communication with low probability of detection.

IEEE Trans. Inf. Theory 2016,62, 3493–3503. [CrossRef]

Bloch, M.R. Covert Communication Over Noisy Channels: A Resolvability Perspective. IEEE Trans. Inf.

Theory 2016,62, 2334–2353. [CrossRef]

Bash, B.A.; Goeckel, D.; Towsley, D. Limits of reliable communication with low probability of detection on

AWGN channels. IEEE J. Sel. Areas Commun. 2013,31, 1921–1930. [CrossRef]

Bloch, M.; Barros, J. Physical-Layer Security From Information Theory to Security Engineering; Cambridge

University Press: Cambridge, UK, 2011.

Tahmasbi, M.; Bloch, M.R. Covert Secret Key Generation. In Proceedings of the 2017 IEEE Conference on

Communications and Network Security (CNS), Las Vegas, NV, USA, 9–11 October 2017; pp. 540–544.

9. Shaked, M.; Shanthikumar, J.G. Stochastic Orders; Springer: Berlin/Heidelberger, Germany,2007.

10.

Naito, M.; Watanabe, S.; Matsumoto, R.; Uyematsu, T. Secret key agreement by soft-decision of signals in

Gaussian Maurer’s model. IEICE Trans. Fundam. 2009,E92-A, 525–534. [CrossRef]

11.

Moser, S.M. Advanced Topics in Information Theory-Lecture Notes. 2013. Available online: http://moser-

isi.ethz.ch/docs/it_script_v46.pdf (accessed on 10 June 2020).

12. Orlitsky, A.; Roche, J. Coding for computing. IEEE Trans. Inf. Theory 2001,47, 903–917. [CrossRef]

13. Thorisson, H. Coupling, Stationarity, and Regeneration; Springer: New York, NY, USA, 2000.

14. Gamal, A.E.; Kim, Y.H. Network Information Theory; Cambridge University Press: Cambridge, UK, 2012.

15.

Simon, M.K.; Alouini, M.S. Digital Communication over Fading Channels; John Wiley & Sons: Chichester,

UK, 2000.

16.

Maurer, U.M. Secret Key Agreement by Public Discussion from Common Information. IEEE Trans. Inf.

Theory 1993,39, 733–742. [CrossRef]

17.

Polyanskiy, Y.; Wu, Y. Lecture Notes on Information Theory 2019. Available online: http://www.stat.yale.

edu/~yw562/teaching/itlectures.pdf (accessed on 10 June 2020).

18.

Hou, J.; Kramer, G. Informational divergence approximations to product distributions. In Proceedings of

the 2013 13th Canadian Workshop on Information Theory, Toronto, ON, Canada, 18–21 June 2013.

Entropy 2020,22, 679 14 of 14

19. Stoyanov, J.M. Counterexamples in Probability, 3rd ed.; Dover: New York, NY, USA, 2013.

20. Cover, T.M. Broadcast Channels. IEEE Trans. Inf. Theory 1972,18, 2–14. [CrossRef]

21.

Bennatan, A.; Burshtein, D.; Caire, G.; Shamai, S. Superposition coding for side-information channels.

IEEE Trans. Inf. Theory 2006,52, 1872–1889. [CrossRef]

22.

Forney, G.D., Jr. On the Role of MMSE Estimation in Approaching the Information-Theoretic Limits of Linear

Gaussian Channels: Shannon meets Wiener. 2004. Available online: https://arxiv.org/pdf/cs/0409053.pdf

(accessed on 10 June 2020).

2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access

article distributed under the terms and conditions of the Creative Commons Attribution

(CC BY) license (http://creativecommons.org/licenses/by/4.0/).