Document [original]

Cognitive Computation

https://doi.org/10.1007/s12559-020-09733-5

Limitations of the Recall Capabilities in Delay-Based Reservoir

Computing Systems

Felix K¨

oster1·Dominik Ehlert1·Kathy L¨

udge1

Received: 28 February 2020 / Accepted: 14 May 2020

©The Author(s) 2020

Abstract

We analyse the memory capacity of a delay-based reservoir computer with a Hopf normal form as nonlinearity and

numerically compute the linear as well as the higher order recall capabilities. A possible physical realization could be a

laser with external cavity, for which the information is fed via electrical injection. A task-independent quantification of

the computational capability of the reservoir system is done via a complete orthonormal set of basis functions. Our results

suggest that even for constant readout dimension the total memory capacity is dependent on the ratio between the information

input period, also called the clock cycle, and the time delay in the system. Optimal performance is found for a time delay

about 1.6 times the clock cycle.

Keywords Lasers ·Reservoir computing ·Nonlinear dynamics

Introduction

Reservoir computing is a machine learning paradigm [1]

inspired by the human brain [2], which utilizes the natural

computational capabilities of dynamical systems. As a

subset of recurrent neural networks it was developed to

predict time-dependent tasks with the advantage of a very

fast training procedure. Generally the training of recurrent

neural networks is connected with high computational cost

resulting e.g. from connections that are correlated in time.

Therefore, problems like the vanishing gradient in time arise

[3]. Reservoir computing avoids this problem by training

just a linear output layer, leaving the rest of the system (the

This article belongs to the Topical Collection: Trends in Reservoir

Computing

Guest Editors: Claudio Gallicchio, Alessio Micheli, Simone

Scardapane, Miguel C. Soriano

Felix K¨

oster

f.koester@tu-berlin.de

Dominik Ehlert

[email protected]

Kathy L¨udge

kathy.luedge@tu-berlin.de

1Institut f¨ur Theoretische Physik, Technische Universit¨

Berlin, Straße des 17. Juni 135, 10623 Berlin, Germany

reservoir) as it is. Thus, the inherent computing capabilities

can be exploited. One can divide a reservoir into three

distinct subsystems, the input layer, which corresponds to

the projection of the input information into the system,

the dynamical system itself that processes the information,

and the output layer, which is a linear combination of the

system’s states trained to predict an often time-dependent

task.

Many different realizations have been presented in the

last years, ranging from a bucket of water [4] over field

programmable gate arrays (FPGAs) [5] to dissociated neural

cell cultures [6], being used for satellite communications

[7], real-time audio processing [8,9], bit-error correction

for optical data transmission [10], amplitude of chaotic laser

pulse prediction [11] and cross-predicting the dynamics of

an injected laser [12]. Especially opto-electronic [13,14]

and optical setups [15–19] were frequently studied because

their high speed and low energy consumption make them

preferable for hardware realizations.

The interest in reservoir computing was refreshed

when Appeltant et al. showed a realization with a single

dynamical node under influence of feedback [20], which

introduced a time-multiplexed reservoir rather than a

spatially extended system. A schematic sketch is shown in

Fig. 1. In general the delay architecture slows down the

information processing speed but reduces complexity of

the hardware. Many neuron based, electromechanical, opto-

electronic and photonic realizations [21–26] showed the

Cogn Comput

Fig. 1 Schematic sketch of time-multiplexed reservoir computing

scheme. The input is preprocessed by multiplication with a mask that

induces the time-multiplexing and is then electrically injected. The

laser in our case is governed by a Hopf normal form. The output

dimension of the system is in this example 4

capabilities from time series predictions [27,28] over an

equalization task on nonlinearly distorted signals [29]upto

fast word recognition [30]. More general analysis showed

the general and task-independent computational capabilities

of semiconductor lasers [31]. A broad overview is given in

[32,33].

In this paper we perform a numerical analysis of the

recall capabilities and the computing performance of a

simple nonlinear oscillator, modelled by a Hopf normal

form, with delayed feedback. We calculate the total memory

capacity as well as the linear and nonlinear contributions

using the method derived by Dambre et al. in [34].

The paper is structured as follows. First, we shortly

explain the concept of time-multiplexed reservoir comput-

ing and give a short overview of the method used for

calculating the memory capacity. After that we present our

results and discuss the impact of the delay time on the

performance and the different nonlinear recall contributions.

Methods

Traditionally, reservoir computing was realized by ran-

domly connecting nodes with simple dynamics (for example

the tanh-function [1]) to a network, which was then used

to process information. The linear estimator of the readouts

is then trained to approximate a target, e.g. predict a time-

dependent task. The network thus transforms the input into

a high dimensional space in which the linear combination

can be used to separate different inputs, i.e. to classify the

given data.

In the traditional reservoir computing setup a reaction

from the system sn=(s1n,s

2n,...,s

Mn)∈RMis recorded

together with the corresponding input unand the target on.

In this case nis the index for the nth input-output training

datapoint, ranging from 1 to N,andMis the dimension

of the measured system states. The goal for the reservoir

computing paradigm is to approximate the target onas close

as possible with linear combinations of the states snfor all

input-output pairs n, meaning that M

m=1wmsmn =ˆon≈

onfor all n,wherew=(w1,w

2,...,w

M)∈RMare the

weights to be trained. We want to find the best solution

for

s·w≈o,(1)

where s∈RN×RMis the state matrix defined by all system

state reactions snto their corresponding inputs un,ware the

weights to train and o∈RNis the vector of targets to be

approximated. This is equivalent to a least square problem

which is analytically solved by [35]

w=(sTs)−1sTo.(2)

The capability of the system to approximate the target

task can be quantified by the normalized root mean square

difference between the approximated answers ˆonand the

targets on

NRMSE =







n=1

(on−ˆon)2

N·var(o),(3)

where NRMSE is the normalized root mean square error of

the target task with var(o)being the variance of the target

values o=(o1,o

2,...,o

N)and N the number of sample

points. An NRMSE of 1 indicates that the system is not

capable of approximating the task better than approximating

the mean value, a value NRMSE = 0 indicates that it is able

to compute the task perfectly. For a successful operation

NMneeds to be fulfilled, where Mis the number

of output weights wmand Nis the number of training

data points. This corresponds to a training data set of

size Nbeing significantly bigger than the possible output

dimension Mto prevent overfitting.

Appeltant et al. introduced in [20] a time-multiplexed

scheme for applying the reservoir computing paradigm on

a dynamical system with delayed feedback. In this case,

the measured states for one input-output pair reaction sn=

(s1n,s

2n,...,s

Mn)are recorded at different times tm=

tn+mθ, with m=1,2,...,M,wheretnis the time

at which the nth input unis fed into the system. θis

describing the distance between two recorded states of the

system and is called the virtual node separation time. The

time between two inputs tn+1−tnis called the clock

cycle Tand describes the period length in which one input

unis applied to the system. To get different reactions

between two virtual nodes a time-multiplexed masking

process is applied. The information fed into the system is

preprocessed by multiplying a T-periodic mask gon the

inputs (see sketch Fig. 1), which is a piecewise constant

function consisting of Mintervals, each of length θ.This

corresponds to the input weights in the spatially extended

system with the difference that now the input weights are

distributed over time.

Dambre et al. showed in [34] that the computational

capability of a system can be quantified completely via a

Cogn Comput

complete orthonormal set of basis functions on a sequence

of inputs un=(...,u

n−2,u

n−1,u

n)at time n. In this case

the index indicates the input n time steps ago. The goal

is to investigate how the system transforms the inputs un.

For this the chosen basis functions z(un), forming a Hilbert

space, are constructed and used to describe every possible

transformation on the inputs un. The system’s capability to

approximate these basis functions is evaluated. Consider the

following examples: The function z(un)=un−5is chosen

as a task on. This is a transformation of the input sequence

5 steps back. The question this task asks is, how well the

system can remember the input 5 steps ago. Another case

would be on=z(un)=un−5un−2, asking how well it

can perform the nonlinear transformation of multiplying the

input 5 steps into the past with the input 2 steps into the past.

A useful quantity to measure the capability of the system is

the capacity defined as

C=1−NRMSE2.(4)

A value of 1 corresponds to the system being perfectly

capable of approximating the transformation task and 0

corresponds to no capability at all. A simpler method, giving

equal results like Eq. (4) developed by Dambre et al. in [34]

to calculate C is given by

oTs(sTs)−1sTo

o2N2,(5)

where Tindicates the transpose of a matrix and −1the

inverse. We use Eq. (5) to calculate the memory capacity.

In this paper we use finite products of normalized

Legendre polynomials Pdn as a full basis of the constructed

Hilbert space for each input step combination. dis the order

of the used Legendre polynomial and n the nth step

into the past passed as value to the Legendre polynomial.

Multiplying a set of those Legendre polynomials gives the

target task y{dn}, which yields (see example below for

clarification)

y{dn}=nPdn (u−n).(6)

This is directly taken from [34]. It is important that the

inputs to the system are uniformly distributed random

numbers un, which are independent and identically drawn

in [−1,1]to match the used normalized Legendre

polynomials. To calculate the memory capacity MCdfor a

degree d, a summation over all possible past input sets is

done

MCd=

{n}

{n},(7)

where {n}is the set of past input steps, Cd

{n}is

the capacity of the system to approximate a specific

transformation task z{n}(un)and dis the degree of all

Legendre polynomials combined in the task z{n}(un).In

the example from above with z{−5,−2}(un)=un−5un−2,it

is d=2and{n}={−5,−2}.Ford=1wegetthe

well known linear memory capacity. To compute the total

memory capacity, a summation over all degrees dis done.

MC =



d=1

MCd(8)

Dambre et al. showed in [34] that the MC is limited by the

readout dimension M, given here by the number of virtual

nodes NV.

The simulation was written in C++ with standard

libraries used except for linear algebra calculations, which

were calculated via the library “Armadillo”. A Runge-

Kutta 4th-order method was applied to integrate numerically

the delay-differential equation given by Eq. (10) with an

integration step t =0.01 in time units of the system. First,

the system was simulated without any inputs to let transients

decay. Afterwards a buffer time was applied with 100000

inputs, that were excluded from the training process. Then,

the training and testing process itself was done with 250000

inputs to have sufficient statistics. The tasks are constructed

via Eq. (6) and the corresponding capacities Cd

{n}were

calculated via Eq. (5). All possible combinations of the

Legendre polynomials up to degree D=10 and n =1000

input steps into the past were considered. Cd

{n}below 0.001

were excluded because of finite statistics. To calculate the

inverse, the Moore–Penrose pseudoinverse from the C++

linear algebra library “Armadillo” was used.

We characterize the performance of our nonlinear

oscillator by evaluating the total memory capacity MC,

the contributions MCdas well as the NRMSE of the

NARMA10 task. The latter is a benchmark test and

combines memory and nonlinear transformations. It is given

by an iterative formula

An+1=0.3An+0.05An9



i=0

An−i+1.5un−9un+0.1.

(9)

Here, Anis an iteratively given number and unis an

independent and identically drawn uniformly distributed

random number in [0,0.5]. The reservoir is driven by the

random numbers unand has to be able to predict the value

of An+1,o=A. The reservoir we use for our analysis

is a Stuart-Landau oscillator, also called Hopf normal

form [36], with delayed feedback. This is a generalized

model applicable for all systems operated close to a Hopf

bifurcation, i.e. close to the onset of intensity oscillations.

One example would be a laser operated closely above

threshold [37]. A derivation from the Class B rate equations

isshownintheAppendix. The equation of motion is given

Cogn Comput

Z=(λ +ηgI +iω +γ|Z|2)Z +κeiφZ(t −τ), (10)

and was taken from [18]. Here, Zis a complex dynamical

variable (in the case of a laser |Z|2resembles the intensity),

λis a dimensionless pump rate, ηthe input strength of the

information fed into the system via electrical injection, gis

the masking function, Iis the input, ωis the frequency with

which the dynamical variable Zrotates in the complex plane

without feedback (in case of a laser, this is the frequency of

the emitted laser light), γthe nonlinearity in the system, κis

the feedback strength, φthe feedback phase and τthe delay

time. The corresponding parameters used in the simulations

are found in Table 1if not stated otherwise.

Results

To get a first impression about how the system can recall

inputs from the input sequence un=(...un−2,u

n−1,u

n),we

show the linear recall capacities C1

{n}in Fig. 2. Here, each

set of all inputs {n}consists of only one input step n,

because d=1forwhichz{n}(un)consists of the Legendre

polynomial P1(un−n)=un−n. The capacities C1

{n}are

plotted over the step n to be recalled for 3 different delay

times τ(blue, orange and green in Fig. 2) while the input

period time Tis kept fixed to 80 and the readout dimensions

NVto 50. These timescale parameters were chosen to fit

the characteristic timescale of the system, such that the time

between two virtual nodes θis long enough for the system

to react, but short enough such that the speeding process is

still as high as possible. For input period times T=τ(the

blue solid line in Fig. 2) a high capacity is achieved for a few

recall steps after which the recallability drops steadily down

to 0 at about the 15th step (n =15) to recall. This changes

when the input period time reaches values of 3 times the

delay time τ=3T(the orange solid line in Fig. 2). Here,

the linear recallability C1

{n}oscillates between high and

low values as a function of n, while its envelope steadily

decreases until it reaches 0 at around the 35th (n =35)

Table 1 Parameters used in the simulation if not stated otherwise

Parameter Description Value

λPump rate −0.02

ηInput strength 0.01

ωFree running frequency 0.0

γNonlinearity −0.1

κFeedback strength 0.1

θFeedback phase 0.0

NVNumber of virtual nodes 50

TInput period time 80

Fig. 2 C1

{n}as defined in Eq. (7) plotted over the nth input step to

recall for 3 different delay times τ. The input period time T=80

step to be recalled. Considering that τ=3Tis a resonance

between the input period time Tand the delay time τ, one

can also take a look at the case for off-resonant setups,

which is shown by the green solid line in Fig. 2with τ≈

3.06T. This parameter choice shows a similar behaviour as

the T=3τone but with higher capacities for short recall

steps and a faster decay of the recallability at around the

29th (n =29) step.

To get a more complete picture, we evaluated the linear

capacity C1

{n}and quadratic capacities C2

{n}of the system

and depicted these as a heatmap over the delay time τ

and the input steps in Fig. 3for a constant input period

time T.Thex-axis indicates the nth step to be recalled

while the delay time τis varied from bottom to top on the

y-axis. In Fig. 3a the linear capacities C1

{n}are shown,

for which the red horizontal solid lines indicate the scan

from Fig. 2One can see a continuous capacity C1

{n}

for τ<2Twhich forks into rays of certain recallable

steps n that linearly increase with the delay time τ.

This implies that specific steps n can be remembered

while others inbetween are forgotten, a crucial limitation

to the performance of the system. Generally the number

of steps into the past that can be remembered increases

with τ(at constant T), while on the other hand also the

gaps inbetween the recallable steps increase. Thus, the total

memory capacity stays constant. This will be discussed

later in Fig. 4.InFig.3b the pure quadratic capacity

C2,p

{n}is plotted within the same parameter space as in

Fig. 3a. Pure means that only Legendre polynomials of

degree 2, i.e. P2(un−n)=1

2(3u2

n−n−1)were considered,

rather than also considering combinations of two Legendre

polynomials of degree 1, i.e. P1(un−n1)P1(un−n2)=

un−n1un−n2. In the graph one can see the same behaviour

as for the linear capacities C1

{n}(Fig. 3a), but with less

rays and thus less steps that can be remembered from the

past. This indicates that the dynamical system is not as

effective in recalling inputs and additionally transforming

them nonlinearly as it is in just recalling them linearly.

For the full quadratic nonlinear transformation capacity, all

Cogn Comput

Fig. 3 aLinear capacity C1

{n}plotted colorcoded over the delay

time τand the input steps n to recall. Parameters as given in

Table 1. The red horizontal solid lines indicate the scan from Fig. 2.

bQuadratic pure capacity C2,p

{n}.cCombination of two Legen-

dre polynomials of degree 1 indicating the capability of nonlinear

transformations of the form u−n1u−n2.Heren1of the first

polynomial is plotted while between two n1-steps n2is

increased from 0 to 45 steps into the past. Yellow indicates good

while blue and black indicates bad recallability. The input period

time T=80

combinations of two Legendre polynomials of degree 1 for

different input steps into the past have to be considered,

i.e.

P1(un−n1)P1(un−n2)=un−n1un−n2.

This is shown in Fig. 3c. Again, the capacities C2

{n}are

depicted as a heatmap and the delay time τis varied along

the y-axis. This time the x-axis shows the steps of the first

Legendre polynomial n1, while inbetween two ticks of

the x-axis, the second Legendre polynomial’s step n2is

scanned from 0 up to 45 steps into the past. For the steps of

Fig. 4 Total memory capacity MC as defined by Eq. (8) (blue) and

memory capacities MC1,2,3,4of degree 1 to 4 (orange, green, red,

violet) plotted over the delay time τfor the same parameters as in

Fig. 3. Resonances between the clock cycle Tand the delay time τare

depicted as vertical red and green dashed lines. One can see the loss

in memory capacity at the resonances, especially for degree 2. Higher

order transformations with d>3 are more effective in the regime

where τ<1.5 T

the second Legendre polynomial n2the capacity exhibits

the same behaviour as already discussed for Fig. 3aandb.

This does also apply to the first Legendre polynomial which

induces interference patterns in the capacity space of the

two combined Legendre polynomials. The red dashed lines

highlight the ray behaviour of the first Legendre polyno-

mial. We therefore learn that the performance of a reservoir

computer described by a Hopf normal form with delay

drastically depends on the task. There are certain nonlinear

transformation combinations u−n1u−n2of the inputs

u−n1and u−n2which cannot be approximated due to the

missing memory at specific steps. To overcome these lim-

itations it would be recommended to use multiple systems

with different parameters to compensate for each other.

To fully characterize the computational capabilities of

our reservoir computer, a full analysis of the degree d

memory capacities MCdand the total memory capacity

MC as defined in Eq. (8) is done. The results are depicted

in Fig. 4a as a function of the delay time τ. All other

parameters are fixed as in Fig. 3. The orange solid line

in Fig. 4refers to the linear, the green, red and violet

lines to the quadratic, cubic and quartic memory capacity

MC1,2,3,4, respectively. The blue solid line shows the total

memory capacity MC summed up over all degrees up to

10. Dambre et al. showed in [34] that the MC is limited by

the number of read-out dimensions and equals it when all

read-out dimensions are linearly independent. In our case

the read-out dimension is given by the number of virtual

nodes NV=50. Nevertheless, the total memory capacity

MC starts at around 15 for a very short time delay with

respect to the input period time T. This low value arises

from the fact that a short delay induces a high correlation

Cogn Comput

between the responses of the dynamical system which

induces highly linearly dependent virtual nodes. This is an

important general result that has to be kept in mind for all

delay-based reservoir computing systems: With τ<1.5 T

the capability of the reservoir computer is partially waisted.

Increasing the delay time τalso increases the total memory

capacity MC reaching the upper bound of 50 at around 1.5

times the input period time T.

For τ>1.5 Tan interesting behaviour emerges.

Depicted by the vertical red dashed lines are multiples of

the input period time Tat which the total memory capac-

ity MC drops again significantly to around 40. A drop in

the linear memory capacity was discussed in the paper by

Stelzer et al. [38] and explained by the fact that resonances

between the delay time τand the input period time Tcon-

cludes in a sparse connection between the virtual nodes.

Our results now show that this effects the total memory

capacity MC, by mainly reducing the quadratic memory

capacity MC2. At the resonances the quadratic nonlinear

transformation capability of the system is reduced. To con-

clude, delay-based reservoir computing systems should be

kept off the resonances between Tand τto maximize the

computational capability. A surprising result is that for the

chosen Hopf nonlinearity the linear memory capacity MC1

is only slightly influenced by the resonances. A result from

Dambre et al. in [34] and analysed by Inubushi et al. in

[39] showed that a trade-off between the linear recalla-

bility and the nonlinear transformation capability exists.

This is clearly only the case if the theoretical limit of the

total memory capacity MC is reached and kept constant,

thus every change in the linear memory capacity MC1has to

induce a change in the nonlinear memory capacities MCd,

d>1. In the case of resonances, a decrease in the total

memory capacity MC happens and thus this loss can be

distributed in any possible way over the different memory

capacities MCd. In our case, we see that the influence on

the quadratic memory capacity MC2is highest.

The system is capable of a small amount of cubic

transformations, depicted by the solid red line in Fig. 4a,

which also decreases at the resonances in a similar way

as the quadratic contribution does. Higher order memory

capacities MCd, with d>3, have only small contributions

for short delay times τ, dropping to 0 for increased time

delay τ. A possible explanation is the fact that short delays

induce an interaction of the last input directly with itself for

k=T

τtimes, depending on the ratio between τand T.Asa

result, short delay times τenable highly nonlinear tasks in

expense of a lower total memory capacity MC.

For more insights into the computing capabilities of our

nonlinear oscillator we now also discuss the NARMA10

time series prediction task, shown in Fig. 4b. Comparing the

memory capacities MCdwith the NARMA10 computation

error NRMSE in Fig. 4b, a small increase in the NARMA10

NRMSE can be seen at the resonances with nτ =mT ,

where n∈[0,1,2...]and m∈[0,1,2...]. For a systematic

characterization a scan of the input period time Tand the

delay time τwas done and the total memory capacity MC

(Fig. 5c), the memory capacities of degrees 1–4 MC1,2,3,4

(Fig. 5a, b, d, e) and the NARMA10 NRMSE (Fig. 5f)

were plotted colorcoded over the two timescales. This

is an extension of the results of R¨

ohm et al. in [19],

where only the linear memory capacity and the NARMA10

computation error were analysed. For short time delays

τand period input times Tthe memory capacities of

degree 1–3 MC1,2,3and the total memory capacity MC

are significantly below the theoretical limit of 50 as already

seen in the results from Fig. 4, while the NARMA10

NRMSE also has high errors of around 0.8. This comes

from the fact that short input period times Talso mean

short virtual node distances θ, which induces a high linear

Fig. 5 aLinear memory

capacity MC1plotted

colorcoded over the delay time τ

and the input period time T.b

Degree 2, MC2.cTotal memory

capacity, MC. dDegree 3, MC3.

eDegree 4, MC4.fNARMA10

prediction error NRMSE

Cogn Comput

correlation between the read-out dimensions. Degree 4 on

the other hand only has values as long as T>τ,a

result coming from the fact that the input unhas to interact

with itself to get a transformation of degree 4. A possible

explanation comes from the fact that the dynamical system

itself is not capable of transformations higher than degree

3, since the highest order in Eq. (10) is 3. If the delay

time τand the input period time Tare long enough the

total memory capacity MC reaches 50 with exceptions of

resonances between τand T. These resonances are also seen

in the NARMA10 NRMSE for which higher errors occur.

Looking at the memory capacity of degree 1 and 2 MC1,2

and comparing it with the NARMA10 NRMSE one can

see a tendency in which the NARMA10 NRMSE is lowest

where both have the highest capacities, raising from the

fact that the NARMA10 task is highly dependent on linear

memory and quadratic nonlinear transformations. This can

also be seen in the area below the τ=T-resonance.

To conclude, one can use the parameter dependencies of

the memory capacities MCdto make predictions of the

reservoir capability to approximate certain tasks.

Conclusions

We analysed the memory capacities and nonlinear transfor-

mation capabilities of a reservoir computer consisting of an

oscillatory system with delayed feedback operated close to

a Hopf bifurcation, i.e. a paradigmatic model also applicable

for lasers close to threshold. We systematically varied the

timescales and found regions of high and low reservoir

computing performing abilities. Resonances between the

information input period time Tand the delay time τshould

be avoided to fully utilize the natural computational capa-

bility of the nonlinear oscillator. A ratio of τ=1.6Twas

found to be the optimal for the computed memory capac-

ities, resulting in a good NARMA10 task approximation.

Furthermore, it was shown, that the recallability for high

delay times τTis restricted to specific past inputs,

which rules out certain tasks. By computing the memory

capacities of a Hopf normal form, one can make general

assumptions about the reservoir computing capabilities of

any system operated close to a Hopf bifurcation. This sig-

nificantly helps in understanding and predicting the task

dependence of reservoir computers.

Acknowledgements The authors would like to thank Andr´

ohm,

Joni Dambre and David Hering for fruitfull discussion.

Funding Open Access funding provided by Projekt DEAL. This study

was funded by the “Deutsche Forschungsgemeinschaft” (DFG) in the

framework of SFB910.

Compliance with Ethical Standards

Conﬂict of Interest The Authors declare that they have no conflict of

interest.

Ethical Approval This article does not contain any studies with human

participants or animals performed by any of the authors.

Open Access This article is licensed under a Creative Commons

Attribution 4.0 International License, which permits use, sharing,

adaptation, distribution and reproduction in any medium or format, as

long as you give appropriate credit to the original author(s) and the

source, provide a link to the Creative Commons licence, and indicate

if changes were made. The images or other third party material in this

article are included in the article’s Creative Commons licence, unless

indicated otherwise in a credit line to the material. If material is not

included in the article’s Creative Commons licence and your intended

use is not permitted by statutory regulation or exceeds the permitted

use, you will need to obtain permission directly from the copyright

holder. To view a copy of this licence, visit http://creativecommons.

org/licenses/by/4.0/.

Appendix

Derivation of the Stuart-Landau Equation with delay from

the Class B laser rate equations

E=(1+iα)EN (11)

N=1

T(P +ηgI −N−(1+2N)|E|2), (12)

where Eis the non-dimensionalized complex eletrical field

and Nthe non-dimensionalized carrier inversion, P the

pump relativ to the threshold for Pthresh =0andαthe

Henry factor. The reservoir computing signal is fed into

the system via electrical injection ηgI . If fast carriers are

considered, an adiabatic elimination of the charge carriers

yields

0=1

T(P +ηgI −N−(1+2N)|E|2)(13)

N=P+ηgI −|E|2

1+2|E|2,(14)

which after substituting into Eq. (11)gives

E=(1+iα)E

P−|E|2

1+2|E|2,(15)

where we introduced the quantity ˜

P=P+ηgI for

convenience purposes. This equation yields the full Class A

rate equation for the non-dimensionalized complex electric

field. Simulations with the full Class A rate equation close

to the threshold show similar results to the reduced case.

Because we consider laser that are operated close to the

threshold level, a taylor expansion of the denominator for

|E|2≈0 is done,

E=(1+iα)E( ˜

P−|E|2−2˜

P|E|2), (16)

Cogn Comput

whereweset|E|4≈0 as a neglectable term. As we consider

a laser operated close to the threshold, it follows that the

pump ˜

Pand the intensity |E|2are of Order O(),whereis

a small factor. This holds true only if the input signal ηgI is

a small electrical injection. After applying this the equation

is given by

E=(1+iα)E( ˜

P−|E|2)(17)

We can substitute ˜

P=P+ηgI back into the equation,

change the rotating frame of the laser by setting E=

Ze−i(ω−α˜

P)t and introduce a complex factor γ=−(1+iα)

that scales the nonlinearity

Z=Z(P +ηgI +iω +γ|Z|2), (18)

By addding feedback κeφZ(t −τ)to the system one arrives

at Eq. (10).

Z=Z(P +ηgI +iω +γ|Z|2)+κeφZ(t −τ), (19)

References

1. Jaeger H. The ‘echo state’ approach to analysing and train-

ing recurrent neural networks. GMD Report 148 GMD - Ger-

man National Research Institute for Computer Science. 2001.

https://doi.org/publica.fraunhofer.de/documents/b-73135.html.

2. Maass W, Natschl¨

ager T, Markram H. Neural Comp.

2002;14:2531. https://doi.org/10.1162/089976602760407955.

3. Hochreiter S. Int J Uncert Fuzz Knowl-Based Syst. 1998;6:107.

https://doi.org/10.1142/S0218488598000094.

4. Fernando C, Sojakka S. In Advances in artificial life, pp 588–97.

https://doi.org/10.1007/978-3-540-39432-7 63. 2003.

5. Antonik P, Duport F, Hermans M, Smerieri A, Haelterman

M, Massar S. IEEE Trans Neural Netw Learn Syst, 28(11). 2016.

https://doi.org/10.1109/tnnls.2016.2598655.

6. Dockendorf K, Park I, He P, Principe JC, DeMarse TB.

Biosystems, 95(2). 2009. https://doi.org/10.1016/j.biosystems.

2008.08.001.

7. Bauduin M, Smerieri A, Massar S, F. Horlin. 2015 IEEE 81st

Vehicular Technology Conference (VTC Spring). 2015.

8. Keuninckx L, Danckaert J, Van der Sande G. Cogn Comput 9(3).

2017. https://doi.org/10.1007/s12559-017-9457-5.

9. Scardapane S, Uncini A. Cogn Comput. 2017;9:125–135.

https://doi.org/10.1007/s12559-016-9439-z.

10. Argyris A, Bueno J, Soriano MC, Fischer I. In

2017 Conf. on Lasers and Electro-optics Europe European

Quantum Electronics Conference (CLEO/Europe-EQEC), p 1.

https://doi.org/10.1109/cleoe-eqec.2017.8086463. 2017.

11. Amil P, Soriano MC, Masoller C. Chaos: Interdiscip J Nonlin

Sci. 2019;29(11):113111. https://doi.org/10.1063/1.5120755.

12. Cunillera A, Soriano MC, Fischer I. Chaos: Interdiscip J Nonlin

Sci. 2019;29(11):113113. https://doi.org/10.1063/1.5120822.

13. Larger L, Soriano M, Brunner D, Appeltant L, Gutierrez JM,

Pesquera L, Mirasso CR, Fischer I. Opt Express. 2012;20(3):

3241. https://doi.org/10.1364/oe.20.003241.

14. Paquot Y, Duport F, Smerieri A, Dambre J, Schrauwen B,

Haelterman M, Massar S. Sci Rep 2(287). 2012. https://doi.

org/10.1038/srep00287.

15. Brunner D, Soriano M, Mirasso CR, Fischer I. Nat Commun.

2013;4:1364. https://doi.org/10.1038/ncomms2368.

16. Vinckier Q, Duport F, Smerieri A, Vandoorne K, Bienstman P,

Haelterman M, Massar S. Optica 2(5). 2015. https://doi.org/10.

1364/optica.2.000438.

17. Nguimdo RM, Lacot E, Jacquin O, Hugon O, Van der Sande G,

de Chatellus HG. Opt Lett 42(3). 2017. https://doi.org/10.1364/ol.

42.000375.

18. R¨

ohm A, L¨udge K. J Phys Commun. 2018;2:085007.

https://doi.org/10.1088/2399-6528/aad56d.

19. R¨

ohm A, Jaurigue LC, L¨udge K. IEEE J Sel Top Quantum Elec-

tron. 2019;26(1):7700108. https://doi.org/10.1109/jstqe.2019.

2927578.

20. Appeltant L, Soriano M, Van der Sande G, Danckaert J, Massar

S, Dambre J, Schrauwen B, Mirasso CR, Fischer I. Nat

Commun. 2011;2:468. https://doi.org/10.1038/ncomms1476.

21. Ortin S, Soriano MC, Pesquera L, Brunner D, San-Mart´

ın D,

Fischer I, Mirasso CR, Gut´

ıerrez JM. Sci Rep. 2015;5:2045.

https://doi.org/10.1038/srep14945.

22. Dion G, Mejaouri S, Sylvestre J. J Appl Phys.

2018;124(15):152132. https://doi.org/10.1063/1.5038038.

23. Brunner D, Penkovsky B, Marquez BA, Jacquot M, Fischer I,

Larger L. J Appl Phys. 2018;124(15):152004. https://doi.org/10.

1063/1.5042342.

24. Chen Y, Yi L, Ke J, Yang Z, Yang Y, Huang L, Zhuge Q, Hu W.

Opt Express. 2019;27(20):27431. https://doi.org/10.1364/oe.27.

027431.

25. Hou YS, Xia GQ, Yang WY, Wang D, Jayaprasath E,

Jiang Z, Hu CX, Wu ZM. Opt Express. 2018;26(8):10211.

https://doi.org/10.1364/oe.26.010211.

26. Sugano C, Kanno K, Uchida A. IEEE J Sel Top Quantum Elec-

tron. 2020;26(1):1500409. https://doi.org/10.1109/jstqe.2019.

2929179.

27. Bueno J, Brunner D, Soriano M, Fischer I. Opt Express.

2017;25(3):2401. https://doi.org/10.1364/oe.25.002401.

28. Kuriki Y, Nakayama J, Takano K, Uchida A. Opt Express.

2018;26(5):5777. https://doi.org/10.1364/oe.26.005777.

29. Argyris A, Cantero J, Galletero M, Pereda E, Mirasso CR,

Fischer I, Soriano MC. IEEE J Sel Top Quantum Electron.

2020;26(1):5100309. https://doi.org/10.1109/jstqe.2019.2936947.

30. Larger L, Bayl´

on-Fuentes A, Martinenghi R, Udaltsov

VS, Chembo YK, Jacquot M. Phys Rev X. 2017;7:011015.

https://doi.org/10.1103/physrevx.7.011015.

31. Harkhoe K, Van der Sande G. Photonics 6(4). 2019.

https://doi.org/10.3390/photonics6040124.

32. Brunner D, Soriano M, Van der Sande G, Dambre J, Bienstman

P, Larger L, Pesquera L, Massar S. Photonic Reservoir

Computing Optical Recurrent Neural Networks. 2019.

33. Van der Sande G, Brunner D, Soriano M. Nanophotonics.

2017;6(3):561. https://doi.org/10.1515/nanoph-2016-0132.

34. Dambre J, Verstraeten D, Schrauwen B, Massar S. Sci Rep.

2012;2:514. https://doi.org/10.1038/srep00514.

35. Williams JH. Quantifying measurement: the tyranny of

numbers (Morgan & Claypool Publishers, UK, 2016).

https://doi.org/http://iopscience.iop.org/book/978-1-6817-4433-9.

36. Sch¨

oll E, Schuster HG, (eds). 2008. Handbook of chaos control.

Weinheim: Wiley. Second completely revised and enlarged

edition.

37. Erneux T, Glorieux P. Laser dynamics. UK: Cambridge

University Press; 2010.

38. Stelzer F, R¨

ohm A, L¨udge K, Yanchuk S. Neural Netw.

2020;124:158. https://doi.org/10.1016/j.neunet.2020.01.010.

39. Inubushi M, Yoshimura K. Sci Rep. 2017;7:10199.

Publisher’s Note Springer Nature remains neutral with regard to

jurisdictional claims in published maps and institutional affiliations.