
Stefan Kitzler, Friedhelm Victor, Pietro Saggese, Bernhard Haslhofer
Disentangling Decentralized Finance (DeFi)
Compositions
Open Access via institutional repository of Technische Universität Berlin
Document type
Preprint | Submitted version
(i. e. version that has been submitted to a publisher for (peer) review; also known as: Author’s Original
Manuscript (AOM), Original manuscript, Preprint)
Date of this version
Nov-2021
This version is available at
https://doi.org/10.14279/depositonce-12640
Citation details
Kitzler, Stefan; Victor, Friedhelm; Saggese, Pietro & Haslhofer, Bernhard (2021). Disentangling Decentralized
Finance (DeFi)Compositions. 1–11. http://dx.doi.org/10.14279/depositonce-12640.
Terms of use
cb This work is licensed under a Creative Commons Attribution 4.0 International license:
https://creativecommons.org/licenses/by/4.0/

Disentangling Decentralized Finance (DeFi) Compositions
Stefan Kitzler
Complexity Science Hub Vienna
Vienna, Austria
Friedhelm Victor
friedhelm.victor@tu-berlin.de
Technische Universität Berlin
Berlin, Germany
Pietro Saggese
pietro[email protected]
AIT - Austrian Institute of Technology
Vienna, Austria
Bernhard Haslhofer
AIT - Austrian Institute of Technology
Vienna, Austria
ABSTRACT
We present the first study on compositions of Decentralized Fi-
nance (DeFi) protocols, which aim to disrupt traditional finance
and offer financial services on top of the distributed ledgers, such
as the Ethereum. Starting from a ground-truth of
23
DeFi protocols
and
10,663,881
associated accounts, we study the interactions of
DeFi protocols and associated smart contracts from a macroscopic
perspective. We find that DEX and lending protocols have a high
degree centrality, that interactions among protocols primarily occur
in a strongly connected component, and that known community
detection cannot disentangle DeFi protocols. Therefore, we propose
an algorithm for extracting the building blocks and uncovering
the compositions of DeFi protocols. We apply the algorithm and
conduct an empirical analysis finding that swaps are the most fre-
quent building blocks and that DeFi aggregation protocols utilize
functions of many other DeFi protocols. Overall, our results and
methods contribute to a better understanding of a new family of
financial products and could play an essential role in assessing
systemic risks if DeFi continues to proliferate.
CCS CONCEPTS
•Applied computing →Digital cash;Electronic funds transfer.
KEYWORDS
decentralized finance, blockchain, networks
1 INTRODUCTION
Decentralized Finance (DeFi) stands for a new paradigm that aims to
disrupt established financial markets. It offers financial services in
the form of smart contracts, which are executable software programs
deployed on top of distributed ledger technologies (DLT) such as
Ethereum. Despite being a relatively recent development, we can
already observe rapid growth in DeFi protocols enabling lending
of virtual assets, exchanging them for other virtual assets without
intermediaries, or betting on future price developments in the form
of derivatives like options and futures. The term “financial lego” is
sometimes used because DeFi services can be composed into new
financial products and services.
DeFi Protocol
Contracts
Token
Contracts
1inch
BAT
Token
BAT/ETH
(UniSwap)
WETH
Token
ETH/USDT
(SushiSwap)
USDT
Token
EOA
User
Figure 1: A DeFi composition where BAT tokens are swapped
against USDT tokens through the DeFi service 1inch in a
single transaction. 1inch executes the swap sequentially
through the DeFi services UniSwap and SushiSwap, using
WETH as an intermediary token. In the transaction trace
graph, we can see the user calling the 1inch smart contract,
which in turn triggers several calls to DeFi protocol-, and
token smart contracts.
As an example of DeFi composition, consider Figure 1, which
illustrates a user interacting with the 1inch decentralized exchange
(DEX) aggregator Web service
1
. The user holds an amount of BAT
tokens and wants to swap them to USDT tokens. Using the Web
application, she creates a transaction against the 1inch contract,
which in turn triggers a sequence of two swaps on two DeFi proto-
cols within the same transaction: from BAT to WETH on UniSwap
and thereafter from WETH to USDT on SushiSwap. In this paper, we
study such single transaction DeFi interactions and the networks
that arise when combining multiple DeFi transactions.
1https://app.1inch.io

Kitzler et al.
Motivation. Within the last year, the total value of tokens held by
smart contracts underlying the DeFi protocols has reached 96 billion
USD [
9
], a growth rate that central banks increasingly perceive
as a risk (cf. [
24
]). While decentralization of finance offers many
opportunities, such as technological innovation or new governance
models, it can also undermine established forms of accountability
and erode the effectiveness of financial regulation and enforcement.
If these protocols are not understood and adopted more broadly,
they could have unforeseeable systemic effects on financial markets
and our society as a whole, as seen in the 2008 financial crisis [
16
].
Previous work (cf., [
7
,
13
]) has already shown possible strategies
allowing rational agents to maximize their revenues by subverting
the intended design of DeFi protocols. However, so far, this has
only been discussed within the restricted scope of an individual
decentralized exchange or lending protocol. Furthermore, none of
the existing studies have systematically investigated compositions
of DeFi protocols, which form complex, interconnected financial
constructs that can only be understood if we first disentangle them.
Contributions. Our work aims to analyze DeFi protocols and to
develop a novel algorithmic method that helps to understand them.
We can summarize our contributions as follows:
(1)
We provide a manually curated ground-truth of
23
DeFi
protocols and
10,663,881
associated smart contracts and
construct two network abstractions representing interac-
tions among DeFi protocols and smart contracts (Section 3).
(2)
We study intertwined DeFi protocols from a macroscopic
perspective by analyzing the topology of both networks.
We find that DEX and lending protocols have a high degree
centrality and that protocols interactions primarily occur in
a strongly connected component. We also find that known
community detection algorithms cannot disentangle DeFi
protocols, indicating DeFi compositions (Section 4).
(3)
We address the microscopic transaction level and propose
an algorithm for extracting the building blocks of DeFi pro-
tocols. We apply the algorithm to all protocol transactions
in our ground-truth, identify the most frequent building
blocks, and find swaps being the most frequent ones. We
also demonstrate how to disentangle the building blocks of
a single protocol using 1inch as an example (Section 5.1).
(4)
We present an overall picture of DeFi compositions by
extracting and flattening the entire nested building block
structure across multiple DeFi protocols. Then, we apply
our algorithm and conduct an empirical analysis showing
that DeFi aggregation protocols (1inch,0x or Instadapp)
utilize functions of many other DeFi protocols (Section 5.2).
For reproducibility of results, we make our ground-truth dataset,
including the labels as well as our source code, openly available at
https://github.com/StefanKit/Untangling_DeFi_Composition.
Implications. We believe that our results are an essential contribu-
tion towards understanding DeFi compositions. Furthermore, our
algorithm can help assess the composition of individual protocols.
Considering the volume of the global financial markets, DeFi is still
a niche phenomenon. However, if DeFi continues to proliferate and
possibly integrate with the traditional financial sector, understand-
ing DeFi compositions will be an important first step in a wider
assessment of systemic risks.
2 BACKGROUND AND DEFINITIONS
We now establish preliminary terms and definitions that are used
throughout this work and briefly introduce the related works.
2.1 Ethereum Account Types
Ethereum is currently the most important distributed ledger tech-
nology (blockchain) for DeFi services [
36
]. It differs from the Bitcoin
blockchain conceptually as it implements the so-called “account
model” with two different account types: an externally owned
account (
𝐸𝑂𝐴
) is a “regular” account controlled by a private key
held by some user. A code account (
𝐶𝐴
), which is synonymous
with the notion “smart contract”, is an account controlled by a com-
puter program, which is invoked by issuing a transaction with the
code account as the recipient.
ACA must always be initially called by an external transaction
originating from an
𝐸𝑂𝐴
, but a CA can itself trigger other CAs. In
the latter case, the interaction, which is also known as “message”, is
denoted as an internal transaction. Several branches of internal
transactions with varying depth can follow an external transaction,
resulting in a cascade, which is also called traces.
CAs allow users to implement application-layer protocols, which
are essentially programs that can follow some standardized inter-
face. Tokens are popular CA-based applications and a way to define
arbitrary assets that can be transferred between accounts. The pro-
gram behind a token manages token ownership and can implement
a standardized interface like ERC20, which defines functions stan-
dardizing token transfer semantics.
2.2 Decentralized Finance (DeFi) Protocol
ADeFi protocol is an application-layer program that provides
financial service functions such as swapping or lending assets. More
technically, we can define it as follows:
Definition 2.1. A DeFi protocol
𝑃
is a decentralized application
that facilitates specific financial service functions defined and im-
plemented by a set of protocol-specific code accounts.
The following properties distinguish DeFi services from tradi-
tional financial services: first, they are non-custodial, meaning that
no intermediary such as a bank or a broker holds custody of a users’
funds. Second, they are permissionless, meaning that anyone can
use existing or implement new services. Third, they are transparent,
which means that anyone with the necessary technical capabilities
and skills can investigate and audit the state of protocols.
2.3 DeFi Protocol Compositions
The fourth, and in this work most crucial property of DeFi protocols
is that DeFi protocols are composable:CAs can call each other, and
their individual functions can be arbitrarily composed into new
financial products and services (“Financial Lego”) [
33
]. While this
analogy is widely used in the literature, to the best of our knowl-
edge, no work investigates which are the basic composable building
blocks of more complex financial services and how they are related.
Harvey et al. [
15
] refer broadly to composability as asset tokeniza-
tion and networked liquidity, while Watcher et al. [
29
] conceive
composability narrowly as a repeated wrapping operation of to-
kens resulting in new derivative products. However, as illustrated

Disentangling Decentralized Finance (DeFi) Compositions
before in Figure 1, we note that DeFi compositions also involve CAs,
which are not tokens. Also, Engel and Herlihy [
10
] and Tolmach
et al. [
26
] respectively discuss compositions only in the context of
automated market makers (AMMs) and of formal verification of
CAs related to decentralized exchanges and lending services, which
is again a very narrow conception. Thus, there is no comprehensive,
technically grounded definition for DeFi compositions to the best
of our knowledge. For our work, we define it as follows:
Definition 2.2. A DeFi protocol composition occurs when an
account leverages one or more accounts belonging to at least an-
other DeFi protocol within a single transaction to provide a novel
financial service.
2.4 Related Work
Others studied networks closely related to the ones we investigated
before this work: Lee et al. [
18
] analyzed the local and global prop-
erties of interaction networks extracted from the entire Ethereum
blockchain statically found heavy-tailed degree distributions. In
a follow-up, Zhao et al. [
37
] analyzed the temporal evolution of
Ethereum interaction networks and found that they proliferate and
follow the preferential attachment growth model. Furthermore, sev-
eral studies focus on the network of Ethereum’s tokenized assets:
Somin et al. [
25
], for instance, studied the combined graph of all
fungible token networks, while Victor and Lüders [
28
] explored
the networks of the top 1,000 ERC20 tokens individually. Fröwis et
al. [
11
] proposed a method for detecting token systems indepen-
dent of an implementation standard. Also, Chen et al. [
5
] conducted
a systematic investigation of the whole Ethereum ERC20 token
ecosystem and analyzed their activeness, purpose, relationship, and
role in token trading. However, none of these related works consider
networks that represent DeFi Protocols and their relationships.
Another growing body of research concentrates on specific func-
tions offered by individual DeFi protocols or types of protocols.
We are aware of many DEX-related measurements focusing on
protocol-specific aspects, such as the magnitude of cyclic arbitrage
activity [
31
], the behavior of liquidity providers [
32
], or the role
of oracles as providers of external information [
19
]. Other studies
focus on lending and borrowing services: Perez et al. [
21
] analyze
liquidations and related participants’ behavior in the DeFi protocol
Compound, while Gudgeon et al. [
14
] compare market efficiency,
utilization, and borrowing rates in different lending protocols. Also,
Wang et al. [
30
] provide methods to identify flash loans in three
different DeFi providers and measure their related activity. Finally,
we are aware that von Wachter et al. [
29
] investigate composability
from an asset perspective and measure composability by identifying
the number of derivatives produced from an initial root asset. How-
ever, we apply a more technical, service-oriented perspective and
consider, to put it simply, a DeFi composition as being a computer
program utilizing other programs’ functions.
Overall, we are not aware of previous studies providing a com-
prehensive picture of DeFi compositions across various protocols.
We also do not know any work that analyzes in detail the building
blocks of individual DeFi protocols. With this work, we want to
close this gap.
3
DATASET AND NETWORK CONSTRUCTION
This section describes the data we collected and network abstrac-
tions we constructed for subsequent analysis steps.
3.1 Dataset collection
To study DeFi compositions, we are interested in transactions be-
tween Ethereum code accounts associated with known DeFi proto-
cols. Thus, we used on-chain transaction data from the Ethereum
blockchain and built a ground-truth of known CAs and their asso-
ciations to DeFi protocols.
3.1.1 On-chain transaction data. We used an OpenEthereum client
and ethereum-etl
2
to gather all Ethereum transactions from 01-Jan-
2021 (block 11,565,019) to 05-Aug-2021 (block 12,964,999), covering
the most recent DeFi history. We collected each external transaction
and also parsed its cascade of internal transactions, which together
gives us the trace. For each transaction, we extracted the source and
destination account addresses, the transaction hash, the transferred
value, the transaction type (call, create, or self-destroy), as well
as the trace id, which indexes the transactions by their execution
order. Additionally, we collected the method id of the 4-byte in-
put sequence, which allows us to identify the signature of called
methods using the 4Byte lookup service3.
To distinguish between CAs and EOA, we gathered all code ac-
count creation transactions from the first CA created on Ethereum
until the end of our observation period. We also use these cre-
ation traces to associate each CA with its creator CA. In total, we
found
46,112,390
CAs and used the output byte sequence to identify
324,143 contracts conforming to the ERC20 standard.
3.1.2 Ground-truth data. We focus on the most relevant protocols
regarding valuation and gas-burned from 06-Mar-2021 to 05-Aug-
2021. We use monthly samples of the top three total-value-locked
protocols from DeFi Pulse
4
for each financial service category to
define the set of investigated DeFi protocols. We exclude those in
the payment category because services like Polygon provide off-
chain functionality rather than composable financial services or
products. Additionally, we consider protocols including CAs of the
top ten gas burner list5in the observation period.
After identifying the most relevant DeFi protocols, we manually
collected the CAs associated with each protocol. Since this infor-
mation is not available on the blockchain, we rely on off-chain
and publicly available sources like protocol websites and available
documentation. We resolved conflicts of duplicated CA to protocol
assignments and identical names by querying CA addresses on
Etherscan
6
and uniquely assigned each CA address to its original
protocol and obtained a unique label. We denote these manually
collected data points as seed data and make them available as part
of our source code repository.
Next, we extended our seed data by implementing a heuristic
that uses the creation transactions and identifies the CAs deployed
by each seed address. By default, all extended addresses inherit
the label and protocol assignments from the corresponding seed
2https://github.com/blockchain-etl/ethereum-etl
3https://www.4byte.directory/
4https://defipulse.com/
5https://ethgasstation.info/gasguzzlers.php
6https://etherscan.io/

Kitzler et al.
Table 1: Ground-truth dataset summary statistics. Seed ad-
dresses were collected manually for each DeFi protocol and
then heuristically extended.
Number of addresses
Protocol type Seed Seed extended
Assets 289 1311
Derivatives 390 400
DEX 242 10,397,838
Lending 486 264,262
1407 10,663,811
address. Combined with our seed data, these extended addresses
form our extended seed data set. If an extended address conflicts with
an existing seed address, we keep the deployed CA and remove the
seed address. Table 1 summarizes the number of seed and extended
addresses collected for each DeFi protocol category. It shows that
our automated expansion does not increase the number of addresses
associated with DeFi protocols for assets and derivatives. However,
it massively expands the dataset for DEXs and lending protocols
utilizing automated factory contract deployments. A significant
share of the DEX addresses belongs to 1inch due to the use of gas
tokens. For more details on considered DeFi protocols, we refer to
Table 6 in the Appendix.
3.1.3 Dataset reduction. As we are only interested in known DeFi
protocols, we finally limited and reduced the traces data set to
the subset protocol traces, where the initial external transaction
originating from an EOA triggers a CA address in our extended
seed dataset. This reduction allows us to investigate and interpret
compositions within the context of known protocols.
3.2 Network construction
In our analysis, we want to understand and discover relations be-
tween DeFi protocols and associated CAs. For that purpose, as
shown in Figure 2, we constructed networks consisting of DeFi
traces on two abstraction levels: the lower-level DeFi Code Account
(CA) Network network and the higher-level DeFi Protocol Network.
The DeFi CA network includes all known ground-truth CAs
triggered by external transactions from arbitrary EOA addresses and
all CAs subsequently called by cascades of internal transactions. We
note that CAs in the network can or cannot be associated with a DeFi
protocol in our ground-truth dataset. We construct the network by
filtering all internal and external transactions between CAs from
the protocol traces. Since repeated usage of DeFi services results in
recurring transaction patterns, we aggregate and count transactions
with the same source and destination address.
The DeFi Protocol network represents interactions between pro-
tocols. We constructed it by merging all DeFi CA vertices associated
with the same DeFi protocol into a single node.
We note that we modeled both networks as a directed graph,
in which vertices represent either a protocol or a single CA. The
weighted edges represent the aggregated set of transactions be-
tween DeFi protocols or CAs.
𝐶𝐴1𝐶𝐴2
𝐶𝐴3
𝐶𝐴4𝐶𝐴5
𝐶𝐴6
𝐶𝐴7
𝐶𝐴8
𝐸𝑂𝐴1
𝐸𝑂𝐴2
𝐸𝑂𝐴3
𝐶𝐴1𝐶𝐴2
𝐶𝐴3
𝐶𝐴4𝐶𝐴5
𝐶𝐴6
𝐶𝐴7
𝐶𝐴8
DeFi Protocol Network
𝑃1
𝑃2
𝑃3
DeFi CA Network
Figure 2: Schematic illustration of constructed networks. The
lower-level DeFi Code Account (CA) network represents in-
teractions between CAs. The higher-level DeFi Protocol Net-
work models relations between DeFi protocols. Lower-level
CAs vertices are associated with higher-level protocol ver-
tices. CAs are triggered by EOA or other CAs.
Table 2: Summary statistics of the analyzed networks.
DeFi CA network DeFi Protocol network
Nodes 2,536,371 43,624
Edges 3,472,757 84,789
Self-loops 6668 146
Average degree 1.369 1.944
Density 5.398e-07 4.456e-05
4 TOPOLOGY MEASUREMENTS
We now analyze the constructed networks from a macroscopic
perspective and investigate whether and how their topological
properties are affected by compositions. Table 2 reports basic sum-
mary statistics for the DeFi CA network and the DeFi Protocol
network. The main difference is in the network dimension, the
latter being two orders of magnitude smaller. The presence of self-
loops indicates that some contracts include multiple functionalities
and thus can also call themselves. Both networks are sparse, as
shown by the average degree and density measure, suggesting that
CAs tend to interact with only a few other CAs.
4.1 Degree distribution
Looking at the total-value-locked at DeFi Pulse, we can observe
that some DeFi protocols and their contracts play a major role.
This observation suggests that they implement core functionality,
which other protocols in DeFi compositions can utilize. Under this
assumption, preferential attachment [
1
,
22
] is a plausible generative
mechanism for both networks. More generally, networks whose
degree distribution follows a power law, i.e., the fraction of vertices
with degree
𝑘
is given by
𝑃(𝑘) ∼ 𝑘−𝛼
for values of
𝑘≥𝑘𝑚𝑖𝑛
, are
Loading more pages...