Introduction
The schema is in the form of a DSL implemented with HCL to define the data we're interested in. This means that basic arithmetic operations and ternary operators for control flow are supported by default.
Variables
We can start by defining variables in a variables
map. For example:
variables = {
weth_address = "0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2"
}
These variables will be available for use across the rest of our schema.
Query
The next top-level element is always a query
block, along with the name of the query.
This name will be used to determine the name of the output file or the name of the table in a database.
For example:
query usdc_eth_swaps {
chain = "ethereum"
...
}
This query
with CSV output enabled would create usdc_eth_swaps.csv
, and with database output we would get the
usdc_eth_swaps
table.
In the query
block, we must define the chain
. This will determine the RPC for the query, and your chain name
must correspond to a chain in the configuration file (config.yml
). For now, only one query at a time is supported.
Later, you will be able to execute multiple queries over multiple chains at the same time, for data that has
cross-chain relevance.
Everything in a query
block needs to happen at the same time. This means that you can't call a method and
filter for events in one query (unless you're calling a method when a certain event occurs).
This is where things become interesting. Inside of the query
block we can choose between a couple of options: global events
and contracts. Let's start with the first.
Global Events
If we want to query certain events that are not tied to a specific address, we create an event
block inside of the query
block.
For example:
query all_arbitrum_transfers {
chain = "arbitrum"
event Transfer {
abi = "erc20.abi.json"
...
}
}
This query would look for all of the ERC20 Transfer
events that happen on the Arbitrum rollup. Since Transfer
is
defined in the ERC20 ABI, we need a property inside event
to specify the ABI.
Contracts
The other option is a contract
block. This allows you to call methods or filter events that are tied to a specific
contract. For example:
query usdc_eth_reserves {
chain = "arbitrum"
contract {
address = "0x905dfCD5649217c42684f23958568e533C711Aa3"
abi = "unipair.abi.json"
...
}
}
This in itself is not a very useful example, let's continue with calling methods.
Contract Methods
Building on the last example, let's see how we can call getReserves
and define which outputs we want for further processing:
query usdc_eth_reserves {
chain = "arbitrum"
contract {
address = "0x905dfCD5649217c42684f23958568e533C711Aa3"
abi = "unipair.abi.json"
method getReserves {
outputs = ["_reserve0", "_reserve1"]
}
...
}
}
The method name getReserves
and the output names _reserve0
and _reserve1
are exactly how they are defined in the ABI,
so be careful with typos. This method has one more output (_blockTimestampLast
), but since we're not interested in that,
we can just omit it.
Contract Events
Contract level events are almost the same as global events, but there's no need for an abi
property. For example:
query usdc_eth_swaps {
chain = "arbitrum"
contract {
address = "0x905dfCD5649217c42684f23958568e533C711Aa3"
abi = "unipair.abi.json"
event Swap {
outputs = ["amount1In", "amount0Out", "amount0In", "amount1Out"]
}
...
}
}
Would allow you to look at all the swaps on the USDC-ETH SushiSwap pool on Arbitrum, and save the outputs for further processing.
Processing Data
We've talked a lot about how we're supposed to get our data, now let's see what we can do with it.
Transform
contract
and global event
blocks can have additional transform
blocks. These blocks can work with previously fetched
data to do some transformations, before proceeding to the final output. Any input
and output
is automatically provided
as a variable to work with, along with some helper functions like parse_decimals()
. More on this
here. Let's look at an example to make this all clear.
In the previous example when recording Swap
events on the USDC-WETH pool, the outputs we're getting are very raw.
For one, they're formatted according to the number of decimals of the token, which is not a great format to work with.
Let's say we want to change that:
query usdc_eth_swaps {
chain = "arbitrum"
contract {
address = "0x905dfCD5649217c42684f23958568e533C711Aa3"
abi = "unipair.abi.json"
event Swap {
outputs = ["amount1In", "amount0Out", "amount0In", "amount1Out"]
}
transform {
usdc_sold = parse_decimals(amount1In, 6)
eth_sold = parse_decimals(amount0In, 18)
usdc_bought = parse_decimals(amount1Out, 6)
eth_bought = parse_decimals(amount0Out, 18)
buy = amount0Out != 0
}
}
...
}
The apollo
DSL provides some functions to work with, and in this case we can use parse_decimals
to do an initial clean up
of the data. We save the results in some easy-to-remember variables that we can use later. The number 1
in the outputs
is always USDC, which has 6 decimals. Anything with the number 0
is WETH, with 18 decimals.
We can also define a buy
boolean, which will keep track of wether the swap bought or sold WETH.
Filter
Before proceeding to our final block, we can filter out for results we're not interested in. We do this in the filter
list. Let's say we're only looking for ETH buys, and don't want to save the sells:
query usdc_eth_swaps {
chain = "arbitrum"
contract {
address = "0x905dfCD5649217c42684f23958568e533C711Aa3"
abi = "unipair.abi.json"
event Swap {
outputs = ["amount1In", "amount0Out", "amount0In", "amount1Out"]
}
transform {
usdc_sold = parse_decimals(amount1In, 6)
eth_sold = parse_decimals(amount0In, 18)
usdc_bought = parse_decimals(amount1Out, 6)
eth_bought = parse_decimals(amount0Out, 18)
}
}
filter = [
eth_bought != 0
]
save {
timestamp = timestamp
block = blocknumber
contract = contract_address
tx_hash = tx_hash
// Example: we want to calculate the price of the swap.
swap_price = usdc_sold / eth_bought
size_in_udsc = usdc_sold
}
}
If there's anything in the filter
list that evaluates to false
, the result will be discarded.
Save
Our last block is the save
block. This block is at the query level, and it defines the final output format of our data.
Everything that passes our filter
is available for use, including any transformed variables.
We'll continue to build on the previous example (but without any filters):
// Note the variables!
variables = {
b = upper("eth_buy")
s = upper("eth_sell")
}
query usdc_eth_swaps {
chain = "arbitrum"
contract {
address = "0x905dfCD5649217c42684f23958568e533C711Aa3"
abi = "unipair.abi.json"
event Swap {
outputs = ["amount1In", "amount0Out", "amount0In", "amount1Out"]
}
transform {
usdc_sold = parse_decimals(amount1In, 6)
eth_sold = parse_decimals(amount0In, 18)
usdc_bought = parse_decimals(amount1Out, 6)
eth_bought = parse_decimals(amount0Out, 18)
buy = amount0Out != 0
}
}
save {
timestamp = timestamp
block = blocknumber
contract = contract_address
tx_hash = tx_hash
// Example: we want to calculate the price of the swap.
// We have to make sure we don't divide by 0, so we use the ternary operator.
swap_price = eth_bought != 0 ? (usdc_sold / eth_bought) : (usdc_bought / eth_sold)
direction = buy ? b : s
size_in_udsc = eth_bought != 0 ? usdc_sold : usdc_bought
}
}
If we choose CSV as the output option, this would produce something like this:
size_in_udsc,timestamp,block,contract,tx_hash,swap_price,direction
70.471237,1653105492,12604569,0x905dfCD5649217c42684f23958568e533C711Aa3,0x92da0717731b067a28663c3053998af4ca9d39164ea96024054688d5febbdfb5,1954.052675,ETH_SELL
7831.683925,1653100138,12600500,0x905dfCD5649217c42684f23958568e533C711Aa3,0xec90864ecd0d6e8eb40bf059fd3c5d44e268871f6632fee6024a70409d24821e,1957.920981,ETH_BUY
1508.127957,1653101447,12601445,0x905dfCD5649217c42684f23958568e533C711Aa3,0x7958274b6d9ce536f7c40f0be8441c3327c8aea788aa6a7a7c615fadd4daa567,1958.898423,ETH_BUY
590.085622,1653101587,12601551,0x905dfCD5649217c42684f23958568e533C711Aa3,0xdf5fc5a39bc93a1cbe5480533a70fdb9da44d970c5acb17ec1bad12f01ff02a0,1959.289088,ETH_BUY
8,1653100643,12600951,0x905dfCD5649217c42684f23958568e533C711Aa3,0x8f8de8c985a3e1118ec4abfcb3a07935a0d43d921e4d6eee35b3275200099840,1958.978772,ETH_BUY
An important thing to note: the save
block has access to some additional context variables, and the ones we use
here are timestamp
, blocknumber
, contract_address
, and tx_hash
. See
Context variables and functions for more information.
In this example we also make use of the ternary operator to make sure we don't divide by 0, and to determine if the swap bought or sold WETH.
You've made it! If you're curious about some more use cases, check out our examples.