Skip Go Fast R&D

By Sam Hart and Marc Graczyk

Prelude

Skip (now Interchain labs) has been interested in developing a fast transfer/intent system for quite some time now. Here’s a thread dating all the way back to spring 2023, when we were speaking to Elijah from Duality/Neutron about the subject.

We had to prioritize conventional transfer and swap routing, so intent-based bridging was put on the back burner for some time. But about 8 months ago it felt like the right moment to pick things up again, so I led an R&D initiative to understand what solutions were available on the market and a design space we could work in. Much of this research was carried out by Skip alum Marc Graczyk (thank you Marc!!).

Our work started by using several existing fast transfer systems ourselves and analyzing them from both a product and mechanism standpoint. These included Across, Connext, Socket, GMP Express / Squid, and UniswapX. The purpose of this market research was to understand how different backend architectures might impact UX, and get a sense of the main product considerations while using deployed swap interfaces. We rated these along key dimensions and took notes, but the main takeaways from this hands-on investigation were the following:

  • Tx costs vary significantly between solutions on the market today
  • Users face a fundamental latency vs price trade-off
  • The system trades-off capital efficiency vs service availability under demand, where a lending market is often introduced to break this trade-off

After completing this preliminary research I had to shift my attention to other projects, however with our research document in-hand Jeremy Liu and Zach Becker were able to quickly commit to the key architectural choices that would translate into a deployed system. So the final Skip Go Fast design is not elaborated explicitly in this post, but falls within the described space of options (the team opted for something simple to start, which worked out nicely).

I thought it would be nice to make this internal research document available to the public, mainly because we found the research process illuminating and thought at least a few members of our community would enjoy learning more about the thinking that went into developing the Go Fast product.

Note: this document reflects research performed approximately 8 months ago, therefore data and statements about the status of certain products is likely somewhat out of date.


Potential design criteria for an initial Skip Go Fast system

  • Ethereum fees should be competitive
  • Prioritize UX, up to a certain transfer size users should always be provided service
  • Solving should be outsourced, but can be kept to a limited set of actors initially
  • Deploying an auction is not an initial requirement if there’s a market structure that’s simpler to work with to start
  • The design should work with Ethereum main chain + Optimism, Arbitrum, Base, and eventually Bitcoin
  • Should be able to service some multiple of Skip Go’s current transfer load

Pricing fast transfer service

There are a variety of factors that a fast bridge liquidity provider might consider when quoting a price. We’ll enumerate them briefly here, and then go into a few of these factors in more depth as we discuss different liquidity bridge designs, namely reorg risk, currency risk, cost of capital, and gas costs, which tend to be the most important considerations.

Technical risk

Though bridges do fail, and smart contracts do have bugs, anecdotally, this is not factored into transaction pricing. It may, however, be a determinant in liquidity providers’ general willingness to participate at certain profit levels.

Reorg risk

Looking at historical data the risk a transaction getting confirmed and then not making it into the chain on Ethereum is negligible barring some tail risk event. In speaking with liquidity providers and other protocol teams, reorgs are not a meaningful factor within their cost model. This would of course change for Bitcoin, and we look into the data for dropped transactions at the end of this document.

Currency risk

Liquidity providers typically want to keep a neutral position relative to USD. The currency risk incurred by holding another asset can be hedged by taking a short position on eg. dYdX. One of the liquidity providers we spoke to mentioned the existence of ETH perps as the reason they would be open to market making the asset.

Measuring the cost incurred by opening such a position would give an estimate of costs greater precision, however we assume this to be negligible for purposes of model simplicity.

Liquidity risk

If a liquidity provider happens to be quoting a price based on a borrow or swap they intend to make in the future then they incur the risk of liquidity being taken off/repriced on exchange or lending books between when fast transfer quote is submitted and the transaction occurs.

Credit risk

If a liquidity provider is making quotes based on a credit line that has been extended to them they incur the risk of their creditor being unable to make called capital available.

Cost of capital

The main cost liquidity providers face is the cost of not having their capital available for alternative use. After having fast filled the user intent the liquidity provider must wait for the bridge to complete. During this time their capital is locked in the bridge and the liquidity provider incurs opportunity cost (as well as bridge and currency risk). We start by pricing this capital cost using the risk-free rate i.e the rate of return on a risk-free investment the liquidity provider could have made with the same capital. For ETH, we use the staking APR offered by Lido. For USDC, we use the lending APY offered by Aave.

In order to determine inventory requirements, we have to estimate the maximum amount of capital that must be locked in the system during a given time-window to service demand.

Gas costs

In order for liquidity provider to execute a transaction the user needs to post “intent” information on the source chain (final token, worst price, destination chain) as call data. A message then needs to be sent to ensure the intent meets a proof of fulfillment. Gas must then be paid to match the two pieces of data against one another.

Minimizing call data

One of the longest recorded routes in the Skip Go API was 1200 characters long. We are mostly concerned with this data having to appear on the Ethereum chain, either during transaction initiation or during the verification process. Posted data can be optimized via lossless compression or commitment to data roots.

Using a compression scheme such as zlib we are can reduce the size to 700B. The price of call-data on Ethereum is 16 gas units per non-zero bytes and zero otherwise. If we assume all the bytes are non-zero we obtain an upper bound of 11200 gas units.

Alternatively, we can use a commitment system with on an off-chain lookup table (adding this to the Skip Go API). The user would communicate their routing information to Skip Go and commit to it by posting the hashed route on the source chain. The liquidity provider would then query the Skip Go API to obtain information about that route. Skip Go would ensure data availability of routing information, while avoiding the cost of posting on-chain. The keccak265 bit hash function for example used in Ethereum has a constant output size of 256bits. We therefore consider this is the size of what we include in the call data.

Model for estimating single transaction gas costs with verification data posted to Ethereum

We use an upper bound estimate where we assume that each transaction incurs a cost in the 99th percentile.

We denote g_c as the worst-case gas usage we can expect. We also denote r_c and \delta_c as the worst-case basefee and tip, respectively, that we’ll realistically pay during a typical day.

We would then pay for such a transaction at most an amount of g_c \times (r_c+\delta_c).

Historical gas costs according to public Ethereum data

The gas calculation is decomposed into two components: the number of units spent and the gas price per unit. The gas price is the sum of the base fee and the tip. We use look at different gas prices: 160 Gwei (the 99th percentile during the past year) and 300 Gwei (99th percentile when the base fee is also in its 99th percentile). We also look at two different gas units: 21000 (the standard gas unit amount for a transaction) and 50000 (the median number of gas units used during the last year).

Fee ranges based on gas use and gas price estimates

If we take g_c = 11000 + 21000 = 32000 and r_c + \delta_c = 300 \text{Gwei} according to the estimate given above, then we then get a cost of $32 per transaction.

Compression cost in gas units (upper bound) Commitment cost in gas units (upper bound) Cost in gas unit of a transaction containing the verification data Gas price Total cost compression (single transaction) Total cost commitment (single transaction)
11200 = 16*700 4096=256*16 21000 (simple transfer) 160 Gwei (low upper bound) $16 $13
“ “ 21000 (simple transfer) 300Gwei (high upper bound) $32 $23
“ “ 50000 (median number of gas units used during last year) 300Gwei $58 $51

Trade-off 1: Batching

While batching saves on call data costs, it trades-off capital efficiency by waiting with idle capital for the batch to complete. This trade-off is modulated by the batching rate.

The verification gas cost is paid by the entity that relays filling information via the slow path. Because the slow path is the batched component, a low batching rate also increases the time of the slow path.

Higher capital efficiency reduces the inventory requirement for the liquidity provider and reduces their opportunity cost. This could lead to them accepting lower fees. There is a positive relationship between the liquidity provider’s incentives and high capital efficiency.

  • The relationship with the user needs to be further explored. Higher capital efficiency means increased systemic cost, leading to an increased value for the compensating component of the user’s bid. At the same time, higher capital efficiency decreases the component of the user’s bid that compensates for the opportunity cost of the liquidity provider. Higher capital efficiency also gives the system a better reliability with respect to the user in case of a slow path.
  • Low capital efficiency increases the inventory requirements of the liquidity provider and increases their opportunity cost (they have to wait longer to obtain a refund). This could lead to them demand higher fees from the user.
  • If we consider uniquely the Ethereum → Cosmos direction, the verification cost is only paid on Cosmos and is therefore negligible. If this is the only direction we care about it may be preferable to select a high batching rate.
  • The risk-free rate represents a floor on the premium that should be paid to the liquidity provider. We study how the risk-free rate varies depending on the batching rate. As mentioned above, this affects the user’s bid since they pay for the increased opportunity cost that a smaller batch time causes for the liquidity provider. For a batching time of 30min the opportunity cost of using the capital elsewhere for that 30 minutes is passed onto the user. Likewise, if the batch time is one day, that one day opportunity cost is passed onto the user.

Cost model for batching

The cost of verification is high on a per-transaction basis, but the marginal cost of verification goes to zero with more batching. A batching system consists in refunding the liquidity provider not for one fill but for a bundle of fills they execute.

To accomplish this we construct a Merkle tree where the leafs are the individual fills performed by the liquidity provider. The only data included in the batching transaction would be the Merkle root. A Merkle proof is then submitted (possibly by the liquidity provider) to execute repayment after the bridge transaction has completed. The proof size is negligible (logarithmic in the size of the tree). The cost of bundling k transactions is upper bounded by g_c(r_c + \delta_c). Hence, F(\frac{s}{T}, T ) \leq \frac{T}{s} \times g_c \times (r_c + \delta_c) where F is the cost function, \frac{s}{T} the bundling frequency and T the number of transactions processed during a day.

Trade-off between gas cost and capital efficiency

It’s perhaps easiest to think about this trade-off at the limit. If we batch only once per day, we pay the cost of a single transaction for verification but the capital efficiency is minimal since the locking time for any fast fill is the entire day. On the other hand, if we bundle at every transaction (which in effect is the same as not bundling) we have maximal capital efficiency since the lockup time is only the incompressible bridging time. However we pay gas for the verification of every single transaction.

More batching creates inelasticity in demand response

The batching rate defines a fixed window during which increasing demand causes increased stress on the liquidity provider since their capital is locked. Consider the case where we batch every 30 minutes and there is a very high demand increase right after the start of a new batching period. The liquidity provider’s inventory keeps diminishing without being replenished during these 30 minutes. Therefore, once we decide on a cost threshold (by deciding on a batching rate) we can measure the inventory the liquidity provider needs to provide by estimating the maximum capital we want to be able to serve during this 30 minute window. In our estimate we consider the 90th percentile of capital demand on the Skip Go API during such a time window. There is always a certain amount of the liquidity provider capital locked in the system under steady regime.


The capital efficiency of the system varies in response to the batching time.

Using historical Skip Go data to inform a cutoff for fast transfer size limit

We pulled historical data from the Skip Go API to understand what kind of volume we have served for Ethereum <> Cosmos routes, and the distribution of transaction sizes.

Our own flow data can be used to tune the system. For example, we could calculate the 90th percentile of volume transferred in a given time window and use this to establish a size limit.

Alternatively, we can directly compute how much inventory is needed to serve all transactions below a certain threshold amount, given a desired batching rate. We can then set our size limit to match a known capital commitment.

If we’re not convinced past data is reflective of future volume we could alternatively use volume data from similar services, or simply assume some transfer size distribution function in order to derive the threshold size parameter.

Flow analysis: inventory requirements

For the parameter T defined previously, given the relatively low total amount of transactions using the Skip Go API to transfer from Ethereum (1000) we take the maximum number of transactions occurring during a given day, 52. We then fetch the 90th percentile volume for a 30min, 1hour, 2hour, 4hour, and 1day window.

Number of transactions during a day 90th percentile of volume - 30minute window ibid. 1hour ibid. 2hours ibid. 4hours ibid. 1day
Batching rate 52 26 10 5 1
52 (maximal number of transactions from/to Ethereum during a day on the Skip Go API) $20K $25K $33K $51K $210K
Total inventory needed on one leg (TBD - based on the previous locked capital graph) $30K $31K $35K $51K $210K

âś±

Costs with batching

Estimated number of tx during a day Gas price Gas units Batching number Total cost per day
52 300Gwei 4096 + 21000 52 $1240
“ “ “ 26 $620
“ “ “ 10 $240
“ “ “ 5 $120
“ “ “ 1 $23

âś±

Without batching i.e using only compression

Tx number Gas price hit Gas units Total cost per day
52 300Gwei 11200 + 21000 $1600

âś±

Backing out a minimum user bid from the risk-free rate

Risk free rate 1day 4hours 2hours 1hour 30minutes
Lido (sETH) 0.0000932 0.0000155 0.00000776 0.00000388 0.00000194
Aave (USDC) 0.000304 0.0000507 0.0000253 0.0000127 0.00000634
Risk free rate for the 90th percentile capital 1day 4hours 2hours 1hour 30minutes
Lido (sETH) 19.562 0.792 0.256 0.0970 0.0388
Aave (USDC) 63.927 2.588 0.837 0.317 0.127

Batch by time or batch by capital?

We considered a fixed batching frequency for the purpose of the model, however, batching could also occur automatically once a certain capital demand is registered by the threshold. For example, we could batch every $20K served. We could also define a dynamic batching rate Ă  la EIP-1559, responding dynamically to demand increase or decrease.

Netting

Batching occurs in both fast transfer directions i.e we batch refund payments for each destination chain (Cosmos and Ethereum). Netting consists in coordinating these two batching processes. Instead of doing two separate verification process, both processes could be executed at the same time. We now only need to pay out liquidity provider the offset of aggregate flows in each direction. If opposing flows are balanced, this can result in substantial cost savings.

Trade off-2: Forward vs backward propagation

For the Skip Go Fast system to later verify that fulfillment was performed correctly, either information about the intent must be transmitted to the destination for verification OR information about final token delivery must be transmitted to the source for verification.

Both forward-propagated and backward-propagated data are compressible. However only the first can be incorporated into a batching process. The costly component is always Ethereum and its costs depends on the direction of the fast transfer as well as the architecture of the system regarding the propagation of the verification data.

Backward propagation vs forward propagation

  • Forward propagation of verification data - check whether the liquidity provider fill was correct or not on the destination chain and refund the liquidity provider on the destination chain
  • Backward propagation of verification data - check whether the liquidity provider fill was correct or not on the source chain and refund the liquidity provider on the destination chain
  • Slow path - refund to the user through the relevant bridge if the liquidity provider did not execute their quote (or did so incorrectly)

Forward propagation requires liquidity provider perform less inventory management. Without forward propagation the liquidity provider would be refilled on the source chain, and their destination chain inventory would continue to decrease, forcing them to manually bridge funds after settlement.

Backward propagation can allow for a more systemic approach since the balancing information is centralized. This can also enable a netting system (when we refund a liquidity provider their inventories come back to their initial ratio), and would be a key part of a lending system. For example if we want to centralize information on Ethereum then we would backward propagate for the Ethereum → Cosmos direction and forward propagate for the Cosmos → Ethereum direction. One can also have a more complex system where the refunding and the verification are decoupled (as it is the case with Across) - the refunding could always occur on the destination chain and the verification always occur on a centralized venue (such as Ethereum).

Different implications for the unhappy path

The backward an forward mechanisms are the two possibles architectures for the verification process. In the backward propagation architecture the fast transfer data is sent back from the destination chain to the source chain for comparison with the initial data defining the user’s intent. If the comparison succeeds the liquidity provider is refunded on the source chain, otherwise a bridge transfer is initiated. In the forward propagation architecture, the verification data is forwarded immediately to the destination chain along with a bridge transfer. Once the bridge transfer arrives, the verification happens on the destination chain. If it succeeds the bridge transfer goes to the liquidity provider. Otherwise, the funds go to the user.

The existence of a lending market has a bearing on the relative capital efficiency of forward vs backward propagation

In a backward propagation architecture, the liquidity provider gets refilled on the source after the evidence of delivery has completed slow bridge propagation. Their inventory on the destination chain therefore continues to diminish until they manually bridge the capital back, and the bridging process can only begin after a completed cycle. Thus, without the existence of a lending market, a forward propagation architecture is more capital efficient.

Presuming the existence of a lending market on the destination, the verification could be performed on source chain in order to aggregate all transaction flow decisions, as well as offer flexibility to the liquidity provider on the chain where they want to be refunded.

Translating the design to Bitcoin

Possible Bitcoin denominations that could be used

There are two options today:

  1. wBTC - a custodial representation of BTC issued by BitGo (minting wBTC requires a KYC flow from an authorized provider)
  2. nBTC - a trust-minimized representation of BTC issued by the Nomic chain

And in the near future a number of other options will become available:

  • tBTC via Mezo will be available early next year
  • Citrea plans to export BTC over Hyperlane and then subsequently IBC along a similar timeline
  • Axelar is working on a Bitcoin integration
  • Dfinity has a threshold signature Bitcoin solution that could be used to issue in Cosmos
  • Thorchain may export BTC over IBC (TBD)

At the moment we don’t have enough information to know which asset the market will prefer, or specifics about when these options will be production ready.

Bitcoin block times

The Bitcoin protocol targets an average block time of 10 minutes by adjusting mining difficulty. In practice block times vary significantly, with most block times falling between 5-50 minutes.


Theoretical (red) vs. empirical (blue) distribution of block times. (link)

Native Bitcoin fees

The Bitcoin fee market is a priority auction based on fees paid directly to miners. The Bitcoin scripting language has very limited expressivity and UTXOs decrease state contention, resulting in less MEV historically. That said, Bitcoin’s blockspace is extremely limited, 1MB with an average of 1 block every 10 minutes, so fees tend to spike dramatically during times of congestion.


Median Bitcoin fees 2021 until today. (link)

Bridge fees

  • wBTC - Depends on the merchant. Would need to investigate further to know exactly.
  • nBTC - The protocol currently takes 1% on deposits, 0.5% on IBC transfers out, paid to NOM stakers. It’s currently artificially high to discourage deposits before audits are complete. The estimated timeline for a change is mid 2024.

Bridge timing

  • wBTC - Effectively a fast transfer, liquidity provider delivers after some number of confirmations.
  • nBTC - Nomic waits 6 confirmations on Bitcoin before registering a main chain deposit. The transfer gets executed the next time there’s a Nomic checkpoint, about once an hour.

Fast path requirements 0 conf vs 1 conf

  • 0 confirmation - In this case a user could submit a transaction with a higher fee that ends up being included prior to the transaction received by Skip’s fast transfer system. Risk reduction is very involved, i.e. direct channels with miners, modeling hash power distribution, mempool propagation, fee distributions. It’s likely not worth the effort, particularly since anyone transacting on Bitcoin has the expectation of very slow block times.
  • 1 confirmation - Much more feasible, there is still a non-zero risk of a reorg, or more likely a “stale block,” in which multiple golden nonce’s are found within the span of P2P propagation. In these scenarios a user could submit a conflicting transaction that makes it into the longest chain. POW’s stochastic block production makes this risk much higher than Ethereum.


Stale block candidates from the BitMex Research feed. Plot only includes the data from their node’s view of the network. (link)

Options for conditional escrow

  • Covenants - OP_CAT or an equivalent BIP. This would allow us to write a conditional escrow in a straightforward way, however TBD if/when this functionality makes it into a Bitcoin upgrade.
  • BitVM - We could write the equivalent conditional logic using BitVM, which Merklizes all possible branches of the computation and provides partial transactions for each path that can be used to resolve the execution path. Very time consuming to perform a computation, so most people are using this for optimistic execution with a challenge game. This is pretty much a research project at this stage, so not practical for us at the moment.
  • Escrow on BTC peg domain - Presumably this chain would have an expressive state machine that would allow us to
  • Threshold signing via threshold ECDSA or Schnorr signatures - The signing system would custody funds and would there fore need to be decentralized.
  • Lightning channel - The user could open a lightning channel with the liquidity provider on Bitcoin with a channel capacity of the correct size. The Skip Go API could hold the signature required to send funds in the lightning channel and watch the Cosmos side for the liquidity provider to deliver funds. The downside of this option is the number of required transactions. Two to open the channel, one to close the channel, and the liquidity provider must make one transaction to replenish funds on the Cosmos side.

Introduction of a lending market (future work)

The role of lending markets for fast transfer systems

The lending market allows to abstract the refund payment from the bridge transfer. The time to refund for the relayer becomes only dependent on the verification time. The lending market maintains an LP pool on each destination chain which size is proportional to the size of the flow to the destination chain. This pool can also act as a lender of last resort i.e. if there are no fast fills then it can execute a user refund.

A lender of last resort is helpful even without the introduction of a lending system, but particularly useful to capitalize a multi-chain lending market.

Important factors for introducing a lending market:

  1. Number of fast transfer routes
  2. Variability in the directionality and magnitude of flows
  3. Aggregate volume

Note: batching by time makes coordinating lending positions much easier.

Additional functionality that may be worth considering

Using Skip Connect to decouple bridging from verification data propagation

We could use Skip Connect to decouple the verification process from the bridging process, abstracting the bridging the process. The restaked validators would handle passing the Ethereum state root and performing verification, upon which the bridged funds would be unlocked to the liquidity provider or the user.

Turning fast transfer into a pool

Instead of using the LP pool on the destination chain as a mechanism to refund relayers or act as a lender of last resort, one could use it to directly complete the fast transfer. A validator would just need to supply the routing information to the destination chain along with an inclusion proof. The validator set, by constantly fetching Merkle state roots from the Ethereum chain, would then agree upon this data and unlock funds from the pool.


Give Skip:Go Fast a try today, or check out our developer docs on how to integrate the Skip:go API, which includes our fast transfer system out of the box. It’s careful design and engineering like this that enables what we believe to be the best cross-chain UX available on the market.