UniV3 Strategy - Part5, UMAMI-WETH

Forecast UMAMIETH Price in the next 1 and 2 hours with Autoregressive Moving Average Model

Dec 10, 2022

Previously, I demonstrated how to use AutoRegression (AR) to predict the next hour BTCETH price. Today, I’ll show you how to predict the next hour UMAMIETH (UMAMI/ETH) price with the AutoRegressive Moving Average (ARMA) model. Because ARMA is built on top of AR, I recommend you to read my previous article first to connect the dots.

Why are we interested in predicting the short term movement of token A price / token B price? Because if we can successfully do that, we can better set the lower and upper price limits and hence play the in-and-out liquidity provision game better. This is a game sophisticated UniV3 users have been playing. For example, they’d identify a small pool with $1M ~ $10M TVL, provide liquidity for as short as 1 ~ 3 hours to rack up trading fees while minimizing impermanent loss. On the other hand, short term price movement is predictable because price momentum tends to carry forward in a short time window.

The UMAMI-WETH 1bp pool on arbitrum has a TVL of $657K at the moment, so it seems a good choice for demonstration. But the recipe and python code I provide here can be easily adopted to other pairs. So let’s get started.

First, we need to download hourly UMAMI and WETH price data from Arbitrum. Thanks to DeFiLlama’s data API and the defillama2 package, we can easily do that in python. For this demo, I set the download period between 2022-11-01 and 2022-12-09. It took ~11 minutes to download the data. I then divided UMAMI price by WETH price to get UMAMIETH price.

Hourly prices displayed in 2 decimal places

There are no missing values in the prices so we don’t need to worry about imputation or filling. I then ran some summary statistics and plotted UMAMIETH to gain a quick understanding of the data.

The second step is to prepare data for model building and forecasting. Using an arbitrary cutoff time, 2022-12-03 23:00:00, I split UMAMIETH into training and test sets. The training set has 792 prices and the test set 121. I then scaled the data so that all values are in the range of 0 and 1. I used the min and max of UMAMIETH in the training set to scale the test series to avoid data leaking. The following table shows scaled values from the training and test series.

Scaled prices displayed in 2 decimal places

The third step is to determine parameter values that give the best model fit to the training data. AutoRegression (AR) model only uses historical prices to predict future prices, whereas AutoRegressive Moving Average (ARMA) model uses both historical prices and historical prediction errors to predict the future. ARMA model has two parameters, p and q , where p is the number of lagged series and q is the number of past prediction error series. For example, ARMA(1, 1) is short for autoregressive moving average of order (1, 1), and its model equation is

Y(t) = a + b1*Y(t-1) + c1*error(t-1) + error(t)

and ARMA(2, 1) has a model equation of

Y(t) = a + b1*Y(t-1) + b2*Y(t-2) + c1*error(t-1) + error(t)

We want to find p, q such that ARMA(p, q) fits the training data best. We can use the ar_select_order() function from the statsmodels package to automatically determine the best p. I ran it on my training data and got p = 2, telling me to use prices at last hour and two hours ago as predictors.

To determine the best q value, I fit both ARMA(2, 1) and ARMA(2, 2) and examined their residuals by looking at the model diagnostic plot.

Both correlograms of the residuals show almost all lollipop sticks squashed down inside the blue 95% confidence band except at lag 15. This says both ARMA(2, 1) and ARMA(2, 2) are able to explain away the autocorrelation in the price series and hence are good model choices. The tiny peak at lag 15 is not a concern because it's expected that 5% out of the 20 lags will be non-zero due to sampling variation.

So shall we go with ARMA(2, 1) or ARMA(2, 2)? To make that choice, I looked at the coefficient estimates of ARMA(2, 1) and found `ar.L2` is not statistically significant and hence can be dropped. This contradicts to what the auto-selection algo suggested that the best AR order is 2.

I then looked at the coefficient estimates of ARMA(2, 2) and found that although `ar.L2` becomes significant after adding `ma.L2`, `ar.L1` is not anymore. But obviously we can't drop `ar.L1`, so let’s go with the parsimonious ARMA(1,1).

I fit an ARMA(1, 1) on the training set and obtained the following diagnostic plots. The correlogram (bottom right) looks good and we are good to move to the next step.

Iterative Training and Forecasting

Now that we know ARMA(1, 1) fits our data best. Let's train an ARMA(1, 1) model and use it to make forecasts iteratively on a rolling basis. That is, as we iterate over the timestamps in the test series, we will

train an ARMA(1, 1) on the training set,
make a 2-step forecast, i.e., predict next-hour and next-2-hour prices simultaneously,
grow the training set by adding the test set observation at that timestamp to the end of the training series. Because we have a small dataset and a simple model, we don’t need to keep the training size fixed to speed up the computation.

Afterwards, we’ll reverse the scaling operation so that we have the forecast values and the actuals in the original scale. We can then plot them. Notice the next hour forecasts are better than the next-2-hour forecasts. This is expected as it’s more difficult to predict the distant future than the near future.

Next-hour Forecasts vs. Actuals, UMAMIETH

Next-2-hour Forecasts vs. Actuals, UMAMI-ETH

Finally, we should calculate a single metric to summarize the error. Mean Absolute Percent Error (MAPE) is a good choice.

Code & Data

Python notebook. You can easily adopt the code to any UniV3 pairs.

Referrals

This section contains affiliate links to crypto products. If you click on the link and use the product, I may receive a small commission at no cost to you. I only link to products I use myself.

Get 5% discount when trading on GMX.
Juno Finance offers free onramp to Arbitrum, Optimism and Solana, a high interest bearing checking account (currently at 5%), and a great loyalty program that gives ledger, gift cards, and 5% cash back for spending. You will get $10 and 500 JCOIN if you use my referral link to open an account and fund it with $50 worth crypto or more. Available to US persons and requires KYC.

If you enjoyed this article and would like to buy me a coffee, you can send ETH, WBTC, AVAX, BNB, USDC, USDT to 0x783c5546C863f65481BD05Fd0e3FD5f26724604E, or you can tip me sat. Thank you and have a great day!

Coin Data School

Discussion about this post