Hello World#

%matplotlib inline
import pandas as pd
import cvxportfolio as cp

Download the problem data with yfinance. We select five liquid stocks.

import yfinance
tickers = ["AMZN", "AAPL", "MSFT", "GOOGL", "TSLA"]
# start_date = "2012-01-01"
# end_date = "2016-12-31"
returns = pd.DataFrame(
    dict(
        [
            (
                ticker,
                yfinance.download(ticker)[ #, start_date=start_date, end_date=end_date)[
                    "Adj Close"
                ].pct_change(),
            )
            for ticker in tickers
        ]
    )
)
[*********************100%***********************]  1 of 1 completed

[*********************100%***********************]  1 of 1 completed

[*********************100%***********************]  1 of 1 completed

[*********************100%***********************]  1 of 1 completed

[*********************100%***********************]  1 of 1 completed

returns.describe()
AMZN AAPL MSFT GOOGL TSLA
count 6523.000000 10675.000000 9349.000000 4697.000000 3222.000000
mean 0.001704 0.001100 0.001134 0.000983 0.002125
std 0.036077 0.028212 0.021314 0.019434 0.036126
min -0.247661 -0.518692 -0.301159 -0.116341 -0.210628
25% -0.013295 -0.013056 -0.009217 -0.007995 -0.015596
50% 0.000402 0.000000 0.000352 0.000745 0.001218
75% 0.014799 0.014695 0.011355 0.010087 0.019381
max 0.344714 0.332280 0.195652 0.199915 0.243951

We get the return on cash from FRED.

import pandas_datareader

returns[["USDOLLAR"]] =  pandas_datareader.get_data_fred("DFF", start='2000-01-01') / (250 * 100)
returns = returns.fillna(method="ffill").dropna()

returns.tail()
AMZN AAPL MSFT GOOGL TSLA USDOLLAR
Date
2023-04-12 -0.020917 -0.004353 0.002334 -0.006739 -0.033460 0.000193
2023-04-13 0.046714 0.034104 0.022399 0.026663 0.029689 0.000193
2023-04-14 0.001074 -0.002114 -0.012766 0.013404 -0.004841 0.000193
2023-04-17 0.002244 0.000121 0.009296 -0.026637 0.011027 0.000193
2023-04-18 -0.006619 0.004600 -0.004778 -0.008116 -0.012992 0.000193

We compute rolling estimates of the first and second moments of the returns using a window of 1000 days. We shift them by one unit (so at every day we present the optimizer with only past data).

r_hat = returns.rolling(window=1000).mean().shift(1).dropna()
Sigma_hat = returns.shift(1).rolling(window=1000).cov().dropna()

r_hat
AMZN AAPL MSFT GOOGL TSLA USDOLLAR
Date
2014-06-20 0.001314 0.001108 0.000786 0.001036 0.002961 0.000005
2014-06-23 0.001300 0.001116 0.000803 0.001059 0.002971 0.000005
2014-06-24 0.001293 0.001127 0.000804 0.001085 0.003083 0.000005
2014-06-25 0.001300 0.001127 0.000794 0.001089 0.003189 0.000005
2014-06-26 0.001302 0.001121 0.000777 0.001113 0.003368 0.000005
... ... ... ... ... ... ...
2023-04-12 0.000320 0.001412 0.001074 0.000738 0.003294 0.000048
2023-04-13 0.000280 0.001393 0.001063 0.000718 0.003256 0.000048
2023-04-14 0.000338 0.001429 0.001089 0.000753 0.003306 0.000049
2023-04-17 0.000339 0.001436 0.001043 0.000761 0.003343 0.000049
2023-04-18 0.000316 0.001441 0.001047 0.000726 0.003405 0.000049

2222 rows × 6 columns

Sigma_hat
AMZN AAPL MSFT GOOGL TSLA USDOLLAR
Date
2014-06-20 AMZN 4.237160e-04 1.014461e-04 9.996433e-05 1.531471e-04 2.003648e-04 9.720131e-10
AAPL 1.014461e-04 2.841137e-04 7.152255e-05 1.037271e-04 1.220260e-04 -6.378275e-10
MSFT 9.996433e-05 7.152255e-05 1.983346e-04 8.818659e-05 1.074351e-04 -7.018987e-10
GOOGL 1.531471e-04 1.037271e-04 8.818659e-05 2.528908e-04 1.364906e-04 2.541239e-10
TSLA 2.003648e-04 1.220260e-04 1.074351e-04 1.364906e-04 1.424632e-03 -6.411383e-10
... ... ... ... ... ... ... ...
2023-04-18 AAPL 3.212597e-04 4.669670e-04 3.427499e-04 3.168139e-04 4.707923e-04 -1.459881e-08
MSFT 3.287896e-04 3.427499e-04 4.118900e-04 3.408980e-04 4.190421e-04 1.773549e-09
GOOGL 3.220580e-04 3.168139e-04 3.408980e-04 4.396580e-04 3.835795e-04 -2.223956e-08
TSLA 4.559975e-04 4.707923e-04 4.190421e-04 3.835795e-04 1.848895e-03 -1.065242e-07
USDOLLAR -2.047893e-08 -1.459881e-08 1.773549e-09 -2.223956e-08 -1.065242e-07 3.357492e-09

13332 rows × 6 columns

For the cash return instead we simply use the previous day’s return.

r_hat['USDOLLAR'] = returns['USDOLLAR'].shift(1)

Here we define the transaction cost and holding cost model (sections 2.3 and 2.4 of the paper). The data can be expressed as

  • a scalar (like we’re doing here), the same value for all assets and all time periods;

  • a Pandas Series indexed by the asset names, for asset-specific values;

  • a Pandas DataFrame indexed by timestamps with asset names as columns, for values that vary by asset and in time.

tcost_model = cp.TcostModel(half_spread=10e-4)
hcost_model = cp.HcostModel(borrow_costs=1e-4)

We define the single period optimization policy (section 4 of the paper).

risk_model = cp.FullCovariance(Sigma_hat)
gamma_risk, gamma_trade, gamma_hold = 1.0, 1.0, 1.0
leverage_limit = cp.LeverageLimit(3)

spo_policy = cp.SinglePeriodOpt(
    return_forecast=r_hat,
    costs=[
        gamma_risk * risk_model,
        gamma_trade * tcost_model,
        gamma_hold * hcost_model,
    ],
    constraints=[leverage_limit],
)

We run a backtest, which returns a result object. By calling its summary method we get some basic statistics.

market_sim = cp.MarketSimulator(
    returns, [tcost_model, hcost_model], cash_key="USDOLLAR"
)
init_portfolio = pd.Series(index=returns.columns, data=250000.0)
init_portfolio.USDOLLAR = 0
results = market_sim.run_multiple_backtest(
    init_portfolio,
    start_time="2020-01-01",
    end_time="2023-04-01",
    policies=[spo_policy, 
             cp.Hold()
            ],
)
results[0].summary()
---------------------------------------------------------------------------
RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/multiprocess/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/home/runner/work/cvxportfolio/cvxportfolio/cvxportfolio/simulator.py", line 272, in _run_backtest
    return self.run_backtest(
  File "/home/runner/work/cvxportfolio/cvxportfolio/cvxportfolio/simulator.py", line 231, in run_backtest
    u = policy.get_trades(h, t)
  File "/home/runner/work/cvxportfolio/cvxportfolio/cvxportfolio/policies.py", line 493, in get_trades
    constraints += constr.weight_expr(t, wplus, z, value)
  File "/home/runner/work/cvxportfolio/cvxportfolio/cvxportfolio/constraints.py", line 47, in weight_expr
    result = self.compile_to_cvxpy(wplus, z, v)
NameError: name 'wplus' is not defined
"""

The above exception was the direct cause of the following exception:

NameError                                 Traceback (most recent call last)
Cell In[10], line 6
      4 init_portfolio = pd.Series(index=returns.columns, data=250000.0)
      5 init_portfolio.USDOLLAR = 0
----> 6 results = market_sim.run_multiple_backtest(
      7     init_portfolio,
      8     start_time="2020-01-01",
      9     end_time="2023-04-01",
     10     policies=[spo_policy, 
     11              cp.Hold()
     12             ],
     13 )
     14 results[0].summary()

File ~/work/cvxportfolio/cvxportfolio/cvxportfolio/simulator.py:279, in MarketSimulator.run_multiple_backtest(self, initial_portf, start_time, end_time, policies, loglevel, parallel)
    277 if parallel:
    278     workers = multiprocess.Pool(num_workers)
--> 279     results = workers.map(_run_backtest, policies)
    280     workers.close()
    281     return results

File /opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/multiprocess/pool.py:367, in Pool.map(self, func, iterable, chunksize)
    362 def map(self, func, iterable, chunksize=None):
    363     '''
    364     Apply `func` to each element in `iterable`, collecting the results
    365     in a list that is returned.
    366     '''
--> 367     return self._map_async(func, iterable, mapstar, chunksize).get()

File /opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/multiprocess/pool.py:774, in ApplyResult.get(self, timeout)
    772     return self._value
    773 else:
--> 774     raise self._value

NameError: name 'wplus' is not defined

The total value of the portfolio in time.

results[1].summary()
Number of periods                               818
Initial timestamp               2020-01-02 00:00:00
Final timestamp                 2023-03-31 00:00:00
Portfolio return (%)                         41.343
Excess return (%)                            40.340
Excess risk (%)                              45.013
Sharpe ratio                                  0.897
Max. drawdown                                59.516
Turnover (%)                                  0.000
Average policy time (sec)                     0.000
Average simulator time (sec)                  0.001
results[0].v.plot(figsize=(12, 5))
results[1].v.plot(figsize=(12, 5))
<Axes: >
../../_images/fb8f0ef64a730c5e6d5f938a2f0dd57714337175811c3b3ad59b09887051dc3e.png

The weights vector of the portfolio in time.

results[0].w.plot(figsize=(12, 6))
<Axes: >
../../_images/e9ce60f22fcb54111d3396f44949776d04d5cfc2d9207eb2d1594ee0a2ed113c.png