Hello World

Hello World#

%matplotlib inline
import pandas as pd
import cvxportfolio as cp

Download the problem data with yfinance. We select five liquid stocks.

import yfinance
tickers = ["AMZN", "AAPL", "MSFT", "GOOGL", "TSLA"]
# start_date = "2012-01-01"
# end_date = "2016-12-31"
returns = pd.DataFrame(
    dict(
        [
            (
                ticker,
                yfinance.download(ticker)[ #, start_date=start_date, end_date=end_date)[
                    "Adj Close"
                ].pct_change(),
            )
            for ticker in tickers
        ]
    )
)

[*********************100%***********************]  1 of 1 completed

[*********************100%***********************]  1 of 1 completed

[*********************100%***********************]  1 of 1 completed

[*********************100%***********************]  1 of 1 completed

[*********************100%***********************]  1 of 1 completed

returns.describe()

	AMZN	AAPL	MSFT	GOOGL	TSLA
count	6523.000000	10675.000000	9349.000000	4697.000000	3222.000000
mean	0.001704	0.001100	0.001134	0.000983	0.002125
std	0.036077	0.028212	0.021314	0.019434	0.036126
min	-0.247661	-0.518692	-0.301159	-0.116341	-0.210628
25%	-0.013295	-0.013056	-0.009217	-0.007995	-0.015596
50%	0.000402	0.000000	0.000352	0.000745	0.001218
75%	0.014799	0.014695	0.011355	0.010087	0.019381
max	0.344714	0.332280	0.195652	0.199915	0.243951

We get the return on cash from FRED.

import pandas_datareader

returns[["USDOLLAR"]] =  pandas_datareader.get_data_fred("DFF", start='2000-01-01') / (250 * 100)
returns = returns.fillna(method="ffill").dropna()

returns.tail()

	AMZN	AAPL	MSFT	GOOGL	TSLA	USDOLLAR
Date
2023-04-12	-0.020917	-0.004353	0.002334	-0.006739	-0.033460	0.000193
2023-04-13	0.046714	0.034104	0.022399	0.026663	0.029689	0.000193
2023-04-14	0.001074	-0.002114	-0.012766	0.013404	-0.004841	0.000193
2023-04-17	0.002244	0.000121	0.009296	-0.026637	0.011027	0.000193
2023-04-18	-0.006619	0.004600	-0.004778	-0.008116	-0.012992	0.000193

We compute rolling estimates of the first and second moments of the returns using a window of 1000 days. We shift them by one unit (so at every day we present the optimizer with only past data).

r_hat = returns.rolling(window=1000).mean().shift(1).dropna()
Sigma_hat = returns.shift(1).rolling(window=1000).cov().dropna()

r_hat

	AMZN	AAPL	MSFT	GOOGL	TSLA	USDOLLAR
Date
2014-06-20	0.001314	0.001108	0.000786	0.001036	0.002961	0.000005
2014-06-23	0.001300	0.001116	0.000803	0.001059	0.002971	0.000005
2014-06-24	0.001293	0.001127	0.000804	0.001085	0.003083	0.000005
2014-06-25	0.001300	0.001127	0.000794	0.001089	0.003189	0.000005
2014-06-26	0.001302	0.001121	0.000777	0.001113	0.003368	0.000005
...	...	...	...	...	...	...
2023-04-12	0.000320	0.001412	0.001074	0.000738	0.003294	0.000048
2023-04-13	0.000280	0.001393	0.001063	0.000718	0.003256	0.000048
2023-04-14	0.000338	0.001429	0.001089	0.000753	0.003306	0.000049
2023-04-17	0.000339	0.001436	0.001043	0.000761	0.003343	0.000049
2023-04-18	0.000316	0.001441	0.001047	0.000726	0.003405	0.000049

2222 rows × 6 columns

Sigma_hat

		AMZN	AAPL	MSFT	GOOGL	TSLA	USDOLLAR
Date
2014-06-20	AMZN	4.237160e-04	1.014461e-04	9.996433e-05	1.531471e-04	2.003648e-04	9.720131e-10
	AAPL	1.014461e-04	2.841137e-04	7.152255e-05	1.037271e-04	1.220260e-04	-6.378275e-10
	MSFT	9.996433e-05	7.152255e-05	1.983346e-04	8.818659e-05	1.074351e-04	-7.018987e-10
	GOOGL	1.531471e-04	1.037271e-04	8.818659e-05	2.528908e-04	1.364906e-04	2.541239e-10
	TSLA	2.003648e-04	1.220260e-04	1.074351e-04	1.364906e-04	1.424632e-03	-6.411383e-10
...	...	...	...	...	...	...	...
2023-04-18	AAPL	3.212597e-04	4.669670e-04	3.427499e-04	3.168139e-04	4.707923e-04	-1.459881e-08
	MSFT	3.287896e-04	3.427499e-04	4.118900e-04	3.408980e-04	4.190421e-04	1.773549e-09
	GOOGL	3.220580e-04	3.168139e-04	3.408980e-04	4.396580e-04	3.835795e-04	-2.223956e-08
	TSLA	4.559975e-04	4.707923e-04	4.190421e-04	3.835795e-04	1.848895e-03	-1.065242e-07
	USDOLLAR	-2.047893e-08	-1.459881e-08	1.773549e-09	-2.223956e-08	-1.065242e-07	3.357492e-09

13332 rows × 6 columns

For the cash return instead we simply use the previous day’s return.

r_hat['USDOLLAR'] = returns['USDOLLAR'].shift(1)

Here we define the transaction cost and holding cost model (sections 2.3 and 2.4 of the paper). The data can be expressed as

a scalar (like we’re doing here), the same value for all assets and all time periods;
a Pandas Series indexed by the asset names, for asset-specific values;
a Pandas DataFrame indexed by timestamps with asset names as columns, for values that vary by asset and in time.

tcost_model = cp.TcostModel(half_spread=10e-4)
hcost_model = cp.HcostModel(borrow_costs=1e-4)

We define the single period optimization policy (section 4 of the paper).

risk_model = cp.FullCovariance(Sigma_hat)
gamma_risk, gamma_trade, gamma_hold = 1.0, 1.0, 1.0
leverage_limit = cp.LeverageLimit(3)

spo_policy = cp.SinglePeriodOpt(
    return_forecast=r_hat,
    costs=[
        gamma_risk * risk_model,
        gamma_trade * tcost_model,
        gamma_hold * hcost_model,
    ],
    constraints=[leverage_limit],
)

We run a backtest, which returns a result object. By calling its summary method we get some basic statistics.

market_sim = cp.MarketSimulator(
    returns, [tcost_model, hcost_model], cash_key="USDOLLAR"
)
init_portfolio = pd.Series(index=returns.columns, data=250000.0)
init_portfolio.USDOLLAR = 0
results = market_sim.run_multiple_backtest(
    init_portfolio,
    start_time="2020-01-01",
    end_time="2023-04-01",
    policies=[spo_policy, 
             cp.Hold()
            ],
)
results[0].summary()

---------------------------------------------------------------------------
RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/multiprocess/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/home/runner/work/cvxportfolio/cvxportfolio/cvxportfolio/simulator.py", line 272, in _run_backtest
    return self.run_backtest(
  File "/home/runner/work/cvxportfolio/cvxportfolio/cvxportfolio/simulator.py", line 231, in run_backtest
    u = policy.get_trades(h, t)
  File "/home/runner/work/cvxportfolio/cvxportfolio/cvxportfolio/policies.py", line 493, in get_trades
    constraints += constr.weight_expr(t, wplus, z, value)
  File "/home/runner/work/cvxportfolio/cvxportfolio/cvxportfolio/constraints.py", line 47, in weight_expr
    result = self.compile_to_cvxpy(wplus, z, v)
NameError: name 'wplus' is not defined
"""

The above exception was the direct cause of the following exception:

NameError                                 Traceback (most recent call last)
Cell In[10], line 6
      4 init_portfolio = pd.Series(index=returns.columns, data=250000.0)
      5 init_portfolio.USDOLLAR = 0
----> 6 results = market_sim.run_multiple_backtest(
      7     init_portfolio,
      8     start_time="2020-01-01",
      9     end_time="2023-04-01",
     10     policies=[spo_policy, 
     11              cp.Hold()
     12             ],
     13 )
     14 results[0].summary()

File ~/work/cvxportfolio/cvxportfolio/cvxportfolio/simulator.py:279, in MarketSimulator.run_multiple_backtest(self, initial_portf, start_time, end_time, policies, loglevel, parallel)
    277 if parallel:
    278     workers = multiprocess.Pool(num_workers)
--> 279     results = workers.map(_run_backtest, policies)
    280     workers.close()
    281     return results

File /opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/multiprocess/pool.py:367, in Pool.map(self, func, iterable, chunksize)
    362 def map(self, func, iterable, chunksize=None):
    363     '''
    364     Apply `func` to each element in `iterable`, collecting the results
    365     in a list that is returned.
    366     '''
--> 367     return self._map_async(func, iterable, mapstar, chunksize).get()

File /opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/multiprocess/pool.py:774, in ApplyResult.get(self, timeout)
    772     return self._value
    773 else:
--> 774     raise self._value

NameError: name 'wplus' is not defined

The total value of the portfolio in time.

results[1].summary()

Number of periods                               818
Initial timestamp               2020-01-02 00:00:00
Final timestamp                 2023-03-31 00:00:00
Portfolio return (%)                         41.343
Excess return (%)                            40.340
Excess risk (%)                              45.013
Sharpe ratio                                  0.897
Max. drawdown                                59.516
Turnover (%)                                  0.000
Average policy time (sec)                     0.000
Average simulator time (sec)                  0.001

results[0].v.plot(figsize=(12, 5))
results[1].v.plot(figsize=(12, 5))

<Axes: >

../../_images/fb8f0ef64a730c5e6d5f938a2f0dd57714337175811c3b3ad59b09887051dc3e.png

The weights vector of the portfolio in time.

results[0].w.plot(figsize=(12, 6))

<Axes: >

../../_images/e9ce60f22fcb54111d3396f44949776d04d5cfc2d9207eb2d1594ee0a2ed113c.png