title: “Bitcoin Quant Strategies - Classification Methods”
date: 2019-07-20
tags: tech
mathjax: true
In this research I looked at intraday Bitcoin trading based on price and volume information using classification models.
Strategy | Precision | P&L | Sharpe Ratio | |
---|---|---|---|---|
10 | MLP Classifier | 0.55 | 2.06 | 1.79 |
3 | KNN | 0.50 | 1.95 | 1.85 |
0 | Baseline | 0.00 | 1.69 | 1.14 |
4 | Decision Tree | 0.49 | 1.61 | 1.84 |
5 | Random Forest | 0.53 | 1.55 | 1.16 |
8 | XGBoost | 0.52 | 1.33 | 0.86 |
9 | SVC | 0.47 | 1.31 | 0.73 |
1 | Logistic Regression | 0.47 | 1.14 | 0.47 |
7 | Gradient Boost | 0.48 | 1.06 | 0.33 |
6 | AdaBoost | 0.50 | 0.86 | -0.09 |
2 | Linear Discriminant Analysis | 0.47 | 0.85 | -0.17 |
import warnings
import itertools
import numpy as np
import pandas as pd
from sklearn import tree
from sklearn.svm import SVC
import matplotlib.pyplot as plt
from xgboost import XGBClassifier
import matplotlib.ticker as ticker
from sklearn.decomposition import PCA
from imblearn.over_sampling import SMOTE
from sklearn.metrics import confusion_matrix
from multiprocessing import set_start_method
from sklearn.tree import DecisionTreeClassifier
from IPython.display import display, HTML, Image
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import KFold, GridSearchCV
from pandas.plotting import register_matplotlib_converters
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier, GradientBoostingClassifier
register_matplotlib_converters()
warnings.filterwarnings("ignore")
plt.rcParams['font.family'] = "serif"
plt.rcParams['font.serif'] = "DejaVu Serif"
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['figure.dpi'] = 100
plt.rcParams['lines.linewidth'] = 0.75
pd.set_option('max_row', 10)
This is a customized function used to plot confusion matrix in python.
def disp(df, max_rows=10):
return display(HTML(df.to_html(max_rows=max_rows, header=True).replace('<table border="1" class="dataframe">','<table>')))
I got the preliminary bitcoin data from bitcoincharts. Data include price and volume information recorded by Bitstamp and split by seconds. This provide great granularity that can be grouped into any desirable levels later on.
data = pd.read_csv('bitstampUSD.csv', header=None, names=['time', 'price', 'volume'])
data['time'] = pd.to_datetime(data['time'], unit='s')
data.set_index('time', inplace=True)
Get 3-month treasury bill price.
url = 'https://fred.stlouisfed.org/graph/fredgraph.csv?id=DTB3'
tr = pd.read_csv(url, index_col=0, parse_dates=True)
We first resample our data by hour. Since most Bitcoin exchanges nowadays have transaction fees, which renders retail trading at a high frequency level unattainable. Therefore I leave out the second and minute level data and combine them into hours. Note that I average the price while summing the volume within an hour.
A 2 year data window from 2017 to 2019 is used, as this is when Bitcoin and other crypto has come into the attention of the larger public, and mostly importantly, started to be heavily traded. Therefore the training set will be more representative of any future trading environment. The plot below illustrates the total dollar amount traded per hours over time.
df0 = data.resample('H').agg({'price': np.mean, 'volume': np.sum}).fillna(method='ffill')
plt.plot(df0.volume * df0.price, c='black')
plt.title('Bitcoin Dollar Volume in Dollar Term')
plt.show()
df1 = data.loc['2017-07-01':'2019-06-30'].resample('H').agg({'price': np.mean,
'volume': np.sum}).fillna(method='ffill')
df2 = tr.loc['2017-07-01':'2019-06-30']
df = df1.join(df2).replace('.', np.NaN).fillna(method='ffill').fillna(method='bfill').rename({'DTB3': 'tr'}, axis=1)
df.tr = df.tr.astype(float)/100
disp(df)
price | volume | tr | |
---|---|---|---|
time | |||
2017-07-01 00:00:00 | 2473.427264 | 200.793669 | 0.0104 |
2017-07-01 01:00:00 | 2463.946180 | 228.853771 | 0.0104 |
2017-07-01 02:00:00 | 2441.314976 | 475.068038 | 0.0104 |
2017-07-01 03:00:00 | 2449.063866 | 177.876034 | 0.0104 |
2017-07-01 04:00:00 | 2453.192311 | 120.916328 | 0.0104 |
... | ... | ... | ... |
2019-06-30 19:00:00 | 11173.875377 | 389.958860 | 0.0208 |
2019-06-30 20:00:00 | 11276.492157 | 372.471619 | 0.0208 |
2019-06-30 21:00:00 | 11340.807808 | 295.522323 | 0.0208 |
2019-06-30 22:00:00 | 11037.539360 | 963.543871 | 0.0208 |
2019-06-30 23:00:00 | 10838.165248 | 1152.810243 | 0.0208 |
plt.plot(df.price, c='black')
plt.title('Bitcoin Price 2017-07-01 to 2019-06-30')
plt.show()
We then created several more data fields intending to extract more information from the previous n-hour window
interval = [6, 12, 24, 48, 120] # 0.25, 0.5, 1, 2, 5 days
for i in interval:
for c in ['price', 'volume']:
df[c+'_change_'+str(i)+'H'] = df[c]/df[c].shift(i)-1
df[c+'_high_'+str(i)+'H'] = df[c].rolling(i).max().shift(1) / df[c]
df[c+'_low_'+str(i)+'H'] = df[c].rolling(i).min().shift(1) / df[c]
df[c+'_avg_'+str(i)+'H'] = df[c].rolling(i).mean().shift(1) / df[c]
df[c+'_std_'+str(i)+'H'] = df[c].rolling(i).std().shift(1) / df[c] * np.sqrt(24/i)
df.dropna(inplace=True)
disp(df.head())
price | volume | tr | price_change_6H | price_high_6H | price_low_6H | price_avg_6H | price_std_6H | volume_change_6H | volume_high_6H | volume_low_6H | volume_avg_6H | volume_std_6H | price_change_12H | price_high_12H | price_low_12H | price_avg_12H | price_std_12H | volume_change_12H | volume_high_12H | volume_low_12H | volume_avg_12H | volume_std_12H | price_change_24H | price_high_24H | price_low_24H | price_avg_24H | price_std_24H | volume_change_24H | volume_high_24H | volume_low_24H | volume_avg_24H | volume_std_24H | price_change_48H | price_high_48H | price_low_48H | price_avg_48H | price_std_48H | volume_change_48H | volume_high_48H | volume_low_48H | volume_avg_48H | volume_std_48H | price_change_120H | price_high_120H | price_low_120H | price_avg_120H | price_std_120H | volume_change_120H | volume_high_120H | volume_low_120H | volume_avg_120H | volume_std_120H | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
time | |||||||||||||||||||||||||||||||||||||||||||||||||||||
2017-07-06 00:00:00 | 2607.823311 | 233.619901 | 0.0102 | 0.005149 | 1.001294 | 0.994877 | 0.998116 | 0.005057 | -0.539692 | 2.172459 | 1.271991 | 1.710225 | 0.606196 | 0.019447 | 1.001294 | 0.980924 | 0.991722 | 0.010182 | -0.137448 | 2.604881 | 0.986408 | 1.724699 | 0.659083 | 0.012479 | 1.001294 | 0.973805 | 0.985347 | 0.008630 | -0.708220 | 3.444264 | 0.815879 | 1.940980 | 0.748083 | 0.019561 | 1.008966 | 0.973805 | 0.990540 | 0.006865 | 0.552344 | 4.357035 | 0.644187 | 1.832178 | 0.620839 | 0.054336 | 1.008966 | 0.916032 | 0.966001 | 0.011353 | 0.163482 | 8.713623 | 0.498690 | 1.772706 | 0.491582 |
2017-07-06 01:00:00 | 2592.974565 | 229.561261 | 0.0102 | -0.005044 | 1.007028 | 1.001879 | 1.004691 | 0.004088 | -0.415935 | 1.872772 | 1.017680 | 1.541597 | 0.657009 | 0.011195 | 1.007028 | 0.988929 | 0.999000 | 0.009511 | -0.622775 | 2.650935 | 1.003848 | 1.741677 | 0.698711 | 0.011259 | 1.007028 | 0.979381 | 0.991506 | 0.009179 | -0.707043 | 3.505159 | 0.830303 | 1.872374 | 0.713401 | 0.009083 | 1.014744 | 0.979381 | 0.996615 | 0.006895 | 0.168152 | 4.434067 | 0.830303 | 1.872115 | 0.625493 | 0.052367 | 1.014744 | 0.921278 | 0.971965 | 0.011479 | 0.003091 | 8.867680 | 0.507507 | 1.805239 | 0.499861 |
2017-07-06 02:00:00 | 2595.240970 | 111.498601 | 0.0102 | -0.005029 | 1.006148 | 0.999127 | 1.002969 | 0.005542 | -0.624789 | 3.855797 | 2.058871 | 2.929583 | 1.561810 | 0.009963 | 1.006148 | 0.989687 | 0.999049 | 0.008379 | -0.765640 | 4.551893 | 2.058871 | 3.302634 | 1.296594 | 0.016319 | 1.006148 | 0.978526 | 0.991104 | 0.009312 | -0.861432 | 7.216670 | 1.709488 | 3.647934 | 1.347291 | 0.000808 | 1.013858 | 0.978526 | 0.995932 | 0.006872 | -0.870219 | 9.129174 | 1.709488 | 3.860618 | 1.283035 | 0.063050 | 1.013858 | 0.920474 | 0.971530 | 0.011491 | -0.765300 | 18.257410 | 1.044891 | 3.716808 | 1.029132 |
2017-07-06 03:00:00 | 2601.939179 | 154.465403 | 0.0102 | 0.001575 | 1.003558 | 0.996555 | 0.999547 | 0.005543 | -0.640708 | 2.783251 | 0.721835 | 1.914348 | 1.612821 | 0.013029 | 1.003558 | 0.987139 | 0.997297 | 0.007360 | -0.329707 | 3.285718 | 0.721835 | 2.187443 | 1.098140 | 0.017659 | 1.003558 | 0.976007 | 0.989220 | 0.009328 | -0.719538 | 4.131480 | 0.721835 | 2.446232 | 0.882973 | -0.004151 | 1.011248 | 0.976007 | 0.993385 | 0.006859 | -0.848249 | 6.589761 | 0.721835 | 2.685895 | 0.903309 | 0.062422 | 1.011248 | 0.918104 | 0.969522 | 0.011449 | -0.131612 | 13.178846 | 0.721835 | 2.663309 | 0.746976 |
2017-07-06 04:00:00 | 2594.198903 | 323.934946 | 0.0102 | -0.006510 | 1.006553 | 0.999528 | 1.002792 | 0.005453 | -0.093037 | 1.273228 | 0.344201 | 0.771118 | 0.713993 | 0.006701 | 1.006553 | 0.993343 | 1.001348 | 0.005868 | -0.345152 | 1.566764 | 0.344201 | 1.023516 | 0.558260 | 0.021535 | 1.006553 | 0.978919 | 0.992897 | 0.009496 | -0.492401 | 1.970058 | 0.344201 | 1.115490 | 0.427613 | -0.005803 | 1.014265 | 0.978919 | 0.996261 | 0.006822 | 0.102404 | 2.598252 | 0.344201 | 1.225214 | 0.392388 | 0.057479 | 1.014265 | 0.920843 | 0.972906 | 0.011490 | 1.679001 | 6.284212 | 0.344201 | 1.269372 | 0.356447 |
Due to the large number of features created in the last step, we use PCA to reduce the dimensionality of the data. Aside from price, 15 other principal components are retained. Since we mostly care about predicting accuracy, therefore we are okay with losing some interpretability in the PCA process.
X = StandardScaler().fit_transform(df.iloc[:, 3:])
comp = 15
pca = PCA(n_components=comp)
X_pca = pca.fit_transform(X)
np.round(pca.explained_variance_ratio_, 2)
array([0.29, 0.23, 0.13, 0.06, 0.05, 0.03, 0.03, 0.02, 0.02, 0.02, 0.01,
0.01, 0.01, 0.01, 0.01])
np.round(np.sum(pca.explained_variance_ratio_), 2)
0.93
df_pca = pd.DataFrame(X_pca, index=df.index, columns = ['PC' + str(i) for i in range(1, comp+1)])
df = pd.DataFrame(df.iloc[:, 0:3]).join(df_pca)
disp(df.head())
price | volume | tr | PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 | PC9 | PC10 | PC11 | PC12 | PC13 | PC14 | PC15 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
time | ||||||||||||||||||
2017-07-06 00:00:00 | 2607.823311 | 233.619901 | 0.0102 | 1.204847 | -1.442388 | -0.936078 | 0.562964 | -1.802252 | -0.126177 | -2.115852 | 0.258446 | 0.547159 | -0.005956 | -0.345683 | -0.077257 | -0.389290 | -0.253596 | 0.501094 |
2017-07-06 01:00:00 | 2592.974565 | 229.561261 | 0.0102 | 1.072486 | -0.509851 | -1.208146 | 1.100969 | -1.991122 | 0.106322 | -1.995090 | 0.089313 | 0.690748 | -0.003055 | -0.152291 | -0.266696 | -0.403753 | -0.415290 | 0.602009 |
2017-07-06 02:00:00 | 2595.240970 | 111.498601 | 0.0102 | 6.313332 | 0.286169 | -0.274636 | 1.772318 | -3.465762 | 0.816421 | -5.245687 | 0.737407 | 1.652572 | 0.043234 | 0.118171 | -0.658779 | -0.606447 | -0.557087 | 0.499793 |
2017-07-06 03:00:00 | 2601.939179 | 154.465403 | 0.0102 | 1.986983 | -0.820551 | -1.195466 | 0.697308 | -1.574198 | 0.289253 | -1.100313 | 0.593288 | 0.655685 | -0.005791 | -0.018679 | -0.152047 | -0.354950 | -0.493463 | 0.640942 |
2017-07-06 04:00:00 | 2594.198903 | 323.934946 | 0.0102 | -1.350755 | -0.873287 | -1.805126 | 0.663435 | -0.972641 | 0.036059 | -0.009574 | 0.000981 | 0.200082 | 0.315963 | -0.225084 | -0.120908 | -0.308701 | -0.390018 | 0.597231 |
Train and test sets are created for modeling purpose. Since it is time series data, randomization will not be performed. Rather, both train and test sets are chosen such that they both include a market upturn and market downturn.
train = df.loc['2017-07-01':'2018-06-30']
test = df.loc['2018-07-01':'2019-06-30']
Here we specify some modeling parameters. The trading frequency is set to one day.
trade_interval = '1H'
trade_interval_min = 60
ann_factor = 24 * 365
training_threshold = 0.0075
transaction_fee = 0.0025
Create a model engine that fit the train data and use grid search CV to tune the parameter grid. A long trade will be executed only if the model predict a next-5-day up move in the last 24 consecutive hours. This limits the frequency of trade which reduce the impact of the relatively large transaction fee per trade. The training threshold is set to 75 bps, which means the model is train to identify a potential up move of more than 75 bps in the next 5 days.
def run_model(Model, model_name, param, param_init, param_grid, search=False):
# prepare data
train_copy = train.resample(trade_interval).first()
test_copy = test.resample(trade_interval).first()
train_copy.dropna(inplace=True)
test_copy.dropna(inplace=True)
indicator = 24 # hr
offset = 120 # hr
X_train = train_copy.iloc[:-offset, 3:]
Y_train = (train_copy.price.shift(-offset)/train_copy.price)[:-offset] > (1 + training_threshold)
X_test = test_copy.iloc[:-offset, 3:]
Y_test = (test_copy.price.shift(-offset)/test_copy.price)[:-offset] > (1 + training_threshold)
# run model
if search:
model = GridSearchCV(estimator=Model(**param_init),
cv=KFold(n_splits=5, random_state=0),
scoring='precision',
param_grid=param_grid).fit(X_train, Y_train)
print(f'cv precision: {round(model.best_score_, 2)}, best param: {model.best_params_}')
else:
model = Model(**param).fit(X_train, Y_train)
Y_pred = model.predict(X_test)
cm = confusion_matrix(Y_test, Y_pred)
precision = round(cm[1][1]/(cm[1][1] + cm[0][1]), 2)
# calculate pnl
# test_copy['ind'] = np.append(Y_pred, False)
# test_copy['pnl'] = test_copy.ind * (test_copy.price.shift(-1) / test_copy.price)
test_copy['pred'] = np.append(Y_pred, [False] * offset)
test_copy['ind'] = test_copy.pred.rolling(indicator).sum() == indicator
test_copy['buy'] = test_copy.ind.rolling(offset).sum() > 0
test_copy['pnl'] = test_copy.buy * (test_copy.price.shift(-1) / test_copy.price)
test_copy.pnl.replace(0, 1, inplace=True)
test_copy.dropna(inplace=True)
test_copy['fee'] = np.where(test_copy.ind != test_copy.ind.shift(1), 1-transaction_fee, 1)
test_copy.pnl *= test_copy.fee
test_pnl = round(test_copy.pnl.cumprod()[-1], 2)
test_spr = round(np.mean(test_copy.pnl - 1 - test_copy.tr/(ann_factor))
/ (test_copy.pnl - 1).std() * np.sqrt(ann_factor), 2)
print(f'test precision: {precision}; pnl: {test_pnl}, spr: {test_spr}')
return test_copy.pnl, precision, test_pnl, test_spr
baseline = test.resample(trade_interval).first()
baseline['pnl'] = baseline.price / baseline.price.shift(1) - 1
baseline_pnl = round(baseline.price.iloc[-1] / baseline.price.iloc[0], 2)
baseline_spr = round(np.mean(baseline.pnl - baseline.tr/(ann_factor))
/ baseline.pnl.std() * np.sqrt(ann_factor), 2)
result = test[['price']].copy().rename({'price': 'Baseline'}, axis=1)/test.price.iloc[0]*1000
comp = pd.DataFrame({'Strategy': 'Baseline', 'Precision': 'NA',
'P&L': baseline_pnl, 'Sharpe Ratio': baseline_spr}, index=[0])
print(f'test precision: {np.NaN}; pnl: {baseline_pnl}, spr: {baseline_spr}')
test precision: nan; pnl: 1.69, spr: 1.14
Model = LogisticRegression
model_name = 'Logistic Regression'
param = {'class_weight':'balanced',
'solver':'liblinear',
'random_state': 0,
'C': 0.001,
'penalty': 'l2'}
param_init = {'random_state': 0, 'class_weight': 'balanced'}
param_grid = {'C': [1e-2, 1e-1, 1, 10], 'penalty': ['l1', 'l2']}
a, b, c, d = run_model(Model, model_name, param, param_init, param_grid, search=True)
result = result.join(a.cumprod() * 1000).rename({'pnl': model_name}, axis=1).dropna()
comp = comp.append({'Strategy': model_name, 'Precision': b, 'P&L': c, 'Sharpe Ratio': d}, ignore_index=True)
cv precision: 0.52, best param: {‘C’: 0.01, ‘penalty’: ‘l1’}
test precision: 0.47; pnl: 1.14, spr: 0.47
Model = LinearDiscriminantAnalysis
model_name = 'Linear Discriminant Analysis'
param = {'solver': 'svd', 'n_components': None}
param_init = {}
param_grid = {'solver': ['svd', 'lsqr'], 'n_components': [None, 5, 10, 25]}
a, b, c, d = run_model(Model, model_name, param, param_init, param_grid, search=True)
result = result.join(a.cumprod() * 1000).rename({'pnl': model_name}, axis=1).dropna()
comp = comp.append({'Strategy': model_name, 'Precision': b, 'P&L': c, 'Sharpe Ratio': d}, ignore_index=True)
cv precision: 0.5, best param: {‘n_components’: None, ‘solver’: ‘svd’}
test precision: 0.47; pnl: 0.85, spr: -0.17
Model = KNeighborsClassifier
model_name = 'KNN'
param = {'p': 2, 'leaf_size': 2, 'n_neighbors': 100}
param_init = {'p': 2}
param_grid = {'n_neighbors': [5, 25, 100], 'leaf_size': [2, 25, 100]}
a, b, c, d = run_model(Model, model_name, param, param_init, param_grid, search=True)
result = result.join(a.cumprod() * 1000).rename({'pnl': model_name}, axis=1).dropna()
comp = comp.append({'Strategy': model_name, 'Precision': b, 'P&L': c, 'Sharpe Ratio': d}, ignore_index=True)
cv precision: 0.53, best param: {‘leaf_size’: 2, ‘n_neighbors’: 100}
test precision: 0.5; pnl: 1.95, spr: 1.85
Model = DecisionTreeClassifier
model_name = 'Decision Tree'
param = {'random_state':0, 'criterion': 'gini', 'max_depth': None, 'max_features': 10}
param_init = {'random_state':0}
param_grid = {'criterion': ['gini', 'entropy'],
'max_depth': [None, 5, 10, 25],
'max_features': [None, 'auto', 5, 10]}
a, b, c, d = run_model(Model, model_name, param, param_init, param_grid, search=True)
result = result.join(a.cumprod() * 1000).rename({'pnl': model_name}, axis=1).dropna()
comp = comp.append({'Strategy': model_name, 'Precision': b, 'P&L': c, 'Sharpe Ratio': d}, ignore_index=True)
cv precision: 0.53, best param: {‘criterion’: ‘gini’, ‘max_depth’: 10, ‘max_features’: 5}
test precision: 0.49; pnl: 1.61, spr: 1.84
Model = RandomForestClassifier
model_name = 'Random Forest'
param = {'class_weight':'balanced',
'random_state': 0,
'criterion': 'entropy',
'max_depth': 10,
'max_features': 5,
'n_estimators': 200}
param_init = {'class_weight':'balanced', 'random_state': 0, 'n_estimators': 200}
param_grid = {'criterion': ['gini', 'entropy'],
'max_depth': [5, 25],
'max_features': [5, 10]}
a, b, c, d = run_model(Model, model_name, param, param_init, param_grid, search=True)
result = result.join(a.cumprod() * 1000).rename({'pnl': model_name}, axis=1).dropna()
comp = comp.append({'Strategy': model_name, 'Precision': b, 'P&L': c, 'Sharpe Ratio': d}, ignore_index=True)
cv precision: 0.53, best param: {‘criterion’: ‘gini’, ‘max_depth’: 5, ‘max_features’: 10}
test precision: 0.53; pnl: 1.55, spr: 1.16
Model = AdaBoostClassifier
model_name = 'AdaBoost'
param = {'random_state': 0, 'algorithm': 'SAMME.R', 'n_estimators': 200, 'learning_rate': 0.1}
param_init = {'random_state': 0, 'algorithm': 'SAMME.R', 'n_estimators': 200}
param_grid = {'learning_rate': [1e-2, 1e-1, 1, 10]}
a, b, c, d = run_model(Model, model_name, param, param_init, param_grid, search=True)
result = result.join(a.cumprod() * 1000).rename({'pnl': model_name}, axis=1).dropna()
comp = comp.append({'Strategy': model_name, 'Precision': b, 'P&L': c, 'Sharpe Ratio': d}, ignore_index=True)
cv precision: 0.52, best param: {‘learning_rate’: 0.01}
test precision: 0.5; pnl: 0.86, spr: -0.09
Model = GradientBoostingClassifier
model_name = 'Gradient Boost'
param = {'random_state': 0, 'warm_start': True, 'n_estimators': 200,
'max_depth': 10,
'max_features': 10,
'learning_rate': 0.1}
param_init = {'random_state': 0, 'warm_start': True, 'n_estimators': 200,}
param_grid = {'max_depth': [5, 25],
'max_features': [5, 10],
'learning_rate': [1e-2, 1e-1, 1, 10]}
a, b, c, d = run_model(Model, model_name, param, param_init, param_grid, search=True)
result = result.join(a.cumprod() * 1000).rename({'pnl': model_name}, axis=1).dropna()
comp = comp.append({'Strategy': model_name, 'Precision': b, 'P&L': c, 'Sharpe Ratio': d}, ignore_index=True)
cv precision: 0.54, best param: {‘learning_rate’: 10, ‘max_depth’: 5, ‘max_features’: 5}
test precision: 0.48; pnl: 1.06, spr: 0.33
set_start_method('forkserver', force=True) # enabling multi-threading
Model = XGBClassifier
model_name = 'XGBoost'
param = {'n_jobs':4, 'seed':0, 'n_estimators': 200,
'max_depth': 5,
'min_child_weight': 1,
'gamma': 10,
'learning_rate': 0.01}
param_init = {'n_jobs':4, 'seed':0, 'n_estimators': 200}
param_grid = {'max_depth': [5, 25],
'min_child_weight': [1, 5],
'gamma': [1e-2, 1e-1, 1, 10],
'learning_rate': [1e-2, 1e-1, 1, 10]}
a, b, c, d = run_model(Model, model_name, param, param_init, param_grid, search=True)
result = result.join(a.cumprod() * 1000).rename({'pnl': model_name}, axis=1).dropna()
comp = comp.append({'Strategy': model_name, 'Precision': b, 'P&L': c, 'Sharpe Ratio': d}, ignore_index=True)
cv precision: 0.52, best param: {‘gamma’: 10, ‘learning_rate’: 0.01, ‘max_depth’: 25, ‘min_child_weight’: 1}
test precision: 0.52; pnl: 1.33, spr: 0.86
Model = SVC
model_name = 'SVC'
param = {'probability':True, 'class_weight':'balanced', 'C': 10, 'gamma': 0.01}
param_init = {'probability':True, 'class_weight':'balanced'}
param_grid = {'C': [1e-2, 1e-1, 1, 10],
'gamma': [1e-2, 1e-1, 1, 10]}
a, b, c, d = run_model(Model, model_name, param, param_init, param_grid, search=True)
result = result.join(a.cumprod() * 1000).rename({'pnl': model_name}, axis=1).dropna()
comp = comp.append({'Strategy': model_name, 'Precision': b, 'P&L': c, 'Sharpe Ratio': d}, ignore_index=True)
cv precision: 0.64, best param: {‘C’: 0.1, ‘gamma’: 1}
test precision: 0.47; pnl: 1.31, spr: 0.73
Model = MLPClassifier
model_name = 'MLP Classifier'
param = {'random_state': 0, 'hidden_layer_sizes': (25, 25), 'alpha': 0.01}
param_init = {'random_state': 0}
param_grid = {# 'hidden_layer_sizes': [x for x in itertools.product((5, 25, 100),repeat=2)],
'alpha' : [1e-2, 1e-1, 1, 10],
# 'activation' : ['identity', 'logistic', 'tanh', 'relu'],
# 'solver' : ['lbfgs', 'sgd', 'adam'],
# 'learning_rate' : ['constant', 'invscaling', 'adaptive'],
# 'max_itr' : [100, 200, 1000]
}
a, b, c, d = run_model(Model, model_name, param, param_init, param_grid, search=True)
result = result.join(a.cumprod() * 1000).rename({'pnl': model_name}, axis=1).dropna()
comp = comp.append({'Strategy': model_name, 'Precision': b, 'P&L': c, 'Sharpe Ratio': d}, ignore_index=True)
cv precision: 0.53, best param: {‘alpha’: 1}
test precision: 0.55; pnl: 2.06, spr: 1.79
The results are summarized as follow.
disp(comp.replace('NA', 0).sort_values('P&L', ascending=False), 20)
Strategy | Precision | P&L | Sharpe Ratio | |
---|---|---|---|---|
10 | MLP Classifier | 0.55 | 2.06 | 1.79 |
3 | KNN | 0.50 | 1.95 | 1.85 |
0 | Baseline | 0.00 | 1.69 | 1.14 |
4 | Decision Tree | 0.49 | 1.61 | 1.84 |
5 | Random Forest | 0.53 | 1.55 | 1.16 |
8 | XGBoost | 0.52 | 1.33 | 0.86 |
9 | SVC | 0.47 | 1.31 | 0.73 |
1 | Logistic Regression | 0.47 | 1.14 | 0.47 |
7 | Gradient Boost | 0.48 | 1.06 | 0.33 |
6 | AdaBoost | 0.50 | 0.86 | -0.09 |
2 | Linear Discriminant Analysis | 0.47 | 0.85 | -0.17 |
Plotting the cumulative return for each strategy with transaction fee reflected.
plt.plot(result)
plt.legend(result.columns, frameon=False)
plt.xticks(rotation=30)
plt.gca().xaxis.set_major_locator(ticker.MultipleLocator(30))
plt.ylabel('Cumulative Value Based on $1000 Investment')
plt.show()