# Chapter 18 Solutions to exercises

## 18.1 Chapter 3

For annual values, see 18.1:

```
%>%
data_ml group_by(date) %>%
mutate(growth = Pb > median(Pb)) %>% # Creates the sort
ungroup() %>% # Ungroup
mutate(year = lubridate::year(date)) %>% # Creates a year variable
group_by(year, growth) %>% # Analyze by year & sort
summarize(ret = mean(R1M_Usd)) %>% # Compute average return
ggplot(aes(x = year, y = ret, fill = growth)) + geom_col(position = "dodge") + # Plot!
theme(legend.position = c(0.7, 0.8))
```

For monthly values, see 18.2:

```
<- data_ml %>%
returns_m group_by(date) %>%
mutate(growth = Pb > median(Pb)) %>% # Creates the sort
group_by(date, growth) %>% # Analyze by date & sort
summarize(ret = mean(R1M_Usd)) %>% # Compute average return
spread(key = growth, value = ret) %>% # Pivot to wide matrix format
ungroup()
colnames(returns_m)[2:3] <- c("value", "growth") # Changing column names
%>%
returns_m mutate(value = cumprod(1 + value), # From returns to portf. values
growth = cumprod(1 + growth)) %>%
gather(key = portfolio, value = value, -date) %>% # Back in tidy format
ggplot(aes(x = date, y = value, color = portfolio)) + geom_line() + # Plot!
theme(legend.position = c(0.7, 0.8))
```

Portfolios based on quartiles, using the tidyverse only. We rely heavily on the fact that features are uniformized, i.e., that their distribution is uniform for each given date. Overall, small firms outperform heavily (see Figure 18.3).

```
%>%
data_ml mutate(small = Mkt_Cap_6M_Usd <= 0.25, # Small firms...
medium = Mkt_Cap_6M_Usd > 0.25 & Mkt_Cap_6M_Usd <= 0.5,
large = Mkt_Cap_6M_Usd > 0.5 & Mkt_Cap_6M_Usd <= 0.75,
xl = Mkt_Cap_6M_Usd > 0.75, # ...Xlarge firms
year = year(date)) %>%
group_by(year) %>%
summarize(small = mean(small * R1M_Usd), # Compute avg returns
medium = mean(medium * R1M_Usd),
large = mean(large * R1M_Usd),
xl = mean(xl * R1M_Usd)) %>%
gather(key = size, value = return, -year) %>%
ggplot(aes(x = year, y = return, fill = size)) + geom_col(position = "dodge")
```

## 18.2 Chapter 4

Below, we import a credit spread supplied by Bank of America. Its symbol/ticker is “BAMLC0A0CM.” We apply the data expansion on the small number of predictors to save memory space. One important trick that should not be overlooked is the uniformization step after the product (4.3) is computed. Indeed, we want the new features to have the same properties as the old ones. If we skip this step, distributions will be altered, as we show in one example below.

We start with the data extraction and joining. It’s important to join early so as to keep the highest data frequency (daily) in order to replace missing points with **close values**. Joining with monthly data before replacing creates unnecessary lags.

```
getSymbols.FRED("BAMLC0A0CM", # Extract data
env = ".GlobalEnv",
return.class = "xts")
```

`## [1] "BAMLC0A0CM"`

```
<- fortify(BAMLC0A0CM) # Transform to dataframe
cred_spread colnames(cred_spread) <- c("date", "spread") # Change column name
<- cred_spread %>% # Take extraction and...
cred_spread full_join(data_ml %>% dplyr::select(date), by = "date") %>% # Join!
mutate(spread = na.locf(spread)) # Replace NA by previous
<- cred_spread[!duplicated(cred_spread),] # Remove duplicates cred_spread
```

The creation of the augmented dataset requires some manipulation. Features are no longer uniform as is shown in Figure 18.4.

```
<- data_ml %>% # Create new dataset
data_cond ::select(c("stock_id", "date", features_short))
dplyr<- paste0(features_short, "_cred_spread") # New column names
names_cred_spread <- data_cond %>% # Old values
feat_cred_spread ::select(features_short)
dplyr<- data_ml %>% # Create vector of spreads
cred_spread ::select(date) %>%
dplyrleft_join(cred_spread, by = "date")
<- feat_cred_spread * # This product creates...
feat_cred_spread matrix(cred_spread$spread, # the new values...
length(cred_spread$spread), # using duplicated...
length(features_short)) # columns
colnames(feat_cred_spread) <- names_cred_spread # New column names
<- bind_cols(data_cond, feat_cred_spread) # Aggregate old & new
data_cond %>% ggplot(aes(x = Eps_cred_spread)) + geom_histogram() # Plot example data_cond
```

To prevent this issue, uniformization is required and is verified in Figure 18.5.

```
<- data_cond %>% # From new dataset
data_cond group_by(date) %>% # Group by date and...
mutate_at(names_cred_spread, norm_unif) # Uniformize the new features
%>% ggplot(aes(x = Eps_cred_spread)) + geom_histogram(bins = 100) # Verification data_cond
```

The second question naturally requires the downloading of VIX series first and the joining with the original data.

```
getSymbols.FRED("VIXCLS", # Extract data
env = ".GlobalEnv",
return.class = "xts")
```

`## [1] "VIXCLS"`

```
<- fortify(VIXCLS) # Transform to dataframe
vix colnames(vix) <- c("date", "vix") # Change column name
<- vix %>% # Take extraction and...
vix full_join(data_ml %>% dplyr::select(date), by = "date") %>% # Join!
mutate(vix = na.locf(vix)) # Replace NA by previous
<- vix[!duplicated(vix),] # Remove duplicates
vix <- data_ml %>% # Keep original data format
vix ::select(date) %>% # ...
dplyrleft_join(vix, by = "date") # Via left_join()
```

We can then proceed with the categorization. We create the vector label in a new (smaller) dataset but not attached to the large data_ml variable. Also, we check the balance of labels and its evolution through time (see Figure 18.6).

```
<- 0.5 # Magnitude of vix correction
delta <- median(vix$vix) # Median of vix
vix_bar <- data_ml %>% # Smaller dataset
data_vix ::select(stock_id, date, R1M_Usd) %>%
dplyrmutate(r_minus = (-0.02) * exp(-delta*(vix$vix-vix_bar)), # r_-
r_plus = 0.02 * exp(delta*(vix$vix-vix_bar))) # r_+
<- data_vix %>%
data_vix mutate(R1M_Usd_Cvix = if_else(R1M_Usd < r_minus, -1, # New label!
if_else(R1M_Usd > r_plus, 1,0)),
R1M_Usd_Cvix = as.factor(R1M_Usd_Cvix))
%>%
data_vix mutate(year = year(date)) %>%
group_by(year, R1M_Usd_Cvix) %>%
summarize(nb = n()) %>%
ggplot(aes(x = year, y = nb, fill = R1M_Usd_Cvix)) + geom_col()
```

Finally, we switch to the outliers (Figure 18.7).

```
%>%
data_ml ggplot(aes(x = R12M_Usd)) + geom_histogram()
```

Returns above 50 should indeed be rare.

`%>% filter(R12M_Usd > 50) %>% dplyr::select(stock_id, date, R12M_Usd) data_ml `

```
## [38;5;246m# A tibble: 8 x 3[39m
## stock_id date R12M_Usd
## [3m[38;5;246m<int>[39m[23m [3m[38;5;246m<date>[39m[23m [3m[38;5;246m<dbl>[39m[23m
## [38;5;250m1[39m 212 2000-12-31 53.0
## [38;5;250m2[39m 221 2008-12-31 53.5
## [38;5;250m3[39m 221 2009-01-31 55.2
## [38;5;250m4[39m 221 2009-02-28 54.8
## [38;5;250m5[39m 296 2002-06-30 72.2
## [38;5;250m6[39m 683 2009-02-28 96.0
## [38;5;250m7[39m 683 2009-03-31 64.8
## [38;5;250m8[39m 862 2009-02-28 58.0
```

The largest return comes from stock #683. Let’s have a look at the stream of monthly returns in 2009.

```
%>%
data_ml filter(stock_id == 683, year(date) == 2009) %>%
::select(date, R1M_Usd) dplyr
```

```
## [38;5;246m# A tibble: 12 x 2[39m
## date R1M_Usd
## [3m[38;5;246m<date>[39m[23m [3m[38;5;246m<dbl>[39m[23m
## [38;5;250m 1[39m 2009-01-31 -[31m0[39m[31m.[39m[31m625[39m
## [38;5;250m 2[39m 2009-02-28 0.472
## [38;5;250m 3[39m 2009-03-31 1.44
## [38;5;250m 4[39m 2009-04-30 0.139
## [38;5;250m 5[39m 2009-05-31 0.086
## [38;5;250m 6[39m 2009-06-30 0.185
## [38;5;250m 7[39m 2009-07-31 0.363
## [38;5;250m 8[39m 2009-08-31 0.103
## [38;5;250m 9[39m 2009-09-30 9.91
## [38;5;250m10[39m 2009-10-31 0.101
## [38;5;250m11[39m 2009-11-30 0.202
## [38;5;250m12[39m 2009-12-31 -[31m0[39m[31m.[39m[31m251[39m
```

The returns are all very high. The annual value is plausible. In addition, a quick glance at the Vol1Y values shows that the stock is the most volatile of the dataset.

## 18.3 Chapter 5

We recycle the training and testing data variables created in the chapter (coding section notably). In addition, we create a dedicated function and resort to the *map2*() function from the *purrr* package.

```
<- (0:10)/10 # Sequence of alpha values
alpha_seq <- 0.1^(0:5) # Sequence of lambda values
lambda_seq <- expand.grid(alpha_seq, lambda_seq) # Exploring all combinations!
pars <- pars[,1]
alpha_seq <- pars[,2]
lambda_seq <- function(alpha, lambda, x_train, y_train, x_test, y_test){ # Function
lasso_sens <- glmnet(x_train, y_train, # Model
fit_temp alpha = alpha, lambda = lambda)
return(sqrt(mean((predict(fit_temp, x_test) - y_test)^2))) # Output
}<- map2(alpha_seq, lambda_seq, lasso_sens, # Automation
rmse_elas x_train = x_penalized_train, y_train = y_penalized_train,
x_test = x_penalized_test, y_test = testing_sample$R1M_Usd)
bind_cols(alpha = alpha_seq, lambda = as.factor(lambda_seq), rmse = unlist(rmse_elas)) %>%
ggplot(aes(x = alpha, y = rmse, fill = lambda)) + geom_col() + facet_grid(lambda ~.) +
coord_cartesian(ylim = c(0.19,0.193))
```

As is outlined in Figure 18.8, the parameters have a very marginal impact. Maybe the model is not a good fit for the task.

## 18.4 Chapter 6

```
<- rpart(formula,
fit1 data = training_sample, # Data source: full sample
cp = 0.001) # Precision: smaller = more leaves
mean((predict(fit1, testing_sample) - testing_sample$R1M_Usd)^2)
```

`## [1] 0.04018973`

```
<- rpart(formula,
fit2 data = training_sample, # Data source: full sample
cp = 0.01) # Precision: smaller = more leaves
mean((predict(fit2, testing_sample) - testing_sample$R1M_Usd)^2) # Test!
```

`## [1] 0.03699696`

`rpart.plot(fit1) # Plot the first tree`

The first model (Figure 18.9) is **too** precise: going into the details of the training sample does not translate to good performance out-of-sample. The second, simpler model, yields better results.

```
<- c(10, 20, 40, 80, 160)
n_trees <- 0
mse_RF for(j in 1:length(n_trees)){ # No need for functional programming here...
<- randomForest(
fit_temp as.formula(paste("R1M_Usd ~", paste(features_short, collapse = " + "))), # New formula!
data = training_sample, # Data source: training sample
sampsize = 30000, # Size of (random) sample for each tree
replace = TRUE, # Is the sampling done with replacement?
ntree = n_trees[j], # Nb of random trees
mtry = 5) # Nb of predictors for each tree
<- mean((predict(fit_temp, testing_sample) - testing_sample$R1M_Usd)^2)
mse_RF[j]
} mse_RF
```

`## [1] 0.03967754 0.03885924 0.03766900 0.03696370 0.03699772`

Trees are by definition random so results can vary from test to test. Overall, large numbers of trees are preferable and the reason is that each new tree tells a new story and diversifies the risk of the whole forest. Some more technical details of why that may be the case are outlined in the original paper by Breiman (2001).

For the last exercises, we recycle the *formula* used in Chapter 6.

```
<- rpart(formula,
tree_2008 data = data_ml %>% filter(year(date) == 2008), # Data source: 2008
cp = 0.001,
maxdepth = 2)
rpart.plot(tree_2008)
```

The first splitting criterion in Figure 18.10 is enterprise value (EV). EV is an indicator that adjusts market capitalization by substracting debt and adding cash. It is a more faithful account of the true value of a company. In 2008, the companies that fared the least poorly were those with the highest EV (i.e., large, robust firms).

```
<- rpart(formula,
tree_2009 data = data_ml %>% filter(year(date) == 2009), # Data source: 2009
cp = 0.001,
maxdepth = 2)
rpart.plot(tree_2009)
```

In 2009 (Figure 18.11), the firms that recovered the fastest were those that experienced high volatility in the past (likely, downwards volatility). Momentum is also very important: the firms with the lowest past returns are those that rebound the fastest. This is a typical example of the momentum crash phenomenon studied in Barroso and Santa-Clara (2015) and K. Daniel and Moskowitz (2016). The rationale is the following: after a market downturn, the stocks with the most potential for growth are those that have suffered the largest losses. Consequently, the negative (short) leg of the momentum factor performs very well, often better than the long leg. And indeed, being long in the momentum factor in 2009 would have generated negative profits.

## 18.5 Chapter 7: the autoencoder model & universal approximation

First, it is imperative to format the inputs properly. To avoid any issues, we work with perfectly rectangular data and hence restrict the investment set to the stocks with no missing points. Dimensions must also be in the correct order.

```
<- data_ml %>% # Shorter dataset
data_short filter(stock_id %in% stock_ids_short) %>%
::select(c("stock_id", "date",features_short, "R1M_Usd"))
dplyr<- unique(data_short$date) # Vector of dates
dates
<- length(stock_ids_short) # Dimension for assets
N <- length(dates) # Dimension for dates
Tt <- length(features_short) # Dimension for features
K
<- data_short %>% # Factor side date
factor_data ::select(date, stock_id, R1M_Usd) %>%
dplyrspread(key = stock_id, value = R1M_Usd) %>%
::select(-date) %>%
dplyras.matrix()
<- array(unlist(data_short %>% # Beta side data: beware the permutation below!
beta_data ::select(-stock_id, -date, -R1M_Usd)),
dplyrdim = c(N, Tt, K))
<- aperm(beta_data, c(2,1,3)) # Permutation beta_data
```

Next, we turn to the specification of the network, using a functional API form.

```
<- layer_input(shape = c(N), name = "main_input") # Main input: returns
main_input <- main_input %>% # Def of factor side network
factor_network layer_dense(units = 8, activation = "relu", name = "layer_1_r") %>%
layer_dense(units = 4, activation = "tanh", name = "layer_2_r")
<- layer_input(shape = c(N,K), name = "aux_input") # Aux input: characteristics
aux_input <- aux_input %>% # Def of beta side network
beta_network layer_dense(units = 8, activation = "relu", name = "layer_1_l") %>%
layer_dense(units = 4, activation = "tanh", name = "layer_2_l") %>%
layer_permute(dims = c(2,1), name = "layer_3_l") # Permutation!
<- layer_dot(c(beta_network, factor_network), # Product of 2 networks
main_output axes = 1, name = "main_output")
<- keras_model( # AE Model specs
model_ae inputs = c(main_input, aux_input),
outputs = c(main_output)
)
```

Finally, we ask for the structure of the model, and train it.

`summary(model_ae) # See model details / architecture`

```
## Model: "model_5"
## __________________________________________________________________________________________
## Layer (type) Output Shape Param # Connected to
## ==========================================================================================
## aux_input (InputLayer) [(None, 793, 7)] 0
## __________________________________________________________________________________________
## layer_1_l (Dense) (None, 793, 8) 64 aux_input[0][0]
## __________________________________________________________________________________________
## main_input (InputLayer) [(None, 793)] 0
## __________________________________________________________________________________________
## layer_2_l (Dense) (None, 793, 4) 36 layer_1_l[0][0]
## __________________________________________________________________________________________
## layer_1_r (Dense) (None, 8) 6352 main_input[0][0]
## __________________________________________________________________________________________
## layer_3_l (Permute) (None, 4, 793) 0 layer_2_l[0][0]
## __________________________________________________________________________________________
## layer_2_r (Dense) (None, 4) 36 layer_1_r[0][0]
## __________________________________________________________________________________________
## main_output (Dot) (None, 793) 0 layer_3_l[0][0]
## layer_2_r[0][0]
## ==========================================================================================
## Total params: 6,488
## Trainable params: 6,488
## Non-trainable params: 0
## __________________________________________________________________________________________
```

```
%>% compile( # Learning parameters
model_ae optimizer = "rmsprop",
loss = "mse"
)
%>% fit( # Learning function
model_ae x = list(main_input = factor_data, aux_input = beta_data),
y = list(main_output = factor_data),
epochs = 20, # Nb rounds
batch_size = 49 # Nb obs. per round
)
```

For the second exercise, we use a simple architecture. The activation function, number of epochs and batch size may matter…

```
<- keras_model_sequential()
model_ua %>% # This defines the structure of the network, i.e. how layers are organized
model_ua layer_dense(units = 16, activation = 'sigmoid', input_shape = 1) %>%
layer_dense(units = 1) #
%>% compile( # Model specification
model_ua loss = 'mean_squared_error', # Loss function
optimizer = optimizer_rmsprop(), # Optimisation method (weight updating)
metrics = c('mean_absolute_error') # Output metric
)summary(model_ua) # A simple model!
```

```
## Model: "sequential_15"
## __________________________________________________________________________________________
## Layer (type) Output Shape Param #
## ==========================================================================================
## dense_48 (Dense) (None, 16) 32
## __________________________________________________________________________________________
## dense_47 (Dense) (None, 1) 17
## ==========================================================================================
## Total params: 49
## Trainable params: 49
## Non-trainable params: 0
## __________________________________________________________________________________________
```

```
<- model_ua %>%
fit_ua fit(seq(0, 6, by = 0.001) %>% matrix(ncol = 1), # Training data = x
sin(seq(0, 6, by = 0.001)) %>% matrix(ncol = 1), # Training label = y
epochs = 30, batch_size = 64 # Training parameters
)
```

In full disclosure, to improve the fit, we also increase the sample size. We show the improvement in the figure below.

```
<- keras_model_sequential()
model_ua2 %>% # This defines the structure of the network, i.e. how layers are organized
model_ua2 layer_dense(units = 128, activation = 'sigmoid', input_shape = 1) %>%
layer_dense(units = 1) #
%>% compile( # Model specification
model_ua2 loss = 'mean_squared_error', # Loss function
optimizer = optimizer_rmsprop(), # Optimisation method (weight updating)
metrics = c('mean_absolute_error') # Output metric
)summary(model_ua2) # A simple model!
```

```
## Model: "sequential_8"
## __________________________________________________________________________________________
## Layer (type) Output Shape Param #
## ==========================================================================================
## dense_24 (Dense) (None, 128) 256
## __________________________________________________________________________________________
## dense_23 (Dense) (None, 1) 129
## ==========================================================================================
## Total params: 385
## Trainable params: 385
## Non-trainable params: 0
## __________________________________________________________________________________________
```

```
<- model_ua2 %>%
fit_ua2 fit(seq(0, 6, by = 0.0002) %>% matrix(ncol = 1), # Training data = x
sin(seq(0, 6, by = 0.0002)) %>% matrix(ncol = 1), # Training label = y
epochs = 60, batch_size = 64 # Training parameters
) tibble(x = x) %>%
ggplot() +
geom_line(aes(x = x, y = predict(model_ua, x), color = "Small model")) +
geom_line(aes(x = x, y = predict(model_ua2, x), color = "Large model")) +
stat_function(fun = sin, aes(color = "sin(x) function")) +
scale_color_manual(values = c("#9999FF", "#333399", "#000000"))
```

## 18.6 Chapter 8

Since we are going to reproduce a similar analysis several times, let’s simplify the task with 2 tips. First, by using default parameter values that will be passed as common arguments to the *svm* function. Second, by creating a custom function that computes the MSE. Third, by resorting to functional calculus via the *map* function from the *purrr* package. Below, we recycle datasets created in Chapter 6.

```
<- function(fit, features, label){ # MSE function
mse return(mean((predict(fit, features)-label)^2))
}<- list(y = train_label_xgb[1:10000], # From Tree chapter
par_list x = train_features_xgb[1:10000,],
type = "eps-regression",
epsilon = 0.1, # Width of strip for errors
gamma = 0.5, # Constant in the radial kernel
cost = 0.1)
<- function(kernel, par_list){ # Function for SVM fit automation
svm_par require(e1071)
return(do.call(svm, c(kernel = kernel, par_list)))
}<- c("linear", "radial", "polynomial", "sigmoid") # Kernels
kernels <- map(kernels, svm_par, par_list = par_list) # SVM models
fit_svm_par map(fit_svm_par, mse, # MSEs
features = test_feat_short, # From SVM chapter
label = testing_sample$R1M_Usd)
```

```
## [[1]]
## [1] 0.03849786
##
## [[2]]
## [1] 0.03924576
##
## [[3]]
## [1] 0.03951328
##
## [[4]]
## [1] 334.8173
```

The first two kernels yield the best fit, while the last one should be avoided. Note that apart from the linear kernel, all other options require parameters. We have used the default ones, which may explain the poor performance of some nonlinear kernels.

Below, we train an SVM model on a training sample with all observations but that is limited to the 7 major predictors. Even with a smaller number of features, the training is time consuming.

```
<- svm(y = train_label_xgb, # Train label
svm_full x = train_features_xgb, # Training features
type = "eps-regression", # SVM task type (see LIBSVM documentation)
kernel = "linear", # SVM kernel
epsilon = 0.1, # Width of strip for errors
cost = 0.1) # Slack variable penalisation
<- dplyr::select(testing_sample,features_short) # Test set
test_feat_short mean(predict(svm_full, test_feat_short) * testing_sample$R1M_Usd > 0) # Hit ratio
```

`## [1] 0.490343`

This figure is very low. Below, we test a very simple form of boosted trees, for comparison purposes.

```
<- xgb.train(data = train_matrix_xgb, # Data source
xgb_full eta = 0.3, # Learning rate
objective = "reg:linear", # Objective function
max_depth = 4, # Maximum depth of trees
nrounds = 60 # Number of trees used (bit low here)
)
```

`## [14:43:24] WARNING: amalgamation/../src/objective/regression_obj.cu:174: reg:linear is now deprecated in favor of reg:squarederror.`

`mean(predict(xgb_full, xgb_test) * testing_sample$R1M_Usd > 0) # Hit ratio`

`## [1] 0.5017377`

The forecasts are slightly better, but the computation time is lower. Two reasons why the models perform poorly:

- there are not enough predictors;

- the models are static: they do not adjust dynamically to macro-conditions.

## 18.7 Chapter 11: ensemble neural network

First, we create the three feature sets. The first one gets all multiples of 3 between 3 and 93. The second one gets the same indices, minus one, and the third one, the initial indices minus two.

```
<- training_sample %>% dplyr::select(features[3*(1:31)]) %>% # First set of feats
feat_train_1 as.matrix()
<- training_sample %>% dplyr::select(features[3*(1:31)-1]) %>% # Second set of feats
feat_train_2 as.matrix()
<- training_sample %>% dplyr::select(features[3*(1:31)-2]) %>% # Third set of feats
feat_train_3 as.matrix()
<- testing_sample %>% dplyr::select(features[3*(1:31)]) %>% # Test features 1
feat_test_1 as.matrix()
<- testing_sample %>% dplyr::select(features[3*(1:31)-1]) %>% # Test features 2
feat_test_2 as.matrix()
<- testing_sample %>% dplyr::select(features[3*(1:31)-2]) %>% # Test features 3
feat_test_3 as.matrix()
```

Then, we specify the network structure. First, the 3 independent networks, then the aggregation.

```
<- layer_input(shape = c(31), name = "first_input") # First input
first_input <- first_input %>% # Def of 1st network
first_network layer_dense(units = 8, activation = "relu", name = "layer_1") %>%
layer_dense(units = 2, activation = 'softmax') # Softmax for categ. output
<- layer_input(shape = c(31), name = "second_input") # Second input
second_input <- second_input %>% # Def of 2nd network
second_network layer_dense(units = 8, activation = "relu", name = "layer_2") %>%
layer_dense(units = 2, activation = 'softmax') # Softmax for categ. output
<- layer_input(shape = c(31), name = "third_input") # Third input
third_input <- third_input %>% # Def of 3rd network
third_network layer_dense(units = 8, activation = "relu", name = "layer_3") %>%
layer_dense(units = 2, activation = 'softmax') # Softmax for categ. output
<- layer_concatenate(c(first_network,
main_output
second_network,%>% # Combination
third_network)) layer_dense(units = 2, activation = 'softmax', name = 'main_output')
<- keras_model( # Agg. Model specs
model_ens inputs = c(first_input, second_input, third_input),
outputs = c(main_output)
)
```

Lastly, we can train and evaluate (see Figure 18.13).

`summary(model_ens) # See model details / architecture`

```
## Model: "model_6"
## __________________________________________________________________________________________
## Layer (type) Output Shape Param # Connected to
## ==========================================================================================
## first_input (InputLayer) [(None, 31)] 0
## __________________________________________________________________________________________
## second_input (InputLayer) [(None, 31)] 0
## __________________________________________________________________________________________
## third_input (InputLayer) [(None, 31)] 0
## __________________________________________________________________________________________
## layer_1 (Dense) (None, 8) 256 first_input[0][0]
## __________________________________________________________________________________________
## layer_2 (Dense) (None, 8) 256 second_input[0][0]
## __________________________________________________________________________________________
## layer_3 (Dense) (None, 8) 256 third_input[0][0]
## __________________________________________________________________________________________
## dense_49 (Dense) (None, 2) 18 layer_1[0][0]
## __________________________________________________________________________________________
## dense_50 (Dense) (None, 2) 18 layer_2[0][0]
## __________________________________________________________________________________________
## dense_51 (Dense) (None, 2) 18 layer_3[0][0]
## __________________________________________________________________________________________
## concatenate_1 (Concatenate) (None, 6) 0 dense_49[0][0]
## dense_50[0][0]
## dense_51[0][0]
## __________________________________________________________________________________________
## main_output (Dense) (None, 2) 14 concatenate_1[0][0]
## ==========================================================================================
## Total params: 836
## Trainable params: 836
## Non-trainable params: 0
## __________________________________________________________________________________________
```

```
%>% compile( # Learning parameters
model_ens optimizer = optimizer_adam(),
loss = "binary_crossentropy",
metrics = "categorical_accuracy"
)
<- model_ens %>% fit( # Learning function
fit_NN_ens x = list(first_input = feat_train_1,
second_input = feat_train_2,
third_input = feat_train_3),
y = list(main_output = NN_train_labels_C), # Recycled from NN Chapter
epochs = 12, # Nb rounds
batch_size = 512, # Nb obs. per round
validation_data = list(list(feat_test_1, feat_test_2, feat_test_3),
NN_test_labels_C)
)plot(fit_NN_ens)
```

## 18.8 Chapter 12

### 18.8.1 EW portfolios with the tidyverse

This one is incredibly easy; it’s simpler and more compact but close in spirit to the code that generates Figure 3.1. The returns are plotted in Figure 18.14.

```
%>%
data_ml group_by(date) %>% # Group by date
summarize(return = mean(R1M_Usd)) %>% # Compute return
ggplot(aes(x = date, y = return)) + geom_point() + geom_line() # Plot
```

### 18.8.2 Advanced weighting function

First, we code the function with all inputs.

```
<- function(Sigma, mu, Lambda, lambda, k_D, k_R, w_old){
weights <- nrow(Sigma)
N <- solve(lambda*Sigma + 2*k_R*Lambda + 2*k_D*diag(N)) # Inverse matrix
M <- 1-sum(M %*% (mu + 2*k_R*Lambda %*% w_old)) # eta numerator
num <- sum(M %*% rep(1,N)) # eta denominator
den <- num / den # eta
eta <- mu + eta * rep(1,N) + 2*k_R*Lambda %*% w_old # Vector in weight
vec return(M %*% vec)
}
```

Second, we test it on some random dataset. We use the returns created at the end of Chapter 1 and used for the Lasso allocation in Section 5.2.2. For \(\boldsymbol{\mu}\), we use the sample average, which is rarely a good idea in practice. It serves as illustration only.

```
<- returns %>% dplyr::select(-date) %>% as.matrix() %>% cov() # Covariance matrix
Sigma <- returns %>% dplyr::select(-date) %>% apply(2,mean) # Vector of exp. returns
mu <- diag(nrow(Sigma)) # Trans. Cost matrix
Lambda <- 1 # Risk aversion
lambda <- 1
k_D <- 1
k_R <- rep(1, nrow(Sigma)) / nrow(Sigma) # Prev. weights: EW
w_old weights(Sigma, mu, Lambda, lambda, k_D, k_R, w_old) %>% head() # First weights
```

```
## [,1]
## 1 0.0031339308
## 3 -0.0003243527
## 4 0.0011944677
## 7 0.0014194215
## 9 0.0015086240
## 11 -0.0005015207
```

Some weights can of course be negative. Finally, we use the map2() function to test some sensitivity. We examine 3 key indicators:

- **diversification**, which we measure via the inverse of the sum of squared weights (inverse Hirschman-Herfindhal index);

- **leverage**, which we assess via the absolute sum of negative weights;

- **in-sample volatility**, which we compute as \(\textbf{w}' \boldsymbol{\Sigma} \textbf{x}\)

To do so, we create a dedicated function below.

```
<- function(lambda, k_D, Sigma, mu, Lambda, k_R, w_old){
sensi <- weights(Sigma, mu, Lambda, lambda, k_D, k_R, w_old)
w <- c()
out $div <- 1/sum(w^2) # Diversification
out$lev <- sum(abs(w[w<0])) # Leverage
out$vol <- t(w) %*% Sigma %*% w # In-sample vol
outreturn(out)
}
```

Instead of using the baseline *map2* function, we rely on a version thereof that concatenates results into a dataframe directly.

```
<- 10^(-3:2) # parameter values
lambda <- 2*10^(-3:2) # parameter values
k_D <- expand_grid(lambda, k_D) # parameter grid
pars <- pars$lambda
lambda <- pars$k_D
k_D
<- map2_dfr(lambda, k_D, sensi,
res Sigma = Sigma, mu = mu, Lambda = Lambda, k_R = k_R, w_old = w_old)
bind_cols(lambda = as.factor(lambda), k_D = as.factor(k_D), res) %>%
gather(key = indicator, value = value, -lambda, -k_D) %>%
ggplot(aes(x = lambda, y = value, fill = k_D)) + geom_col(position = "dodge") +
facet_grid(indicator ~. , scales = "free")
```

In Figure 18.15, each panel displays an indicator. In the first panel, we see that diversification increases with \(k_D\): indeed, as this number increases, the portfolio converges to uniform (EW) values. The parameter \(\lambda\) has a minor impact. The second panel naturally shows the inverse effect for leverage: as diversification increases with \(k_D\), leverage (i.e., total negative positions - shortsales) decreases. Finally, the last panel shows that in-sample volatility is however largely driven by the risk aversion parameter. As \(\lambda\) increases, volatility logically decreases. For small values of \(\lambda\), \(k_D\) is negatively related to volatility but the pattern reverses for large values of \(\lambda\). This is because the equally weighted portfolio is less risky than very leveraged mean-variance policies, but more risky than the minimum-variance portfolio.

### 18.8.3 Functional programming in the backtest

Often, programmers prefer to avoid loops. In order to avoid a loop in the backtest, we need to code what happens for one given date. This is encapsulated in the following function. For simplicity, we code it for only one strategy. Also, the function will assume the structure of the data is known, but the columns (features & labels) could also be passed as arguments. We recycle the function **weights_xgb** from Chapter 12.

```
<- function(t, data_ml, ticks, t_oos, m_offset, train_size, weight_func){
portf_map <- data_ml %>% filter(date < t_oos[t] - m_offset * 30, # Roll. window w. buffer
train_data > t_oos[t] - m_offset * 30 - 365 * train_size)
date <- data_ml %>% filter(date == t_oos[t]) # Test set
test_data <- test_data %>% # Computing returns via:
realized_returns ::select(R1M_Usd) # 1M holding period!
dplyr<- weight_func(train_data, test_data, features) # Weights = > recycled!
temp_weights <- match(temp_weights$names, ticks) %>% na.omit() # Index of test assets
ind <- c()
x $weights <- rep(0, length(ticks)) # Empty weights
x$weights[ind] <- temp_weights$weights # Locate weights correctly
x$returns <- sum(temp_weights$weights * realized_returns) # Compute returns
xreturn(x)
}
```

Next, we combine this function to **map**(). We only test the first 6 dates: this reduces the computation times.

```
<- 1:3 %>% # Test on the first 100 out-of-sample dates
back_test map(portf_map, data_ml = data_ml, ticks = ticks, t_oos = t_oos,
m_offset = 1, train_size = 5, weight_func = weights_xgb)
```

```
## [14:43:55] WARNING: amalgamation/../src/objective/regression_obj.cu:174: reg:linear is now deprecated in favor of reg:squarederror.
## [14:44:04] WARNING: amalgamation/../src/objective/regression_obj.cu:174: reg:linear is now deprecated in favor of reg:squarederror.
## [14:44:14] WARNING: amalgamation/../src/objective/regression_obj.cu:174: reg:linear is now deprecated in favor of reg:squarederror.
```

`head(back_test[[1]]$weights) # Sample weights`

`## [1] 0.001675042 0.000000000 0.000000000 0.001675042 0.000000000 0.001675042`

`1]]$returns # Return of first period back_test[[`

`## [1] 0.0189129`

Each element of backtest is a list with two components: the portfolio weights and the returns. To access the data easily, functions like *melt* from the package *reshape2* are useful.

## 18.9 Chapter 15

We recycle the AE model trained in Chapter 15. Strangely, building smaller models (encoder) from larger ones (AE) requires to save and then reload the weights. This creates an external file, which we call “ae_weights.” We can check that the output does have 4 columns (compressed) instead of 7 (original data).

```
save_model_weights_hdf5(object = ae_model,filepath ="ae_weights.hdf5", overwrite = TRUE)
<- keras_model(inputs = input_layer, outputs = encoder)
encoder_model %>%
encoder_model load_model_weights_hdf5(filepath = "ae_weights.hdf5",skip_mismatch = TRUE,by_name = TRUE)
%>% compile(
encoder_model loss = 'mean_squared_error',
optimizer = 'adam',
metrics = c('mean_absolute_error')
)%>%
encoder_model ::predict_on_batch(x = training_sample %>%
keras::select(features_short) %>%
dplyras.matrix()) %>%
head(5)
```

```
## [,1] [,2] [,3] [,4]
## [1,] -1.676068 0.2152901 0.1490787 0.8586778
## [2,] -1.684428 0.1919250 0.1985757 0.8494947
## [3,] -1.686019 0.1909507 0.1556696 0.8760668
## [4,] -1.689391 0.1893837 0.1492719 0.8799014
## [5,] -1.694696 0.1924362 0.1314638 0.8783331
```

## 18.10 Chapter 16

All we need to do is change the rho coefficient in the code of Chapter 16.

```
set.seed(42) # Fixing the random seed
<- 10^5 # Number of samples generated
n_sample <- (-0.8) # Autoregressive parameter
rho <- 0.4 # Std. dev. of noise
sd <- 0.06 * rho # Scaled mean of returns
a <- tibble(returns = a/rho + arima.sim(n = n_sample, # Returns via AR(1) simulation
data_RL3 list(ar = rho),
sd = sd),
action = round(runif(n_sample)*4)/4) %>% # Random action (portfolio)
mutate(new_state = if_else(returns < 0, "neg", "pos"), # Coding of state
reward = returns * action, # Reward = portfolio return
state = lag(new_state), # Next state
action = as.character(action)) %>%
na.omit() # Remove one missing state
```

The learning can then proceed.

```
<- list(alpha = 0.1, # Learning rate
control gamma = 0.7, # Discount factor for rewards
epsilon = 0.1) # Exploration rate
<- ReinforcementLearning(data_RL3, # Main RL function
fit_RL3 s = "state",
a = "action",
r = "reward",
s_new = "new_state",
control = control)
print(fit_RL3) # Show the output
```

```
## State-Action function Q
## 0.25 0 1 0.75 0.5
## neg 0.7107268 0.5971710 1.4662416 0.9535698 0.8069591
## pos 0.7730842 0.7869229 0.4734467 0.4258593 0.6257039
##
## Policy
## neg pos
## "1" "0"
##
## Reward (last iteration)
## [1] 3013.162
```

In this case, the constantly switching feature of the return process changes the outcome. The negative state is associated with large profits when the portfolio is fully invested, while the positive state has the best average reward when the agent refrains from investing.

For the second exercise, the trick is to define all possible actions, that is all combinations (+1,0-1) for the two assets on all dates. We recycle the data from Chapter 16.

```
<- c(-1,0,1) # Possible alloc. to asset 1
pos_3 <- c(-1,0,1) # Possible alloc. to asset 3
pos_4 <- expand_grid(pos_3, pos_4) # All combinations
pos <- bind_cols(pos, id = 1:nrow(pos)) # Adding combination id
pos
<- bind_cols(r3 = return_3, r4 = return_4, # Returns & P/B dataframe
ret_pb_RL pb3 = pb_3, pb4 = pb_4)
<- sapply(ret_pb_RL, # Combining return & positions
data_RL4
rep.int, times = nrow(pos)) %>%
data.frame() %>%
bind_cols(id = rep(1:nrow(pos), 1, each = length(return_3))) %>%
left_join(pos) %>% dplyr::select(-id) %>%
mutate(action = paste(pos_3, pos_4), # Uniting actions
pb3 = round(5 * pb3), # Simplifying states
pb4 = round(5 * pb4), # Simplifying states
state = paste(pb3, pb4), # Uniting states
reward = pos_3*r3 + pos_4*r4, # Computing rewards
new_state = lead(state)) %>% # Infer new state
::select(-pb3, -pb4, -pos_3, # Remove superfluous vars.
dplyr-pos_4, -r3, -r4)
```

We can the plug this data into the RL function.

```
<- ReinforcementLearning(data_RL4, # Main RL function
fit_RL4 s = "state",
a = "action",
r = "reward",
s_new = "new_state",
control = control)
$Q <- round(fit_RL4$Q, 3) # Round the Q-matrix
fit_RL4print(fit_RL4) # Show the output
```

```
## State-Action function Q
## 0 0 0 1 0 -1 -1 -1 -1 0 -1 1 1 -1 1 0 1 1
## 0 2 0.000 0.000 0.002 -0.017 -0.018 -0.020 0.023 0.025 0.024
## 0 3 0.001 -0.005 0.007 -0.013 -0.019 -0.026 0.031 0.027 0.021
## 3 1 0.003 0.003 0.003 0.002 0.002 0.003 0.002 0.002 0.003
## 2 1 0.027 0.038 0.020 0.004 0.015 0.039 0.013 0.021 0.041
## 2 2 0.021 0.014 0.027 0.038 0.047 0.045 -0.004 -0.011 -0.016
## 2 3 0.007 0.006 0.008 0.054 0.057 0.056 -0.041 -0.041 -0.041
## 1 1 0.027 0.054 0.005 -0.031 -0.005 0.041 0.025 0.046 0.072
## 1 2 0.019 0.020 0.020 0.015 0.023 0.029 0.012 0.014 0.023
## 1 3 0.008 0.019 0.000 -0.036 -0.027 -0.016 0.042 0.053 0.060
##
## Policy
## 0 2 0 3 3 1 2 1 2 2 2 3 1 1 1 2 1 3
## "1 0" "1 -1" "0 -1" "1 1" "-1 0" "-1 0" "1 1" "-1 1" "1 1"
##
## Reward (last iteration)
## [1] 0
```

The matrix is less sparse compared to the one of Chapter 16; we have covered much more ground! Some policy recommendations have not changed compared to the smaller sample, but some have! The change occurs for the states for which only a few points were available in the first trial. With more data, the decision is altered.

*MIS Quarterly*, 1293–1327.

*Expert Systems with Applications*140: 112891.

*Journal of Portfolio Management*35 (1): 52–56.

*Proceedings of the 23rd International Conference on Machine Learning*, 9–16. ACM.

*Outlier Analysis*. Springer.

*Journal of Financial Data Science*1 (4): 39–62.

*SSRN Working Paper*3578830.

*Missing Data*. Vol. 136. Sage publications.

*Expert Systems with Applications*87: 267–79.

*Expert Systems with Applications*130: 145–56.

*Journal of Finance*74 (6): 3187–3216.

*Journal of Banking & Finance*70: 23–37.

*Nature*567: 305–7.

*Talking Nets: An Oral History of Neural Networks*. MIT Press.

*arXiv Preprint*, no. 2003.01977.

*SSRN Working Paper*3726714.

*Asset Management: A Systematic Approach to Factor Investing*. Oxford University Press.

*Journal of Finance*61 (1): 259–99.

*Journal of Financial Economics*106 (1): 132–56.

*SSRN Working Paper*1106463.

*arXiv Preprint*, no. 1908.07442.

*arXiv Preprint*, no. 1907.02893.

*Journal of the American Statistical Association*3116974.

*Management Science*61 (11): 2569–79.

*Journal of Portfolio Management*45 (4): 18–36.

*Journal of Financial Data Science*1 (1): 64–74.

*Journal of the American Statistical Association*115 (529): 482–85.

*Journal of Financial Economics*135 (3): 629–52.

*Journal of Finance*68 (3): 929–85.

*Journal of Portfolio Management*43 (5): 72–87.

*Journal of Portfolio Management*39 (4): 49–68.

*Journal of Financial Economics*129 (3): 479–509.

*Journal of Investment Management*13 (1): 27–63.

*Journal of Economic Surveys*33 (5): 1463–92.

*Journal of Portfolio Management*46 (3): 26–35.

*arXiv Preprint*, no. 2009.03394.

*R Package Version*1 (1).

*Asset Pricing and Portfolio Choice Theory*. Oxford University Press.

*Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection*. John Wiley & Sons.

*Econometrica*70 (1): 191–221.

*Journal of Portfolio Management*40 (5): 39–59.

*IEEE Trans. On Systems, Man, Cybernetics*8 (4): 311–13.

*Journal of Financial Economics*106 (3): 473–91.

*Financial Analysts Journal*67 (1): 40–54.

*Journal of Financial and Quantitative Analysis*Forthcoming: 1–24.

*Empirical Asset Pricing: The Cross Section of Stock Returns*. John Wiley & Sons.

*Expert Systems with Applications*42 (20): 7046–56.

*Management Science*64 (3): 1136–54.

*Journal of Finance*48 (5): 1719–47.

*Journal of Finance*48 (4): 1231–62.

*Journal of Financial Economics*9 (1): 3–18.

*Handbook of Behavioral Economics-Foundations and Applications*.

*Journal of Financial Economics*115 (1): 1–24.

*SSRN Working Paper*3477463.

*Review of Financial Studies*29 (11): 3068–3107.

*Journal of Financial Economics*68 (2): 161–99.

*Journal of Finance*73 (2): 715–54.

*IEEE Transactions on Information Theory*39 (3): 930–45.

*Machine Learning*14 (1): 115–33.

*Journal of Financial Economics*116 (1): 111–20.

*Neural Computation*16 (9): 1959–81.

*Journal of the Operational Research Society*20 (4): 451–68.

*Journal of Multivariate Analysis*175: 104544.

*SSRN Working Paper*2695101.

*Proceedings of the European Conference on Computer Vision (ECCV)*, 456–73.

*arXiv Preprint*, no. 2009.11698.

*SSRN Working Paper*3732113.

*Regression Diagnostics: Identifying Influential Data and Sources of Collinearity*. Vol. 571. John Wiley & Sons.

*Machine Learning*79 (1-2): 151–75.

*Neural Networks: Tricks of the Trade*, 437–78. Springer.

*SSRN Working Paper*3438533.

*Journal of Machine Learning Research*13 (Feb): 281–305.

*Journal of Finance*54 (5): 1553–1607.

*Journal of Financial Economics*134 (2): 253–72.

*Procedia Economics and Finance*3: 68–77.

*Dynamic Programming and Optimal Control - Volume II, Fourth Edition*. Athena Scientific.

*SSRN Working Paper*3440147.

*Journal of Finance*72 (1): 5–46.

*American Economic Review*109 (3): 1116–54.

*arXiv Preprint*, no. 1007.0085.

*Decision Support Systems*50 (3): 602–13.

*Journal of Machine Learning Research*13 (Apr): 1063–95.

*Journal of Machine Learning Research*9 (Sep): 2015–33.

*Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models*. CRC Press.

*SSRN Working Paper*3745078.

*Financial Analysts Journal*48 (5): 28–43.

*Journal of Investing*28 (4): 95–103.

*Journal of Portfolio Management*46 (3): 42–48.

*Machine Learning*35 (3): 193–205.

*European Journal of Operational Research*229 (3): 637–44.

*Hands-on Machine Learning with r*. Chapman & Hall / CRC.

*Review of Financial Studies*Forthcoming.

*Annals of Operations Research*, 1–39.

*Proceedings of the 2008 SIAM International Conference on Data Mining*, 243–54.

*Proceedings of the Fifth Annual Workshop on Computational Learning Theory*, 144–52. ACM.

*Journal of Finance*74 (2): 639–74.

*Convex Optimization*. Cambridge University Press.

*Review of Financial Studies*22 (9): 3411–47.

*Decision Sciences*18 (3): 415–29.

*Machine Learning*24 (1): 49–64.

*Machine Learning*45 (1): 5–32.

*Classification and Regression Trees*. Chapman & Hall.

*Annals of Statistics*32 (1): 1–11.

*Journal of Financial Research*Forthcoming.

*Annals of Applied Statistics*9 (1): 247–74.

*Proceedings of the National Academy of Sciences*106 (30): 12267–72.

*Expert Systems with Applications*39 (3): 3446–53.

*SSRN Working Paper*3473874.

*SSRN Working Paper*3481736.

*SSRN Working Paper*3493458.

*Quantitative Finance*19 (8): 1271–91.

*SSRN Working Paper*3657366.

*Neural Computing & Applications*6 (4): 193–200.

*Expert Systems with Applications*Forthcoming.

*Annals of Statistics*42 (6): 2526–56.

*Journal of Investing*Forthcoming.

*Social Responsibility Journal*Forthcoming.

*Journal of Financial Economics*81 (1): 27–60.

*SSRN Working Paper*3706532.

*IEEE Transactions on Neural Networks*14 (6): 1506–18.

*Journal of Finance*52 (1): 57–82.

*Journal of Finance*59 (6): 2577–2603.

*SSRN Working Paper*3435141.

*SSRN Working Paper*2524547.

*Studies in Economics and Finance*.

*ACM Computing Surveys (CSUR)*41 (3): 15.

*ACM Transactions on Intelligent Systems and Technology (TIST)*2 (3): 27.

*arXiv Preprint*, no. 2003.06497.

*arXiv Preprint*, no. 2003.10014.

*Scientific Reports*8 (1): 6085.

*SSRN Working Paper*3448637.

*SSRN Working Paper*3272572.

*SSRN Working Paper*3254995.

*Journal of Finance*Forthcoming.

*SSRN Working Paper*3073681.

*Review of Asset Pricing Studies*Forthcoming.

*INFORMS Journal on Computing*13 (4): 312–31.

*arXiv Preprint*, no. 1808.02610.

*2016 7th International Conference on Cloud Computing and Big Data (CCBD)*, 87–92. IEEE.

*Management Science*58 (10): 1834–53.

*SSRN Working Paper*3350138.

*Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*, 785–94. ACM.

*Expert Systems with Applications*80: 340–55.

*Journal of Empirical Finance*60: 56–73.

*SSRN Working Paper*3478223.

*Journal of Business & Economic Statistics*38 (4): 771–83.

*Journal of Finance*75 (1): 551–77.

*Journal of Finance*74 (1): 449–92.

*Journal of Financial Economics*Forthcoming.

*SSRN Working Paper*3487624.

*Annals of Applied Statistics*4 (1): 266–98.

*Journal of Financial and Quantitative Analysis*49 (1): 107–30.

*Deep Learning with Python*. Manning Publications Company.

*Review of Financial Studies*33 (5): 2134–79.

*SSRN Working Paper*2549578.

*Applied Economics*34 (13): 1671–77.

*International Conference on Machine Learning*, 2067–75.

*Model Selection and Model Averaging*. Cambridge University Press.

*International Economic Review*50 (2): 363–95.

*SSRN Working Paper*3362495.

*Asset Pricing: Revised Edition*. Princeton University Press.

*Journal of Finance*66 (4): 1047–1108.

*SSRN Working Paper*3449822.

*SSRN Working Paper*3307057.

*SSRN Working Paper*2800590.

*Journal of Financial Economics*21 (2): 255–89.

*The Journal of Finance*48 (4): 1263–91.

*Long Memory in Economics*, 289–309. Springer.

*Journal of Financial and Quantitative Analysis*54 (5): 1975–2016.

*Annals of Finance*11 (2): 221–41.

*Expert Systems with Applications*73: 69–81.

*Quantitative Finance*Forthcoming.

*Annals of Operations Research*288: 181–221.

*Journal of Portfolio Management*.

*SSRN Working Paper*3779481.

*Apprentissage Artificiel: Deep Learning, Concepts Et Algorithmes*. Eyrolles.

*Machine Learning*20 (3): 273–97.

*Journal of NeuroTechnology*1 (1).

*Mathematical Finance*1 (1): 1–29.

*IEEE Transactions on Information Theory*42 (2): 348–63.

*Journal of Machine Learning Research*7 (Mar): 551–85.

*Review of Financial Studies*29 (3): 739–86.

*Journal of Financial Economics*117 (2): 333–49.

*Theory of Probability & Its Applications*60 (4): 561–79.

*Mathematics of Control, Signals and Systems*2 (4): 303–14.

*Quantitative Finance*11 (3): 351–64.

*Journal of Financial Economics*106 (1): 157–81.

*Journal of Financial and Quantitative Analysis*55 (4): 1163–98.

*Journal of Finance*56 (3): 921–65.

*Review of Financial Studies*33 (4): 1673–1736.

*Journal of Financial Economics*122 (2): 221–47.

*Review of Financial Studies*33 (5): 1927–79.

*Journal of Finance*52 (1): 1–33.

*Critical Finance Review*1 (1): 103–39.

*Journal of Finance*56 (2): 743–66.

*Journal of Banking & Finance*61: S235–40.

*SSRN Working Paper*3557957.

*Advances in Financial Machine Learning*. John Wiley & Sons.

*Mathematische Annalen*300 (1): 463–520.

*Journal of Econometrics*Forthcoming.

*Management Science*55 (5): 798–812.

*Review of Financial Studies*22 (5): 1915–53.

*SSRN Working Paper*3392875.

*Review of Financial Studies*33 (5): 2180–2222.

*Journal of Financial and Quantitative Analysis*50 (6): 1443–71.

*International Conference on Machine Learning*, 665–73.

*Financial Analysts Journal*75 (4): 84–102.

*Journal of Forecasting*Forthcoming.

*European Financial Management*Forthcoming.

*International Journal of Machine Learning and Computing*7 (5): 118–22.

*SSRN Working Paper*, no. 3572181.

*Machine Learning in Finance: From Theory to Practice*. Springer.

*Journal of Forecasting*15 (1): 49–61.

*International Conference on Machine Learning*, 97:107–15.

*Advances in Neural Information Processing Systems*, 155–61.

*Journal of Financial Data Science*Forthcoming.

*Neural Networks and Statistical Learning*. Springer Science & Business Media.

*Journal of Machine Learning Research*12 (Jul): 2121–59.

*Journal of Asset Management*14 (1): 52–71.

*Quarterly Journal of Business and Economics*, 33–48.

*Proceedings of the Conference on Neural Information Processing Systems*.

*SSRN Working Paper*3014521.

*arXiv Preprint*, no. 1906.06711.

*Cognitive Science*14 (2): 179–211.

*Structural Equation Modeling*8 (1): 128–41.

*Applied Missing Data Analysis*. Guilford Press.

*Journal of Finance*73 (5): 1971–2001.

*Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition*, 10792–801.

*Econometrica*, 987–1007.

*Expert Systems with Applications*29 (4): 927–40.

*Journal of Portfolio Management*46 (3): 1–4.

*Journal of Portfolio Management*45 (1): 141–47.

*arXiv Preprint*, no. 2009.04824.

*Journal of Finance*47 (2): 427–65.

*Journal of Financial Economics*33 (1): 3–56.

*Journal of Financial Economics*116 (1): 1–22.

*Journal of Financial Economics*128 (2): 234–52.

*Journal of Political Economy*81 (3): 607–36.

*SSRN Working Paper*3152386.

*Computational Management Science*12 (3): 417–34.

*Journal of Finance*75 (3): 1327–70.

*SSRN Working Paper*3243683.

*arXiv Preprint*, no. 2011.04391.

*European Journal of Operational Research*270 (2): 654–69.

*Journal of Machine Learning Research*20 (177): 1–81.

*Applied Intelligence*, 1–19.

*arXiv Preprint*, no. 2002.07477.

*arXiv Preprint*, no. 1807.02811.

*Journal of Financial Economics*111 (1): 1–25.

*Journal of Accounting Research*, 185–209.

*Machine Learning: Proceedings of the Thirteenth International Conference*, 96:148–56.

*Journal of Computer and System Sciences*55 (1): 119–39.

*Review of Financial Studies*33 (5): 2326–77.

*Journal of Sustainable Finance & Investment*5 (4): 210–33.

*Annals of Statistics*, 1189–1232.

*Computational Statistics & Data Analysis*38 (4): 367–78.

*Biostatistics*9 (3): 432–41.

*Annals of Statistics*28 (2): 337–407.

*Machine Learning*29 (2-3): 131–63.

*Journal of Financial and Quantitative Analysis*21 (3): 293–305.

*arXiv Preprint*, no. 1806.01743.

*Decision Analysis*14 (1): 1–20.

*Econometrica*84 (3): 985–1046.

*SSRN Working Paper*3443426.

*Journal of Banking & Finance*32 (12): 2646–54.

*arXiv Preprint*, no. 1611.04561.

*Expert Systems with Applications*129: 27–36.

*Neurocomputing*72 (7-9): 1483–93.

*SSRN Working Paper*3774548.

*Bayesian Data Analysis, 3rd Edition*. Chapman & Hall / CRC.

*Neural Computation*4 (1): 1–58.

*International Journal of Forecasting*29 (1): 108–21.

*Journal of Economic Literature*57 (3): 535–74.

*Computational Statistics & Data Analysis*50 (11): 3113–23.

*SSRN Working Paper*3525530.

*SSRN Working Paper*2865922.

*Journal of Political Economy*111 (4): 693–732.

*Journal of Banking & Finance*50: 169–82.

*Oxford Research Encyclopedia of Economics and Finance*.

*Deep Learning*. MIT Press Cambridge.

*Advances in Neural Information Processing Systems*, 2672–80.

*Journal of Financial Economics*132 (2): 451–71.

*Journal of Econometrics*Forthcoming.

*Journal of Financial and Quantitative Analysis*50 (6): 1415–41.

*Financial Markets and Portfolio Management*Forthcoming.

*Biometrics*, 857–71.

*Financial Markets and Portfolio Management*26 (1): 3–38.

*Journal of Financial and Quantitative Analysis*50 (6): 1237–67.

*Econometrica*, 424–38.

*Review of Accounting Studies*18 (3): 692–730.

*Review of Financial Studies*30 (12): 4389–4436.

*Econometric Analysis, Eighth Edition*. Pearson Education.

*R Journal*9 (1): 421–36.

*R Journal*.

*Journal of Financial Economics*78 (2): 311–39.

*Management Science*63 (4): 1110–30.

*Journal of Econometrics*Forthcoming.

*Review of Financial Studies*33 (5): 2223–73.

*Big Data and Machine Learning in Quantitative Investment*, 129–48. Wiley.

*Wilmott*2018 (98): 24–33.

*Journal of Financial and Quantitative Analysis*51 (4): 1297–1323.

*Neural Networks*98: 296–304.

*SSRN Working Paper*3683288.

*IEEE Transactions on Knowledge and Data Engineering*26 (9): 2250–67.

*Journal of Portfolio Management*45 (3): 13–36.

*Expert Systems with Applications*38 (8): 10389–97.

*Journal of Lachine Learning Research*3 (Mar): 1157–82.

*Review of Financial Studies*33 (5): 1980–2018.

*arXiv Preprint*, no. 1706.09523.

*An Introduction to Machine Learning Interpretability - Second Edition*. O’Reilly.

*Annals of Statistics*36 (5): 2135–52.

*arXiv Preprint*, no. 1805.06126.

*SSRN Working Paper*3185335.

*Advances in Neural Information Processing Systems*, 571–81.

*Econometrica*, 1029–54.

*IEEE Transactions on Evolutionary Computation*1 (1): 40–52.

*Journal of Finance*74 (5): 2153–99.

*Journal of Finance*72 (4): 1399–1440.

*Critical Finance Review*, 1–9.

*Quantitative Finance*10 (5): 469–85.

*Journal of Portfolio Management*42 (1): 13–28.

*SSRN Working Paper*3341728.

*Journal of Finance*Forthcoming.

*Review of Asset Pricing Studies*10 (2): 199–248.

*Review of Financial Studies*29 (1): 5–68.

*SSRN Working Paper*2528780.

*Journal of Financial Economics*132 (3): 182–204.

*Expert Systems with Applications*33 (1): 171–80.

*arXiv Preprint*, no. 2006.00371.

*The Elements of Statistical Learning*. Springer.

*Journal of Financial Economics*41 (3): 401–39.

*Neural Networks and Learning Machines*. Prentice Hall.

*Machine Learning*69 (2-3): 169–92.

*Foundations and Trends in Optimization*2 (3-4): 157–325.

*SSRN Working Paper*3143752.

*PLoS Biology*13 (3): e1002106.

*Journal of Causal Inference*6 (2).

*Journal of Financial Economics*99 (3): 560–80.

*Expert Systems with Applications*124: 226–51.

*Journal of Finance*49 (5): 1639–64.

*Journal of Business Ethics*70 (2): 165–74.

*Journal of Financial and Quantitative Analysis*46 (3): 815–39.

*Journal of Banking & Finance*36 (5): 1392–1401.

*Proceedings of 3rd International Conference on Document Analysis and Recognition*, 1:278–82. IEEE.

*Journal of Optimization Theory and Applications*115 (3): 549–70.

*Neural Computation*9 (8): 1735–80.

*Artificial Intelligence Review*22 (2): 85–126.

*Journal of Portfolio Management*44 (1): 30–43.

*SSRN Working Paper*3190310.

*arXiv Preprint*, no. 1802.02871.

*American Journal of Political Science*54 (2): 561–81.

*Review of Financial Studies*33 (3): 1011–23.

*Journal of Econometrics*208 (1): 265–81.

*arXiv Preprint*, no. 1902.06021.

*Management Science*.

*Expert Systems with Applications*129: 273–85.

*Review of Financial Studies*28 (3): 650–705.

*Review of Financial Studies*33 (5): 2019–2133.

*Journal of Banking & Finance*97: 257–69.

*SSRN Working Paper*3678363.

*SSRN Working Paper*3622753.

*Computers & Operations Research*32 (10): 2513–22.

*European Journal of Operational Research*278 (1): 330–42.

*Review of Finance*9 (3): 415–35.

*arXiv Preprint*, no. 1912.09104.

*Expected Returns: An Investor’s Guide to Harvesting Market Rewards*. John Wiley & Sons.

*SSRN Working Paper*3400998.

*Journal of Financial Economics*135 (1): 213–30.

*Neural Computation*3 (1): 79–87.

*Journal of Finance*58 (4): 1651–83.

*Journal of Finance*53 (4): 1285–1309.

*An Introduction to Statistical Learning*. Vol. 112. Springer.

*Journal of Financial Economics*133 (2): 273–98.

*Journal of Finance*48 (1): 65–91.

*Journal of Finance*23 (2): 389–416.

*Big Data and Machine Learning in Quantitative Investment*, 51–74. Wiley.

*SSRN Working Paper*3756587.

*arXiv Preprint*, no. 2003.01859.

*arXiv Preprint*, no. 1706.10059.

*SSRN Working Paper*, no. 3492142.

*SSRN Working Paper*3622743.

*Journal of Finance*57 (2): 585–608.

*Review of Asset Pricing Studies*9 (1): 1–46.

*Advances in Psychology*, 121:471–95.

*Journal of Business*, 259–78.

*Factor Investing: From Traditional to Alternative Risk Premia*. Elsevier.

*Journal of Statistical Software*47 (11): 1–26.

*Journal of Financial and Quantitative Analysis*42 (3): 621–56.

*SSRN Working Paper*3803954.

*arXiv Preprint*, no. 2006.05574.

*Advances in Neural Information Processing Systems*, 3146–54.

*SSRN Working Paper*3388293.

*High Frequency Trading: New Realities for Traders, Markets, and Regulators*.

*Journal of Financial Economics*134 (3): 501–24.

*European Financial Management*13 (5): 908–22.

*Expert Systems with Applications*Forthcoming: 113546.

*Neurocomputing*55 (1-2): 307–19.

*SSRN Working Paper*3263001.

*Journal of Banking & Finance*45: 1–8.

*1990 IJCNN International Joint Conference on Neural Networks*, 1–6. IEEE.

*arXiv Preprint*, no. 1412.6980.

*SSRN Working Paper*3520131.

*Journal of Political Economy*127 (4): 1475–515.

*SSRN Working Paper*3378340.

*Journal of Financial Data Science*1 (1): 159–71.

*Journal of Machine Learning in Finance*1 (1).

*Proceedings of the ICLR Conference*, 1–25.

*arXiv Preprint*, no. 2004.03445.

*Journal of Finance*73 (3): 1183–223.

*Journal of Financial Economics*135: 271–92.

*European Journal of Operational Research*259 (2): 689–702.

*Journal of Banking & Finance*, 105687.

*Journal of Risk and Financial Management*12 (1): 47.

*Doing Bayesian Data Analysis: A Tutorial with r, JAGS, and Stan (2nd Ed.)*. Academic Press.

*Feature Engineering and Selection: A Practical Approach for Predictive Models*. CRC Press.

*Journal of Investing*29 (2): 21–32.

*Annals of Applied Statistics*5 (2A): 798–823.

*Journal of Finance*49 (5): 1541–78.

*Review of Financial Studies*24 (10): 3197–3249.

*Journal of Empirical Finance*15 (5): 850–59.

*Journal of Multivariate Analysis*88 (2): 365–411.

*Review of Financial Studies*30 (12): 4349–88.

*Journal of Financial Econometrics*17 (4): 645–86.

*arXiv Preprint*, no. 2001.10278.

*Nouvelles méthodes Pour La détermination Des Orbites Des Comètes*. F. Didot.

*SSRN Working Paper*3410972.

*SSRN Working Paper*.

*arXiv Preprint*, no. 1404.3274.

*Journal of Econometrics*Forthcoming.

*Review of Financial Studies*33 (5): 2274–2325.

*European Journal of Operational Research*134 (1): 84–102.

*ACM Computing Surveys (CSUR)*46 (3): 35.

*Online Portfolio Selection: Principles and Algorithms*. CRC Press.

*SSRN Working Paper*3536461.

*SSRN Working Paper*3688484.

*Philosophical Transactions of the Royal Society A*379 (2194): 20200209.

*Review of Financial Studies*31 (7): 2606–49.

*Review of Economics and Statistics*47 (1): 13–37.

*SSRN Working Paper*3272090.

*SSRN Working Paper*3531946.

*Statistical Analysis with Missing Data*. Vol. 333. John Wiley & Sons.

*Journal of Forecasting*Forthcoming.

*Review of Financial Studies*3 (2): 175–205.

*American Mathematical Monthly*Forthcoming.

*Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence*, 1280–86. AAAI Press.

*Journal of Accounting Research*54 (4): 1187–1230.

*Advances in Neural Information Processing Systems*, 4765–74.

*Review of Financial Studies*Forthcoming.

*Journal of Business & Economic Statistics*38 (1): 214–27.

*Expert Systems with Applications*Forthcoming: 113973.

*Handbook of Graphical Models*. CRC Press.

*International Conference on Machine Learning*, 2113–22.

*Journal of Portfolio Management*36 (4): 60–70.

*European Journal of Operational Research*244 (1): 289–99.

*R Data Science Quick Reference*, 71–81. Springer.

*Journal of Finance*7 (1): 77–91.

*arXiv Preprint*, no. 1910.09504.

*arXiv Preprint*, no. 1909.10678.

*SSRN Working Paper*3511296.

*International Conference on Machine Learning*, 1614–23.

*Journal of Forecasting*Forthcoming.

*Advances in Neural Information Processing Systems*, 512–18.

*Practical Neural Network Recipes in C++*. Morgan Kaufmann.

*Journal of Forecasting*31 (2): 172–88.

*SSRN Working Paper*3638177.

*Journal of Finance*71 (1): 5–32.

*Data*4 (3): 110.

*Journal of the American Statistical Association*44 (247): 335–41.

*Matrix Analysis and Applied Linear Algebra*. Vol. 71. SIAM.

*Finance Research Letters*Forthcoming.

*arXiv Preprint*, no. 2102.05799.

*Foundations of Machine Learning*. MIT Press.

*Interpretable Machine Learning: A Guide for Making Black Box Models Explainable*. LeanPub / Lulu.

*Journal of Open Source Software*3 (27): 786.

*Proceedings of the IEEE/IAFE 1997 Computational Intelligence for Financial Engineering (CIFEr)*, 300–307. IEEE.

*Journal of Forecasting*17 (5-6): 441–70.

*SSRN Working Paper*2740751.

*arXiv Preprint*, no. 2004.01509.

*Journal of Finance*54 (4): 1249–90.

*Journal of Financial Economics*104 (2): 228–50.

*Econometrica: Journal of the Econometric Society*34 (4): 768–83.

*The Journal of Investing*25 (2): 113–24.

*Doklady AN USSR*, 269:543–47.

*Advances in Neural Information Processing Systems*, 952–58.

*Advances in Neural Information Processing Systems*, 936–42.

*Decision Support Systems*50 (3): 559–69.

*arXiv Preprint*, no. 2006.05421.

*Journal of Financial Economics*103 (3): 429–53.

*Review of Financial Studies*29 (1): 104–47.

*arXiv Preprint*, no. 1912.05901.

*Journal of Financial Data Science*Forthcoming.

*Ensembles in Machine Learning Applications*. Vol. 373. Springer Science & Business Media.

*Social Studies of Science*26 (3): 611–59.

*arXiv Preprint*, no. 1708.05070.

*Expert Systems with Applications*, 112828.

*IEEE Transactions on Knowledge and Data Engineering*22 (10): 1345–59.

*Expert Systems with Applications*42 (1): 259–68.

*Expert Systems with Applications*42 (4): 2162–72.

*Journal of Financial Economics*98 (3): 605–25.

*Journal of Financial Economics*Forthcoming.

*Causality: Models, Reasoning and Inference. Second Edition*. Vol. 29. Cambridge University Press.

*SSRN Working Paper*3530390.

*SSRN Working Paper*2953614.

*Expert Systems with Applications*103: 1–13.

*Journal of Econometrics*Forthcoming.

*SSRN Working Paper*3425827.

*Journal of Business & Economic Statistics*29 (2): 307–18.

*Elements of Causal Inference: Foundations and Learning Algorithms*. MIT Press.

*Review of Financial Studies*22 (1): 435–80.

*Journal of Banking & Finance*36 (2): 410–17.

*SSRN Working Paper*3691117.

*SSRN Working Paper*1787045.

*USSR Computational Mathematics and Mathematical Physics*4 (5): 1–17.

*arXiv Preprint*, no. 1909.06312.

*Journal of Control Theory and Applications*9 (3): 336–52.

*Journal of Financial Data Science*2 (1): 86–93.

*arXiv Preprint*, no. 1802.09596.

*Review of Financial Studies*32 (4): 1573–1607.

*Dataset Shift in Machine Learning*. MIT Press.

*Journal of Finance*68 (4): 1633–62.

*SSRN Working Paper*3428095.

*AISTATS*, 489–97.

*Decision Support Systems*50 (2): 491–500.

*Theoretical Economics*Forthcoming.

*Computational Economics*40 (3): 245–64.

*Reproducible Finance with r: Code Flows and Shiny Apps for Portfolio Analysis*. Chapman & Hall / CRC.

*Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*, 1135–44. ACM.

*AISTATS*.

*Pattern Recognition and Neural Networks*. Cambridge University Press.

*Stochastic Processes and Their Applications*49 (2): 207–16.

*Econometrica*73 (4): 1237–82.

*Journal of Empirical Finance*23: 93–116.

*Psychological Review*65 (6): 386.

*Journal of Economic Theory*13 (3): 341–60.

*Robust Regression and Outlier Detection*. Vol. 589. Wiley.

*arXiv Preprint*, no. 1911.05620.

*arXiv Preprint*, no. 2008.12152.

*SSRN Working Paper*2986059.

*arXiv Preprint*, no. 1904.04973.

*Statistical Methods in Medical Research*8 (1): 3–15.

*Machine Learning*5 (2): 197–227.

*Nonlinear Estimation and Classification*, 149–71. Springer.

*Boosting: Foundations and Algorithms*. MIT Press.

*Annals of Statistics*43 (4): 1716–41.

*Synthesis Lectures on Data Mining and Knowledge Discovery*2 (1): 1–126.

*Synthesis Lectures on Artificial Intelligence and Machine Learning*6 (1): 1–114.

*arXiv Preprint*, no. 1911.13288.

*American Journal of Epidemiology*179 (6): 764–74.

*Journal of Asset Management*21 (6): 506–12.

*European Journal of Finance*Forthcoming: 1–27.

*Review of Financial Studies*5 (1): 1–33.

*Contributions to the Theory of Games*2 (28): 307–17.

*Journal of Finance*19 (3): 425–42.

*Journal of Business*39 (1): 119–38.

*Risk & Reward*, 14–19.

*Nature*529: 484–89.

*Journal of Financial Data Science*1 (1): 32–44.

*Journal of Experimental Psychology: General*143 (2): 534.

*Quantitative Finance*19 (9): 1449–59.

*arXiv Preprint*, no. 1803.09820.

*SSRN Working Paper*3728192.

*Advances in Neural Information Processing Systems*, 2951–59.

*Journal of Financial Data Science*Forthcoming.

*Expert Systems with Applications*Forthcoming: 113456.

*Causation, Prediction, and Search*. MIT Press.

*Journal of Machine Learning Research*15 (1): 1929–58.

*Amundi Working Paper*.

*Journal of Financial Economics*54 (3): 375–421.

*arXiv Preprint*, no. 1804.01955.

*Bioinformatics*28 (1): 112–18.

*Journal of Finance*53 (5): 1821–27.

*Journal of Portfolio Management*43 (2): 90–104.

*Journal of Econometrics*Forthcoming.

*Reinforcement Learning: An Introduction (2nd Edition)*. MIT Press.

*arXiv Preprint*, no. 2010.14194.

*International Journal of Epidemiology*45 (6): 1887–94.

*arXiv Preprint*, no. 2004.06627.

*Journal of the Royal Statistical Society. Series B (Methodological)*, 267–88.

*Annals of Statistics*, 1701–28.

*Journal of Statistical Software*76 (1): 1–30.

*Annual Review of Financial Economics*10: 449–79.

*IEEE Transactions on Knowledge & Data Engineering*, no. 3: 659–65.

*Journal of Financial Markets*Forthcoming: 100588.

*Harvard Business Review*43 (1): 63–75.

*2017 IEEE 19th Conference on Business Informatics (CBI)*, 1:7–12.

*Journal of Empirical Finance*Forthcoming.

*Journal of Financial and Quantitative Analysis*45 (4): 959–86.

*Econometrics Journal*22 (1): 34–56.

*Flexible Imputation of Missing Data*. Chapman & Hall / CRC.

*Journal of Banking & Finance*35 (12): 3263–74.

*Automation and Remote Control*24: 774–80.

*Review of Financial Studies*26 (5): 1087–1145.

*arXiv Preprint*, no. 2003.11132.

*Journal of Financial Markets*Forthcoming: 100598.

*Omega*15 (2): 145–55.

*arXiv Preprint*, no. 2001.04185.

*Organizational Behavior and Human Performance*8 (1): 139–58.

*arXiv Preprint*, no. 2003.00130.

*Expert Systems with Applications*38 (1): 223–30.

*SSRN Working Paper*3382932.

*Omega*40 (6): 758–66.

*Expert Systems with Applications*143: 113042.

*Machine Learning*8 (3-4): 279–92.

*arXiv Preprint*, no. 1910.03743.

*Journal of Big Data*3 (1): 9.

*Econometrica*68 (5): 1097–1126.

*Journal of Open Source Software*4 (43): 1686.

*IRE WESCON Convention Record*, 4:96–104.

*Quantitative Finance*Forthcoming.

*Complex Systems*6 (1): 47.

*Neural Networks*5 (2): 241–59.

*IEEE Transactions on Evolutionary Computation*1 (1): 67–82.

*arXiv Preprint*, no. 2003.02515.

*Management Science*Forthcoming.

*arXiv Preprint*, no. 1811.07522.

*Review of Financial Studies*Forthcoming.

*SSRN Working Paper*3443998.

*SSRN Working Paper*3517888.

*Expert Systems with Applications*114: 388–401.

*North American Journal of Economics and Finance*, 101274.

*arXiv Preprint*, no. 1901.08740.

*Journal of Banking & Finance*Forthcoming: 105966.

*arXiv Preprint*, no. 1212.5701.

*Ensemble Machine Learning: Methods and Applications*. Springer.

*Expert Systems with Applications*36 (5): 8849–54.

*Journal of Financial Data Science*2 (2): 25–40.

*Journal of Business & Economic Statistics*Forthcoming.

*Ensemble Methods: Foundations and Algorithms*. Chapman & Hall / CRC.

*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*67 (2): 301–20.

*The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution*. Penguin Random House.