Let’s look at how we could have changed our returns object portfolio_returns_tq_rebalanced_monthly
to match the Fama French factors.
We want to recast the date
column to be tied to the first of the month. If it were an xts
object, we could use to.monthly(indexAt = "firstof", OHLC = FALSE)
. So, let’s convert it to xts
with tk_xts(date_var = date)
from the timetk
package first, and then run our operation.
symbols <- c("SPY","EFA", "IJS", "EEM","AGG")
w <- c(0.25, 0.25, 0.20, 0.20, 0.10)
# The prices object will hold our raw price data throughout this book.
portfolio_returns_tq_rebalanced_monthly <-
getSymbols(symbols,
src = 'yahoo',
from = "2012-12-31",
to = "2017-12-31",
auto.assign = TRUE,
warnings = FALSE) %>%
map(~Ad(get(.))) %>%
reduce(merge) %>%
`colnames<-`(symbols) %>%
to.monthly(indexAt = "lastof",
OHLC = FALSE) %>%
tk_tbl(preserve_index = TRUE,
rename_index = "date") %>%
gather(asset, returns, -date) %>%
group_by(asset) %>%
mutate(returns = (log(returns) - log(lag(returns)))) %>%
spread(asset, returns) %>%
select(date, symbols) %>%
na.omit() %>%
gather(asset, returns, -date) %>%
tq_portfolio(assets_col = asset,
returns_col = returns,
weights = w,
col_rename = "returns",
rebalance_on = "months")
temp <- tempfile()
base <-
"http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/"
factor <-
"Global_3_Factors_"
format<-
"CSV.zip"
full_url <-
paste(base,
factor,
format,
sep = "")
download.file(
full_url,
temp)
Global_3_Factors <-
read_csv(unz(temp, "Global_3_Factors.csv"),
skip = 6) %>%
rename(date = X1) %>%
mutate(date =
ymd(parse_date_time(date, "%Y%m"))) %>%
mutate_if(is.character,as.numeric)
pf_returns_tq_rebalanced_monthly_first_day <-
portfolio_returns_tq_rebalanced_monthly %>%
tk_xts(date_var = date) %>%
to.monthly(indexAt = "firstof",
OHLC = FALSE)
head(pf_returns_tq_rebalanced_monthly_first_day)
## returns
## 2013-01-01 0.0308487591
## 2013-02-01 -0.0008695893
## 2013-03-01 0.0186624765
## 2013-04-01 0.0206247906
## 2013-05-01 -0.0053528018
## 2013-06-01 -0.0229532633
head(Global_3_Factors)
## # A tibble: 6 x 5
## date `Mkt-RF` SMB HML RF
## <date> <dbl> <dbl> <dbl> <dbl>
## 1 1990-07-01 0.860 0.770 -0.250 0.680
## 2 1990-08-01 -10.8 -1.60 0.600 0.660
## 3 1990-09-01 -12.0 1.23 0.810 0.600
## 4 1990-10-01 9.57 -7.39 -4.25 0.680
## 5 1990-11-01 - 3.86 1.22 1.14 0.570
## 6 1990-12-01 1.10 -0.790 -1.60 0.600
The dates are in the same format now. This makes our life a lot easier - but there was a cost. We changed our base data object. Was it worth it? Well, we can perform a join to put these into one object now. But first we have to convert the portfolio_returns_tq_rebalanced_monthly_first_day
back to a tibble with tk_tbl(preserve_index = TRUE, rename_index = "date")
.
ff_portfolio_returns_joined <-
pf_returns_tq_rebalanced_monthly_first_day %>%
tk_tbl(preserve_index = TRUE,
rename_index = "date") %>%
left_join(Global_3_Factors)
tail(ff_portfolio_returns_joined)
## # A tibble: 6 x 6
## date returns `Mkt-RF` SMB HML RF
## <date> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2017-07-01 0.0248 2.51 -0.110 1.16 0.0700
## 2 2017-08-01 0.000689 0.130 -0.130 -1.36 0.0900
## 3 2017-09-01 0.0266 2.30 1.31 1.54 0.0900
## 4 2017-10-01 0.0177 1.80 -0.850 -0.950 0.0900
## 5 2017-11-01 0.0157 1.93 -0.680 -0.260 0.0800
## 6 2017-12-01 0.0134 1.38 0.940 0.140 0.0900
That process seems slighly less painful than when we changed the date format of the FF factors though neither one is perfect. Disparate data sources lead to these kinds of dilemmas and I apologize for making us go through so much date reformatting (though if you found that insufferable it’s not too late to take the LSAT). Hopefully I’ve convinced you that data provenance can really start to matter when we start taking on new projects and using data from different sources.