Loading...

tsbox 0.1: class-agnostic time series

Avatar
Author
Christoph Sax

The R ecosystem knows a vast number of time series classes: ts, xts, zoo, tsibble, tibbletime or timeSeries. The plethora of standards causes confusion. As different packages rely on different classes, it is hard to use them in the same analysis. tsbox provides a set of tools that make it easy to switch between these classes. It also allows the user to treat time series as plain data frames, facilitating the use with tools that assume rectangular data.


xkcd comic: Standards
comic by xkcd


The tsbox package is built around a set of functions that convert time series of different classes to each other. They are frequency-agnostic and allow the user to combine multiple non-standard and irregular frequencies. Because coercion works reliably, it is easy to write functions that work identically for all classes. So whether we want to smooth, scale, differentiate, chain-link, forecast, regularize, or seasonally adjust a time series, we can use the same tsbox-command for any time series class.

This blog gives a short overview of the changes introduced in 0.1. A detailed overview of the package functionality is given in the documentation page (or in a previous blog-post).

Keeping explicit missing values

Version 0.1, now on CRAN, brings many bug fixes and improvements. A substantial change involves the treatment of NA values in data frames. Previously, all NAs in data frames were treated as implicit and were only made explicit by a call to ts_regular.

This has changed now. If you convert a ts object to a data frame, all NA values will be preserved. To replicate previous behavior, apply the ts_na_omit function:

library(tsbox)
x.ts <- ts_c(mdeaths, austres)
x.ts
ts_df(x.ts)
ts_na_omit(ts_df(x.ts))

ts_span extends outside of series span

This lays the groundwork for ts_span to be extensible. With extend = TRUE, ts_span extends a regular series with NA values, up to the specified limits, similar to base window. Like all functions in tsbox, this is frequency-agnostic. For example, in the following, the monthly series mdeaths is extended by monthly NA values, while the quarterly series austres is extended by quarterly NA values.

x.df <- ts_df(ts_c(mdeaths, austres))
ts_span(x.df, end = "1999-12-01", extend = TRUE)

ts_default standardizes column names in a data frame

In rectangular data structures, i.e., in a data.frame, a data.table, or a tibble, tsbox stores one or multiple time series in the ‘long’ format. By default, tsbox detects a value, a time and zero, one or several id columns. Alternatively, the time column and the value column can be explicitly named time and value. If explicit names are used, the column order will be ignored.

While automatic column name detection is useful in interactive mode, it produces unnecessary overhead in longer workflows. The helper function ts_default detects and renames the time and the value column so that auto-detection will be turned off in subsequent steps (note that the names of the id columns are not affected):

x.df <- ts_df(ts_c(mdeaths, austres))
names(x.df) <- c("a fancy id name", "date", "count")
ts_plot(x.df)  # tsbox is fine with that
ts_default(x.df)

ts_summary summarizes time series

ts_summary provides a frequency agnostic summary of a ts-boxable object:

ts_summary(ts_c(mdeaths, austres))
#>        id obs    diff freq      start        end
#> 1 mdeaths  72 1 month   12 1974-01-01 1979-12-01
#> 2 austres  89 3 month    4 1971-04-01 1993-04-01

ts_summary returns a plain data frame that can be used for any purpose. It is also recommended for the extraction of various time series properties, such as start, freq or id:

ts_summary(austres)$id
#> [1] "austres"
ts_summary(austres)$start
#> [1] "1971-04-01"

And a cheat sheet!

Finally, we fabricated a tsbox cheat sheet that summarizes most functionality. Print and enjoy working with time series.

tsbox cheat sheet




More posts

DevOps Engineer (80-100%)

cynkra team

EFS vs. NFS for RStudio on Kubernetes (AWS): Configuration and considerations

Patrick Schratz

Accessing Google's API via OAuth2

Patrick Schratz

Data Scientist (80-100%)

cynkra team

seasonal 1.9: Accessing composite output

Christoph Sax

Google Season of Docs with R: useR! Information Board

Ben Ubah

Running old versions of TeXlive with tinytex

Kirill Müller

tsbox 0.3.1: extended functionality

Christoph Sax

Celebrating one-year anniversary as RStudio Full Service Certified Partner

Cosima Meyer, Patrick Schratz

Deprecating a pkgdown site served via GitHub Pages

Patrick Schratz, Kirill Müller

gfortran support for R on macOS

Patrick Schratz

Seasonal Adjustment of Multiple Series

Christoph Sax

Dynamic build matrix in GitHub Actions

Kirill Müller

Setting up a load-balanced Jitsi Meet instance

Patrick Schratz

DevOps Expert (f/m/d, 60-100%)

cynkra team

Maintaining multiple identities with Git

Kirill Müller

Relational data models in R

Angel D'az, Kirill Müller

tempdisagg: converting quarterly time series to daily

Christoph Sax

tsbox 0.2: supporting additional time series classes

Christoph Sax

DevOps System Engineer (40-60%)

cynkra team

Introducing dm: easy juggling of tables and relations

Balthasar Sager

tsbox 0.1: class-agnostic time series

Christoph Sax

Data Scientist/Engineer (40-100%)

cynkra team

Time series of the world, unite!

Christoph Sax

Done “Establishing DBI”!?

Kirill Müller


Other blogs

R-bloggers
Top