Logo by Elisa Sovrano

depigner 0.8.1

depigner 0.8.1

I’m very excited to announce the first release of {depigner} to CRAN!

Pigna [pìn’n’a] is the Italian word for pine cone. In jargon, it is used to identify a task which is boring, banal, annoying, painful, frustrating and maybe even with a not so beautiful or rewarding result, just like the obstinate act of trying to challenge yourself in extracting pine nuts from a pine cone, provided that, in the end, you will find at least one inside it.

The {depigner} aims to provide some useful functions to be used to solve small everyday problems of coding or analyzing data with R. The hope is to provide solutions to that kind of small-little problems which would be normally solved using quick-and-dirty (ugly and maybe even wrong) patches.

Installation

You can install the released version from CRAN directly calling:

install.packages("depigner")

If you would like to be updated with the last development version available, you can install it from it’s source on GitHub by calling:

# install.packages("devtools")
devtools::install_github("CorradoLanera/depigner")

Next, you can attach it to your session as usual by:

library(depigner)
## Welcome to depigner: we are here to un-stress you!

Provided Tools

Tools Category Function(s) Aim
Harrell’s verse tidy_summary() pander-ready data frame from Hmisc::summary()
  paired_test_continuous Paired test for continuous variable into Hmisc::summary
  paired_test_categorical Paired test for categorical variable into Hmisc::summary
  adjust_p() Adjusts P-values for multiplicity of tests at tidy_summary()
  summary_interact() data frame of OR for interaction from rms::lrm()
  htypes() Will be your variables continuous or categorical in Hmisc::describe()?
Statistical ci2p() Get P-value form estimation and confidence interval
Programming pb_len() Quick set-up of a progress::progress_bar() progress bar
  install_pkg_set() Politely install set of packages (topic-related sets at ?pkg_sets)
Development use_ui() Activate {usethis} user interface into your own package
  please_install() Politely ask the user to install a package
  imported_from() List packages imported from a package (which has to be installed)
Telegram start_bot_for_chat() Quick start of a {telegram.bot} Telegram’s bot
  send_to_telegram() Unified wrapper to send someRthing to a Telegram chat
  errors_to_telegram() Divert all your error messages from the console to a Telegram chat
Why not?! gdp() Do you have TOO much pignas in your back?! … try this out ;-)

Harrell’s Verse Tools

Harrell’s packages {Hmisc} and {rms} are amazing packages to do statistical analyses, especially for clinical data and purposes. If you do not know them, you should.

They are that useful and vast that they gain the label of “Harrell’verse’”.

One of the functions I use more is its summary() methods, especially with the option method = "reverse" enabled. It is incredibly useful to get a “table one” like information.

Pigna

Often, my colleagues and I have faced to the problem of using the data inside the output of the summary() function, and I have seen very many colorful patches to manage this issue.

{depigner}: depigner has a tidy_summary() pipable function that produces in output a lovely table already adjusted to be processed by {pander}!

Currently it is tested for method reverse only:

library(rms, quietly = TRUE)
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
## 
##     format.pval, units
## 
## Attaching package: 'SparseM'
## The following object is masked from 'package:base':
## 
##     backsolve
  options(datadist = 'dd')
library(survival)
library(pander)

dd <- datadist(iris)
my_summary <- summary(Species ~., data = iris, method = "reverse")
tidy_summary(my_summary) %>% 
  pander()
  setosa (N=50) versicolor (N=50) virginica (N=50)
Sepal.Length 4.800/5.000/5.200 5.600/5.900/6.300 6.225/6.500/6.900
Sepal.Width 3.200/3.400/3.675 2.525/2.800/3.000 2.800/3.000/3.175
Petal.Length 1.400/1.500/1.575 4.000/4.350/4.600 5.100/5.550/5.875
Petal.Width 0.2/0.2/0.3 1.2/1.3/1.5 1.8/2.0/2.3
dd <- datadist(heart)
surv <- Surv(heart$start, heart$stop, heart$event)
f    <- cph(surv ~ age + year + surgery, data = heart)
my_summary <- summary(f)
tidy_summary(my_summary) %>% 
  pander()
  Diff. HR Lower 95% CI Upper 95% CI
age 10.69 1.336 1.009 1.767
year 3.374 0.6104 0.3831 0.9727
surgery 1 0.5286 0.2574 1.085

Pigna

What if you have paired samples into the data you would like to summary()zed?

{depigner}: provide the required object to pass to the summary() function to perform a paired test, both for continuous and categorical data with two or more groups.1

  • paired_test_*(): Paired test for categorical/continuous variables to be used in the summary() of the {Hmisc} (Harrell 2020) package:
data(Arthritis)
# categorical -------------------------
## two groups
summary(Treatment ~ Sex,
    data    = Arthritis,
    method  = "reverse",
    test    = TRUE,
    catTest = paired_test_categorical
)
## 
## 
## Descriptive Statistics by Treatment
## 
## +----------+--------------------+--------------------+------------------------------+
## |          |Placebo             |Treated             |  Test                        |
## |          |(N=43)              |(N=41)              |Statistic                     |
## +----------+--------------------+--------------------+------------------------------+
## |Sex : Male|           26%  (11)|           34%  (14)|Chi-square=5.92 d.f.=1 P=0.015|
## +----------+--------------------+--------------------+------------------------------+
## more than two groups
summary(Improved ~ Sex,
    data    = Arthritis,
    method  = "reverse",
    test    = TRUE,
    catTest = paired_test_categorical
)
## 
## 
## Descriptive Statistics by Improved
## 
## +----------+-----------------+-----------------+-----------------+------------------------+
## |          |None             |Some             |Marked           |  Test                  |
## |          |(N=42)           |(N=14)           |(N=28)           |Statistic               |
## +----------+-----------------+-----------------+-----------------+------------------------+
## |Sex : Male|        40%  (17)|        14%  ( 2)|        21%  ( 6)|chi2=1.71 d.f.=3 P=0.634|
## +----------+-----------------+-----------------+-----------------+------------------------+
# continuous --------------------------
## two groups
summary(Species ~.,
    data    = iris[iris$Species != "setosa",],
    method  = "reverse",
    test    = TRUE,
    conTest = paired_test_continuous
)
## 
## 
## Descriptive Statistics by Species
## 
## +------------+---------------------+---------------------+------------------------+
## |            |versicolor           |virginica            |  Test                  |
## |            |(N=50)               |(N=50)               |Statistic               |
## +------------+---------------------+---------------------+------------------------+
## |Sepal.Length|    5.600/5.900/6.300|    6.225/6.500/6.900| t=-5.28 d.f.=49 P<0.001|
## +------------+---------------------+---------------------+------------------------+
## |Sepal.Width |    2.525/2.800/3.000|    2.800/3.000/3.175| t=-3.08 d.f.=49 P=0.003|
## +------------+---------------------+---------------------+------------------------+
## |Petal.Length|    4.000/4.350/4.600|    5.100/5.550/5.875|t=-12.09 d.f.=49 P<0.001|
## +------------+---------------------+---------------------+------------------------+
## |Petal.Width |       1.2/1.3/1.5   |       1.8/2.0/2.3   |t=-14.69 d.f.=49 P<0.001|
## +------------+---------------------+---------------------+------------------------+
## more than two groups
summary(Species ~.,
    data    = iris,
    method  = "reverse",
    test    = TRUE,
    conTest = paired_test_continuous
)
## 
## 
## Descriptive Statistics by Species
## 
## +------------+--------------------+--------------------+--------------------+-----------------------+
## |            |setosa              |versicolor          |virginica           |  Test                 |
## |            |(N=50)              |(N=50)              |(N=50)              |Statistic              |
## +------------+--------------------+--------------------+--------------------+-----------------------+
## |Sepal.Length|   4.800/5.000/5.200|   5.600/5.900/6.300|   6.225/6.500/6.900| F=30.55 d.f.=2 P<0.001|
## +------------+--------------------+--------------------+--------------------+-----------------------+
## |Sepal.Width |   3.200/3.400/3.675|   2.525/2.800/3.000|   2.800/3.000/3.175| F=12.63 d.f.=2 P<0.001|
## +------------+--------------------+--------------------+--------------------+-----------------------+
## |Petal.Length|   1.400/1.500/1.575|   4.000/4.350/4.600|   5.100/5.550/5.875|F=322.89 d.f.=2 P<0.001|
## +------------+--------------------+--------------------+--------------------+-----------------------+
## |Petal.Width |      0.2/0.2/0.3   |      1.2/1.3/1.5   |      1.8/2.0/2.3   |F=234.21 d.f.=2 P<0.001|
## +------------+--------------------+--------------------+--------------------+-----------------------+

Pigna

How often were you asked to provide “all vs. all” checks and tests to find that unique randomly sampled P-value, which is less than 0.05 by a factor of \(10^{-15}\) order of magnitude? {depigner}: gives you a function to automatically adjust the resulting p-values for multiplicity into the summary() tables are also provided.

  • adjust_p(): Adjust P-values of a tidy_summary objects:
my_summary <- summary(Species ~., data = iris,
                      method = "reverse",
                      test = TRUE)

  tidy_summary(my_summary, prtest = "P") %>%
    adjust_p() %>%
    pander()
## ✓ P adjusted with BH method.
Table continues below
  setosa (N=50) versicolor (N=50) virginica (N=50)
Sepal.Length 4.800/5.000/5.200 5.600/5.900/6.300 6.225/6.500/6.900
Sepal.Width 3.200/3.400/3.675 2.525/2.800/3.000 2.800/3.000/3.175
Petal.Length 1.400/1.500/1.575 4.000/4.350/4.600 5.100/5.550/5.875
Petal.Width 0.2/0.2/0.3 1.2/1.3/1.5 1.8/2.0/2.3
P-value
<0.001
<=0.001
<=0.001
<=0.001

Producing a lovely, concise, printable table when interaction kicks in into the model is the aim of the summary_interact() function.

  • summary_interact(): Produce a data frame of OR (with the corresponding CI95%) for the interactions between different combination of a continuous variable (for which it is possible to define the reference and the target values) and (every or a selection of levels of) a categorical one in a logistic model provided by lrm() (from the {rms} package (Harrell, Jr. 2020)):
summary_interact(lrm_mod, age, abo) %>%
  pander()
  Low High Diff. Odds Ratio Lower 95% CI Upper 95% CI
age - A 43 58 15 1.002 0.557 1.802
age - B 43 58 15 1.817 0.74 4.463
age - AB 43 58 15 0.635 0.186 2.169
age - O 43 58 15 0.645 0.352 1.182
summary_interact(lrm_mod, age, abo, p = TRUE) %>%
  pander()
  Low High Diff. Odds Ratio Lower 95% CI Upper 95% CI P-value
age - A 43 58 15 1.002 0.557 1.802 0.498
age - B 43 58 15 1.817 0.74 4.463 0.137
age - AB 43 58 15 0.635 0.186 2.169 0.728
age - O 43 58 15 0.645 0.352 1.182 0.883

Pigna

Another super useful function provided by Hmisc is describe(), which has a lovely plot() method for its results. I had faced some issue when I needed to use those plots programmatically because their outputs are not that clear or easy to infer before to see them. I.e., suppose you want to know which variable will be plotted, and in which plot they will be plotted (or even if a plot will be produced). Hence, working with {Hmisc}, it is crucial to know which variables Harrell considers continuous or categorical. Among them, it is also mandatory to know which one he considers to have (good) enough information to have been plotted.

{depigner} solve those problems with its family of htypes() functions, which seems to have “reasonable” results… officially:

  • htypes() and friends: get/check types of variable with respect to the {Hmisc} ecosystem (Harrell 2020).
htypes(mtcars)
##    mpg    cyl   disp     hp   drat     wt   qsec     vs     am   gear   carb 
##  "con" "none"  "con"  "con"  "con"  "con"  "con"  "cat"  "cat" "none" "none"
desc <- Hmisc::describe(mtcars)
htypes(desc)
##    mpg    cyl   disp     hp   drat     wt   qsec     vs     am   gear   carb 
##  "con" "none"  "con"  "con"  "con"  "con"  "con"  "cat"  "cat" "none" "none"
htype(desc[[1]])
## [1] "con"
is_hcat(desc[[1]])
## [1] FALSE
is_hcon(desc[[1]])
## [1] TRUE

Statistical Tools

Pigna

Some people can and prefer to read more information into a (single) value than into two…

{depigner}: have a wrapper to provide an answer to them.

  • ci2p(): compute the p-value related with a provided confidence interval (assuming every type o necessary assumptions):
ci2p(1.125, 0.634,  1.999, log_transform = TRUE)
## [1] 0.367902

Programming Tools

Pigna

Progress bars provided by {progress} are great, complete, and powerful. They are also quite easy to set up, but for effortless usage (which comes quite often) an even more easy-to-use wrapper could be useful.2

{depigner}: helps you with this super simple wrapper, you have to know the length (i.e., number of steps) the progress should do overall, and that’s it!

pb <- pb_len(100)

for (i in 1:100) {
    Sys.sleep(0.1)
    tick(pb, paste("i = ", i))
}

Pigna

How often have you (re)installed the same set of packages on (your or other) computers? You need to remember which packages you or they need, you can forget one of them, and next… you have to repeat it for the next computer!

{depigner}: gives you a function and a set of predefined sets of packages which you can use to install all the packages you need in a single call. Moreover, it does it politely querying the use if they agree for the installation, and possibly for a general update.

  • install_pkg_set(): Simple and polite wrapper to install sets of packages. Moreover, {depigner} provides some sets already defined for common scenario in R (analyses, production, documenting, …). See them by call ?pgk_sets.
install_pkg_set() # this install the whole `?pkg_all`
install_pkg_set(pkg_stan)

?pkg_sets

Development Tools

Pigna

When developing a package, use the {usethis} interface would be fantastic. Anyway, importing {usethis} can be not enough: you still need to write the call for its functions explicitly, e.g., you could be face to the problem of writing something like

usethis::ui_stop('field {usethis::ui_field("foo")} must has value {usethis::ui_value(value)}, ...')

Which is not that nice, doesn’t it?

{depigner}: help you, in a single call in the {usethis} style to add to your NAMESPACE all the {usethis}’ UI permitting to you to go straight to something like

ui_stop('field {ui_field("foo")} must has value {ui_value(value)}, ...')
# in the initial setup steps of the development of a package
use_ui()

Pigna

When you need to write code that others should run, it is not kind to force the installation or the update of packages in their environments.

{depigner}: thanks to a prompt given by He The Hadley during a fantastic workshop in NYC, gives you a function to politely ask others to install on their system the packages that you need.

  • please_install(): This is a polite wrapper to install.packages() inspired (= w/ very minimal modification) by a function Hadley showed us during a course.
a_pkg_i_miss <- setdiff(available.packages(), installed.packages())[[1]]
please_install(a_pkg_i_miss)

Pigna

When you install or attach packages, often (quite ever), other packages (i.e., dependencies) were installed/attached too. Whose are they? Do we need to install both those packages explicitly or one of them it is sufficient because the other will be automatically installed as a consequence?

{depigner}: has a simple single function to provide you a quick answer to those questions.3

  • imported_from(): If you would like to know which packages are imported by a package (eg to know which packages are required for its installation or either installed during it) you can use this function
imported_from("depigner")
##  [1] "desc"         "dplyr"        "fs"           "ggplot2"      "Hmisc"       
##  [6] "magrittr"     "progress"     "purrr"        "rlang"        "rms"         
## [11] "rprojroot"    "stats"        "stringr"      "telegram.bot" "tibble"      
## [16] "tidyr"        "usethis"      "utils"

Telegram Tools

Telegram… I love it!4

Pigna

Have you ever run some long, very long, super long, extra long, computation… on a server? How many times did you need to come back to your PC, or to your ssh + screen session to check its state, to check the logs, to see if some errors evilly happened?

{depigner}: provide you some super easy to use wrappers on some functionalities of {telegram.bot} package which will permit you to be notified with custom messages, images, and even errors on your phone directly. You need only to define your bot (instruction in the help page), and you are ready to go!

  • Wrappers to simple use of Telegram’s bots: wrappers from the {telegram.bot} package (Benedito 2019):
# Set up a Telegram bot. read `?start_bot_for_chat`
start_bot_for_chat()

# Send something to telegram
send_to_telegram("hello world")

library(ggplot2)
gg <- ggplot(mtcars, aes(x = mpg, y = hp, colour = cyl)) +
    geom_point()
send_to_telegram(
  "following an `mtcars` coloured plot",
  parse_mode = "Markdown"
)
send_to_telegram(gg)

# Divert output errors to the telegram bot
errors_to_telegram()

Why Not?!

Pigna

SSometimes piñatas are too many to be solved, and you just have to give in to them.

{depigner}: is near you.5

  • gdp(): A wrapper to relax.
gdp(7)

Acknowledgements

The {depigner}’s logo was lovely designed by Elisa Sovrano.

References

Benedito, Ernest. 2019. Telegram.bot: Develop a Telegram Bot with r. http://github.com/ebeneditos/telegram.bot.
Csárdi, Gábor, and Rich FitzJohn. 2019. Progress: Terminal Progress Bars. https://github.com/r-lib/progress#readme.
Daróczi, Gergely, and Roman Tsegelskyi. 2018. Pander: An r Pandoc Writer. http://rapporter.github.io/pander.
Harrell, Frank E, Jr. 2020. Hmisc: Harrell Miscellaneous.
Harrell, Jr., Frank E. 2020. Rms: Regression Modeling Strategies.
Wickham, Hadley, and Jennifer Bryan. 2020. Usethis: Automate Package and Project Setup.

  1. Pay particular attention to the ?paired_test_* documentation of interest because data should be provided in some specific way due to the internal management of records from {Hmisc}’s functions.↩︎

  2. Note: currently, after erum2020 I am considering to switch to [{progressr}] https://github.com/HenrikBengtsson/progressr) package. However, for the moment, I still consider {progress} and our {depigner} wrapper as simple to use as powerful for most cases.↩︎

  3. Note: it reports only the packages imported and not the suggested ones.↩︎

  4. Do not ask me why… It is wonderful! That’s it!↩︎

  5. Why “gdp?”… no one knows it…↩︎

Avatar
Corrado Lanera
Fellow researcher, data scientist, and trainer
comments powered by Disqus