R is a Lego Construction Set
Bioinformatics Core Facility CECAD
2025-03-17
git clone https://github.com/CECADBioinformaticsCoreFacility/Intermediate_R_Course_2025.git
https://cecadbioinformaticscorefacility.github.io/Intermediate_R_Course_2025/
Session 1 :: R is a Lego Construction Set
Let’s say we are tinkering interactively in the console
:
But what if we want to try a different value for hour
? Do we need to write it all again?
The first level of abstraction R is the function
:
Catching illegal input
: Throw an error
… or return internally with a message
Keep a file for hand-crafted functions
:
## Functions for drawing heatmaps:
## pheatmaps code, from https://www.biostars.org/p/223532/
scale_rows <- function(x){
m <- apply(x, 1, mean, na.rm = TRUE)
s <- apply(x, 1, sd, na.rm = TRUE)
return((x - m) / s)
}
scale_mat <- function(mat, scale){
if(!(scale %in% c("none", "row"))){
stop("scale argument shoud take values: 'none' or 'row'")
}
mat <- switch(scale, none = mat,
row = scale_rows(mat),
column = t(scale_rows(t(mat))))
return(mat)
}
source package
: it is simply a folder with strictly defined content in plain text
NAMESPACE file
, which describes how the package depends on other packages
post-installation
structure is operation system dependent and no longer plain text
The base R function package_dependencies
queries information on the CRAN repository. Therefore we need to set a valid CRAN mirror
, if we have not already done so:
List the strong dependencies
of the plotting package ggplot2, i.e. those packages which need to be installed for ggplot2 to work
:
While repositories are package sources, a destination folder for packages on our local system
is called a library
.
There may be more than a single library folder.
Function .libPaths()
can be used to list them
:
Bioi
is a small and barebones package related to bioimaging
. Let’s install it from CRAN:
Note that parameters "lib" and "repos" are superfluous in this case
, because
.libPaths()[1]
(the user library)
First remove it again …
Restart the R session for rstudio's Packages pane to realize the change
!
Then do it rstudio-compliant:
Have an eye on what happens in the console:
It uses a different CRAN mirror from the one we had set above!
This happens because R had been restarted in between.
If you want a permanent value for an option,
_
set it in your .Rprofile file!
Checking an entry in the Packages list invokes the library()
function:
library(PKG)
without further arguments scans .libPath()
.
The first entry found with name “PKG” - is attached to the search path
- its namespace is loaded
(code can refer to PKG’s functions by name,
without specifying the package)
Functions which would be exposed after a loading a package via library() may also be accessed via the package name: PKG::my_func(). PKG would then be in the state loaded via namespace (and not attached)
.
"ecosystem" of interconnected packages
(most of them in R)
, "sub-ecosystems"
own repository and installer
,
Bioconductor’s guiding idea:
allow chaining of function input and output into flexible analysis workflows, even across packages
Here are two Bioconductor URLs in plain text, for easier copying:
While I find the official site a bit hard to navigate, the Carpentries Incubator site is a really nice introduction.
A nice set of interdependent statistics-related packages is easystats
(https://easystats.github.io/easystats/), with a mission “to provide a unifying and consistent framework to tame, discipline, and harness the scary R statistics and their pesky models”.
And then there is the Tidyverse
, a large and growing ecosystem of cross-compatible packages, with focus on workflows on 2D tables and graphical objects: