Friday, November 20, 2015


Integrated Development Environments

Integrated Development Environment
  • RStudio - A powerful and productive user interface for R. Works great on Windows, Mac, and Linux.
  • Emacs + ESS - Emacs Speaks Statistics is an add-on package for emacs text editors.
  • Sublime Text + R-Box - Add-on package for Sublime Text 2/3.
  • TextMate + r.tmblundle - Add-on package for TextMate 1/2.
  • StatET - An Eclipse based IDE for R.
  • Revolution R Enterprise - Revolution R would be offered free to academic users and commercial software would focus on big data, large scale multiprocessor functionality.
  • R Commander - A package that provides a basic graphical user interface.
  • IPython - An interactive Python interpreter, and it supports execution of R code while capturing both output and figures.
  • Deducer - A Menu driven data analysis GUI with a spreadsheet like data editor.
  • Radiant - A platform-independent browser-based interface for business analytics in R, based on the Shiny.
  • Vim-R - Vim plugin for R.
  • JASP - A complete package for both Bayesian and Frequentist methods, that is familiar to users of SPSS.
  • Bio7 - A IDE contains tools for model creation, scientific image analysis and statistical analysis for ecological modelling.
Minimal functionnal packages:


Packages change the way you use R.
  • magrittr - Let's pipe it.
  • pipeR - Multi-paradigm Pipeline Implementation.
  • lambda.r - Functional programming and simple pattern matching in R.
  • purrr - A FP package for R in the spirit of underscore.js.

Data Manipulation

Packages for cooking data.
  • dplyr - Fast data frames manipulation and database query.
  • data.table - Fast data manipulation in a short and flexible syntax.
  • reshape2 - Flexible rearrange, reshape and aggregate data.
  • readr - A fast and friendly way to read tabular data into R.
  • haven - Improved methods to import SPSS, Stata and SAS files in R.
  • tidyr - Easily tidy data with spread and gather functions.
  • broom - Convert statistical analysis objects into tidy data frames.
  • rlist - A toolbox for non-tabular data manipulation with lists.
  • jsonlite - A robust and quick way to parse JSON files in R.
  • ff - Data structures designed to store large datasets.
  • lubridate - A set of functions to work with dates and times.
  • stringi - ICU based string processing package.
  • stringr - Consistent API for string processing, built on top of stringi.

Graphic Displays

Packages for showing data.
  • ggplot2 - An implementation of the Grammar of Graphics.
  • ggfortify - A unified interface to ggplot2 popular statistical packages using one line of code.
  • lattice - A powerful and elegant high-level data visualization system.
  • rgl - 3D visualization device system for R.
  • Cairo - R graphics device using cairo graphics library for creating high-quality display output.
  • extrafont - Tools for using fonts in R graphics.
  • showtext - Enable R graphics device to show text using system fonts.
  • animation - A simple way to produce animated graphics in R, using ImageMagick.
  • misc3d - Powerful functions to deal with 3d plots, isosurfaces, etc.
  • xkcd - Use xkcd style in graphs.
  • imager - An image processing package based on CImg library to work with images and display them.

HTML Widgets

Packages for interactive visualizations.
  • d3heatmap - Interactive heatmaps with D3.
  • DataTables - Displays R matrices or data frames as interactive HTML tables.
  • DiagrammeR - Create JS graph diagrams and flowcharts in R.
  • dygraphs - Charting time-series data in R.
  • formattable - Formattable Data Structures.
  • ggvis - Interactive grammar of graphics for R.
  • Leaflet - One of the most popular JavaScript libraries interactive maps.
  • MetricsGraphics - Enables easy creation of D3 scatterplots, line charts, and histograms.
  • networkD3 - D3 JavaScript Network Graphs from R.
  • scatterD3 - Interactive scatterplots with D3.
  • plotly - Interactive ggplot2 and Shiny plotting with
  • rCharts - Interactive JS Charts from R.
  • rbokeh - R Interface to Bokeh.
  • threejs - Interactive 3D scatter plots and globes.

Reproducible Research

Packages for literate programming.
  • knitr - Easy dynamic report generation in R.
  • xtable - Export tables to LaTeX or HTML.
  • rapport - An R templating system.
  • rmarkdown - Dynamic documents for R.
  • slidify - Generate reproducible html5 slides from R markdown.
  • Sweave - A package designed to write LaTeX reports using R.
  • texreg - Formatting statistical models in LaTex and HTML.
  • checkpoint - Install packages from snapshots on the checkpoint server.
  • brew - Pre-compute data to enhance your report templates. Can be combined with knitr.
  • ReporteRs - An R package to generate Microsoft Word, Microsoft PowerPoint and HTML reports.

Web Technologies and Services

Packages to surf the web.
  • shiny - Easy interactive web applications with R.
  • RCurl - General network (HTTP/FTP/...) client interface for R.
  • httr - User-friendly RCurl wrapper.
  • httpuv - HTTP and WebSocket server library.
  • XML - Tools for parsing and generating XML within R.
  • rvest - Simple web scraping for R, using CSSSelect or XPath syntax.
  • OpenCPU - HTTP API for R.
  • Rfacebook - Access to Facebook API via R.
  • twitteR - Access to Twitter API via R.
  • Rlinkedin - Access to LinkedIn API via R.

Parallel Computing

Packages for parallel computing.
  • parallel - R started with release 2.14.0 which includes a new package parallel incorporating (slightly revised) copies of packages multicore and snow.
  • Rmpi - Rmpi provides an interface (wrapper) to MPI APIs. It also provides interactive R slave environment.
  • foreach - Executing the loop in parallel.
  • SparkR - R frontend for Spark.
  • DistributedR - A scalable high-performance platform from HP Vertica Analytics Team.
  • ddR - Provides distributed data structures and simplifies distributed computing in R.

High Performance

Packages for making R faster.
  • Rcpp - Rcpp provides a powerful API on top of R, make function in R extremely faster.
  • Rcpp11 - Rcpp11 is a complete redesign of Rcpp, targetting C++11.
  • compiler - speeding up your R code using the JIT

Language API

Packages for other languages.
  • rJava - Low-level R to Java interface.
  • jvmr - Integration of R, Java, and Scala.
  • rJython - R interface to Python via Jython.
  • rPython - Package allowing R to call Python.
  • runr - Run Julia and Bash from R.
  • RJulia - R package Call Julia.
  • RinRuby - a Ruby library that integrates the R interpreter in Ruby.
  • R.matlab - Read and write of MAT files together with R-to-MATLAB connectivity.
  • RcppOctave - Seamless Interface to Octave and Matlab.
  • RSPerl - A bidirectional interface for calling R from Perl and Perl from R.
  • V8 - Embedded JavaScript Engine.
  • htmlwidgets - Bring the best of JavaScript data visualization to R.
  • rpy2 - Python interface for R.

Database Management

Packages for managing data.
  • RODBC - ODBC database access for R.
  • DBI - Defines a common interface between the R and database management systems.
  • elastic - Wrapper for the Elasticsearch HTTP API
  • mongolite - Streaming Mongo Client for R
  • RMySQL - R interface to the MySQL database.
  • ROracle - OCI based Oracle database interface for R.
  • RPostgreSQL - R interface to the PostgreSQL database system.
  • RSQLite - SQLite interface for R
  • RJDBC - Provides access to databases through the JDBC interface.
  • rmongodb - R driver for MongoDB.
  • rredis - Redis client for R.
  • RCassandra - Direct interface (not Java) to the most basic functionality of Apache Cassanda.
  • RHive - R extension facilitating distributed computing via Apache Hive.
  • RNeo4j - Neo4j graph database driver.

Machine Learning

Packages for making R cleverer.
  • AnomalyDetection - AnomalyDetection R package from Twitter.
  • ahaz - Regularization for semiparametric additive hazards regression.
  • arules - Mining Association Rules and Frequent Itemsets
  • bigrf - Big Random Forests: Classification and Regression Forests for Large Data Sets
  • bigRR - Generalized Ridge Regression (with special advantage for p >> n cases)
  • bmrm - Bundle Methods for Regularized Risk Minimization Package
  • Boruta - A wrapper algorithm for all-relevant feature selection
  • BreakoutDetection - Breakout Detection via Robust E-Statistics from Twitter.
  • bst - Gradient Boosting
  • CausalImpact - Causal inference using Bayesian structural time-series models.
  • C50 - C5.0 Decision Trees and Rule-Based Models
  • caret - Classification and Regression Training
  • Clever Algorithms For Machine Learning
  • CORElearn - Classification, regression, feature evaluation and ordinal evaluation
  • CoxBoost - Cox models by likelihood based boosting for a single survival endpoint or competing risks
  • Cubist - Rule- and Instance-Based Regression Modeling
  • e1071 - Misc Functions of the Department of Statistics (e1071), TU Wien
  • earth - Multivariate Adaptive Regression Spline Models
  • elasticnet - Elastic-Net for Sparse Estimation and Sparse PCA
  • ElemStatLearn - Data sets, functions and examples from the book: "The Elements of Statistical Learning, Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman
  • evtree - Evolutionary Learning of Globally Optimal Trees
  • FSelector - A feature selection framework, based on subset-search or feature ranking approches.
  • frbs - Fuzzy Rule-based Systems for Classification and Regression Tasks
  • GAMBoost - Generalized linear and additive models by likelihood based boosting
  • gamboostLSS - Boosting Methods for GAMLSS
  • gbm - Generalized Boosted Regression Models
  • glmnet - Lasso and elastic-net regularized generalized linear models
  • glmpath - L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model
  • GMMBoost - Likelihood-based Boosting for Generalized mixed models
  • grplasso - Fitting user specified models with Group Lasso penalty
  • grpreg - Regularization paths for regression models with grouped covariates
  • h2o - Deeplearning, Random forests, GBM, KMeans, PCA, GLM
  • hda - Heteroscedastic Discriminant Analysis
  • Introduction to Statistical Learning
  • ipred - Improved Predictors
  • kernlab - kernlab: Kernel-based Machine Learning Lab
  • klaR - Classification and visualization
  • kohonen - Supervised and Unsupervised Self-Organising Maps.
  • lars - Least Angle Regression, Lasso and Forward Stagewise
  • lasso2 - L1 constrained estimation aka ‘lasso’
  • LiblineaR - Linear Predictive Models Based On The Liblinear C/C++ Library
  • LogicReg - Logic Regression
  • maptree - Mapping, pruning, and graphing tree models
  • mboost - Model-Based Boosting
  • Machine Learning For Hackers
  • mvpart - Multivariate partitioning
  • MXNet - MXNet brings flexible and efficient GPU computing and state-of-art deep learning to R.
  • ncvreg - Regularization paths for SCAD- and MCP-penalized regression models
  • nnet - eed-forward Neural Networks and Multinomial Log-Linear Models
  • oblique.tree - Oblique Trees for Classification Data
  • pamr - Pam: prediction analysis for microarrays
  • party - A Laboratory for Recursive Partytioning
  • partykit - A Toolkit for Recursive Partytioning
  • penalized - L1 (lasso and fused lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox model
  • penalizedLDA - Penalized classification using Fisher's linear discriminant
  • penalizedSVM - Feature Selection SVM using penalty functions
  • quantregForest - quantregForest: Quantile Regression Forests
  • randomForest - randomForest: Breiman and Cutler's random forests for classification and regression.
  • randomForestSRC - randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC).
  • rattle - Graphical user interface for data mining in R.
  • rda - Shrunken Centroids Regularized Discriminant Analysis
  • rdetools - Relevant Dimension Estimation (RDE) in Feature Spaces
  • REEMtree - Regression Trees with Random Effects for Longitudinal (Panel) Data
  • relaxo - Relaxed Lasso
  • rgenoud - R version of GENetic Optimization Using Derivatives
  • rgp - R genetic programming framework
  • Rmalschains - Continuous Optimization using Memetic Algorithms with Local Search Chains (MA-LS-Chains) in R
  • rminer - Simpler use of data mining methods (e.g. NN and SVM) in classification and regression
  • ROCR - Visualizing the performance of scoring classifiers
  • RoughSets - Data Analysis Using Rough Set and Fuzzy Rough Set Theories
  • rpart - Recursive Partitioning and Regression Trees
  • RPMM - Recursively Partitioned Mixture Model
  • RSNNS - Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS)
  • RWeka - R/Weka interface
  • RXshrink - RXshrink: Maximum Likelihood Shrinkage via Generalized Ridge or Least Angle Regression
  • sda - Shrinkage Discriminant Analysis and CAT Score Variable Selection
  • SDDA - Stepwise Diagonal Discriminant Analysis
  • SuperLearner and subsemble - Multi-algorithm ensemble learning packages.
  • svmpath - svmpath: the SVM Path algorithm
  • tgp - Bayesian treed Gaussian process models
  • tree - Classification and regression trees
  • varSelRF - Variable selection using random forests
  • xgboost - eXtreme Gradient Boosting Tree model, well known for its speed and performance.

Natural Language Processing

Packages for Natural Language Processing.
  • tm - A comprehensive text mining framework for R.
  • openNLP - Apache OpenNLP Tools Interface.
  • koRpus - An R Package for Text Analysis.
  • zipfR - Statistical models for word frequency distributions.
  • tmcn - A Text mining toolkit for international characters especially for Chinese.
  • Rwordseg - Chinese word segmentation.
  • NLP - Basic functions for Natural Language Processing.
  • LDAvis - Interactive visualization of topic models.
  • topicmodels - Topic modeling interface to the C code developed by by David M. Blei for Topic Modeling (Latent Dirichlet Allocation (LDA), and Correlated Topics Models (CTM)).
  • syuzhet - Extracts sentiment from text using three different sentiment dictionaries.


Packages for Bayesian Inference.
  • coda - Output analysis and diagnostics for MCMC.
  • mcmc - Markov Chain Monte Carlo.
  • MCMCpack - Markov chain Monte Carlo (MCMC) Package.
  • R2WinBUGS - Running WinBUGS and OpenBUGS from R / S-PLUS.
  • BRugs - R interface to the OpenBUGS MCMC software.
  • rjags - R interface to the JAGS MCMC library.
  • rstan - R interface to the Stan MCMC software.


Packages for dealing with money.
  • quantmod - Quantitative Financial Modelling & Trading Framework for R.
  • TTR - Functions and data to construct technical trading rules with R.
  • PerformanceAnalytics - Econometric tools for performance and risk analysis.
  • zoo - S3 Infrastructure for Regular and Irregular Time Series.
  • xts - eXtensible Time Series.
  • tseries - Time series analysis and computational finance.
  • fAssets - Analysing and Modelling Financial Assets.


Packages for processing biological datasets.
  • Bioconductor - Tools for the analysis and comprehension of high-throughput genomic data.
  • genetics - Classes and methods for handling genetic data.
  • gap - An integrated package for genetic data analysis of both population and family data.
  • ape - Analyses of Phylogenetics and Evolution.
  • pheatmap - Pretty heatmaps made easy.

Network Analysis

Packages to construct, analyze and visualize network data.
  • igraph - A collection of network analysis tools.
  • network - Basic tools to manipulate relational data in R.
  • sna - Basic network measures and visualization tools.
  • networkDynamic - Support for dynamic, (inter)temporal networks.
  • ndtv - Tools to construct animated visualizations of dynamic network data in various formats.
  • statnet - The project behind many R network analysis packages.
  • ergm - Exponential random graph models in R.
  • latentnet - Latent position and cluster models for network objects.
  • tnet - Network measures for weighted, two-mode and longitudinal networks.
  • rgexf - Export network objects from R to GEXF, for manipulation with network software like Gephi or Sigma.

R Development

Packages for packages.
  • devtools - Tools to make an R developer's life easier.
  • testthat - An R package to make testing fun.
  • R6 - simpler, faster, lighter-weight alternative to R's built-in classes.
  • pryr - Make it easier to understand what's going on in R.
  • roxygen - Describe your functions in comments next to their definitions.
  • lineprof - Visualise line profiling results in R -> install.packages("profvis")
  • packrat - Make your R projects more isolated, portable, and reproducible.
  • installr - Functions for installing softwares from within R (for Windows).
  • import - An import mechanism for R.
  • Rocker - R configurations for Docker.
  • drat - Creation and use of R repositories on GitHub or other repos.
  • covr - Test coverage for your R package and (optionally) upload the results to coveralls or codecov.
  • lintr - Static code analysis for R to enforce code style.


Packages for Logging
  • futile.logger - A logging package in R similar to log4j
  • log4r - A log4j derivative for R
  • logging - A logging package emulating the python logging package.

Other Interpreters

Alternative R engines.
  • renjin - a JVM-based interpreter for R.
  • pqR - a "pretty quick" implementation of R
  • fastR - FastR is an implementation of the R Language in Java atop Truffle and Graal.
  • riposte - a fast interpreter and JIT for R.
  • TERR - TIBCO Enterprise Runtime for R.
  • RRO - Revolution R Open.
  • CXXR - Refactorising R into C++.

Learning R

Packages for Learning R.
  • swirl - An interactive R tutorial directly in your R console.


Where to discover new R-esources.


  • R-project - The R Project for Statistical Computing.
  • R Bloggers - There are people scattered across the Web who blog about R. This is simply an aggregator of many of those feeds.
  • DataCamp - Learn R data analytics online.
  • Quick-R - An excellent quick reference.
  • Advanced R - An in-progress book site for Advanced R.
  • CRAN Task Views - Task Views for CRAN packages.
  • The R Programming Wikibook - A collaborative handbook for R.
  • R-users - A job board for R users (and the people who are looking to hire them)
  • R Cookbook - A problem-oriented website that supports the R Graphics Cookbook.
  • tryR - A quick course for getting started with R.


  • The Art of R Programming - It's a good resource for systematically learning fundamentals such as types of objects, control statements, variable scope, classes and debugging in R.
  • Free Books - CRAN Contributed Documentation in many languages.
  • R Cookbook - A quick and simple introduction to conducting many common statistical tasks with R.
  • Books written as part of the Johns Hopkins Data Science Specialization:
  • R Packages - A book (in paper and website formats) on writing R packages.
  • R in Action - This book aims at all levels of users, with sections for beginning, intermediate and advanced R ranging from "Exploring R data structures" to running regressions and conducting factor analyses.
  • Use R! - This series of inexpensive and focused books from Springer publish shorter books aimed at practitioners. Books can discuss the use of R in a particular subject area, such as Bayesian networks, ggplot2 and Rcpp.
  • R for SAS and SPSS users - An excelllent resource for users already familiar with SAS or SPSS.
  • An Introduction to R - A very good introductory text on R, also covers some advanced topics.

Reference Cards


Massive open online courses.

No comments:


HTMLCode Content