aviral.jpg

Aviral Goel

I am a Computer Science Ph.D. student at Northeastern University. I am a member of the Programming Research Laboratory, advised by Professor Jan Vitek. I received a Bachelors degree in Electronics and Communication Engineering from Netaji Subhas Institute of Technology.

Previously, I interned at Oracle Labs, Austria, and worked at National Center for Biological Sciences, and Yahoo!.

Research

Broadly, my research involves two direction. First, I am interested in applying programming language techniques to improve data science tools. Second, I am interested in developing mechanisms for data-driven evolution of mainstream programming languages. To explore these directions, I employ empirical evaluations, dynamic, and static analyses.

I am working primarily on the migration of R from lazy-by-default to lazy-on-demand semantics. The interaction of R's laziness with missing arguments, side-effects, metaprogramming and reflective operations introduces challenges for users, developers, and language implementors. I am trying to develop tools and algorithms to switch R's semantics with minimal impact to legacy code. To test the correctness and robustness of my approach, I am leveraging millions of lines of code available from R's official package repositories, CRAN and Bioconductor.

I am also involved in other projects related to R. For example, I am helping retrofit a type system to R motivated by empirical insights from existing R code.

Publications

OOPSLA'20
Designing Types for R, Empirically
Alexi Turcotte , Aviral Goel , Filip Křikava , Jan Vitek

OOPSLA'19
On the Design, Implementation, and Use of Laziness in R
Aviral Goel , Jan Vitek

POPL'18
Correctness of Speculative Optimizations with Dynamic Deoptimization
Olivier Flückiger , Gabriel Scherer , Ming-ho Yee , Aviral Goel , Amal Ahmed , Jan Vitek

Softwares

Frameworks

R-dyntrace
A modified R VM that provides an event framework for program execution. This forms the base of all our dynamic analyses and empirical evaluations of R. It supports over forty events related to function calls, side-effects, garbage collection, attributes, promises, environments, eval, metaprogramming, and dynamic dispatch.

instrumentr
An R package that complements R-dyntrace by modeling all R objects and the R session for tracing and empirical evaluations. instrumentr is a culmination of years of experience writing tracers and provides a rich API to facilitate many complex analyses used in our research papers. It accumulates a lot of metadata about R objects and provides wrapper APIs to extract fine-grained information used in dynamic analyses for object tainting and event tracking.

experimentr
An R package for large-scale program analysis experiments related to R. First, it provides utility functions for ranking R packages based on dependencies, extracting runnable programs, counting lines of code, and other routine tasks. Second, it exposes a DSL to express complex analysis pipelines as a graph of step abstractions with a facility to invoke the steps from command-line or a web-app based interface.

dockr
A Docker image providing a comprehensive environment for R and C/C++ development, large-scale program analysis experiments, and setting up corpus of R programs. It provides standard development tools such as emacs, gdb, valgrind, and, perf, includes all the dependencies needed to build our in-house dynamic and static analysis softwares, and contains enough native dependencies to install most of the CRAN and Bioconductor repositories ( ~20,000 R packages with ~450,000 programs).

Applications

envtracer
A dynamic analyzer for tracking first-class environments and reflective frame access in R. We are using this to understand how R developers take advantage of the first-class nature of environments and function scopes.

strictr
An R package that alters R semantics by eagerly evaluating function arguments based on strictness specifications. We are using this to study the impact of switching R's semantics from lazy-by-default to lazy-on-demand.

lazr
A dynamic analyzer for profiling laziness in R applications and synthesizing strictness signatures. We are using this in conjunction with strictr to propose a semi-automated laziness removal technique.

evil
A dynamic analyzer for analyzing the use of eval family of functions in R. We are using this to better understand how dynamic evaluation is employed by R package authors and how their usage patterns are different from those of Javascript developers.

contractr
An R package that inserts function argument and return type contracts and monitors failures. We used this to evaluate the design of type signatures for R for 8.7K R packages with 98M assertions.

tastr
A C++ library implementing a grammar of type signatures for R. It provides APIs to parse type signatures from input streams into C++ objects. This was used by contractr to generate type contracts for 22K R functions from 412 packages.

promisedyntracer
A dynamic analysis tool to analyze use of laziness in R, specifically, evaluation of promises, side-effects, metaprogramming, and argument evaluation orders. It was used to study laziness in 230K R programs from 16,707 R packages. The tool generated 5.2 TB of execution traces from 271B promises (thunks).

Aviral Goel 2021