Research Papers – ESEC/FSE 2019

Papers that received an ACM SIGSOFT Distinguished Paper Award:

Empirical Review of Java Program Repair Tools: A Large-Scale Experiment on 2 141 Bugs and 23 551 Repair Attempts
Thomas Durieux, Fernanda Madeiral, Matias Martinez, and Rui Abreu

Generating Automated and Online Test Oracles for Simulink Models with Continuous and Uncertain Behaviors
Claudio Menghi, Shiva Nejati, Khouloud Gaaloul, and Lionel Briand

The Importance of Accounting for Real-World Labelling When Predicting Software Vulnerabilities
Matthieu Jimenez, Renaud Rwemalika, Mike Papadakis, Federica Sarro, Yves Le Traon, and Mark Harman

Assessing the Quality of the Steps to Reproduce in Bug Reports
Oscar Chaparro, Carlos Bernal-Cárdenas, Jing Lu, Kevin Moran, Andrian Marcus, Massimiliano Di Penta, Denys Poshyvanyk, and Vincent Ng

A Framework for Writing Trigger-Action Todo Comments in Executable Format
Pengyu Nie, Rishabh Rai, Junyi Jessy Li, Sarfraz Khurshid, Raymond J. Mooney, and Milos Gligoric

A Statistics-based Performance Testing Methodology for Cloud Applications
Sen He, Glenna Manns, John Saunders, Wei Wang, Lori Pollock, and Mary Lou Soffa

List of accepted papers of the main research track:

Concolic Testing for Models of State-Based Systems
Reza Ahmadi and Juergen Dingel
(Queen’s University, Canada)

Article Search

Artifacts Reusable

Target-Driven Compositional Concolic Testing with Function Summary Refinement for Effective Bug Detection
Yunho Kim, Shin Hong, and Moonzoo Kim
(KAIST, South Korea; Handong Global University, South Korea)

Article Search

Info

Generating Automated and Online Test Oracles for Simulink Models with Continuous and Uncertain Behaviors
Claudio Menghi, Shiva Nejati, Khouloud Gaaloul, and Lionel C. Briand
(University of Luxembourg, Luxembourg)

Article Search

Artifacts Available

Artifacts Reusable

Lifting Datalog-Based Analyses to Software Product Lines
Ramy Shahin, Marsha Chechik, and Rick Salay
(University of Toronto, Canada)

Article Search

An Empirical Study of Real-World Variability Bugs Detected by Variability-Oblivious Tools
Austin Mordahl, Jeho Oh, Ugur Koc, Shiyi Wei, and Paul Gazzillo
(University of Texas at Dallas, USA; University of Texas at Austin, USA; University of Maryland, USA; University of Central Florida, USA)

Article Search

Artifacts Available

Artifacts Reusable

Principles of Feature Modeling
Damir Nešić, Jacob Krüger, Ștefan Stănciulescu, and Thorsten Berger
(KTH, Sweden; University of Magdeburg, Germany; ABB, Switzerland; Chalmers University of Technology, Sweden; University of Gothenburg, Sweden)

Preprint

Info

Understanding GCC Builtins to Develop Better Tools
Manuel Rigger, Stefan Marr, Bram Adams, and Hanspeter Mössenböck
(JKU Linz, Austria; University of Kent, UK; Polytechnique Montréal, Canada)

Preprint

Artifacts Available

Artifacts Reusable

Assessing the Quality of the Steps to Reproduce in Bug Reports
Oscar Chaparro, Carlos Bernal-Cárdenas, Jing Lu, Kevin Moran, Andrian Marcus, Massimiliano Di Penta, Denys Poshyvanyk, and Vincent Ng
(College of William and Mary, USA; University of Texas at Dallas, USA; University of Sannio, Italy)

Preprint

Info

A Learning-Based Approach for Automatic Construction of Domain Glossary from Source Code and Documentation
Chong Wang, Xin Peng, Mingwei Liu, Zhenchang Xing, Xuefang Bai, Bing Xie, and Tuo Wang
(Fudan University, China; Australian National University, Australia; Peking University, China)

Preprint

On Using Machine Learning to Identify Knowledge in API Reference Documentation
Davide Fucci, Alireza Mollaalizadehbahnemiri, and Walid Maalej
(University of Hamburg, Germany)

Article Search

Artifacts Available

Generating Query-Specific Class API Summaries
Mingwei Liu, Xin Peng, Andrian Marcus, Zhenchang Xing, Wenkai Xie, Shuangshuang Xing, and Yang Liu
(Fudan University, China; University of Texas at Dallas, USA; Australian National University, Australia)

Preprint

Semantic Relation Based Expansion of Abbreviations
Yanjie Jiang, Hui Liu, and Lu Zhang
(Beijing Institute of Technology, China; Peking University, China)

Article Search

Diversity-Based Web Test Generation
Matteo Biagiola, Andrea Stocco, Filippo Ricca, and Paolo Tonella
(Fondazione Bruno Kessler, Italy; USI Lugano, Switzerland; University of Genoa, Italy)

Article Search

Web Test Dependency Detection
Matteo Biagiola, Andrea Stocco, Ali Mesbah, Filippo Ricca, and Paolo Tonella
(Fondazione Bruno Kessler, Italy; USI Lugano, Switzerland; University of British Columbia, Canada; University of Genoa, Italy)

Article Search

Testing Scratch Programs Automatically
Andreas Stahlbauer, Marvin Kreis, and Gordon Fraser
(University of Passau, Germany)

Article Search

A Large-Scale Empirical Study of Compiler Errors in Continuous Integration
Chen Zhang, Bihuan Chen, Linlin Chen, Xin Peng, and Wenyun Zhao
(Fudan University, China)

Preprint

A Statistics-Based Performance Testing Methodology for Cloud Applications
Sen He, Glenna Manns, John Saunders, Wei Wang, Lori Pollock, and Mary Lou Soffa
(University of Texas at San Antonio, USA; University of Virginia, USA; University of Delaware, USA)

Article Search

Artifacts Available

Artifacts Reusable

How Bad Can a Bug Get? An Empirical Analysis of Software Failures in the OpenStack Cloud Computing Platform
Domenico Cotroneo, Luigi De Simone, Pietro Liguori, Roberto Natella, and Nematollah Bidokhti
(Federico II University of Naples, Italy; FutureWei Technologies, USA)

Article Search

Artifacts Available

Artifacts Reusable

Towards More Efficient Meta-heuristic Algorithms for Combinatorial Test Generation
Jinkun Lin, Shaowei Cai, Chuan Luo, Qingwei Lin, and Hongyu Zhang
(Institute of Software at Chinese Academy of Sciences, China; Microsoft Research, China; University of Newcastle, Australia)

Article Search

Compiler Bug Isolation via Effective Witness Test Program Generation
Junjie Chen, Jiaqi Han, Peiyi Sun, Lingming Zhang, Dan Hao, and Lu Zhang
(Tianjin University, China; Peking University, China; University of Texas at Dallas, USA)

Article Search

Concolic Testing with Adaptively Changing Search Heuristics
Sooyoung Cha and Hakjoo Oh
(Korea University, South Korea)

Article Search

Symbolic Execution-Driven Extraction of the Parallel Execution Plans of Spark Applications
Luciano Baresi, Giovanni Denaro, and Giovanni Quattrocchi
(Politecnico di Milano, Italy; University of Milano-Bicocca, Italy)

Article Search

Generating Effective Test Cases for Self-Driving Cars from Police Reports
Alessio Gambi, Tri Huynh, and Gordon Fraser
(University of Passau, Germany; Saarland University, Germany; CISPA, Germany)

Article Search

Preference-Wise Testing for Android Applications
Yifei Lu, Minxue Pan, Juan Zhai, Tian Zhang, and Xuandong Li
(Nanjing University, China)

Article Search

Bisecting Commits and Modeling Commit Risk during Testing
Armin Najafi, Peter C. Rigby, and Weiyi Shang
(Concordia University, Canada)
Software testing is one of the costliest stages in the software development life cycle. One approach to reducing the test execution cost is to group changes and test them as a batch (i.e. batch testing). However, when tests fail in a batch, commits in the batch need to be re-tested to identify the cause of the failure, i.e. the culprit commit. The re-testing is typically done through bisection (i.e. a binary search through the commits in a batch). Intuitively, the effectiveness of batch testing highly depends on the size of the batch. Larger batches require fewer initial test runs, but have a higher chance of a test failure that can lead to expensive test re-runs to find the culprit. We are unaware of research that investigates and simulates the impact of batch sizes on the cost of testing in industry. In this work, we first conduct empirical studies on the effectiveness of batch testing in three large-scale industrial software systems at Ericsson. Using 9 months of testing data, we simulate batch sizes from 1 to 20 and find the most cost-effective BatchSize for each project. Our results show that batch testing saves 72% of test executions compared to testing each commit individually. In a second simulation, we incorporate flaky tests that pass and fail on the same commit as they are a significant source of additional test executions on large projects. We model the degree of flakiness for each project and find that test flakiness reduces the cost savings to 42%. In a third simulation, we guide bisection to reduce the likelihood of batch-testing failures. We model the riskiness of each commit in a batch using a bug model and a test execution history model. The risky commits are tested individually, while the less risky commits are tested in a single larger batch. Culprit predictions with our approach reduce test executions up to 9% compared to Ericsson’s current bisection approach. The results have been adopted by developers at Ericsson and a tool to guide bisection is in the process of being added to Ericsson’s continuous integration pipeline.

Article Search

White-Box Testing of Big Data Analytics with Complex User-Defined Functions
Muhammad Ali Gulzar, Shaghayegh Mardani, Madanlal Musuvathi, and Miryung Kim
(University of California at Los Angeles, USA; Microsoft Research, USA)

Article Search

Empirical Review of Java Program Repair Tools: A Large-Scale Experiment on 2,141 Bugs and 23,551 Repair Attempts
Thomas Durieux, Fernanda Madeiral, Matias Martinez, and Rui Abreu
(University of Lisbon, Portugal; INESC-ID, Portugal; Federal University of Uberlândia, Brazil; Polytechnic University of Hauts-de-France, France)

Preprint

Info

Artifacts Available

Artifacts Reusable

iFixR: Bug Report driven Program Repair
Anil Koyuncu, Kui Liu, Tegawendé F. Bissyandé, Dongsun Kim, Martin Monperrus, Jacques Klein, and Yves Le Traon
(University of Luxembourg, Luxembourg; Furiosa A.I., South Korea; KTH, Sweden)

Article Search

Artifacts Available

Artifacts Reusable

Exploring and Exploiting the Correlations between Bug-Inducing and Bug-Fixing Commits
Ming Wen, Rongxin Wu, Yepang Liu, Yongqiang Tian, Xuan Xie, Shing-Chi Cheung, and Zhendong Su
(Hong Kong University of Science and Technology, China; Xiamen University, China; Southern University of Science and Technology, China; Sun Yat-sen University, China; ETH Zurich, Switzerland)

Article Search

Info

Effects of Explicit Feature Traceability on Program Comprehension
Jacob Krüger, Gül Çalıklı, Thorsten Berger, Thomas Leich, and Gunter Saake
(University of Magdeburg, Germany; Chalmers University of Technology, Sweden; University of Gothenburg, Sweden; Harz University of Applied Sciences, Germany; METOP, Germany)

Preprint

Artifacts Available

What the Fork: A Study of Inefficient and Efficient Forking Practices in Social Coding
Shurui Zhou, Bogdan Vasilescu, and Christian Kästner
(Carnegie Mellon University, USA)

Article Search

Info

ServDroid: Detecting Service Usage Inefficiencies in Android Applications
Wei Song, Jing Zhang, and Jeff Huang
(Nanjing University of Science and Technology, China; Texas A&M University, USA)

Preprint

Info

Together Strong: Cooperative Android App Analysis
Felix Pauck and Heike Wehrheim
(University of Paderborn, Germany)

Article Search

Info

Artifacts Available

A Framework for Writing Trigger-Action Todo Comments in Executable Format
Pengyu Nie, Rishabh Rai, Junyi Jessy Li, Sarfraz Khurshid, Raymond J. Mooney, and Milos Gligoric
(University of Texas at Austin, USA)

Article Search

Decomposing the Rationale of Code Commits: The Software Developer’s Perspective
Khadijah Al Safwan and Francisco Servant
(Virginia Tech, USA)

Article Search

Info

Artifacts Available

Model-Based Testing of Breaking Changes in Node.js Libraries
Anders Møller and Martin Toldam Torp
(Aarhus University, Denmark)

Preprint

Info

Artifacts Reusable

Monitoring-Aware IDEs
Jos Winter, Maurício Aniche, Jürgen Cito, and Arie van Deursen
(Adyen, Netherlands; Delft University of Technology, Netherlands; Massachusetts Institute of Technology, USA)

Preprint

Going Big: A Large-Scale Study on What Big Data Developers Ask
Mehdi Bagherzadeh and Raffi Khatchadourian
(Oakland University, USA; City University of New York, USA)

Article Search

Why Aren’t Regular Expressions a Lingua Franca? An Empirical Study on the Re-use and Portability of Regular Expressions
James C. Davis, Louis G. Michael IV, Christy A. Coghlan, Francisco Servant, and Dongyoon Lee
(Virginia Tech, USA)

Preprint

Artifacts Available

Artifacts Reusable

Nodest: Feedback-Driven Static Analysis of Node.js Applications
Benjamin Barslev Nielsen, Behnaz Hassanshahi, and François Gauthier
(Oracle Labs, Australia; Aarhus University, Denmark)

Article Search

Effective Error-Specification Inference via Domain-Knowledge Expansion
Daniel DeFreez, Haaken Martinson Baldwin, Cindy Rubio-González, and Aditya V. Thakur
(University of California at Davis, USA)

Article Search

Artifacts Available

Artifacts Reusable

DeepStellar: Model-Based Quantitative Analysis of Stateful Deep Learning Systems
Xiaoning Du, Xiaofei Xie, Yi Li, Lei Ma, Yang Liu, and Jianjun Zhao
(Nanyang Technological University, Singapore; Kyushu University, Japan; Zhejiang Sci-Tech University, China)

Article Search

REINAM: Reinforcement Learning for Input-Grammar Inference
Zhengkai Wu, Evan Johnson, Wei Yang, Osbert Bastani, Dawn Song, Jian Peng, and Tao Xie
(University of Illinois at Urbana-Champaign, USA; University of Texas at Dallas, USA; University of Pennsylvania, USA; University of California at Berkeley, USA)

Article Search

Info

Boosting Operational DNN Testing Efficiency through Conditioning
Zenan Li, Xiaoxing Ma, Chang Xu, Chun Cao, Jingwei Xu, and Jian Lü
(Nanjing University, China)

Article Search

A Comprehensive Study on Deep Learning Bug Characteristics
Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan
(Iowa State University, USA)

Article Search

Just Fuzz It: Solving Floating-Point Constraints using Coverage-Guided Fuzzing
Daniel Liew, Cristian Cadar, Alastair F. Donaldson, and J. Ryan Stinnett
(Imperial College London, UK; Mozilla, USA)

Article Search

Artifacts Available

Cerebro: Context-Aware Adaptive Fuzzing for Effective Vulnerability Detection
Yuekang Li, Yinxing Xue, Hongxu Chen, Xiuheng Wu, Cen Zhang, Xiaofei Xie, Haijun Wang, and Yang Liu
(University of Science and Technology of China, China; Nanyang Technological University, Singapore; Zhejiang Sci-Tech University, China)

Article Search

iFixFlakies: A Framework for Automatically Fixing Order-Dependent Flaky Tests
August Shi, Wing Lam, Reed Oei, Tao Xie, and Darko Marinov
(University of Illinois at Urbana-Champaign, USA)

Article Search

Binary Reduction of Dependency Graphs
Christian Gram Kalhauge and Jens Palsberg
(University of California at Los Angeles, USA)

Article Search

Artifacts Available

Artifacts Reusable

AggrePlay: Efficient Record and Replay of Multi-threaded Programs
Ernest Pobee and W. K. Chan
(City University of Hong Kong, China)

Article Search

The Review Linkage Graph for Code Review Analytics: A Recovery Approach and Empirical Study
Toshiki Hirao, Shane McIntosh, Akinori Ihara, and Kenichi Matsumoto
(NAIST, Japan; McGill University, Canada; Wakayama University, Japan)

Preprint

Artifacts Available

Mitigating Power Side Channels during Compilation
Jingbo Wang, Chungha Sung, and Chao Wang
(University of Southern California, USA)

Preprint

Maximal Multi-layer Specification Synthesis
Yanju Chen, Ruben Martins, and Yu Feng
(University of California at Santa Barbara, USA; Carnegie Mellon University, USA)

Article Search

Phoenix: Automated Data-Driven Synthesis of Repairs for Static Analysis Violations
Rohan Bavishi, Hiroaki Yoshida, and Mukul R. Prasad
(University of California at Berkeley, USA; Fujitsu Labs, USA)

Article Search

Black Box Fairness Testing of Machine Learning Models
Aniya Aggarwal, Pranay Lohia, Seema Nagar, Kuntal Dey, and Diptikalyan Saha
(IBM Research, India)

Article Search

Java Reflection API: Revealing the Dark Side of the Mirror
Felipe Pontes, Rohit Gheyi, Sabrina Souto, Alessandro Garcia, and Márcio Ribeiro
(Federal University of Campina Grande, Brazil; State University of Paraíba, Brazil; PUC-Rio, Brazil; Federal University of Alagoas, Brazil)

Article Search

A Conceptual Replication of Continuous Integration Pain Points in the Context of Travis CI
David Gray Widder, Michael Hilton, Christian Kästner, and Bogdan Vasilescu
(Carnegie Mellon University, USA)

Article Search

Info

Ethnographic Research in Software Engineering: A Critical Review and Checklist
He Zhang, Xin Huang, Xin Zhou, Huang Huang, and Muhammad Ali Babar
(Nanjing University, China; University of Adelaide, Australia)

Article Search

Achilles’ Heel of Plug-and-Play Software Architectures: A Grounded Theory Based Approach
Joanna C. S. Santos, Adriana Sejfia, Taylor Corrello, Smruthi Gadenkanahalli, and Mehdi Mirakhorli
(Rochester Institute of Technology, USA)

Preprint

Info

Latent Error Prediction and Fault Localization for Microservice Applications by Learning from System Trace Logs
Xiang Zhou, Xin Peng, Tao Xie, Jun Sun, Chao Ji, Dewei Liu, Qilin Xiang, and Chuan He
(Fudan University, China; University of Illinois at Urbana-Champaign, USA; Singapore Management University, Singapore)

Preprint

The Importance of Accounting for Real-World Labelling When Predicting Software Vulnerabilities
Matthieu Jimenez, Renaud Rwemalika, Mike Papadakis, Federica Sarro, Yves Le Traon, and Mark Harman
(University of Luxembourg, Luxembourg; University College London, UK; Facebook, UK)

Article Search

Detecting Concurrency Memory Corruption Vulnerabilities
Yan Cai, Biyun Zhu, Ruijie Meng, Hao Yun, Liang He, Purui Su, and Bin Liang
(Institute of Software at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China; Renmin University of China, China)

Article Search

Locating Vulnerabilities in Binaries via Memory Layout Recovering
Haijun Wang, Xiaofei Xie, Shang-Wei Lin, Yun Lin, Yuekang Li, Shengchao Qin, Yang Liu, and Ting Liu
(Shenzhen University, China; Nanyang Technological University, Singapore; National University of Singapore, Singapore; Teesside University, UK; Xi’an Jiaotong University, China)

Article Search

Storm: Program Reduction for Testing and Debugging Probabilistic Programming Systems
Saikat Dutta, Wenxian Zhang, Zixin Huang, and Sasa Misailovic
(University of Illinois at Urbana-Champaign, USA)

Article Search

NullAway: Practical Type-Based Null Safety for Java
Subarno Banerjee, Lazaro Clapp, and Manu Sridharan
(University of Michigan, USA; Uber Technologies, USA; University of California at Riverside, USA)

Article Search

Artifacts Available

Artifacts Reusable

Automatically Detecting Missing Cleanup for Ungraceful Exits
Zhouyang Jia, Shanshan Li, Tingting Yu, Xiangke Liao, and Ji Wang
(National University of Defense Technology, China; University of Kentucky, USA)

Article Search

Finding and Understanding Bugs in Software Model Checkers
Chengyu Zhang, Ting Su, Yichen Yan, Fuyuan Zhang, Geguang Pu, and Zhendong Su
(East China Normal University, China; ETH Zurich, Switzerland; MPI-SWS, Germany)

Article Search

A Segmented Memory Model for Symbolic Execution
Timotej Kapus and Cristian Cadar
(Imperial College London, UK)

Article Search

Artifacts Available

Artifacts Reusable

Releasing Fast and Slow: An Exploratory Case Study at ING
Elvan Kula, Ayushi Rastogi, Hennie Huijgens, Arie van Deursen, and Georgios Gousios
(Delft University of Technology, Netherlands; ING Bank, Netherlands)

Preprint

SAR: Learning Cross-Language API Mappings with Little Knowledge
Nghi D. Q. Bui, Yijun Yu, and Lingxiao Jiang
(Singapore Management University, Singapore; Open University, UK)
To save effort, developers often translate programs from one programming language to another, instead of implementing it from scratch. Translating application program interfaces (APIs) used in one language to functionally equivalent ones available in another language is an important aspect of program translation. Existing approaches facilitate the translation by automatically identifying the API mappings across programming languages. However, these approaches still require large amount of parallel corpora, ranging from pairs of APIs or code fragments that are functionally equivalent, to similar code comments.
To minimize the need of parallel corpora, this paper aims at an automated approach that can map APIs across languages with much less a priori knowledge than other approaches. The approach is based on an realization of the notion of domain adaption, combined with code embedding, to better align two vector spaces. Taking as input large sets of programs, our approach first generates numeric vector representations of the programs (including the APIs used in each language), and it adapts generative adversarial networks (GAN) to align the vectors in different spaces of two languages. For a better alignment, we initialize the GAN with parameters derived from API mapping seeds that can be identified accurately with a simple automatic signature-based matching heuristic. Then the cross language API mappings can be identified via nearest-neighbors queries in the aligned vector spaces. We have implemented the approach (SAR, named after three main technical components in the approach) in a prototype for mapping APIs across Java and C# programs. Our evaluation on about 2 million Java files and 1 million C# files shows that the approach can achieve 54% and 82% mapping accuracy in its top-1 and top-10 API mapping results with only 174 automatically identified seeds, more accurate than other approaches using the same or much more mapping seeds.

Preprint

Info

Artifacts Available

Artifacts Reusable

Robust Log-Based Anomaly Detection on Unstable Log Data
Xu Zhang, Yong Xu, Qingwei Lin, Bo Qiao, Hongyu Zhang, Yingnong Dang, Chunyu Xie, Xinsheng Yang, Qian Cheng, Ze Li, Junjie Chen, Xiaoting He, Randolph Yao, Jian-Guang Lou, Murali Chintalapati, Furao Shen, and Dongmei Zhang
(Microsoft Reseach, China; Nanjing University, China; University of Newcastle, Australia; Microsoft, USA; Tianjin University, China)

Article Search

Pinpointing Performance Inefficiencies in Java
Pengfei Su, Qingsen Wang, Milind Chabbi, and Xu Liu
(College of William and Mary, USA; Scalable Machines Research, USA)

Article Search

Understanding Flaky Tests: The Developer’s Perspective
Moritz Eck, Fabio Palomba, Marco Castelluccio, and Alberto Bacchelli
(University of Zurich, Switzerland; Mozilla, UK)

Article Search

SEntiMoji: An Emoji-Powered Learning Approach for Sentiment Analysis in Software Engineering
Zhenpeng Chen, Yanbin Cao, Xuan Lu, Qiaozhu Mei, and Xuanzhe Liu
(Peking University, China; University of Michigan, USA)

Preprint