How should compilers explain problems to developers?

This week I presented the final part of my dissertation, How should compilers explain problems to developers?, at the Foundations of Software Engineering conference in Lake Buena Vista, Florida. This work is relevant to communities, such as Elm and Rust, that are invested in improving the usability of the error messages their tools produce.

The abstract of the paper follows:

Compilers primarily give feedback about problems to developers through the use of error messages. Unfortunately, developers routinely find these messages to be confusing and unhelpful. In this paper, we postulate that because error messages present poor explanations, theories of explanation—such as Toulmin’s model of argument—can be applied to improve their quality. To understand how compilers should present explanations to developers, we conducted a comparative evaluation with 68 professional software developers and an empirical study of compiler error messages found in Stack Overflow questions across seven different programming languages.

Our findings suggest that, given a pair of error messages, developers significantly prefer the error message that employs proper argument structure over a deficient argument structure when neither offers a resolution—but will accept a deficient argument structure if it provides a resolution to the problem. Human-authored explanations on Stack Overflow converge to one of the three argument structures: those that provide a resolution to the error, simple arguments, and extended arguments that provide additional evidence for the problem. Finally, we contribute three practical design principles to inform the design and evaluation of compiler error messages.

Microsoft in Redmond

I’m delighted to announce that as of today, I’ve started the next phase of my research career at Microsoft. I’ll be working with the PROSE team in a dual role: as a Researcher at the intersection of programming languages and human-computer interaction, and as a Research Software Engineer to improve developer experiences across Microsoft’s software development tools.

Software Engineering Research for the Post-apocalypse

Designing for Dystopia

I presented our short paper, Designing for Dystopia: Software Engineering Research for the Post-apocalypse, at the Visions and Reflections Track in Foundations of Software Engineering conference at Seattle, Washington.

The abstract of the paper follows:

Software engineering researchers have a tendency to be optimistic about the future. Though useful, optimism bias bolsters unrealistic expectations towards desirable outcomes. We argue that explicitly framing software engineering research through pessimistic futures, or dystopias, will mitigate optimism bias and engender more diverse and thought-provoking research directions. We demonstrate through three pop culture dystopias, Battlestar Galactica, Fallout 3, and Children of Men, how reflecting on dystopian scenarios provides research opportunities as well as implications, such as making research accessible to non-experts, that are relevant to our present.

Check it out before the world ends.

The Bones of the System: A Case Study of Logging and Telemetry at Microsoft

Our full paper, The Bones of the System: A Case Study of Logging and Telemetry at Microsoft, has been accepted to the International Conference on Software Engineering, Software Engineering in Practice Track (ICSE SEIP 2016). ICSE is hosted this year in Austin, Texas.

The abstract of the paper follows:

Large software organizations are transitioning to event data platforms as they culturally shift to better support data-driven decision making. This paper offers a case study at Microsoft during such a transition. Through qualitative interviews of 28 participants, and a quantitative survey of 1,823 respondents, we catalog a diverse set of activities that leverage event data sources, identify challenges in conducting these activities, and describe tensions that emerge in data-driven cultures as event data flow through these activities within the organization. We find that the use of event data span every job role in our interviews and survey, that different perspectives on event data create tensions between roles or teams, and that professionals report social and technical challenges across activities.

I am delighted to have been able to collaborate with Microsoft Research for this study. Thanks to Robert DeLine, Steven Drucker, and Danyel Fisher, the co-authors of the paper.

Challenges in Using Event Data

Challenges in Using Event Data

Timeful: My E-mail Policy

Due to increased demands on my time and the need to minimize distractions, I am posting an official policy on my use of e-mail.

This policy is effective 11/27/2015, and is available at:

E-mails sent to my personal account or University account are now batched. This means that your e-mail will be intentionally delayed against one of the following time boundaries (all listed times are in Eastern time):

  • Monday through Friday: 9 AM, 1 PM, 3 PM, 7 PM. If you send an e-mail after 7 PM, I will receive your e-mail at 9 AM the following day.

  • Saturday and Sunday: 9 PM only. Weekends are reserved for time with my family.

If you need to contact me urgently, you may send me a text message at 251-454-1579.


Social Media Addendum

  • I check Facebook once a week, usually on Friday night.

I ♥ Hacker News

I Heart HN

Our short paper, I ♥ Hacker News: Expanding Qualitative Research Findings by Analyzing Social News Websites, has been accepted to the joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 2015, New Ideas). ESEC/FSE is hosted this year in Bergamo, Italy.

The abstract of the paper follows:

Grounded theory is an important research method in empirical software engineering, but it is also time consuming, tedious, and complex. This makes it difficult for researchers to assess if threats, such as missing themes or sample bias, have inadvertently materialized. To better assess such threats, our new idea is that we can automatically extract knowledge from social news websites, such as Hacker News, to easily replicate existing grounded theory research — and then compare the results. We conduct a replication study on static analysis tool adoption using Hacker News. We confirm that even a basic replication and analysis using social news websites can offer additional insights to existing themes in studies, while also identifying new themes. For example, we identified that security was not a theme discovered in the original study on tool adoption. As a long-term vision, we consider techniques from the discipline of knowledge discovery to make this replication process more automatic.

Improving Error Notification Comprehension in IDEs by Supporting Developer Self-Explanations

Explanatory interface mockup, conceptualized in the Eclipse IDE.

My second graduate consortium submission, Improving Error Notification Comprehension in IDEs by Supporting Developer Self-Explanations, has been accepted to the IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC) in Atlanta, Georgia. The abstract of the paper follows:

Despite the advanced static analysis techniques available to compilers, error notifications as presented by modern IDEs remain perplexing for developers to resolve. My thesis postulates that tools fail to adequately support self-explanation, a core metacognitive process necessary to comprehend notifications. The contribution of my work will bridge the gap between the presentation of tools and interpretation by developers by enabling IDEs to present the information they compute in a way that supports developer self-explanation.

You can compare this submission with my prior VL/HCC Graduate Consortium.

Fuse: A Reproducible, Extendable, Internet-scale Corpus of Spreadsheets

Fuse Distribution

Our paper, Fuse: A Reproducible, Extendable, Internet-scale Corpus of Spreadsheets, has been accepted to the 12th Working Conference on Mining Software Repositories.

Fuse is obtained by filtering through 1.9 petabytes of raw data from Common Crawl, using Amazon Web Services. See our Fuse Spreadsheet Corpus project page for details on obtaining the spreadsheets and using the spreadsheet metadata.

The abstract of the paper follows:

Spreadsheets are perhaps the most ubiquitous form of end-user programming software. This paper describes a corpus, called Fuse, containing 2,127,284 URLs that return spreadsheets (and their HTTP server responses), and 249,376 unique spreadsheets, contained within a public web archive of over 26.83 billion pages. Obtained using nearly 60,000 hours of computation, the resulting corpus exhibits several useful properties over prior spreadsheet corpora, including reproducibility and extendability. Our corpus is unencumbered by any license agreements, available to all, and intended for wide usage by end-user software engineering researchers. In this paper, we detail the data and the spreadsheet extraction process, describe the data schema, and discuss the trade-offs of Fuse with other corpora.

Microsoft Research Internship

Microsoft Research

I’m happy to announce that I’ve accepted an offer to intern at Microsoft Research this summer, from June 1 through August 21, in Redmond, Washington. I’ll be working under the direction of Rob DeLine, Principal Researcher and Group Manager of Human Interactions in Programming (HIP).

Human Interactions in Programming HIP works at the intersection of HCI, CSCW, and Software Engineering. The group uses a human-centered approach to develop tools to support software engineers and teams.

Can Social Screencasting Help Developers Learn New Tools?

Screencasting Tool Usages
Our short paper, Can Social Screencasting Help Developers Learn New Tools?, has been accepted to the 8th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE 2015). The workshop is colocated with ICSE, and will be hosted in Florence, Italy.

The lead author of the paper is Kubick Lubick. The abstract of the paper follows:

An effective way to learn about software development tools is by
directly observing peers’ workflows. However, these tool knowledge transfer events happen infrequently because developers must be both colocated and available. We explore an online social screencasting system that removes the dependencies of colocation and availability while maintaining the beneficial tool knowledge transfer of peer observation. Our results from a formative study indicate these online observations happen more frequently than in-person observations, but their effects are only temporary. We conclude that while peer observation facilitates online knowledge transfer, it is not the only component — other social factors may be involved.