Clarifying Meaning with Venn-Diagram Thinking and Concrete Scenarios

Recently, my team was discussing our team mission, strategy, etc. and the question arose whether we should write down our team’s values. My teammate George was skeptical: in reality, when your back is up against it, won’t the decisions you make be based on whatever it takes for your team to survive, rather than some high-minded set of written values? My response was, “What if you replaced our whole team with a brand new set of people and gave them the same mission? Would they make the same decisions we do? If not, doesn’t that say that our values are something distinct from our mission?” You could see the light bulb go off in his head, and he immediately said yes, we should write down our team’s values.

After the meeting, George pointed out that this is a strength of mine: to put an idea into such a clear perspective that what was once a confusing and controversial idea becomes clear and easy to come to consensus on.

This is a skill I picked up in grad school, but I can’t say I learned it explicitly. It was just something that senior folks in the department did, and the rest of us picked it up slowly through osmosis. We gradually learned that being able to put your finger on an important part of a problem was a kind of superpower, and could accelerate progress towards key insights. I distinctly remember when one of the senior professors in the department (Mitch Wand) told me I had asked a really good question during a research talk and I realized I had picked up this superpower too.

On reflecting what this skill even is, I think it’s this: I clarify meaning through Venn-diagram thinking and concrete scenarios.

Let me explain with an example. Suppose I’m a software engineer in a meeting with my team, and the tech lead says “From here out, our focus is on reliability”. Immediately my internal ambiguity alarm starts firing: this is a vague statement and can mean very different things to different people. Should we drop any and all work that is not reliability-focused? Or should we adjust our ratio of work so that, say, 75% of our work is on reliability but we still leave 25% for new features? 

I think of these various interpretations as overlapping sets in a Venn diagram (see figure 1): in many situations, the decision is the same (either way, at least 75% of the work is on reliability), but not always.

Figure 1: Overlapping interpretations of the same statement

So the goal is to figure out which of these overlapping sets they’re referring to: A or B. And the way I do that is to think of a concrete scenario that’s in A, but not in B (see figure 2). So in this situation I might ask the tech lead, “Focusing on reliability can mean a lot of different things. For example, that new feature the product manager just asked for is a top priority request from one of our largest customers, but it has nothing to do with reliability. Are we still going to work on it?”. Once I’ve put my finger on a concrete example of the ambiguity in the statement “our focus is on reliability”, the conversation usually moves on to a productive discussion of what is really meant, so that the whole team ends up sharing the same interpretation.

Figure 2: A concrete scenario in one interpretation but not the other

Once I learned how to apply this skill, I started to see in how many situations it applies: defining team priorities, writing a mission statement, explaining the overarching philosophy in a design, assigning ownership of project components, and more.

A key part of the skill is to detect that there’s some ambiguous meaning to clarify in the first place. Unfortunately, I don’t have any better advice other than to apply the above method in your head as you’re listening to someone describe an idea and determine if something seems missing. See if there are concrete scenarios where a statement might have multiple interpretations, and if so, try to tease out which one is meant.

Even though it wasn’t the direct goal of my PhD, this technique was one of the most important skills I picked up in grad school. But no one ever sat down and explained it to me, so it took a long time to learn. I hope by laying out the process I follow helps others pick up the idea faster. I really do believe this is a big point of leverage for senior folks in tech, so hopefully it’s useful to many of you out there.

Career Mission: Reasoning About Programs

For almost as long as I’ve been programming, I’ve had a gut feeling that writing software should be easier. But I couldn’t name this feeling and didn’t understand it. I couldn’t describe it to others. So for a long time, this vague idea languished and remained as nothing more than a feeling.

Over the years, this feeling has grown into a calling. It’s something I have a hard time not thinking about or working on. Now that I’ve been done with my PhD for a while, I decided it was time to understand this idea and figure out exactly what it was that I wanted to work on.

This post is my attempt to transform that gut feeling into a coherent idea that I can explain to others: specifically, a “mission statement” for my career. There’s no better tool than writing for turning muddled thinking into crisply defined ideas, so that’s what I’m doing here.

Secondarily, I’m also writing this post in the hopes that other people might have tips to share on how to do this kind of work in industry, or pointers to topics I should look into.

My career mission is “to give software developers knowledge and tools to help them reason about their programs”. Read on if you’d like to know what I mean by this, why it’s where I want to focus my life’s work, and what I see as my next steps.

What This Mission Means

Let’s unpack some of the terms in that mission statement.

“Reason About”

First, what exactly do I mean by “reason about”? I mean that a programmer should be able to hold some model of their program in their head that helps them understand how the program works and predict what it will do. This is effectively what is known in science as “predictive power”.

To make this concrete, here are some examples of the kinds of things that help a programmer reason about their program in the above sense.

  • Lexical scope: helps the programmer determine what variables are in scope with just a simple scan of the abstract syntax tree.
  • Structured programming: enables the program to be subdivided hierarchically into “chunks”, so that the programmer can infer facts about individual chunks and then compose those facts into a single fact about the larger whole (i.e. it enables “compositional reasoning” about program chunks).
  • Static type systems (especially sound ones): automatically check the programmer’s intent against what they wrote, ensuring that certain kinds of errors can never happen at run time.
  • Rust’s borrowing system: prevents certain classes of errors not caught by traditional type systems, like use-after-free and data races (with the exception of code marked with “unsafe”).
  • Abstraction and information hiding: enable the programmer to predict how some chunk of code interacts with the rest of the program by reasoning only about that chunk’s interface and specification, rather than its entire implementation.
  • Proof assistants: automate some of the deductive reasoning about programs to help the programmer prove certain theorems about their programs. The theorems might be freedom from a certain kind of error, or a guarantee that the program implements a specification, or anything else the proof assistant’s logic supports.

Conversely, here are some examples of tools and practices that a programmer might use to gain confidence in their program, but don’t enable the kind of reasoning I want to work on.

  • Testing (e.g. unit tests, integration tests, load tests, etc.): tells the programmer whether the program has some desired behavior, but only after executing it. An over-reliance on testing leads to a kind of guess-and-check style of programming, where the programmer writes some code without understanding it, then just runs the test to see what happens. (I’ll talk more below about why I think this is a bad idea.) You see this most often with students just learning to program, but occasionally in professional programmers, too.
  • Runtime verification: checks for certain error conditions while the program is running, but by the definition of “runtime” it can’t predict what will happen.
  • Model checking: predicts whether the program will obey some specification (e.g. the program will never deadlock), but doesn’t necessarily help the programmer understand why that’s the case. If the only possible results from running a model checker are just “yes” or “no” (or “unable to determine”), then relatively little information is gained. In this sense, model checking is similar to testing in terms of the reasoning ability it provides: it can say something is wrong in the program, but it can’t say where the problem is. (Some model checkers provide concrete counterexamples in the “no” case, which might change how I categorize them.)

These examples and counterexamples are separated mostly along static/dynamic lines, but model checking is an exception to this rule. It’s generally considered a static technique, but it doesn’t increase a programmer’s understanding of their program in the way I’m looking for. It might be that what I’m looking for is really static, compositional reasoning, but I need to think through some more examples before I’m convinced that that’s true.

“Knowledge and Tools”

Now that we’ve covered what I mean by “reason about”, let’s talk about the “knowledge and tools” aspect of this mission statement. I wanted to explicitly call those out as distinct concerns because while we know that tools can help a programmer reason about their program, it’s equally important to know that their ability to reason about their program is also determined by how they design and write their program. Some program structures are much easier to reason about than others (consider spaghetti code versus a well-structured program), and some tools work only if the program is written in a certain way. Programmers need to know how the design of a program affects reasoning ability, and that can’t come (exclusively) from a tool: programmers have to know this from the start and use this knowledge while creating their programs.

Concretely for me, listing both “knowledge” and “tools” in my mission statement reminds me that this mission involves not just tool-building, but a certain amount of teaching as well.

Why This Particular Mission?

Software developers need to reason about their programs in order to do any kind of meaningful work. While it’s sometimes possible to write code that “works” without really understanding it (e.g. with the “guess-and-check” approach mentioned in the above bullet about testing), that’s usually a bad idea. Here’s why:

  • Test cases can tell you that your program works for those specific cases, but without a predictive model, you can’t be sure that the program will do the right thing in all cases.
  • If you ever need to modify your program, you’ll have no idea what should be modified unless you understand how it works.
  • When something goes wrong in production (as it inevitably will), you’ll need to understand how the program is supposed to work if you want to have any hope of finding the root cause of the issue.

By default, we try to do this kind of reasoning in our heads. For the really tough problems, we might break out the pencil and paper. That’s how it happened historically, and that’s how most of us do it when starting out today.

But as Fred Brooks wrote in No Silver Bullet in 1987, “software entities are more complex for their size than perhaps any other human construct.” Douglas Crockford made a similar observation in JavaScript: The Good Parts in 2008, saying “computer programs are the most complex things that humans make.” And not only that: programs are getting more complex every year, with the shift from single-machine to distributed programs, programs that have to run on heterogeneous platforms (desktop computer, mobile phone, smart TV), and other advances in modern software.

Human reasoning alone can’t keep up with this rise in complexity. Some people are able to hold more information in their head than others, but we all hit our limit at a certain point. Software complexity will only continue to surpass the ability of the human mind to reason about it.

Computers, on the other hand, are designed precisely for these kinds of heavy-but-mechanical symbol-processing tasks. Even a run-of-the-mill laptop today is orders of magnitude faster and less error-prone at this kind of work than the best human reasoner. We need computers to automate some of this work for us if we want to have any hope of continuing to reason about modern-day software.

The good news is that there is a lot of untapped potential in this space. The last couple of decades have seen a number of advances from formal-methods/programming-language/software-engineering research trickle down into industry, like SAT-solver-based verification and more advanced type systems such as the ones in Rust and Haskell. Other techniques like dependent type systems appear to be becoming more usable, and there seems to be a growing interest in this kind of work from software developers. I expect this trend to continue.

My hope is that in some future year, a programmer will be able to sit down at their desk and write a program in concert with a kind of uber proof assistant. The programmer would set goals and guide the overall direction of the work, while the proof assistant would take care of the mechanical reasoning tasks and answer the programmer’s questions about their code. By offloading the heavy burden of reasoning over to the computer, the programmer’s brain would be free to take on the more creative and intuitive tasks. Thus the human and the machine would complement each other’s skills, leading to an end result that neither could have achieved on their own. I’m heavily inspired in this by J. C. R. Licklider’s Man-Computer Symbiosis and Thomas A. Limoncelli’s Automation Should Be Like Iron Man, Not Ultron.

What Are My Next Steps?

Even with the specific definitions used above, my mission statement is still pretty broad. That’s intentional: I’m trying to identify the kind of work I want to do and think would be valuable to the world, but be able to adapt that purpose to different environments and contexts depending on where I am in my career.

That said, there are a number of general concepts in this area that would be good to know regardless of the specific project I’m working on. Here are a few things I plan to do in the short-ish term:

  • Finish reading The Little Typer and do the exercises, as a way to learn the basics of dependent types.
  • Read (or at least skim) Ringer et al.’s recent survey of proof engineering, QED at Large.
  • Learn about compositional reasoning in general, and specifically how to design programs that are amenable to it. I’m sure there’s a ton of work in this area, but I’m hopeful there are some common/fundamental ideas I should know about (suggestions?).
  • Maybe experiment with a theorem prover like Coq or Agda, since I don’t actually have that experience yet. I’ve heard that Software Foundations and Programming Language Foundations in Agda are both good places to start.

Finally, I’d also like to figure out how to do this kind of work in industry. Academia wasn’t a good environmental fit for me, and I prefer to build useful tools that get into developers’ hands quickly rather than seek out fundamental truths that can take a long time to find their way into practice. But much of my mission sounds more like a classical research program than an industry project, so I’m still figuring out how to actually do this work outside of the typical academic setting. Perhaps there’s room for a kind of tech-transfer role in this area, but I don’t know what that looks like yet or how to make it happen. (Note that I intend to stay at Google for the time being, but who knows what could happen in the longer term?)

Do you have suggestions for additional topics I should learn about? Ideas on how to do this kind of work in an industry setting? Let me know! I’m @SchuCodes on Twitter (I’ll start a thread about this post), or you can just email me at jonathan.p.schuster@gmail.com. Thanks!

Reading Classic Software Engineering Research in Industry: Successes and Challenges

Over the summer, I started a group at my local Google office to read through the papers in Seminal Papers in Software Engineering: The Carnegie Mellon Canonical Collection. This is a collection of about 25 of the most influential papers (and other writings) in software engineering research, curated by the Carnegie Mellon software engineering faculty. The goal of the reading group is to get a sense for what the academic side of software engineering can teach us practitioners and what universal truths we can extract and use in our work. (As a side benefit, it’s also been a great forcing function for me to read more papers in my post-academic life.)

Having run the group for half a year now, this seemed like a good time to stop and reflect on how it has gone so far. (Disclaimer: what follows are my opinions alone; not necessarily the opinions of the group.)

How It Works

Every month we read the next paper in the list ahead of time, then we meet to discuss it. The reading order is based on this Twitter conversation with Jonathan Aldrich, which is roughly the order he and others use when they teach a class based on this list. To add some variety, every 6 months the group picks a paper not in the CMU collection (preferably one with an author from an under-represented group) and discusses that instead.

To help jump-start the discussion, we started posting our questions about the papers ahead of time on an internal tool called Dory. Dory lets everyone up-vote or down-vote each question, so it provides an aggregate signal for what the group would like to discuss. If the conversation during the meeting comes to a lull, we can use some of the top questions from Dory to keep things going.

What Has Gone Well

  • Posting questions on Dory ahead of time has helped give a collective sense ahead of time of our general impression of the paper and where we got confused. It has also helped structure and guide the conversation overall.
  • Reading these papers has gotten me thinking more about some fundamental questions in software development, like how to better divide programs into chunks and create clean abstraction boundaries between them. This probably affects my day-to-day work more than I realize, at least at a subconscious level.
  • I’ve been exposed to a new paper (Dilemmas in a general theory of planning) that I really like. It introduces the notion of a “wicked problem”: basically a social problem that can’t easily be solved by simple math and engineering optimizations. It touches on a lot of ideas, such as how do we even define success for such an issue, how do we get feedback while working on an ambiguous problem, and how do we make plans in these situations. I’d like to read it at least one more time to really wrap my head around it.

Challenges So Far

  • We usually don’t know much historical context around each paper, and without that it’s sometimes difficult to understand what makes that paper significant. For example, in the Levels and Abstraction chapter from Siewiorek et al.’s Computer Structures, we struggled to extract a more meaningful message than “organizing your system into a hierarchy helps with abstraction”. Was that less obvious to a developer back then than it is today? Or did we perhaps miss some more subtle and insightful message in that reading? The editors’ notes in the CMU tech report help with this a little, but without someone more well-versed in the area guiding us through, it’s hard to get more than a shallow reading of some papers.
  • Similarly, it’s sometimes hard to see why a paper might be important to us as working software engineers, rather than to researchers in software engineering. And maybe sometimes it’s not relevant at all: after all, this list was originally intended for introducing new researchers to the field, so maybe sometimes it’s not! Tony Hoare’s Proofs of correctness of data representations was one paper where we had this difficulty. While you can generalize some of the principles there to reasoning about correctness in general, none of us in the group use formal methods day-to-day (aside from automated tools that hide the details), and we never have to do something this rigorous. Without a clear value proposition to working developers, it was hard to muster the motivation to wade through all of the formalisms in that paper
  • Our group-chosen paper, Margaret Hamilton’s Universal Systems Language: Lessons Learned From Apollo turned out to be something of a dud for us. Partially this was because of the two previous points, but it also just seemed like a poorly written paper that couldn’t even explain what kind of thing the Universal Systems Language was (a programming language? A specification language? Something else?). While we want to add some variety outside of the CMU-curated list of papers, it’s also tricky for us to select what might be a good paper ahead of time without knowing the space better.
  • Less significantly, some of the readings are hard to find from a legitimate source. For “Proofs of correctness of data representations” we eventually found it in a compilation of Hoare’s papers, and we ended up skipping a couple of book chapters since it didn’t make sense to have us each buy multiple textbooks just to read a chapter or two from each. I don’t think this will have a very negative effect in the end, since we can still find most of the readings, but it’s unfortunate that we might miss out on a few of them.

Conclusion

All in all, I’m happy that running this group is forcing me to read more research papers, and I’m glad that it’s exposing us to some new ideas. But at the same time, I’d like to figure out a way to extract more value and find deeper insights from these papers than we are right now.

I wonder whether tips for running Great Books programs could help here. I’ve never been through a Great Books program myself, but this feels similar in that we’re studying the original sources for big ideas and trying to wrap our heads around them. Advice for people who go through a Great Books program without a more experienced guide might be relevant.

If you have suggestions, I’m happy to hear them!

P.S. to Googlers: I’m planning to open this group up to people outside the US-CAM office. If you’re interested, check out go/cam-se-research-reading-group.

Highlights from Curry On 2019

I had the privilege to attend Curry On in London earlier this week, and it was a great conference. The idea of the conference is to get people from both academia and industry together to meet and talk about programming languages and related technologies.

For those who weren’t able to attend, I thought I’d summarize my favorite talks. You can also watch recordings. I link to the videos of each of my highlighted talks below.

Simon Marlow: Glean: Facts About Code

(video)

Simon Marlow (of Haskell fame) talked about the system Facebook is building to host a repository of information about their code, called Glean. The idea is to put all information about the code into one place: the code itself, type/static analysis information, test results, bugs, and more.

The central organizing idea of Glean is that it stores “facts” about the code. Whenever the codebase changes (e.g. someone commits a change), new facts are computed, and those new facts are stored as “diffs” against the old facts (this reminds me a lot of how I understand git works). Thus, Glean is an incremental database.

They’re currently using Glean at Facebook for dead-code detection, code navigation, and other miscellaneous IDE features. It sounds, however, like they want to push this idea to the limits and see how much value they can get out of making a centralized knowledge base for their code. I personally think our current tools don’t do nearly enough for us in terms of helping us understand our code, so I’m curious to see what comes out of this in the future.

Frank Luan and Celeste Barnaby: Using ML for Code Discovery at Facebook

(video)

Frank and Celeste presented on some of the ways Facebook is using machine learning to give developers more information about their code (and once again I’m happy to see more people working on this). My major takeaway was about their “neural search” work. The idea is that a process crawls over the entire Facebook codebase and records all “words” found in the code (where a “word” can either be a word found in comments, or a part of a variable or function name). Then after passing that information through some sort of neural network, they have a system that can determine what snippets of code are likely relevant to what words (even words that might not appear around that snippet, just based on related words found elsewhere in the codebase).

In the end, they have an automated system that can respond to Stack-Overflow-like queries like “how do I open a TCP socket?” with a relevant snippet of code. From what I recall, they said the code snippet is relevant about half the time. So that sounds like something you wouldn’t want to rely on exclusively, but I could see such a system complementing the human-sourced answers in Stack Overflow with automated answers of its own.

Edwin Brady: Idris 2: Type-Driven Development of Idris

(video)

I’ve heard Edwin’s talks are great, but this was my first chance to see one. He didn’t disappoint. This is a guy who clearly just really enjoys working on this stuff, and that energy and sense of fun really comes across in his talks.

The talk at Curry On was about his Idris 2, the successor to the dependently-typed Idris. The big feature he focused on was type-driven code synthesis. So if you have a function that takes as input some argument of an algebraic data type (say, a List type with the usual Cons/Nil definition), the underlying Idris 2 system can generate the case-split for you, then search for an expression that fits the expected return type. In certain cases, depending on the type, that resulting code might be exactly what you want.

According to Edwin, this pattern of case-split-then-search-for-code is quite common. He’d like to see if he can automate much of this process into a kind of “programmable tactics” for writing code. Hard to say if that would go anywhere or not at this point, but it’s an intriguing idea that sounds valuable if it works out.

He admits that the programs he’s demoing are completely toy programs, so who knows if this approach would work for programming larger programs, or even moderately sized ones. It was a really fun talk, though, and one that gave me yet another good reason to learn about dependently typed programming.

Adelbert Chang: Microservice Architecture: A Programming Languages Perspective

(video)

Adelbert started off by talking about the kinds of problems one must handle when deploying and managing services in a microservice architecture. For example, how do you make sure all of the dependencies for your service are ready to go when you deploy your service? How do you handle service maintenance, or network partitions? How do you update a service’s message format without breaking the services that rely on the previous format?

The solution he’s been using is a project called Nelson, which is a kind of deployment management layer/runtime for your whole microservice system. Whenever you bring up your system, Nelson knows about the dependencies between different services and can bring them up in the right order. Similarly, when you want to upgrade a service, it can bring the service dependencies up/down in the proper order without the programmer having to think about it. If a service dependency is no longer being used, the management layer notices and shuts it down automatically.

Adelbert pointed out that much of this system is similar to what you get from a language runtime like the JVM. For instance, the part about removing unused services is just like garbage collection. The philosophy here as I understand it (and which I agree with) is that we can get a lot of leverage by trying to treat the entire “system” as one big program and using analogues techniques from the language-implementation world like garbage collection, packages, and type-checking.

Adelbert is currently looking into interface description languages, or “IDLs”, to add better notions of type-checking between services to these systems, including subtyping (which is crucial for upgrading services). He also mentioned a paper Verifying Interfaces Between Container-Based Components based on some of these ideas, and he’s working with some of the authors of that paper to take the ideas further.

I’m really excited about this space. Reasoning about distributed systems using techniques from programming languages was the general theme of my dissertation, so I’m really glad to see other people working on these problems. I’m excited to see what they’ll come out with next.

Lin Clark & Till Schneidereit: Bringing WebAssembly Outside the Web with WASI

(video)

Lin and Till capped off the first day with a keynote on the WebAssembly System Interface, or WASI. The thesis here is that two of the biggest benefits of WebAssembly are portability and the sandboxed security model. Those benefits aren’t restricted to the web alone, though: desktop applications can benefit, too. This is the idea of WASI: it’s a simple interface for running WebAssembly programs outside the context of a web browser and allowing supervised access to operating-system components like the file system or network connections.

I’m especially glad to see the object capability model taking hold in this space. For those not familiar, the idea of object capabilities is that to access something like, say, the file system, you must have some capability token that says gives you access. Typically, the user running the system grants the program certain capabilities (much like the permissions to run an app on an Android phone), and then the program has to hand off those tokens to any dependencies that need to access the relevant systems. This avoids the “ambient authority” problem where a malicious component can gain access to systems it shouldn’t have access to, only because it’s running as a part of a program running in the context of some particular user.

One slogan I found useful was that we should be able to “read the label” on a piece of software in order to know what it does. Today typically the only way of truly knowing what a program might try to do is to read the code, which doesn’t scale. This nutrition-label analogy sounds like a good approach and gets the idea across.

Aside from the technical content of the talk, I continue to be astounded by Lin’s skill at conveying complex technical information. Having spent much of the last two years writing about my research and having put together several different research/technical talks during grad school, I have a deep understanding of how difficult it can be to find a way to communicate the ideas you might have in your head in a simple way. Lin’s slides are beautifully simple and tell the story using a narrative and “characters” that just about anyone can connect to. She’ll be a role model for me the next time I have to give a talk, and I recommend everyone check out more of her cartoons on web technology at Mozilla Hacks.

Cynthia Solomon: Logo, A Computer Language to Grow With

(video)

Cynthia started off day 2 with a fascinating keynote on the language Logo and some of the works it went on to inspire. A large part of the idea of designing Logo (and the curriculum around it) was to focus on the general underlying ideas of computing and ways to teach them rather than surface issues of a particular language: things like procedural thinking, debugging, project-based learning, talking about what you are doing, having objects to play with, and anthropomorphic thinking. Logo was not just a programming language, but a culture: a way of thinking about computers and learning how to talk about what you’re thinking about.

If you haven’t heard of Logo before, the general idea is that you control a little robotic “turtle” and tell it to walk forward, turn, walk backwards, etc., while also giving commands to raise or lower a pen so that the turtle is drawing images as it goes. What I had never realized, but I suppose should have been obvious in retrospect, was that the turtle was an actual, physical object! Cynthia showed videos of the turtle, which looked like a big upside-down bucket on wheels, drawing a picture on a piece of paper while the students who wrote the program watched their results come to life. I had always assumed the turtle was just a metaphor and/or something you saw on-screen in a simulator, so it was literally a revelation to see videos of the actual turtle running around.

Cynthia went through a lot of the history around Logo, both before and after. I don’t have time to recap it all, but I recommend watching the video if you’re interested. Some of the projects that Logo went on to inspire were a “Lego Logo” robotics system from Lego (this must have either been Mindstorms or some precursor), ButtonBox which gave 3-year-olds a board of buttons to use to program a turtle robot, and the popular learn-to-program language Scratch.

Tomas Petricek: The Search for Fundamental Software Engineering Principles

(video)

Spoiler alert: this was my favorite talk of all of Curry On. Tomas asked the question “what should we be teaching students about software engineering that will still be relevant in 100 years?”. It’s a hard question. In computer science, we at least have some fundamental ideas of computing like Turing machines, the halting problem, and Big-O analysis that underpin much of the field, but when it comes to actually doing something with that knowledge (i.e. software engineering), we lack the same kind of fundamental ideas. In fact, it’s hard to say anything concrete about a piece of software at all: how long will it take to build? How much will it cost? Will it produce the right answers? If not, how often will it be wrong? Can it crash? Most of these are not “formal” questions in the sense that questions about science or math are formal, so they’re much harder to answer definitively.

As a result, to the extent that we teach students anything in school about software engineering, it tends to be based in trends that may or may not be around in 20-30 years (e.g. design patterns, tools like git, the agile methodology, language peculiarities, etc.). I would quibble a little with this and say that many schools at least try to teach some more fundamental principles of ideas, but since none of us really know what those are, about the best we can do is teach what we have found from experience to be “best practices” in industry.

Tomas’ idea is that, to start, we should start by studying the context in which many of these best practices developed. In many cases, the decisions made sense only given that context, so we should try to understand that context before adapting them to our own situation.

Tomas then got into the complexities of software, and where they come from. An insightful question he presented was “which is more complex: building a chess program that can defeat a grandmaster, or writing an accounting system that can calculate VAT in the UK and France?”. Research often focuses on the former, because although it’s a hard problem it’s easier to specify, while the latter is more common in our day-to-day work. Much of the work that goes into software engineering is about managing this complexity, and finding ways to mitigate the risk it entails. My overall sense was that Tomas sees the way forward as finding ways to tame this complexity and risk to a manageable point, rather than completely eliminating it. He seems to believe (and I’m generally convinced) that software engineering will likely never be a problem that can be cleanly defined like the principles of computer science mentioned above, but we can study what kinds of issues happen in the aggregate (e.g. broken deadlines, security breaches) and find ways to deal with them.

Overall, I’d say his thesis was that we should be studying software engineering more like a social science than a physical one, and using the techniques from those fields. In particular, he emphasizes understanding the history and trying to find useful metaphors from other fields like urban planning.

On reflecting on this later (and talking to Tomas after the talk), I wondered if we do yet have any software engineering principles that will always be relevant. My thought was that abstraction is such a principle. Time and again, over several generations of programming languages, we see abstraction as the key mechanism for reasoning about software without understanding every tiny detail. Abstraction is what allows software to scale without collapsing in on itself, so I believe we’ll still be teaching about abstraction 100 years from today.

All in all, this talk was near and dear to my heart. I’ve long felt that teaching people to be software engineers by having them major in computer science is like asking mechanical engineers to major in physics. Yes, you need to understand the underlying science to do your work, but the work requires a much better knowledge of how to apply the ideas of that science in practice, and what can go wrong in the real world. I hope more researchers follow Tomas’ lead and come up with better answers for us.

Tudor Girba: Moldable Development

(video)

Tudor gave a description and demo of the Glamorous Toolkit that he and his team use for what he calls “moldable development”. The idea is that as software engineers, much of our time is spent not on the writing of code per se, but trying to understand what has already been written. The only method most development tools provide to do that is by reading the code, line-by-line. According to Tudor, this doesn’t scale. Instead, we need a moldable development environment: one that allows us to write the tools we need to understand our particular programs in a quick and easy way.

Tudor showed several examples of this, such as a color picker for visualizing different colors, a tree diagram, and a graph + arrows diagram for visualizing the flow of messages. Key to all of this is that it’s easy and lightweight to create all of these tools. The environment is built on top of a Smalltalk environment, which is built with this kind of idea in mind. The video of Tudor’s talk gives a much better idea of the kinds of tools and development methodology he’s proposing than I can describe in words alone.

I like the idea, but I’m skeptical that this approach can really make the understanding of code that much faster. Most of his demos seemed to be about visualizing data, but when I’m trying to understand a piece of code, I find that it’s usually the algorithm that’s hard to understand. Algorithms don’t usually have an obvious visual representation in the way data does, so I’m not convinced the moldable approach works there. That said, (a) I’d be happy to be proven wrong, and (b) this idea would still be great for visualizing lots of different kinds of data.

Alastair Donaldson: GraphicsFuzz: Metamorphic Testing for Graphics Shader Compilers

(video)

Alastair gave a talk on metamorphic testing, which was a term I had heard before but never knew what it was. It starts with the problem that it’s often difficult to come up with an extensive list of test cases for something like a compiler, especially if you’re really trying to eliminate every bug. Metamorphic testing offers a complementary approach. Instead of trying to come up with test programs and expected results manually, start with some known program, then modify it so that it should produce an “equivalent” result. This might be something like wrapping a block of code with an “if” test that’s always true, or expanding a number to some expression that computes that number. If you compile and run both the original program and the mutated one, and they produce different results, then you have a bug in your compiler.

The benefit of this approach is that it’s easy to generate lots of such “equivalent” programs randomly, so a computer can easily come up with a large number of test cases to stress test your compiler. Alastair and his team used this approach to test shader compilers for Android from a number of different vendors, and they found many bugs using this approach.

I liked that this approach offers an easy way to explore many corner cases of programs that manual generation of test cases would otherwise overlook. The approach isn’t necessarily applicable to all scenarios, and I don’t know that I have a way to use it in my work today, but it’s a tool worth keeping in my testing toolbox.

Nate Foster: Hyperproperties

This was actually a talk at the associated Papers We Love event the day before the conference itself started. Nate gave a great talk on a relatively recent (2010) idea called “hyperproperties”. The idea is generalization of the notion of “properties” from the world of model checking. If you’ve heard of safety properties and liveness properties before, then you’ve got the right idea.

Suppose we can model the execution of some process as a (possibly infinite) set of states. Think of each state like the state your program is in when you run it in a debugger and click the “next step” button. I’ll call this state-sequence a “trace”. We might want to specify what it means for a trace to be “good”: e.g., whether it always avoids deadlock between two threads, or whether every incoming request is eventually processed. For every propery like that we might come up with, there is a corresponding set of traces that obey that have that property. Thus, a trace is “good” if it’s in the set described by that property, and so the set of traces effectively is the property.

Researchers have studied this framework for describing properties for the last several decades, and it has proved useful for verifying certain kinds of program behavior. In particular, the most commonly used kinds of properties are safety properties that say that some particular bad thing never happens (e.g. “this program will never crash”), and liveness properties that say that some particular good thing always happens (e.g. “Function foo always terminates”).

Some things we’d like to say about our programs are not expressible in liveness properties, though. For example, maybe you have a latency requirement that 99% of
all requests are handled in 100 ms. We can’t say whether a given trace satisfies the requirement. Instead, this kind of behavior is defined in terms of a set of traces. This is the idea of hyperproperties: a hyperproperty is defined as the set of all sets of traces that have some kind of behavior.

This work is still in a relatively early stage. Most of the existing work, as I understand it, is around formalizing this idea, coming up with theories around hyperproperties, and thinking about how they might be used. For example, there’s work in the world of properties that says that every property is the intersection of a safety and a liveness property, and it turns out that a similar result holds for hyperproperties.

I’m curious to explore hyperproperties further and see whether there might be interesting applications, though. I’m working at Google now, and a big concern in the distributed system I work on is the SLOs (“service level objectives”) that have a similar flavor to the 99% latency requirement above. Perhaps there could be some value in thinking of these SLOs as hyperproperties, although it could equally be the case that the work is still too young to be applicable in such a scenario.

Conclusion

I had a great time at Curry On, as well as the colocated Papers We Love event. The conference had the perfect mix of academic and industry talks and participants, and I saw and heard a lot about how to make software engineering better. We may not have solved every issue, but I hope there are many more events like this in the future so those of us who care deeply about these issues can get together, talk, and try to improve the situation a little bit at a time.

What is Type Soundness?

UPDATE (26 March 2017): In class, most people said the below example is not an example of unsoundness because the program will throw a well-known exception rather than allow undefined behavior. Whether this is sound or not may depend on your point of view, as Cornell professor Ross Tate explains, although the short version is that the Java designers generally consider this sound. I may discuss the definition of soundness we used in class and explore this apparent gap between “the type system guarantees what you expect from the annotations” and “no undefined behavior occurs” in a future post, but for now I recommend waiting for a summary of our class discussion on the PRL blog.

As part of this semester’s History of Programming Languages course at Northeastern, Matthias tasked the students (and people like me who just sit in from time to time) to figure out how they would explain type soundness to the typical software developer who’s not familiar with programming languages research or formal semantics. The in-class discussion is tomorrow, so watch the PRL blog for a summary of the full class discussion. In the meantime, here’s what I came up with:

Type soundness is a formal means of proving that your language’s type system won’t “lie” to you. For example of what I mean by lying, take the following example Java snippet:

This snippet type-checks, but it appears to do something weird: line 3 inserts an integer into an array, but when we retrieve that item from the array, somehow we can magically treat it as a string! That shouldn’t happen: the whole point of the type system should ensure that if we have a variable of type String, then it should definitely be a string.

It turns out that at run-time, when an item is inserted into an array, Java will check that the type of the item matches the actual type of the array and throw an ArrayStoreException if they don’t match.

This is an example of how Java’s type system is unsound: the type-checker confirms that “objects” is an array of Objects (into which we expect to be able to store any kind of object), but in reality it can only store strings.

A sound type system is one that guarantees a type-related error will not occur if the program type-checks. Exactly what a “type-related error” is depends on the particular language, but generally it means applying an operation to the wrong kind of arguments, such as inserting an integer into an array of strings, or concatenating a boolean onto a string.

Sound type systems reduce the amount of reasoning programmers have to do by ensuring that certain kinds of executions will never happen. An unsound type system, on the other hand, helps find bugs, but programmer can’t rely on the type-checker to prove that no type-related bugs exist in their program.

Getting Started in Programming Languages

A student in Northeastern’s Fundamentals of Computing course, in which I’m a TA, recently asked me how to learn what the “state of the art” is in the study of programming languages. It’s a long road from Fundamentals to reading and understanding today’s research, but this post is my answer to how to start down that path.

First, a distinction: to study programming languages, many people will point you to a book on compilers or interpreters. While those comprise a large fraction of programming languages, they are only a piece of the puzzle. The heart of (the study of) programming languages is formal semantics: the idea that one can mathematically specify the meaning of expressions and constructs in a programming language to avoid the ambiguity that otherwise arises from plain-English descriptions. Equipped with a formal semantics, one can develop multiple techniques to run programs (compilers, interpreters, hybrid approaches like JITs, etc.), new kinds of optimizations, and manual and automatic techniques to find bugs or prove properties of programs, all while remaining faithful to the intended definition of the programming language. I recommend first learning how to build and run your own language, then moving on to learning formal semantics to get a solid foundation for understanding current research.

With that in mind, here are some resources for the new (or not so new) computer scientist wishing to learn more about programming languages:

Essentials of Programming Languages: EOPL, as it’s better known, introduces readers to the internal workings of programming languages by describing small programming languages and creating an interpreter for each one. The book is very hands-on, with lots of exercises for the reader to modify the interpreters with new features. It touches on the ideas of reasoning about languages and formal semantics, but mostly sticks to the interpreter-as-semantics approach.

Jonathan Turner’s Reading List: Turner is an engineer on Mozilla’s Rust team and recently posted his reading list for getting up-to-speed on programming languages. The list starts with some resources on how to build interpreters and compilers, but also points to more academic material later.

Types and Programming Languages: TAPL (rhymes with “apple”), as it’s better known, has a solid introduction to formal semantics in the first few chapters and would be my pick for a starting point on formal semantics. The remainder of the book deals with type systems, which form only one part of programming languages, but it’s the canonical reference if you’re looking to learn about types.

Semantics Engineering with PLT Redex: The PhD-level programming languages course here at Northeastern uses the Redex book, and I found it to be a good introduction. The tool itself (Redex) is a great way to experiment with semantics, including reduction relations (roughly, the part of the semantics that says how the program runs) and type systems. You could use this book as a substitute for TAPL (at least for learning the basics of formal semantics), or you could use Redex to experiment with the languages described in TAPL.

10PL: This is a list compiled by Northeastern’s PL faculty of (roughly) ten academic papers that they think every student in PL should know. Not all of them are PL papers themselves, and they don’t form a full foundation on their own, but they form a kind of “great books” list for PL. Benjamin Pierce, the author of TAPL, also has a similar list (although with a slightly more type-heavy and theoretical bent).

That list is more than enough to get you started. I omitted resources for learning about formal methods and software engineering, two fields that overlap heavily with PL, but you may be interested in learning about them, too. For more information, I recommend talking to students or faculty in PL at your school, joining (or starting) a PL reading group, or eventually even applying to grad school if you’re so inclined.

Good luck!

Lifetime Parameters in Rust

Not long ago, when I first started working with Rust, lifetime parameters were one of the trickiest aspects to wrap my head around. What was this strange <'a> thing I saw at the beginning of some function and struct definitions, and what was it for?

I’m writing this post to solidify my own understanding of this unfamiliar feature, and hopefully it helps you, too. In it, I’ll cover:

  • the problem lifetime parameters solve
  • the places in Rust where lifetime parameters can be used, and what each location represents
  • the use of lifetimes as bounds on both generic type parameters and other lifetime parameters
  • the 'static lifetime
  • lifetime elision (a set of rules that allow you to omit explicit lifetime parameters in all but the most ambiguous cases)

I assume the reader is familiar with the basics of ownership, borrowing, and references in Rust, as well as generic types; if not, the book is an excellent reference.

Let’s get started!

The Problem

Imagine that we have a hypothetical version of Rust without lifetimes on reference types, and we try to run this program:

At line 15, x would be assigned a reference to memory owned by the coordinate c (through the call to get_x). But at line 16, c would go out of scope, so its memory would be freed. That means that at line 18, x would point to invalid memory—a classic use-after-free error.

So why didn’t the compiler catch this? The problem is that without any extra information on the type signature for get_x, the compiler doesn’t know that the returned reference’s lifetime should be the same as the input reference’s. We could have, say, written a version of get_x that returns a reference with a completely different lifetime, like a reference to a constant.

A similar issue can occur with references stored inside structs. Consider if a similar struct that instead stored references to its x and y fields and we tried the following:

Again, running this program would cause a use-after-free error. In this case, example_x and example_y are freed when the block ends at line 13, so c.x at line 15 would reference a freed location in memory. For the compiler to catch this error before we run the program, we need to establish some sort of relationship between the lifetime of a RefCoordinate and of the references it contains.

As you’ve probably already guessed, lifetime parameters solve both problems.

The Basics of Explicit Lifetimes and Lifetime Parameters

Although they are not always made explicit, every reference type in Rust has an associated lifetime describing the scope in which the reference can be used.

In cases like the above, we do want to make the lifetimes explicit so that we can express, for example, that the reference returned by get_x has the same lifetime as the input argument, c . That would give the compiler enough information to know that the use of x at line 18 is invalid.

Rust does not provide a way to write down concrete lifetimes (e.g. there’s no way to express “the lifetime starting at line 13 and ending at line 16”), but typically that’s not what we want, anyway. Instead, we want to abstract over all possible lifetimes; this is what lifetime parameters provide.

Lifetime parameters are special generic parameters on items like functions and structs, much like generic type parameters. The parameters provide names for our lifetimes that can be used as explicit lifetimes in our reference types. To make this concrete, here is the program after adding the proper lifetime parameters to get_x:

Let’s break this down one step at a time. In general, 'a means “the lifetime a”, and &'a T is the type for a reference of type T with lifetime 'a (or &'a mut T for a mutable reference).

get_x has a single lifetime parameter, 'a, declared immediately after the name. c’s declared type says that it is a reference whose lifetime is 'a. This declaration does not change c’s lifetime; it only gives it a name so we can refer to it later. Finally, the return type &'a i32 says that get_x returns a reference to an i32, whose lifetime is the same as the input argument’s lifetime. In other words, from the caller’s perspective the returned reference is only usable as long as the input c is still in scope. With this information (and replacing our hypothetical, lifetime-ignoring Rust with the real thing), the compiler is able to report that our use of get_x is invalid, because c does not live long enough.

We could write the code like this instead to get a working program:

Structs work similarly, as seen in this lifetime-ified example of RefCoordinate :

The struct Coordinate is now parameterized over a lifetime 'a, and its references are each declared to have that same lifetime 'a. In effect, this expresses the constraint that any reference to the struct may not outlive its contained references.

Lifetimes on Other Items

Traits can also have lifetime parameters, with the same syntax and meaning as structs:

Finally, when implementing methods in an impl block, the block itself can declare lifetime parameters to be used in the named struct or trait:

In general, lifetime parameters are declared as generic parameters on items like functions, structs, traits, and impls, and they are used both in reference types and as arguments to items that take generic parameters (as with GetXRef and RefCoordinate in the above example).

Bounded Lifetimes, Lifetimes as Bounds

If you’re familiar with advanced uses of generic types, you know that they can have bounds, as in fn foo<T: S>(x: T). Here the provided T must be a type that implements S.

Similarly, lifetimes can act as bounds to other lifetimes. A parameter like <'a: 'b> means “I expect some lifetime 'a that is at least as long as 'b”.

Addtionally, lifetimes can act as bounds on types. A type parameter such as <T: 'a> means “I expect some type T such that all references in T have a lifetime at least as long as 'a”.

Bounds in both cases can be combined using the + operator just as with normal type bounds.

The ‘static Lifetime

There is one particular lifetime in Rust that can be named without abstracting over it: the static lifetime, 'static. This is the lifetime for references that are alive throughout the entire execution of a program, such as references to string literals and global constants.

Lifetime Elision

Lifetime parameters provide the compiler with the extra information it requires to prevent use-after-free issues, but writing explicit lifetimes for every single reference type can be cumbersome, especially for obvious cases like this:

We would rather write the following and have the compiler figure out where to insert the lifetime annotations:

Fortunately, Rust has a simple set of rules, called the elision rules, that allow the programmer to elide (leave out) lifetime parameters in functions in the obvious cases (structs, however, must always use explicit lifetimes for their fields). The rules refer to input and output lifetimes, which simply refer to the lifetimes used in input types or return types, respectively. The rules are (as quoted from the book):

  • Each elided lifetime in a function’s arguments becomes a distinct lifetime parameter.
  • If there is exactly one input lifetime, elided or not, that lifetime is assigned to all elided lifetimes in the return values of that function.
  • If there are multiple input lifetimes, but one of them is &self or &mut self, the lifetime of self is assigned to all elided output lifetimes.

Eliding any other output lifetimes is a compile-time error. The book has many more examples to help you understand the elision rules.

Summary

  • Lifetime parameters abstract over the lifetimes used in type annotations and allow the programmer to express relationships between lifetimes.
  • Lifetime parameters are declared as generic type parameters to functions, structs, traits, and impl blocks and can be used in reference types, struct types, and trait uses. They can also act as bounds on type and lifetime parameters.
  • The 'static lifetime extends over the whole program and is used for constants and string literals.
  • A simple set of lifetime elision rules allows programmers to omit explicit lifetimes in most cases.

Hopefully this has helped clear up one of the more confusing aspects of Rust’s type system. If you have questions I didn’t answer, please sound off in the comments. Happy hacking!

The PLAI Philosophy: An Interpretation

As a PhD student in PL who came from four years of industrial software development, lately I’ve been trying to figure out what working programmers should know from the study of programming languages. As part of that investigation I’ve been looking over some of the better books on programming languages to see what their take is. In the introduction to Shriram Krishnamurthi’s Programming Languages: Application and Interpretation, Shriram forgoes a description of the book’s philosophy in favor of a link to a video of the first lecture for the course. What follows is my own interpretation of the philosophy from that video, as an attempt to both put some of that material down into written, searchable form and discover my own thoughts on the matter.

(Disclaimer: this is my interpretation of what was said in the video, not necessarily what was meant, so any mistakes in the below description are my own)

Inspiration from Science

To the extent that computer science is a science, we should look to science for ways to approach our own field. One of the primary tasks of science is to come up with categories to better understand similarities and differences between objects or living things. Think of taxonomies in biology, for example, that might classify an organism as a plant, animal, or a member of some other kingdom.

We programmers often do this with programming languages at a high level. We categorize them as “functional”, “object-oriented”, “imperative”, “scripting”, and all sorts of other categories. However, this categorization breaks down upon further inspection: the categories are not well-defined, and many languages fit into more than one category (for example, Scala proponents often emphasize that it is well-suited for both object-oriented and functional programming).

Perhaps this kind of categorization would be useful at a different level. Another use of science is to study the smallest building blocks of objects, such as atoms or DNA base pairs, and determine how they combine to gain a better understanding of the larger structure. This approach is helpful in studying programming languages. By understanding the individual building blocks of the language (e.g. values, expression forms, types) and the rules for putting them together (i.e. the semantics), we gain a much deeper understanding of the language as a whole. Furthermore, at this level it can be useful to categorize the small building blocks or semantic rules in different ways, as the similarities and differences are more restricted and well-defined than for programming languages as a whole. The divide between call-by-name and call-by-value semantics is one such categorization familiar to PL researchers.

Programming Language = Core + Syntactic Sugar

The philosophy of the course is that every language has a small “essence” to it, made up of these small building blocks, and that the rest of the language is just syntactic sugar. That is, the non-core features of the language can be expressed in terms of the core features (if perhaps in a verbose manner). Finding and understanding that core provides a framework for understanding the rest of the language, and helps language designers understand the effect new features will have on their language.

The course has students write a desugaring process for the language under study, then write an interpreter for the remaining core language to better understand its semantics. The goal is to end up finding the “essence” of the language. One difficulty is that the implementer of both the desugaring process and the interpreter must also prove that their language implementation has the same semantics as the official language (whether that’s a formal specification, a reference implementation, or what have you). In other words, when one runs a program under the new implementation and gets an answer, that answer must be the same as the one predicted by the reference language.

Later on in the video, Shiram calls this a “recipe for [language] design”: start with the right core, think about the next necessary features, and add them in a way that works well with the core.

Motivation

One might ask, “Why study programming languages at all?”. After all, only a very select few work on the commonly-used programming languages.

The problem is this: no one sets out to build a programming language. They start by writing something simple, such as a configuration file format. Then a colleague asks them for a way to save repeated computations, so they introduce variables. Then another colleague asks for a conditional construct, so they introduce “if”. Then another colleague asks for another feature, and so on and so on until the tiny configuration language has morphed into a full-blown programming language.

Every programmer will run into this situation at some point. In other words, every programmer will eventually end up writing a programming language, whether they wanted to or not. The key difference is that those who don’t understand programming languages will write it badly.

Or as Shiram put it, “we have a social obligation” to make sure the programming languages of the future are well-constructed, by teaching programmers the fundamentals of PL.

My Take

When considering what the take-aways are for most working developers, I think the most important part is the focus on understanding the core of a language and how it works. Even disregarding any issues around language engineering, if a programmer understands how their language works at that level, they will have a much better grasp on how all the pieces fit together and will write code that’s less prone to errors caused by weird corner cases of their language.

Certainly I think the core-first recipe for language design is the right way to go, and although I still have my doubts on how useful this is for all programmers, the discussion around configuration languages that grow out of control is well-taken. Also, the idea of a social obligation to ensure better semantics for “the next big thing” resonated with me—even if we can’t find a reliable way to become the designer of the next hot language, we can do our best to make sure the person who gets there knows what they’re doing.

One issue I noted is that it’s not always clear what the core of a given programming language is. There might be many opinions on what features make up the core, which was discussed briefly in the lecture when mentioning that there are usually multiple ways to desugar some feature. Finding the right core is probably not as important as finding a core, though, and my hypothesis is that most desugarings will give some insight into the workings of the language.

All in all, I think this is a great philosophy for those interested in knowing a bit more about programming languages to improve their own programming, and all programmers would be well-served by at least taking this approach to understanding their own language for day-to-day use.

Thoughts on Net Neutrality

With the FCC’s rules being struck down and arguments between Netflix/Cogent/Comcast/Verizon in the news lately, I thought this would be a good time to revisit net neutrality. I’ve generally been in favor of the idea ever since I heard about the issue (less restrictions on content creators and users of the Internet overall is a good thing), but I’ve never stopped to really think about all the details, and having been studying computer networks for the last couple of years, I figure this is as good a time as any to analyze the issue and come up with a more concrete and well-thought-out opinion.

The Issue

First off, for those who aren’t familiar, what is net neutrality, and what’s the whole debate about? At a high level, net neutrality is the principle that all traffic on the Internet should be treated equally by those who carry that traffic from one place to another. So no matter if you’re browsing Wikipedia, listening to music on Pandora, or watching movies on Netflix, all of the networks between you and the computers serving your content should treat your packets the same (for some definition of “the same”—more on that later).

The Internet started with this principle (at least understood, if not explicitly stated by early members), and carriers have largely followed it ever since. However, with the rise of bandwidth-heavy applications like streaming video and peer-to-peer file sharing, carriers are looking for more ways to ease the load on their saturated networks. One of the proposed methods is to throttle certain types of traffic in preference to others, based on criteria like the type or source of the content. Even past that point, many are worried that ISPs will actually start charging users to access certain sites, or that they will slow certain sites down to a crawl unless the customer pays up for “fast lane” access to that site. Alternatively, there’s also the possibility that ISPs charge the content providers themselves for the right to have their traffic carried to users, since certain providers (e.g. Netflix, YouTube) take up a large fraction of the available bandwidth. For example, in 2005, CEO of SBC Edward Whitacre stated in an interview that companies like Google, Yahoo! and Vonage would like to “use my pipes free, but I ain’t going to let them do that because we have spent this capital and we have to have a return on it.” This requirement to pay for access to users also brings up the fear that ISPs will start blocking or throttling services that compete with their own (e.g. Comcast slowing down Netflix to convince customers to use its own Xfinity video streaming).

Are these fears well-founded? Well, perhaps. In the US, we haven’t seen any cases of an ISP outright blocking a legal website (to my knowledge), but back in 2007, it was discovered that Comcast was throttling BitTorrent traffic to essentially unusable levels because of the load it caused on their network. An agreement was eventually reached, but many worry that this sort of situation could become more commonplace. Much more recently, in the last few months we’ve seen multiple disagreements (1, 2) between Netflix and ISPs like Comcast and Verizon about who should pay for what when it comes to giving users better levels of service on their Netflix streams (more on this in a later article).

Today, there are two primary questions: (1) should we still have network neutrality at all? and (2) if so, should we regulate it? There’s also the more nuanced issue of “exactly what do we mean by ‘net neutrality’?”.

My Take

Having researched the issue and thought about arguments from each side, my position is that the federal government should regulate ISPs to follow the principle of network neutrality. I define network neutrality as the combination of the following rules:

* ISPs can only charge a fee for those who connect directly to its network, not the sources or destinations of its traffic.
* ISPs can allocate resources differently for different kinds of applications (e.g. video streaming versus web browsing), but not according to the source or destination of that traffic (with the exception of allocating bandwidth to customers according to their pricing plan).
* ISPs can charge differently for different levels of service, but the level of service must be the same across all the rest of the Internet, not on a per-site basis. The same offerings must be available to all customers.

These rules are largely inspired by a net neutrality definition from Tim Berners-Lee (inventor of the World Wide Web): “If I pay to connect to the Net with a certain quality of service, and you pay to connect with that or greater quality of service, then we can communicate at that level.” These rules also make exceptions for things like illegal activity, security attacks, etc. to the extent that reasonable efforts to prevent these issues must violate the rules.

My primary rationale is that if ISPs start charging users or content providers different amounts to access different parts of the Internet, then I believe (as many do) that the Internet will become much more closed off and we won’t see nearly the free-flowing exchange of ideas and content that we do now. This becomes especially hard on content providers who end up having to pay every last ISP just to reach all of their subscribers.

Furthermore, customers are already paying ISPs for the cost of maintaining their network and carrying traffic. It shouldn’t matter what type of traffic that is. Figuring out how to deliver the user’s requested content is the ISP’s problem, not the user’s.

Opposing arguments

I’d like to take some time to address some of the trickier arguments that took me a while to fully answer.

Why should content providers have free access to an ISP’s “pipes”?

The Edward Whitacre argument is one that’s tricky to address at first, but much less so once you understand the overall payment structure of the Internet. The short answer is that content providers don’t use ISPs’ “pipes” for free, although the payment may be indirect. This is because of two forms of connection agreements between networks, known as “peering” and “transit”. Ars Technica has a good writeup on this, but I’ll give a quick overview.

Transit is similar to the relationship between a customer and an ISP: when network A pays network B for transit, A expects B to carry any and all of A’s traffic to the rest of the Internet. The cost is often in terms of dollars per amount of data transferred.

Peering, on the other hand, is a more mutual agreement, and traditionally (though not always) does not involve payment from either side. When two networks agree to peer with one another, they agree to exchange traffic directly between their customers, but not between their upstream connections. For example, if Verizon and Comcast peer with each other, a Verizon customer’s computer can communicate with a Comcast customer’s computer, but that same Verizon customer’s computer could not use Comcast’s transit provider to access the rest of the Internet.

Given that every network must connect to its neighbors through either peering or transit, any network that is carrying traffic from one place to another must have already made some agreement to carry that traffic. For example, if traffic from a YouTube video is traveling through AT&T’s network on the way to a Comcast customer, there must have been some agreements (paid or not) between the networks between YouTube and AT&T to allow the traffic to flow. So even if YouTube’s transit provider did not pay AT&T directly, there’s still a chain of payments between said transit provider and AT&T (with the possible exception of peering connections)

Networks are indeed expensive to maintain and upgrade, but the users and content providers creating the traffic flowing across those networks are certainly paying their fair share.

People who download lots of videos slow down network access for everyone else

I’m okay with routing traffic so that certain kinds of traffic don’t take up all the bandwidth from other users. This does not violate my principle of net neutrality. Ideally, ISPs would divide up their total available bandwidth at various points in the network and divide it up evenly among the customers who are currently sending and receiving packets. So regardless of what type of traffic the user is causing, they still get the same bandwidth as everyone else. That said, this may not be feasible given the complexity of most networks and the difficulty in configuring network management policies, but it’s an ideal to shoot for.

The real solution is to have more competition among ISPs—then the free market would encourage net neutrality as users buy access from ISPs that follow the principle.

Would “real” competition help? Well, maybe. I don’t know of any place in the US that has more than a small handful of available providers (often just one or two), so we don’t have actual experience to extrapolate from. A big problem, though, is that it’s hard to encourage competition in the Internet access space. Laying cables across cities and rural areas to connect residences and businesses to the Internet is a large expense, and not many companies are willing to take that kind of risk.

What are some of the options for encouraging competition? We could force all ISPs to open access to their cable lines to other ISPs, lowering the barrier to entry. However, the owning ISPs rightfully expect to make profits off of the lines they dug, so we shouldn’t force them to open their cable access. We could allow them to charge licensing fees for the access right, but without forcing them to do so, there’s no guarantee that would happen. We could require all future network lines to offer licensed access, but that doesn’t do anything for all the existing lines connecting most of the US already.

We could also have the government take over building the physical networks instead, but then we have the usual problem with government-run projects that there’s less incentive to upgrade and innovate.

I don’t see a great way forward here, but I’d love to know if there are other good ideas out there.

Some users/content providers might want to pay extra for a guaranteed quality of service (QoS) level

This type of argument can be made for those who use the Internet for mission-critical, drop-sensitive tasks. For example, one could imagine a military network that needs to route a video feed from soldiers on the ground to commanders on the other side of the globe. The military might be willing to pay more for access to a high-speed line with very little contention.

This is still doable, but I think it should happen only at the direct-connection level. If you want a fast connection, then you pay your ISP for it, who pays someone else for a fast connection, who pays someone else, etc. Forcing the end-customer to pay every middle man along the way seems impractical.

It’s possible that this is less efficient than having specifically outlined QoS levels that a user can pay for, but I’m not convinced that this is impossible to implement within a net neutrality framework.  Even if it is, I think it’s a good trade-off to make to maintain the openness of the Internet.

Network neutrality is a good principle, but regulating it would cause more issues than it solves

This is perhaps my main concern. I haven’t had a chance to research this yet, so expect it to be the topic of a future article.

Conclusion

Net neutrality is a great principle to help keep the Internet an open environment that allows for rapid exchange of information, services, or just the latest funny cat pictures. It can take several forms, though, and the arguments against it (or against regulating it) should not be trivially dismissed. My own position is that we should regulate it to prevent abuses from ISPs, but allow for reasonable traffic prioritization. I encourage you to learn more, keep an eye on the news, and if you agree that this is a principle we need to enforce, contact your legislators when the issue comes up again.

I’m sure there are plenty more issues about net neutrality than I’ve covered here, so I’m happy to continue the discussion in the comments below.