Rust, Part 1: Welcome to the 21st Century

Apr 15, 2015

8 min. read

technology security

Rust, from Mozilla Research, has shipped a 1.0 beta.

Image Credit: Nylithius

Welcome to the 21st century. To celebrate this great milestone — the advent of modern, civilized society — this is the first in a series of articles looking at Rust from various theoretical and practical perspectives.

What is Rust? Straight from the horses mouth:

Rust is a systems programming language that runs blazingly fast, prevents almost all crashes*, and eliminates data races.

Rust has been on my radar for some time now, but I’ve lacked the motivation to intensely investigate it. Unfortunately, Forge doesn’t turn a profit simply by virtue of me learning. Then, last week, I heard Steve Klabnik and Yehuda Katz on the Changelog talking about Rust following the recent 1.0 beta milestone. On or about that same day, the Ruby Talk mailing list got hit with some spam. This gave me an idea, which I hope to be writing about soon. Nevertheless, it was the kick-in-the-pants I needed to dedicate some time to studying Rust. I’m glad I did.

You should investigate Rust as well. It’s not completely production ready just yet, but by the time it is, Rust will eat the collective lunches of other major systems programming languages. C is the mother of all systems languages. C++, D, Go, et al. are all fantastic languages in their own ways; these kinds of evolutionary improvements are critical. But Rust is a revolutionary improvement for the industry.

Compiled

Your CPU cannot directly run the source code you write. Your C++ or Python program needs to be interpreted by a language processor to transform it into native instructions that the CPU can execute. In a language like C or C++, the compiler is that language processor. It parses your source code, and writes an executable file consisting of machine code instructions logically equivalent to your source code. When you run your program, there is no overhead - those instructions are executed by the CPU directly. Can’t get any faster than that. But when you make a change to your source code, you have to re-compile a new executable. The output code is wickedly efficient, but you have that compilation step every time you want to make a change.

By contrast, in a language like Perl or Ruby, the language processor is called an interpreter - it parses your source code, and then executes instructions which are logically equivalent to it. The interpreter itself has already been compiled, that is, it already contains all of the machine code for any set of instructions the language can generate. So you get to skip the explicit “compile” step when you make changes, but that abstraction comes at the cost of a hefty runtime performance penalty because the compilation step still happens in a way, but now it happens every time you run the code, whether or not there were changes.

Rust follows the earlier pattern, compilation. You have to compile the code when you make changes, but there isn’t a heavy runtime* with tons of overhead incurred when you execute the program.

*Rust does actually have a tiny runtime, and it does perform some minor safety checks, but A) these are few and far between, and B) they’re compiled into the native code. For the purposes of a high level overview, we’ll hand-wave the performance penalty.

Memory Safe

The causes and effects of memory safety in Rust are a big deal. That’s the whole point. It’s what this is all about.

Low-level languages like C allow you to twiddle individual bits within the computers memory. You can allocate some memory on the heap and have multiple variables linked to that memory (called pointers) for easy access. The problem is twofold - first, machines are stupid, and second, so are programmers.

Machine Stupidity - The computer will do exactly what you tell it to do, every time, all the time, and twice on Sundays. It’s stupid, it just follows directions, nothing more.

Programmer Stupidity - What a programmer thinks they told the computer to do, and what they actually told the computer to do frequently diverge. When they do, the computer does what it’s told, not what the programmer intended. That’s called a bug, and it’s proof positive that programmers are stupid.

Computers (which are stupid) can’t prevent programmers (who are stupid) from writing bugs. So programmers will continue writing bugs until we reach the Singularity, at which point the computers will kill all the humans, and the problem will sort itself out. The real danger until then, however, is that bugs in memory management code are arguably some of the worst kinds of software bugs imaginable.

Memory bugs almost always precede a crash. Not log-a-message-and-move-on, not gracefully-reset, but a full-on nose dive fiery death. A segmentation fault — or “seg fault” — almost always betrays a vulnerability leading to privilege escalation, code execution, or some equally terrifying security nightmare. These are the vulnerabilities that regularly propagate malware, enable identity theft, empower spying and espionage, drain bank accounts, ruin performance, corrupt files, and murder kittens. If you’re sick of updating Java and Flash, then you’re sick of memory bugs.

Rust imposes rules, suggests conventions, and provides tools not found (at least not all in the same place at the same time) in other languages which practically eliminate the entire category of memory bugs. In a systems language, that’s a big deal. A momentous deal. The combination of bare metal performance and pointers has meant for decades “be careful what you do, or you will learn what SIGSEGV means.” And for decades we’ve struggled to deal with that, and we’ve failed. Time and time again. Pwn2Own this year was just another high-profile demonstration of this failure. What’s more (and what’s novel) is that Rust does all of this without a runtime GC - ~~all~~ most of this is baked into the compilation step. At this point, the only runtime overhead of which I’m aware is sequence bounds checking (which is just automating what you should already be doing in C or elsewhere).

This is just the “why”, though. More on the “what”, “when” and “where” in a future post.

Multi-paradigm

Imperative / Proceral

fn main() {
    let x = 5;
    println!("The value of x is {}.", x);
}

Structured

fn main() {
    for y in 0..10 {
        println!("The value of y is {}.", y);
    }
}

Object Oriented

struct Shape {
    height: i32,
    width: i32,
}

fn main() {
    let square = Shape { height: 2, width: 2 };
    println!("The square is {}x{}.", square.width, square.height);
}

Functional

fn main() {
    let z = "2";
    println!("The value of z is {}", match z {
        1 => "one",
        2 => "two",
        3 => "three",
    });
}

Rust also includes support for generics, iterators, and closures

fn main() {
  let mut nums: Vec<i32> = Vec::new();
  nums.push(1);
  nums.push(2);
  nums.push(3);

  println!("nums sum: {}",
      nums
          .iter()
          .fold(0, |acc, &x| acc + x)
  );
}

For those who know how painful it is to write Ruby one day:

puts %(alice bob mallory).join(", ")

And find yourself with the same problem the next day in C++:

std::vector<std::string> names();
names.push_back("alice");
names.push_back("bob");
names.push_back("mallory");

size_t counter = 0;
for (auto : iter) {
  std::cout << iter;
  if (++counter < names.length()) std::cout << ", ";
}

You can appreciate the productivity of such high level features paid for with but a few extra syntactic and semantic rules. Finally, we have rectified native performance with developer happiness. You even begin to forget that this code is even compiled - rustc is pretty quick (although no other compiler can touch Go in terms of speed).

Community

Rust and Cargo both observe Semantic versioning. ‘Nuff said.

Rust comes with a Gem-inspired package management system called Cargo. If you’re familiar with Gems, Pip, Composer, NPM, or the like, Cargo will make immediate sense. It even leverages a formalization of the familiar .ini-esque configuration syntax called TOML instead of inventing something crazy of its own.

Rust core follows 6 week version release cycle, with stable, beta, and nightly channels. You know exactly what you’re getting.

Triple slash /// comments are supported for a built-in documentation generator (rustdoc) that uses Markdown. That’s nice (I’m a huge Markdown fan) but the killer feature here is that Rust is able to automatically generate runnable tests from fenced code blocks. Nigh magical. This means that if you’re maintaining examples in your autogen’d documentation (which is a great convention regardless), you get small unit tests “for free” that will automatically break if your docs no longer match your code. For example (from the Book):

/// ```
/// use std::rc::Rc;
///
/// let five = Rc::new(5);
/// ```

Will run a test such as:

fn main() {
    use std::rc::Rc;
    let five = Rc::new(5);
}

This type of attention to “meta” isn’t earth shattering - for example a lot of the paradigms Rust uses are common practice in the Javascript, Ruby, and other communities. What’s unique here is that they’re all on-by-default, they’re all baked into the language itself, and they’re being applied to a systems programming language which actually does something new and substantially useful.

As I mentioned at the top, welcome to the 21st century. Watch for future installments as I dive into some detailed analysis, provide simple tutorials, and prognosticate about what it’s like using Rust as a daily driver.

Chris Tonkinson

Refactored Podcast

Career Schema Project