Sunday, June 21, 2020

Rust, Minecraft, and the Fragility of Software

I've been using Rust for the last year or so as the main language I've been writing in.  Recently I went back to C++ for something else, and was struck at the difference.  I had been gradually liking Rust more and more as I used it, but switching back/forth between Rust and C++ really opens your eyes to the advantages that Rust has.

Rust's type system, and the way that it's woven references, heap-allocation, stack-allocation, and such all into the type system, is really powerful, and once you've gotten used to it, makes a great deal of sense, vs. the often opaque nature of C's pointers.

Yes, C++ has std::unique_ptr<> and, and has been trying to incorporate these concepts in the standard library, but it's not nearly as simple to use as Rust's default mode of moves and borrows.

In particular, the Rust compiler is a great ally.  And it's continually getting better.  The ability to catch "use after moved", and reference lifetime issues (e.g. use after free issues) is amazing.

But that, to me, is not the best part.

The best part is a standard library that has the notion of Option<T>, and Result<T, Error> deeply embedded into it.  Option<T> is an enum type, generic on type T.  It has two variants:  None, and Some(T).  None is like null, except that it's effectively a sub-type to the compiler.  You can't quite treat an Option<T> variable like it's a T.  Because it might be None.  But it's easy to use, especially with the match operator, map(), and the like.

Checked math exists, is A Thing, especially when dealing with time.  And that's subtle, but it's what got me thinking about this (that and explaining to my 9yo why computers have limits to the sizes of numbers that they can use).

Mathematical overflow is one of those things that we tend to not think about, except in the brief moments of choosing the type for a variable/member and then much later with the realization that something has gone terribly wrong when you're suddenly seeing nonsense values when something has overflowed.

Rust has a bunch of operations that are checked, and return an Option<T>, allowing it to return None instead of nonsense.  And since that None isn't a null, but an enum variant that you're forced to contend with, the compiler won't let you pretend it's Some<T>.

Unfortunately, that can lead to some clunky code when trying to do math (say converting Fahrenheit to Celsius), if each step is being checked for overflow.

But that clunkiness lays bare the fragility that underlies a lot of software.

We assume so much in the software that we right is safe, and for the most part it is.  Until it isn't.

Another example, and what got started me on this line of thought, was my 9yo asking about the Far Lands in Minecraft, a world generation bug that occurred at high values along X and Z coordinates (the ground plane).  And it occurred to me that this was likely due to overflows, or the imprecision of floating point at large values (which also shows up in Minecraft).

I've long been aware of these issues, but also as special cases.  By making some choices early on, one can mostly ignore them.  I mean, 640KB should be enough for anyone, right?

But these, and using Rust, has really been making me re-evaluate just how often we make these assumptions, and how fragile most software is, especially if it ever faces unexpected (by the developer) inputs.  And not just user inputs.  But corrupt I/O readings, packet errors, etc. can be nefarious in embedded work.

Rust certainly isn't perfect.  As I mentioned earlier, the checked math routines are clunky to use, and for the most part, aren't the default.  Tools like proptest exist, which can help setup the bounds for limiting bad inputs to your functions, but it's still a bunch of work to always be thinking about what these limits and error potentials mean.

But as compilers get better, especially with expressive type systems like Rust has, I'm hoping that we'll get to a point where we can catch these sorts of errors at compile-time, and as a result, get closer to a place where we can categorically remove classes of errors from programs.

No comments:

Post a Comment