PrettySize 0.3 release and a weakness in rust’s type system

In which we discuss the limitations of rust's support for commutative mathematical operations and how to preserve backwards compatibility in the face of breaking changes

PrettySizeI’m happy to announce that a new version of size, the PrettySize.NET port for rust, has been released and includes a number of highly requested features and improvements.

The last major release of the size crate was 0.1.2, released in December of 2018. It was feature complete with regards to its original purpose: the (automatic) textual formatting of file sizes for human-readable printing/display purposes. It would automatically take a file size, pick the appropriate unit (KB, MB, GB, etc) to display the final result in, and choose a suitable precision for the floating numeric component. It had support for both base-10 (KB, MB, GB, etc) and base-2 (KiB, MiB, GiB, etc) types, and the user could choose between them as well as override how the unit was formatted. In short, it did one thing and did it right.

A brief recap of the size crate to date

Some time after its release, there was a request made to add support for mathematical operations on strongly-typed Size types (without having to go to an intermediate “raw” bytes value and back) that I originally approached with some gusto, but ended up dismayed by some restrictions of the rust type system that made it difficult to write generic code that could support the full gamut of what a user could reasonably expect to be able to do (more on this later).

As the size crate covered the functionality we needed out of it here at NeoSmart Technologies and as there were valid workarounds for composing/calculating sizes (via the .bytes() escape hatch), there wasn’t a pressing need to tackle those limitations and other projects took priority meaning size didn’t see any updates since then. But it never sat well with me that I left my git working tree in an unclean state and had open issues languishing unresolved, and from time to time I would always think of going back and issuing an update… but you know how that goes.

However, in a recent discussion apropos “genuine limitations of the rust type system that frustrate people that otherwise love rust” I raised the issue that I ran into and that finally got me hot and bothered enough to finally tackle a new size release with the requested support for mathematical operators and more.

Rust’s problem with commutative mathematical operations

So what’s this about a problem in rust’s type system? Well, with the caveat that it’s importance is almost certainly overinflated in my eyes, the issue lies with how mathematical operations (impls of core::ops::{Add, Sub, Mul, Div} and others) are written and how that conflicts with rust’s orphan rule, which forbids you from implementing a trait for a foreign type (defined in a different crate).

After publishing, I realized that this article heavily uses the LHS and RHS abbreviations, but never defines what they stand for! LHS is “left-hand side” and RHS is “right-hand side” and they denote (for a binary mathematical operation) what side of the operator a token is on. e.g. in 42 apples + 17 oranges, the LHS is “42 apples,” while the RHS has a magnitude of 17 with a unit of “oranges.”

Let’s take a look at how we would normally implement (in rust) support for a mathematical operation like foo * bar where Foo and Bar are different types, both local to the current crate:

use core::ops::Mul;

struct Foo(i32);
struct Bar(i32);

/// Here, Bar is the RHS type and Foo is the LHS type, 
/// i.e. this impl is used for `let prod = foo * bar`
impl Mul<Bar> for Foo {
    type Output = i32;
    
    fn mul(self, other: Bar) -> Self::Output {
        self.0 * other.0
    }
}

/// Here, Foo is the RHS type and Bar is the LHS type, 
/// i.e. this impl is used for `let prod = bar * foo`
impl Mul<Foo> for Bar {
    type Output = i32;
    
    fn mul(self, other: Foo) -> Self::Output {
        self.0 * other.0
    }
}

fn main() {
    println!("{}", Foo(7) * Bar(6));
    println!("{}", Bar(6) * Foo(7));
}

You can try this online in the rust playground.

The above code demonstrates the commutative property of scalar multiplication: foo * bar is the same as bar * foo and returns the same result, both in type (here, i32) and magnitude (42 in this example).

It’s important for a type system to allow (or even require) you to write out separate implementations for each of the two commutative permutations because not all mathematical operations (or even all multiplications) are commutative. For example, while addition is generally commutative, subtraction isn’t (4 - 2 gives a different result from 2 - 4) – and multiplication of matrices M and N may not only give different results for M * N as compared to N * M, one of those operations may be valid while the other is an error!

So far, we haven’t run into any issues. But let’s say we have a bunch of different types, all of which (for reasons we won’t get into) can be boiled down to an integer equivalent and we want to support commutative multiplication for them all. It sounds like a textbook case for the use of generics: define a trait AsInt, have each type implement it however it likes, then implement core::ops::Mul via the AsInt trait:

use core::ops::Mul;

trait AsInt {
    fn as_int(&self) -> i32;
}

struct Foo(i32);
struct Bar(i32);

impl AsInt for Foo {
    fn as_int(&self) -> i32 { self.0 }
}

impl AsInt for Bar {
    fn as_int(&self) -> i32 { self.0 }
}

impl<Lhs: AsInt, Rhs: AsInt> Mul<Rhs> for Lhs {
    type Output = i32;
    
    fn mul(self, other: Rhs) -> Self::Output {
        self.as_int() * other.as_int()
    }
}

You can try this online in the rust playground.

Unfortunately, this doesn’t compile:

error[E0210]: type parameter `Lhs` must be used as the type parameter for some local type (e.g., `MyStruct<Lhs>`)
  --> src/main.rs:18:6
   |
18 | impl<Lhs: AsInt, Rhs: AsInt> Mul<Rhs> for Lhs {
   |      ^^^ type parameter `Lhs` must be used as the type parameter for some local type
   |
   = note: implementing a foreign trait is only possible if at least one of the types for which it is implemented is local
   = note: only traits defined in the current crate can be implemented for a type parameter

For more information about this error, try `rustc --explain E0210`.
error: could not compile `playground` due to previous error

The problem is that while all the types and the traits in this implementation are indeed local, the rust compiler doesn’t check if we are in violation of the orphan rule by checking which types implement the (local) trait we are implementing (another) trait against – it just checks to see if the implementing type itself is local. You can actually use a generic parameter implementing any (foreign or local) trait in your impl, but you can’t implement against that generic type directly – you can only forward it as a generic parameter to a local type.

Sidebar: Quare rust’s orphan rule and its limitations?

The most succinct PLT answer to this is that it’s because “local types” is a closed set (a new local type can never be added without changing your code and its API) while “types implementing local trait” is (or could be) an open set: a downstream user of your crate/library may implement your trait on their type at a later date, and suddenly we could have a conflict. You might be tempted to think “that’s on them,” and I wouldn’t blame you (and might even agree) but the problems don’t stop there – we absolutely need the ability to implement both local and foreign traits, but a crate or library upstream of yours (or even the standard library itself) might implement the same foreign trait against types implementing another foreign trait, and then the conflict would be your problem, in your code.

Of course the rust compiler could be smarter about this and allow a combination of only certain permutations of impl/for local/foreign types/traits to get around these restrictions (e.g. allow implementing anything for a local type, implementing only sealed traits inaccessible to downstream users for foreign types, etc) and while there are open issues and rfcs for some of these, the road to hell is paved with good intentions and there are a thousand pitfalls.1 Long story short, the situation is what it is (for now) for $reasons and until that changes, these restrictions on commutative operations for generic types aren’t going anywhere.

Back to the issue at hand

You can kind of work around this by abusing Deref with an output that’s some intermediate type exposing a reference to an i32 (because we actually have an underlying i32 in this case, and are not just calculating one out of the blue each time), but deref coercion will only get you so far.

For the particular case of Size, we just need to implement commutative multiplication of Size * number and number * Size so it turns out we can actually side-step this entire debate by manually writing out a million or so different impls, one for each primitive numeric type (macros help here!). Then multiply those by four, because you need to write a separate impl for each of Foo * Bar, Foo * &Bar, &Foo * Bar and &Foo * &Bar. Lots of code, but conceptually simple.

Except it turns out not to be so simple after all. Here’s an example that demonstrates commutative multiplication of a type (but not a reference to a type) with an i32 value:

use core::ops::Mul;

#[derive(Debug, Copy, Clone)]
struct Foo(i32);

impl Mul<Foo> for i32 {
    type Output = Foo;
    
    fn mul(self, other: Foo) -> Self::Output {
        Foo(self * other.0)
    }
}

impl Mul<i32> for Foo {
    type Output = Foo;
    
    fn mul(self, other: i32) -> Self::Output {
        Foo(self.0 * other)
    }
}

fn main() {
    println!("{:?}", Foo(7) * 6);
    println!("{:?}", 6 * Foo(7));
}

You can try this online in the rust playground.

It works great. This time we are returning a strongly-typed Foo rather than an i32 scalar value, commutative multiplication works fine, the code compiles, and prints the expected output.

We originally wanted to make this generic over all primitive numeric types, so that if the user has a num: u8 or a float: f64 lying around, they can just perform the multiplication automatically without getting a type mismatch error like you would with the above if you tried to multiply by some already-typed value that rust can’t coerce/infer to be an i32 (which our impl is specifically for):

fn main() {
    println!("{:?}", Foo(7) * 6_u8);
    println!("{:?}", 6_f32 * Foo(7));
}

You can try this online in the rust playground.

Which gives the following (expected) type errors:

error[E0308]: mismatched types
  --> src/main.rs:23:31
   |
23 |     println!("{:?}", Foo(7) * 6_u8);
   |                               ^^^^ expected `i32`, found `u8`
   |
help: change the type of the numeric literal from `u8` to `i32`
   |
23 |     println!("{:?}", Foo(7) * 6_i32);
   |                                 ~~~

error[E0277]: cannot multiply `f32` by `Foo`
  --> src/main.rs:24:28
   |
24 |     println!("{:?}", 6_f32 * Foo(7));
   |                            ^ no implementation for `f32 * Foo`
   |
   = help: the trait `Mul<Foo>` is not implemented for `f32`

Some errors have detailed explanations: E0277, E0308.
For more information about an error, try `rustc --explain E0277`.

We said we can’t use generics to implement this support, but we can add a second pair of impl Mul for u8 to get this to work, right?

use core::ops::Mul;

#[derive(Debug, Copy, Clone)]
struct Foo(i32);

impl Mul<Foo> for i32 {
    type Output = Foo;
    
    fn mul(self, other: Foo) -> Self::Output {
        Foo(self * other.0)
    }
}

impl Mul<i32> for Foo {
    type Output = Foo;
    
    fn mul(self, other: i32) -> Self::Output {
        Foo(self.0 * other)
    }
}

impl Mul<Foo> for u8 {
    type Output = Foo;
    
    fn mul(self, other: Foo) -> Self::Output {
        Foo(self as i32 * other.0)
    }
}

impl Mul<u8> for Foo {
    type Output = Foo;
    
    fn mul(self, other: u8) -> Self::Output {
        Foo(self.0 * other as i32)
    }
}

fn main() {
    println!("{:?}", Foo(7) * 6_u8);
    println!("{:?}", 6_i32 * Foo(7));
}

You can try this online in the rust playground.

Indeed, this adds support for multiplying a Foo by a u8 or the other way around, just as we wanted. We also still have support for multiplying Foo by i32 (and vice-versa) as well. Great! This is what we wanted, right? Ergonomics +100 achievement unlocked!

Unfortunately, no. While we did add support for multiplying by u8 or i32 typed values, we broke something probably much more important: the ability to multiply by an untyped (or at least, not explicitly typed) literal:

use core::ops::Mul;

#[derive(Debug)]
struct Foo(i32);

impl Mul<Foo> for i32 {
    type Output = Foo;

    fn mul(self, other: Foo) -> Self::Output {
        Foo(self * other.0)
    }
}

impl Mul<Foo> for u8 {
    type Output = Foo;
    
    fn mul(self, other: Foo) -> Self::Output {
        Foo(self as i32 * other.0)
    }
}

fn main() {
    let prod = 7 * Foo(6);
    assert_eq!(prod.0, 42);
}

You can try this online in the rust playground.

This breaks in a rather weird way: you’d expect that if there’s any confusion about what a type is, it’s about whether 7 is an i32 or u8 here. Indeed, that’s what’s happening internally, but that’s not what the error surfaced by the rust compiler says:

error[E0282]: type annotations needed
  --> src/main.rs:39:9
   |
39 |     let prod = 7 * Foo(6);
   |         ^^^^^
   |
   = note: type must be known at this point
help: consider giving `prod` an explicit type
   |
39 |     let prod: _ = 7 * Foo(6);
   |              +++

Weird. We know (or at least, can reasonably surmise) the problem is with the ambiguity in the literal 7 and whether the compiler should invoke the Mul<Foo> for i32 impl or the Mul<Foo> for u8 impl, but the compiler says the problem is actually with the missing return type for the entire operation (which is always Foo because that’s the Mul::Output we have specified for both)! In fact, an older version of the compiler produces a better message:

<snip>

error[E0283]: type annotations needed
  --> src/main.rs:39:19
   |
39 |     let prod1 = 7 * Foo(6);
   |                   ^ cannot infer type for type `{integer}`
   |
note: multiple `impl`s satisfying `{integer}: Mul<Foo>` found
  --> src/main.rs:22:1
   |
22 | impl Mul<Foo> for i32 {
   | ^^^^^^^^^^^^^^^^^^^^^
...
30 | impl Mul<Foo> for u8 {
   | ^^^^^^^^^^^^^^^^^^^^

However, let’s just take the latest rustc at its word and add the missing Foo type to the let prod = ... expression:

fn main() {
    let prod: Foo = 7 * Foo(6);
    assert_eq!(prod.0, 42);
}

And everything magically works! But we didn’t actually solve the problem we were dealing with, we just worked around the resulting compiler error – something each of our users would have to do any time they relied on type inference to multiply a scalar number by a typed Foo (interesting tidbit: this doesn’t happen the other way around, when multiplying a Foo by a scalar value – I’m not sure why, but I’ve opened bugs for these issues: [1], [2]).

To recap:

  • Rust’s orphan rule prevents us from implementing commutative addition/multiplication for types implementing a trait, which isn’t a complete blocker if you’re the only one that’s ever going to be implementing them because you can use macros or good, old copy-and-paste to work around that limitation and implement the operation manually. Half the operations can be generic over RHS (because impl<Rhs: ...> Mul<Rhs> for SpecificType is perfectly legal) but the other half need to be manually spelled out. If you’re doing just two or three types, it’s fairly manageable but since you need 4 * M * N impls in total (accounting for the ref/non-ref permutations), it can quickly spiral into insanity.
  • A (temporary?) bug? quirk? limitation? in the rust compiler stops us from manually implementing commutative operations with the various numeric literals, because even though rust has a (silent) integer inference preference for i32 and a default floating point type of f64, the presence of multiple impls breaks type inference in interesting ways.

A new (and a newer) size crate

This brings us at long last to today’s announcement regarding a new size crate. Faced with the issues above while attempting to implement commutative mathematical operations, size now features support for the following, implemented via a combination of macros/copy-and-paste and generic impls where possible:

  • Strongly-typed addition and subtraction of Size values, giving Size results . This was implemented directly, as there’s only one type involved, with copy-and-pasted impls for the ref/non-ref cases.
  • Multiplication and division of an LHS Size by an RHS integer or floating value, yielding a Size instance; implemented via generics as impl<T: ..> Mul<T> for Size is perfectly accepted, then copy-and-pasted as needed to handle ref/non-ref permutations.
  • Multiplication of an LHS integer or floating point value by an RHS Size value, yielding a Size result.This could only be implemented for one integer type (i64) and floating type (f64) to prevent the bizarre breakage when an untyped integer/float value is used (with only one possible {float} type and one possible {integer} type, rustc will try to coerce to the matching type of the two automatically). This had to be implemented manually (via macros) as rust’s orphan rule got in the way.
  • Division of an LHS integer by a Size value is not implemented, since it makes no sense (what does 42 / 16 KiB yield?).

That was pretty much it in terms of the features I’d wanted to implement from a few years back before I was stymied by the rust restrictions/limitations we’ve discussed. But the additions and improvements to size didn’t stop there:

  • As a result of implementing core::ops::Subtraction, it became necessary to add support for the concept of negative file sizes (something which can only exist in the abstract and wasn’t previously supported). This necessitated a change in the “output type” used by the library, and now the core primitive type returned/expected by the library (generic overloads excepted) is i64 rather than u64.
  • The goal of this crate has changed from “merely” providing formatting for file sizes to encapsulating all operations on sizes in general by providing a strongly-typed size that can expose just the right number of features and functionality while restricting the user from doing things that don’t make sense (such as dividing a scalar integer by a file size, as mentioned above). To that end, it is now possible to directly compare Size types for equality or order (via PartialEq and PartialOrd impls).
  • With its newfound ability to do more than just format file sizes for human readable output, it’s possible to imagine using size in completely different contexts. To that end, the size crate may now be compiled as a no_std library2 which lets you use the basic Size features such as initializing a Size from different units, comparing Size instances, etc but disables features that aren’t meant to be used in embedded or other no_std contexts.
  • The size crate no longer has any dependencies. It previously featured only a single dependency on num_crates (plus its transitive dependency on autocfg) for abstracting over the different primitive numeric types, but the latest releases now use a sealed local trait and some macros to accomplish the same but without any foreign dependencies. Compilation time has been significantly improved as a result.

The changes above formed the bulk of the size 0.2.0 release. But just as I was about to sit down and write up this article, it struck me that the Size api was not very rusty. The crate (and its basic API) was originally written in 2018 and envisioned somewhat differently from how it turned out. The original idea was to take advantage of rust’s game-changing tagged enums to provide, in addition to pretty printing of file sizes, an interface for converting directly between sizes of different units (almost like a units(1) but in rust).

Rust’s enums seemed perfect for the job, so size 0.1.x and 0.2.x shipped with an API that exposed an enum composed of a strongly-typed unit name and a generic, numeric size, e.g. Size::Bytes(T), Size::Gigabytes(T), Size::Kibibytes(T), etc. But in practice, people aren’t reaching out for the size crate to convert between well-defined base-2/base-10 size units, they’re using it to create strongly-typed Size objects to represent an underlying file size and format it for display. Users requested the ability to perform math/logic on Size types, but users didn’t care for requesting an equivalent Size but with a base unit of gigabytes.

The majority of the code I found on GitHub and in other nooks and crannies of the internet ended up looking like this:

let size = Size::Bytes(some_value);
...
println!("File size: {}", size);

Which, while being perfectly valid rust code and actually conforming to the rust formatting rules and regulations, just didn’t feel rusty and didn’t match the approach that other crates have pretty much clustered around. You don’t get the feeling that Size<T>::Bytes() is an enum so much as it appears to be an unfortunately mis-capitalized method exposed by the Size type. What’s more, the interface was extremely generic heavy, but the generics were only skin-deep because all operations coming out of a Size stripped the original T and returned values of the intermediate (u64/i64) numeric type instead. While the Size variants were storing the user-picked T, it wasn’t actually used anywhere except as an input to internal calculations completely masked to the end user, it wasn’t intuitive that operations like Size<u8> + Size<f64> were even possible let alone yielded a Size<i64> (regardless of the initial types), and the internal type conversions and changes to precision (one way or the other) were not intuitively exposed.

Enter size 0.3.x with a new and rustier (if not improved) API that should feel more natural to rustaceans around the globe. Size has been changed from an enum to a struct and now exposes functions to model the behavior previously exposed by the old variants. The biggest API change is that the Size type itself is no longer generic  and things like Size::Kilobytes(10) are now expressed as Size::from_kilobytes(10) (or, optionally, Size::from_kb(10) instead). It should be more immediately intuited that a numeric conversion is (or at least may be) taking place given the “from” in the function name and the fact that you are not directly instantiating an instance of Size containing a particular numeric type T that is somehow never afterwards seen.

One other minor change that may be of interest to other crate developers: there are certain spellings or phrasings specific to each community, and it helps the ecosystem considerably for crate authors to make a conscious effort to adhere to them where possible. For example, while size 0.2.x spelled “lower case” as two words, the rust standard library has it as a single word “lowercase” and so enum members like Style::FullLowerCase have been renamed to Style::FullLowercase to match.

Preserving backwards compatibility in rust

It would seem like a massively breaking change to switch the core Size type from an enum to a struct, let alone to rename virtually all the interface members in such a manner. But if you take the unix “source-compatible” approach rather than focusing on strict ABI compatibility, things are actually not that – if you’re willing to break some conventions and bend the rust compiler to your will with liberal usage of #[allow(...)] in carefully chosen places.

After the new Size interface was in place, a second impl Size { ... } was added, – this time prefixed with #[doc(hidden)] to keep it out of the documentation – that contained a number of “fakes,” in this case, const functions masquerading as enum variants. Here’s an excerpt of what that looks like:

#[doc(hidden)]
impl Size {
    #![allow(non_snake_case)]

    #[inline]
    #[deprecated(since = "0.3", note = "Use Size::from_bytes() instead")]
    /// Express a size in bytes.
    pub const fn Bytes(t: T) -> Self { Self::from_bytes(t) }

    #[inline]
    #[deprecated(since = "0.3", note = "Use Size::from_kibibytes() instead")]
    /// Express a size in kibibytes. Actual size is 2^10 \* the value.
    pub const fn Kibibytes(t: T) -> Self { Self::from_kibibytes(t) }

    // ...
}

You may not be able to achieve full compatibility with the old API and there are a lot of cases where this won’t cut it, but fortunately for us, they’re not how most users would approach things. For example, someone creating a Size via Size<u64>::Bytes(num.into()) would find that their code no longer compiles, as Size itself is not generic (rather, it’s the function/mock variant Size::Bytes<T> that is generic over T). But luckily for us, that’s not how most people would write that code and the “natural” way of expressing it (Size::Bytes(num as u64)) continues to compile, happily oblivious to the fact that we’re actually calling a function called Bytes() rather than constructing an enum variant Size<T>::Bytes.

For the renamed “plain” enums, a similar approach was used to make it seem like FullLowerCase was still a valid member of the Style enum (used to specify how the unit name is formatted when the size is pretty-printed):

enum Style { .... }
impl Style {
    #[doc(hidden)]
    #[allow(non_upper_case_globals)]
    #[deprecated(since = "0.3", note = "Use Style::FullLowercase instead")]
    /// A backwards-compatible alias for [`Style::FullLowercase`]
    pub const FullLowerCase: Style = Style::FullLowercase;
}

In this particular case, it would have been possible to keep the old FullLowerCase enum member around and simply hide it from the docs, since Style remained an enum. But that would mean updating all our match sites to handle both the old and the new name, incurring both a maintenance and a (negligible) runtime cost to keeping the backwards-compatible name around. With this approach, and especially with all the old names kept in a separate impl Style block that only contained shims for the deprecated API, there is almost no cost to keeping the code compatible for a few versions or however long we choose to support the legacy API.

Again, this isn’t a magic fix that keeps everything working, but it does handle pretty much all the cases our users were actually using (in this case, calling a function and specifying a Style::Foo variant as a parameter). I highly recommend using GitHub’s (or any other service’s) code search feature to look at how people are using your API before introducing breaking changes or remodeling an API; it really helps to understand how your users approach your crate, which may be quite different than how you originally intended for it to be used.

Using size or contributing

The latest release of the size crate is available on crates.io, and the documentation has been completely overhauled as part of the new 0.2.x and 0.3.x releases. The source code is available on GitHub and is released under the MIT license.

I’ve actually released size 0.4 shortly after publishing this article, mainly to future-proof the API against breaking changes in the future (by breaking it in the here-and-now instead 🤦‍♀️). Unfortunately, I wasn’t able to use any of the methods outlined above to preserve backwards compatibility, and I humbly apologize to everyone affected by this breakage!

You can use size in your rust code today by simply adding a reference to size in your Cargo.toml and placing use size::Size at the top of your rust code:

use size::Size;
use std::fs::File;

fn main() {
    let metadata = File::open("foo.bin").metadata().unwrap();
    let file_size = Size::from_bytes(metadata.len());
    println!("{}", file_size); // prints "13.37 MiB"
}

Sign up and follow for more!

If you found this article interesting, please follow me on twitter and sign up for my rust mailing list to get notifications on new rust articles and make sure you never miss out. You won’t get any other emails, I pinky swear!

If you would like to receive a notification the next time we release a rust library, publish a crate, or post some rust-related developer articles, you can subscribe below. Note that you'll only get notifications relevant to rust programming and development by NeoSmart Technologies. If you want to receive email updates for all NeoSmart Technologies posts and releases, please sign up in the sidebar to the right instead.


  1. For example, a foreign trait is implemented for impls of a local trait, but one of your types implements both the local trait and some other upstream trait and a later upstream release implements the same foreign trait for all impls of the other foreign trait, and suddenly your type has multiple impls for the same trait. 

  2. Just compile with default features disabled. 

2 thoughts on “PrettySize 0.3 release and a weakness in rust’s type system

Leave a Reply

Your email address will not be published.