Using SIMD acceleration in rust to create the world’s fastest tac

NeoSmart Technologies’ open source (MIT-licensed), cross-platform tac (written in rust) has been updated to version 2.0 (GitHub link). This is a major release with enormous performance improvements under-the-hood, primarily featuring handwritten SIMD acceleration of the core logic that allows it to process input up to three times faster than its GNU Coreutils counterpart. (For those that aren’t familiar with it, tac is a command-line utility to reverse the contents of a file.)

This release has been a long time in the making. I had the initial idea of utilizing vectorized instructions to speed up the core line ending search logic during the initial development of tac in 2017, but was put off by the complete lack of stabilized SIMD support in rust at the time. In 2019, after attempting to process a few log files – each of which were over 100 GiB – I decided to revisit the subject and implemented a proof-of-concept shortly thereafter… and that’s when things stalled for a number of reasons.

Continue reading