rewrite: a rust-powered, in-place file rewrite utility

Let’s say you’ve got a terminal open and you want to sort the contents of a file before you email it to a friend. The file can contain anything and it could be of any length, it doesn’t matter. What do you do?

The obvious answer is to use sort. Sorting the file is as easy as sort myfile – except it doesn’t actually sort the file, it sorts the *contents* of the file and dumps them to the command line (via stdout). So how do you sort the file “in-place,” so-to-speak? Again, the obvious answer would be sort myfile > myfile,1 redirecting the output of the sort command back to the file you want to ultimately send sorted.

The only problem is that if you did the command above, you’d likely lose your data entirely. You see, sort is too smart to read the entire file into memory, sort it, then spit it out on stdout, which is how it can manage sorting huge amounts of data without completely trashing your system memory. It keeps a handle to the input file open and reads even as it writes to stdout.2 You see, the use of the > to redirect instructs the shell to replace the file following it with the output of the previous command – truncating it entirely. If you did try to sort your file that way, you’re going to regret it very dearly quite shortly thereafter when it comes time to email that file and you realize that it’s now completely blank and all your work has been lost.

The “right” way of doing this would be to read from one file and write to another. You could do something like this:

#randomly select 2048 words from the dictionary
shuf -n 2048 /usr/share/dict/words > words.txt
sort words.txt > sorted.txt
mv sorted.txt words.txt

And that would do what you want. Except it’s a bit of a mouthful, and it gets a lot harder when it’s not just a one-off command that needs to read and write from/to the same file but rather part of a larger workflow or bash script. Enter rewrite.

rewrite is a dead-simple no-brainer of a command line utility that simply buffers content from stdin and only when the upstream command or commands have finished does it write to the output file. Sorting a file in-place becomes as simple as

sort words.txt | rewrite words.txt

And that’s it. You don’t have to worry about creating or deleting temporary files, you don’t need to generate random file names if you’re batch processing a thousand files as part of a script, etc. rewrite takes care of the entire thing for you, and it couldn’t be easier to use. And for those that care about such things, rewrite is written in rust for safety and reliability combined with performance and out-of-the-box cross-platform support.

Installing rewrite is as as simple as

cargo install rewrite

(Assuming you have rust’s cargo utility installed.) If you’re on Windows or Mac, pre-built binaries are also provided below. Of course, rewrite is fully open-source (MIT licensed, no less) and available on GitHub for your forking pleasure.


  1. Don’t do this! ↩︎

  2. While not true for sort, there are however complex commands where terminating with | tee infile.txt > /dev/null results in loss of data due to the input file being read at the same time the output file is being written. ↩︎

4 thoughts on “rewrite: a rust-powered, in-place file rewrite utility

  1. Haha, what a waste. Sort already has the option to redirect output with the -o option.

    arc% cat words.txt
    foo
    bar
    more
    words
    blah
    arc% sort -o words.txt words.txt
    arc% cat words.txt
    bar
    blah
    foo
    more
    words

  2. This is really handy, thank you.

    dangeroushobo: do you not understand what an example is????? What about any other tool that doesn’t have a -o option?

    rewrite is awesome and it’s going in my default install.

  3. @dangeroushobo: obviously workarounds for individual commands exist. There’s one in the post itself (redirect and move).

    Here’s an example that doesn’t work the way you say it should:

    mqudsi@ZBook ~> shuf -n 2048 /usr/share/dict/words > words.txt
    mqudsi@ZBook ~> wc words.txt
     2048  2048 19296 words.txt
    mqudsi@ZBook ~> uniq --help 2>&1 | head -n1
    Usage: uniq [OPTION]... [INPUT [OUTPUT]]
    mqudsi@ZBook ~> uniq words.txt words.txt
    mqudsi@ZBook ~> wc words.txt
    0 0 0 words.txt
    

    The point of rewrite is that it standardizes everything. You don’t need to read the man page to see what option to use to specify the output file. And even when you know what option to specify for writing to an output file, as we’ve just seen, there is no guarantee that it’ll even work when both the input and output files are the same.

    mqudsi@ZBook ~> shuf -n 2048 /usr/share/dict/words > words.txt
    mqudsi@ZBook ~> wc words.txt
     2048  2048 19296 words.txt
    mqudsi@ZBook ~> uniq words.txt | rewrite words.txt
    mqudsi@ZBook ~> wc words.txt
     2048  2048 19296 words.txt
    

Leave a Reply

Your email address will not be published. Required fields are marked *