{"id":3894,"date":"2017-02-23T13:42:54","date_gmt":"2017-02-23T19:42:54","guid":{"rendered":"http:\/\/neosmart.net\/blog\/?p=3894"},"modified":"2017-02-24T06:43:18","modified_gmt":"2017-02-24T12:43:18","slug":"rewrite-a-rust-powered-in-place-file-rewrite-utility","status":"publish","type":"post","link":"https:\/\/neosmart.net\/blog\/rewrite-a-rust-powered-in-place-file-rewrite-utility\/","title":{"rendered":"rewrite: a rust-powered, in-place file rewrite utility"},"content":{"rendered":"<p><a href=\"https:\/\/github.com\/neosmart\/rewrite\" rel=\"nofollow\"><img loading=\"lazy\" decoding=\"async\" class=\"alignright wp-image-3898 size-thumbnail colorbox-3894\" src=\"https:\/\/neosmart.net\/blog\/wp-content\/uploads\/forkme-right-150x150.png\" width=\"150\" height=\"150\" srcset=\"https:\/\/neosmart.net\/blog\/wp-content\/uploads\/forkme-right-150x150.png 150w, https:\/\/neosmart.net\/blog\/wp-content\/uploads\/forkme-right-300x300.png 300w, https:\/\/neosmart.net\/blog\/wp-content\/uploads\/forkme-right.png 512w\" sizes=\"auto, (max-width: 150px) 100vw, 150px\" \/><\/a>Let&#8217;s say you&#8217;ve got a terminal open and you want to sort the contents of a file before you email it to a friend. The file can contain anything and it could be of any length, it doesn&#8217;t matter. What do you do?<\/p>\n<p>The obvious answer is to use <code>sort<\/code>. Sorting the file is as easy as <code>sort myfile<\/code> &#8211; except it doesn&#8217;t actually sort the <em>file<\/em>, it sorts the *contents* of the file and dumps them to the command line (via <code>stdout<\/code>). So how do you sort the file &#8220;in-place,&#8221; so-to-speak? Again, the obvious answer would be <code>sort myfile &gt; myfile<\/code>,<sup id=\"rf1-3894\"><a href=\"#fn1-3894\" title=\"Don&rsquo;t do this!\" rel=\"footnote\">1<\/a><\/sup> redirecting the output of the sort command back to the file you want to ultimately send sorted.<\/p>\n<p><!--more--><\/p>\n<p>The only problem is that if you did the command above, you&#8217;d likely lose your data entirely. <del>You see, <code>sort<\/code> is too smart to read the entire file into memory, sort it, then spit it out on <code>stdout<\/code>, which is how it can manage sorting huge amounts of data without completely trashing your system memory. It keeps a handle to the input file open and reads even as it writes to <code>stdout<\/code>.<\/del><sup id=\"rf2-3894\"><a href=\"#fn2-3894\" title=\"While not true for sort, there are however complex commands where terminating with | tee infile.txt &gt; \/dev\/null results in loss of data due to the input file being read at the same time the output file is being written.\" rel=\"footnote\">2<\/a><\/sup> You see, the use of the <code>&gt;<\/code> to redirect instructs the shell to replace the file following it with the output of the previous command &#8211; truncating it entirely. If you <em>did<\/em> try to sort your file that way, you&#8217;re going to regret it very dearly quite shortly thereafter when it comes time to email that file and you realize that <strong>it&#8217;s now completely blank<\/strong> and all your work has been lost.<\/p>\n<p>The &#8220;right&#8221; way of doing this would be to read from one file and write to another. You could do something like this:<\/p>\n<pre>#randomly select 2048 words from the dictionary\nshuf -n 2048 \/usr\/share\/dict\/words &gt; words.txt\nsort words.txt &gt; sorted.txt\nmv sorted.txt words.txt<\/pre>\n<p>And that would do what you want. Except it&#8217;s a bit of a mouthful, and it gets a lot harder when it&#8217;s not just a one-off command that needs to read and write from\/to the same file but rather part of a larger workflow or bash script. Enter <code>rewrite<\/code>.<\/p>\n<p><code>rewrite<\/code> is a dead-simple no-brainer of a command line utility that simply buffers content from <code>stdin<\/code> and only when the upstream command or commands have finished does it write to the output file. Sorting a file in-place becomes as simple as<\/p>\n<pre>sort words.txt | rewrite words.txt<\/pre>\n<p>And that&#8217;s it. You don&#8217;t have to worry about creating or deleting temporary files, you don&#8217;t need to generate random file names if you&#8217;re batch processing a thousand files as part of a script, etc. <code>rewrite<\/code> takes care of the entire thing for you, and it couldn&#8217;t be easier to use. And for those that care about such things, <code>rewrite<\/code> is written in rust for safety and reliability combined with performance and out-of-the-box cross-platform support.<\/p>\n<p>Installing <code>rewrite<\/code> is as as simple as<\/p>\n<pre>cargo install rewrite<\/pre>\n<p>(Assuming you have rust&#8217;s <code>cargo<\/code> utility installed.) If you&#8217;re on Windows or Mac, pre-built binaries are also provided below. Of course, <code>rewrite<\/code> is fully open-source (MIT licensed, no less) and <a href=\"https:\/\/github.com\/neosmart\/rewrite\" rel=\"nofollow\">available on GitHub<\/a> for your forking pleasure.<\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/neosmart\/rewrite\/releases\/download\/0.1\/rewrite.0.1.for.Windows.x64.zip\" rel=\"nofollow\">rewrite for Windows x64<\/a> (authenticode signed)<\/li>\n<li><a href=\"https:\/\/github.com\/neosmart\/rewrite\/releases\/download\/0.1\/rewrite.0.1.for.Windows.x86.zip\" rel=\"nofollow\">rewrite for Windows x86<\/a> (authenticode signed)<\/li>\n<li><a href=\"https:\/\/github.com\/neosmart\/rewrite\/releases\/download\/0.1\/rewrite.0.1.for.OS.X.zip\" rel=\"nofollow\">rewrite for OS X<\/a> (unsigned binary)<\/li>\n<\/ul>\n<hr class=\"footnotes\"><ol class=\"footnotes\"><li id=\"fn1-3894\"><p>Don&#8217;t do this!&nbsp;<a href=\"#rf1-3894\" class=\"backlink\" title=\"Jump back to footnote 1 in the text.\">&#8617;<\/a><\/p><\/li><li id=\"fn2-3894\"><p>While not true for <code>sort<\/code>, there are however complex commands where terminating with <code>| tee infile.txt &gt; \/dev\/null<\/code> results in loss of data due to the input file being read at the same time the output file is being written.&nbsp;<a href=\"#rf2-3894\" class=\"backlink\" title=\"Jump back to footnote 2 in the text.\">&#8617;<\/a><\/p><\/li><\/ol>","protected":false},"excerpt":{"rendered":"<p>Let&#8217;s say you&#8217;ve got a terminal open and you want to sort the contents of a file before you email it to a friend. The file can contain anything and it could be of any length, it doesn&#8217;t matter. What &hellip; <a href=\"https:\/\/neosmart.net\/blog\/rewrite-a-rust-powered-in-place-file-rewrite-utility\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":true,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[448,52,935,936,813],"class_list":["post-3894","post","type-post","status-publish","format-standard","hentry","category-software","tag-freeware","tag-open-source","tag-rewrite","tag-rust","tag-unix"],"aioseo_notices":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4xDa-10O","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/posts\/3894","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/comments?post=3894"}],"version-history":[{"count":14,"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/posts\/3894\/revisions"}],"predecessor-version":[{"id":3910,"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/posts\/3894\/revisions\/3910"}],"wp:attachment":[{"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/media?parent=3894"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/categories?post=3894"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/tags?post=3894"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}