EFTA00400459 has been cracked, DBC12.pdf liberated

Three days ago, I wrote a post about stumbling across uncensored/unredacted plain-text content in the DOJ Epstein archives that could theoretically be reconstructed and reversed to reveal the uncensored originals of some of the files. Last night, I succeeded in doing just that, and was able to extract the DBC12 PDF file from the Epstein archive EFTA00400459 document. Don’t get too excited: as expected, the exfiltrated document (available for download here) turned out to be nothing too juicy and is “just” another exemplar of the cronyism and the incestuous circles of financial funding that Epstein and his colleagues engaged in.

There is a lot I could say in this post about the different approaches I took and the interesting rabbit holes I went down in trying to extract valid base64 data from the images included in the PDF; however, I am somewhat exhausted from all this and don’t have the energy to go into it all in as much detail as it deserves, so I’ll just post some bullet points with the main takeaways:

Continue reading

Recreating uncensored Epstein PDFs from raw encoded attachments

Heads-up: An update to this article has been posted.

There have been a lot of complaints about both the competency and the logic behind the latest Epstein archive release by the DoJ: from censoring the names of co-conspirators to censoring pictures of random women in a way that makes individuals look guiltier than they really are, forgetting to redact credentials that made it possible for all of Reddit to log into Epstein’s account and trample over all the evidence, and the complete ineptitude that resulted in most of the latest batch being corrupted thanks to incorrectly converted Quoted-Printable encoding artifacts, it’s safe to say that Pam Bondi’s DoJ did not put its best and brightest on this (admittedly gargantuan) undertaking. But the most damning evidence has all been thoroughly redacted… hasn’t it? Well, maybe not.

I was thinking of writing an article on the mangled quoted-printable encoding the day this latest dump came out in response to all the misinformed musings and conjectures that were littering social media (and my dilly-dallying cost me, as someone beat me to the punch), and spent some time searching through the latest archives looking  for some SMTP headers that I could use in the article when I came across a curious artifact: not only were the emails badly transcoded into plain text, but also some binary attachments were actually included in the dumps in their over-the-wire Content-Transfer-Encoding: base64 format, and the unlucky intern that was assigned to the documents in question didn’t realize the significance of what they were looking at and didn’t see the point in censoring seemingly meaningless page after page of hex content!

Continue reading