Dealing with PDFs

As a student, I frequently have assigned reading. I've found that I retain what I'm reading a lot better when I listen to it, or ideally, read and listen at the same time. Fortunately, with the optical character recognition and text to speech, that's not too hard. Let me walk you through my admittedly convoluted process.

VFlat Scan

https://play.google.com/store/apps/details?id=com.voyagerx.scanner&hl=en-US

This is a closed-source, freemium scanner app that's frankly kind of a nuisance to use, but it produces the best scans of any app I've tried, so I put up with it.

I scan my books into VFlat (please don't sue me, they're for my own personal use), and export them as multiple photos (to get around the PDF export limitations.

KDE Connect

I push the pictures from my phone to my computer using KDE connect, but any file sharing software will do the trick.

Conversion Script

function jpgtopdf
    convert *.png "$argv[1]"
    ocrmypdf "$argv[1]" "$argv[1]"
    pushfile "$argv[1]"
    rm *.jpg
end

function pushfile
    ntfy pub --file "$argv[1]" my.ntfy.instance/topic
end
This will delete all jpg files in the folder it's run in. If you aren't familiar with the Linux `rm` command, you probably shouldn't use this.

That's where these little fish functions come in. I stick all of the images for a particular reading into a directory, then this function uses ImageMagick to make the pictures into a PDF, and ocrmypdf to recognize the text. Then I send it to myself with ntfy, and deletes the original images.

T2S

https://play.google.com/store/apps/details?id=hesoft.T2S&hl=en-US

Then, I use T2S to read those PDFs aloud using the free TTS engines that come with my phone. The voices are a bit robotic, but it's free unlike something like Speechify or Eleven Labs, so, it's good enough in my book.

Honorable Mentions

Paperless NGX

I've been working on integrating Paperless into my workflow, since it handles OCR and managing PDFs all in one place, but I like my current system, and haven't figured out the best way to fit Paperless in.

It also doesn't help that my laptop is a lot beefier than the old computer I'm using as a server and runs OCR way faster.

PDF Arranger

This thing is a Swiss Army Knife. Highly recommended.