Mining Text from PDF Files, Part 1: PDF with Text
Intro I wanted to find out how to mine text from PDF files with R. I’m experimenting with different formats, which will each have their own blog post. This first one is about PDF files with just text in them. The second one will be about extracting text in tables and in the third one I will extract text that’s in a picture inside a PDF file. I’m assuming you’re using RStudio as your IDE (Integrated Development Environment).