Posing a question to consider during the current Grand Prix competition.
I wanted to share an observation about using PDFs with LangChain.
When loading the text out of a PDF, I noticed there was an artifact of gaps within some of the words extracted.
For example (highlighted in red)