If you get access error on Linux:

javaldx failed! Warning: failed to read path from javaldx LibreOffice 7.3 - Fatal Error: The application cannot be started. User installation could not be completed.
LibreOffice user installation could not be processed due to missing access rights. Please make sure that you have sufficient access rights for the following location and restart LibreOffice.

Add this to LibreOffice parameters:

set args($i(args)) = "-env:UserInstallation=file:///tmp/libreofficehome/"

where /tmp/libreofficehome is any empty folder InterSystems IRIS has write access to.

If there's a text layer use LibreOffice to convert to txt (InterSystems IRIS wrapper), for OCR you'll need some thirdparty tool, for example Tesseract can be easily used with Embedded Python.

UPD: LibreOffice can't extract text from PDFs unfortunately. Here's Embedded Python solution:

Class User.PDF
{

/// zw ##class(User.PDF).GetText("/tmp/example.pdf", .text)
ClassMethod GetText(file, Output text) As %Status
{
  try {
    #dim sc As %Status = $$$OK
    kill text
    set dir = $system.Util.ManagerDirectory()_ "python"
    do ##class(%File).CreateDirectoryChain(dir)
    // pip3 install --target /data/db/mgr/python --ignore-requires-python typing==3.10.0.0
    try {
      set pypdf2 = $system.Python.Import("PyPDF2")
    } catch {
      set cmd = "pip3"
      set args($i(args)) = "install"
      set args($i(args)) = "--target"
      set args($i(args)) = dir
      set args($i(args)) = "PyPDF2==2.10.0"
      set args($i(args)) = "dataclasses"
      set args($i(args)) = "typing-extensions==3.10.0.1" 
      set args($i(args)) = "--upgrade"
      set sc = $ZF(-100,"", cmd, .args)
      set pypdf2 = $system.Python.Import("PyPDF2")
    }
    return:'$d(pypdf2) $$$ERROR($$$GeneralError, "Unable to load PyPDF2")
    kill pypdf2
    set text = ..GetTextPy(file)
  } catch ex {
    set sc = ex.AsStatus()
  }
  quit sc
}

ClassMethod GetTextPy(file) [ Language = python ]
{
  from PyPDF2 import PdfReader

  reader = PdfReader(file)
  text = ""
  for page in reader.pages:
    text += page.extract_text() + "\n"

  return text
}

}

Try to reload like this:

set importlib = ##class(%SYS.Python).Import("importlib")
do importlib.reload(helloWorld)

Also, it not an IRIS-specific behavior, you'll get the same results in any python interpreter:

import helloWorld
helloWorld.helloWorld()
>'Hello world'
del helloWorld

# modify helloWorld.py in text editor

import helloWorld
helloWorld.helloWorld()
>'Hello world'

Will it run in the same Windows process?

Yes.

Will there be any issues with multitasking (considering python doesn't seem very good at this)?

GIL still exists and applies. If you write async code it would only be executed while control flow is on a python side of things. You can't spawn async task in python, go to InterSystems ObjectScript to do something else and then come to a completed python task.

Also, is there a performance penalty to pay for running embedded python vs "using IRIS APIs from Python". 

IRIS APIs from Python (Native SDK/Native API) can be invoked either in-shared-memory or over TCP. TCP comes with a performance penalty.

Another question is what python interpreter the embedded python is using? Is it an Intersystems one or the regular c.python?

CPython.

Version?

Use sys.version to check. Recently it was Python 3.9.5 on Windows and 3.8.10 on Linux.

Except that calling Python is about 10x slower, a

Not really, more like faster if you need to call it more than once:

 
Code

Results in:

 
Output