Tesseract
See here for example
Get help with:
tesseract.exe --help-extra
.txt is automatically appended:
tesseract.exe -l eng image.png textfile
Do a whole folder of images:
for %i in (*.png) do tesseract.exe -l eng "%~i" "%~ni"
To OCR a PDF, first extract the images with Ghostscript:
gswin32c -q -dBATCH -dNOPAUSE -sDEVICE=jpeg -dFirstPage=1 -dLastPage=4 -sOutputFile=imagefile_%d.jpg -r300 pdffile.pdf for %i in (*.jpg) do tesseract.exe -l eng "%~i" "%~ni"