A quick help is highly appreciated. I am extracting the text from the tiff image through tesseract-OCR. The output I am looking for is.HOCR (HTML). I am getting the perfect output in terms of content, but the format looks very unorganized. But the same when I open with Notepad ++ it gives a clean format.
The windows command line is given below
Tesseract "Path\image.tiff" "Path\output" HOCR
need your help in getting the organised hocr format in notepad as enclosed



Problem is not in tesseract, but in notepad. Use some normal text editor like notepad++ or context.