I am using Ghostscript to convert PDF files to text via the GhostScript.NET wrapper library. I was getting it to convert when writing the text to a file ("-sOutputFile=" + outputPipeHandle), but I'd rather write it to a stream ("-o" + outputPipeHandle). Here's my attempt in c#:

            try
            {
                Debug.WriteLine("Installed " + _lastInstalledVersion.LicenseType.ToString() + " Ghostscript " + _lastInstalledVersion.Version.ToString());
                GhostscriptPipedOutput gsPipedOutput = new GhostscriptPipedOutput();
                
                // pipe handle format: %handle%hexvalue
                string outputPipeHandle = "%handle%" + int.Parse(gsPipedOutput.ClientHandle).ToString("X2");

                using (GhostscriptProcessor processor = new GhostscriptProcessor(_lastInstalledVersion, true))
                {
                    processor.Processing += new GhostscriptProcessorProcessingEventHandler(ghostscript_Processing);

                    List<string> switches = new List<string>();
                    switches.Add("-empty");
                    switches.Add("-dQUIET");
                    switches.Add("-dSAFER");
                    switches.Add("-dBATCH");
                    switches.Add("-dNOPAUSE");
                    switches.Add("-dNOPROMPT");
                    if (this.Context.IsPreview)
                    {
                        switches.Add("-dLastPage=5");
                    }
                    switches.Add("-dTextFormat=3");
                    switches.Add("-sDEVICE=txtwrite");
                    switches.Add("-o" + outputPipeHandle);
                    switches.Add("-q");
                    switches.Add("-f");
                    switches.Add(Context.Path);
                        //"-sOutputFile=" + outputPipeHandle

                    try
                    {
                        //processor.Process(switches.ToArray());
                        // THROWS AN ERROR HERE
                        processor.StartProcessing(switches.ToArray(), new ConsoleStdIO(true, true, true));

                        // HERE IS WHERE I HOPE TO GRAB THE TEXT FROM THE PIPED OUTPUT
                        byte[] rawDocumentData = gsPipedOutput.Data;
                        returnVal = System.Text.Encoding.Default.GetString(rawDocumentData);
                        Debug.WriteLine(returnVal);
                    }
                    catch (Exception ex)
                    {
                        returnVal = ex.Message;
                    }
                    finally
                    {
                        gsPipedOutput.Dispose();
                        gsPipedOutput = null;
                    }
                }
            }
            catch (Exception ex)
            {
                Debug.WriteLine(ex.Message);
                returnVal = ex.Message;
            }

            return returnVal;
        }
    }

my errors are as follows - I'm not so much concerned with the warnings as I am the fatal error about "Could not open the file 958" - as I thought that it would be in memory.

   **** Warning: can't process font stream, loading font by the name.
   **** Error reading a content stream. The page may be incomplete.
   **** File did not complete the page properly and may be damaged.
   **** Warning: File has unbalanced q/Q operators (too many q's)
The thread 0x59c0 has exited with code 0 (0x0).
   **** Warning: can't process font stream, loading font by the name.
   **** Error reading a content stream. The page may be incomplete.
   **** File did not complete the page properly and may be damaged.
GPL Ghostscript 9.06: **** Could not open the file 958 .
Error: /ioerror in --showpage--
Operand stack:
   1   true
Execution stack:
   %interp_exit   .runexec2   --nostringval--   --nostringval--   --nostringval--   2   %stopped_push   --nostringval--   --nostringval--   --nostringval--   false   1   %stopped_push   1926   1   3   %oparray_pop   1925   1   3   %oparray_pop   1909   1   3   %oparray_pop   --nostringval--   --nostringval--   5   1   5   --nostringval--   %for_pos_int_continue   --nostringval--   --nostringval--   1809   0   10   %oparray_pop   --nostringval--   --nostringval--
Dictionary stack:
   --dict:1173/1684(ro)(G)--   --dict:1/20(G)--   --dict:82/200(L)--   --dict:82/200(L)--   --dict:109/127(ro)(G)--   --dict:293/300(ro)(G)--   --dict:25/31(L)--   --dict:6/8(L)--   --dict:21/40(L)--   --dict:7/15(L)--
Current allocation mode is local
Last OS error: Bad file descriptor
GPL Ghostscript 9.06: Unrecoverable error, exit code 1
Exception thrown: 'Ghostscript.NET.GhostscriptAPICallException' in Ghostscript.NET.dll

Update

For additional reference, I'm essentially using this code from the GhostScript.net samples on their website: PipedOutputSample.cs

The only difference I can see is that I changed the DEVICE to 'txtwrite' instead of 'pdfwrite'.

Update 2

After reading the documentation more carefully

Note that on MS Windows systems, the % character also has a special meaning for the command processor (shell), so you will have to double it.

As I happen to be on Windows, I modified where I name my outputPipeHandle to: string outputPipeHandle = "%%handle%%" + int.Parse(gsPipedOutput.ClientHandle).ToString("X2");

so I no longer get the above error about not being able to access the file. Instead, my breakpoint on this line is reached

byte[] rawDocumentData = gsPipedOutput.Data;

but upon trying to copy the value of that Data field to my variable, it completely breaks out of my thread... skipping my catch, and even my finally block. I don't see any messages as to why it broke out in the output console. some kind of GhostScript pointer error????

I'm still at a loss. I don't even know what %handle% is supposed to signify or why I need to use it. Without that knowledge, I'm operating blind and just throwing stuff hoping it works.

Thank you.

0

There are 0 best solutions below