I'm working with a C# program that must parse the text boxes and other contents of Microsoft Word documents. To do this, I'm utilizing the Microsoft.Office.Interop.Word library. But when I try to filter the shapes by type, looking for text boxes in particular, I get a COMException.
The line of code that is causing the problem is as follows:
var textBoxContents = doc.Shapes.Cast<Word.Shape>()
.Where(shape =>
{
try
{
return shape.Type == Microsoft.Office.Core.MsoShapeType.msoTextBox;
}
catch (System.Runtime.InteropServices.COMException)
{
// Skip this shape if an exception occurs while accessing its Type property
return false;
}
})
.Select(shape => shape.TextFrame.TextRange)
.ToList();
When the line shape.Type == Microsoft.Office.Core.MsoShapeType.msoTextBox is executed, and the following exception is thrown:
System.Runtime.InteropServices.COMException: 'Unspecified error (Exception from HRESULT: 0x80004005 (E_FAIL))'
To handle the error and avoid the problematic shapes, I added a try-catch block, but this didn't seem to resolve the problem.
Has anyone else had this issue, and if so, how can I access the Shape.Type property correctly without running into this exception? Is there a solution or a better way to use the Microsoft.Office.Interop.Word library to extract text boxes from a Word document?
I would be very grateful for any assistance.