I have use FPDFImageObj_GetImageDataDecoded, FPDFImageObj_GetImageDataRaw, but all failed
I just use the pdfium to get the data of pdf pages, but i can't get the tiff or jbig2 right data from FPDFImageObj_GetImageDataDecoded api. who can help me? thank you very much
There is a feature request discussing this: https://crbug.com/pdfium/1930 (disclaimer: I'm the reporter)
TLDR The functions you mention do provide the main data stream, but for some filters complementary data would be needed to actually re-construct the image, which pdfium does not provide.
CCITTDecode, as the TIFF format can use, pdfium's public API does not tell the CCITT group, but this would be needed to re-construct the TIFF header, which the PDF format strips. And I thinkBlackIs1info would also be needed; possibly more.JBIG2Decodemay optionally use a separateJBIG2Globalsstream, which again pdfium does not provide. I had filed a separate bug about this: https://crbug.com/pdfium/1927. However, I guess the raw JBIG2 data might not be very useful except for re-insertion into a PDF. IIRC the way pikepdf handles JBIG2 extraction to files is to just decode the data and re-encode to some other format. From a programmatic POV that's not ideal, but I guess the context is that standalone JBIG2 isn't really supported by end-user apps.Concerning
FPDFImageObj_GetImageDataDecoded(), note that it does not fully decode images; it only applies "simple" filters (see https://crbug.com/pdfium/1203#c7), so the function name is a bit misleading.For the plain pixel data, you can use
FPDFImageObj_GetBitmap(),FPDFBitmap_GetBuffer()& co, but note thatFPDF_BITMAPis limited in supported pixel formats and bit depth (e.g. no CMYK, B/W, >8bpc RGB(A), ...).