0

I'm trying to convert PDF version 1.3 to PDF version 1.5 or above. The challenge here is: i don't want to change only header version like almost all forums writting on. I want to change ALL file version.

I read this topic: convert PDF to an older version from a servlet?, but firstly new version of iTextPdf 7, does not support PdfStamper so i just skip it.

I think I need to create a TMP file, write PDF to TMP, replace original with TMP and delete TMP. But how can i do that in JAVA?

This code only convert HEADER version! I use Itext version 7.

WriterProperties wp = new WriterProperties();

wp.setPdfVersion(PdfVersion.PDF_1_7);

PdfDocument pdfDoc = new PdfDocument(new PdfReader("source"), new PdfWriter("destination", wp));

pdfDoc.close();

Any suggestions?

picture from Firefox (PDF v 1.3) picture: no text available

here you can download pdf sample: https://wetransfer.com/downloads/ce2d2f41ac29c36baa2ac895ebc0473c20210922065257/5889b2

Blaz Cus
  • 23
  • 4
  • 1
    AFAIK PDF is backwards compatible so a valid 1.3 file with a header updated to 1.5 is a valid 1.5 file. Why do you want to re-write the whole file? This sounds like an [XY problem](https://xyproblem.info). – Joachim Sauer Sep 21 '21 at 09:16
  • *"This code only convert HEADER version! I use Itext version 7."* - Just like @Joachim said, PDF is mostly backward compatible. Even without a change of the header, a valid 1.3 PDF file usually already is also a valid 1.4, 1.5, 1.6, and 1.7 PDF file. – mkl Sep 21 '21 at 10:21
  • Tnx guys! But if you want to open PDF 1.3 in Firefox, you cannot open it! So we have some clients which use firefox for opening PDF formats. Firefox only support 1.5 and above :( – Blaz Cus Sep 21 '21 at 10:32
  • Please share a PDF 1.3 which Firefox does not support. We can try and analyze it to find out what the error in FF is. – mkl Sep 21 '21 at 10:45

1 Answers1

1

There is no need to convert PDF version 1.3 to PDF version 1.5 or above as PDF is designed to be backwards compatible. Thus, every PDF 1.3 document already also is a PDF 1.4 document. And a PDF 1.5 document. And a PDF 1.6 document. ...

In a comment you explained why you want the version change nonetheless:

But if you want to open PDF 1.3 in Firefox, you cannot open it! So we have some clients which use firefox for opening PDF formats. Firefox only support 1.5 and above

In the light of the compatibility discussed above that does not make sense. But sometimes programs behave in a nonsensical way. Thus, I tested this.

The result: The Firefox 87.0 I have installed here accepts PDF 1.3 and PDF 1.4 files I found among my documents without any issue!

Unfortunately I don't have any PDF 1.2 (or earlier) files around here, so I cannot check support of such files.

Thus, I'm afraid you'll have to go back to analyzing the issue your customers have, it is not as simple as "Firefox only support 1.5 and above".

(Some ideas: Maybe your PDF 1.3 files actually are broken and Firefox fails to open them because of that; they may be broken already on your side or they may get broken during transfer to your clients. Or maybe your clients have some older Firefox version with some bugs in its PDF viewer.)

A Fix For The Actual Problem

In comments here the OP provided example files. Analyzing them it turned out that the Actual Problem is that Firefox cannot properly determine the built-in encoding of the embedded fonts.

To help Firefox in this regard, we can provide an explicit base encoding, so Firefox does not need the built-in encoding.

As you used iText 7 in your question, here a proof-of-concept working with your example PDF:

try (   PdfReader pdfReader = new PdfReader("1100-SD-9000455596.pdf");
        PdfWriter pdfWriter = new PdfWriter("1100-SD-9000455596-Fixed.pdf");
        PdfDocument pdfDocument = new PdfDocument(pdfReader, pdfWriter) ) {
    for (int page = 1; page <= pdfDocument.getNumberOfPages(); page++) {
        PdfPage pdfPage = pdfDocument.getPage(page);
        PdfResources pdfResources = pdfPage.getResources();
        for (Entry<PdfName, PdfObject> fontEntry : pdfResources.getResource(PdfName.Font).entrySet()) {
            PdfObject fontObject = fontEntry.getValue();
            if (fontObject != null && fontObject.getType() == PdfObject.INDIRECT_REFERENCE) {
                fontObject = ((PdfIndirectReference)fontObject).getRefersTo(true);
            }
            if (fontObject instanceof PdfDictionary) {
                PdfDictionary fontDictionary = (PdfDictionary) fontObject;
                PdfDictionary encodingDictionary = fontDictionary.getAsDictionary(PdfName.Encoding);
                if (encodingDictionary != null) {
                    if (encodingDictionary.getAsName(PdfName.BaseEncoding) == null &&
                            encodingDictionary.getAsArray(PdfName.Differences) != null) {
                        encodingDictionary.put(PdfName.BaseEncoding, PdfName.WinAnsiEncoding);
                    }
                }
            }
        }
    }
}

(FixForFirefox test testFix1100_SD_9000455596)

mkl
  • 90,588
  • 15
  • 125
  • 265
  • Thank you mkl for answer! well, the problem is somehow in version. I'm sure because if I open PDF with Adobe reader (or anyone opensource pdf reader) it opens normally. But when I want to open in firefox, it open PDF but no text. When I open PDF in some pdf EDITOR and save it. Then I can normally open it in firefox. But after saving, new version is set (from 1.3 to 1.6). – Blaz Cus Sep 22 '21 at 05:30
  • @BlazCus That doesn't necessarily indicate that the version is the culprit. Another possibility is that saving the PDF changes some broken or nonstandard data — which apparently Acrobat can handle — into something more correct or standard that Firefox can open properly. – M. Justin Sep 22 '21 at 05:39
  • @BlazCus indeed, most pdf viewers internally fix a number of errors in pdfs without telling you while failing working in the presence of other errors. And which errors are fixed and which not depends on the viewer in question. For a better analysis please share your original pdf. – mkl Sep 22 '21 at 05:47
  • @mkl I add a PDF example...try to open in PDF reader and then in Firefox. – Blaz Cus Sep 22 '21 at 06:59
  • @BlazCus Indeed the content looks like your screenshot in FF. Inspecting the PDF font objects in question it becomes clear that the glyphs displayed are those for which those font objects contain an explicit entry in their **Encoding** map. It looks like FF cannot determine the built-in encoding maps of the TrueType font programs embedded in your PDF. Extracting and opening those TTFs in FontForge makes a number of error messages appear. I'm not deep enough into font programs to tell whether there actually are relevant errors in the embedded fonts or whether this is a bug of Firefox. – mkl Sep 22 '21 at 09:54
  • @BlazCus Could you also share a copy of the same PDF after re-saving it and so making it supported by Firefox? (I added a note and saved using Acrobat Reader; the result has header version 1.6 but the file still looks broken in FF; apparently your saving worked differently.) – mkl Sep 22 '21 at 09:55
  • @mkl here is converted PDF version: https://wetransfer.com/downloads/a5e46c0a0214ae85df8fcc34af63b02a20210922121417/8515db – Blaz Cus Sep 22 '21 at 12:14
  • @BlazCus That editor completely replaced all fonts by different ones (ArialNarrowSAP variants in the original, ArialMT variants in the edit) which in particular were used differently (simple TrueType font with problematic **Encoding** in the original, composite font with **Encoding Identity-H** without problem). Just open the two files in two tabs of Acrobat Reader and switch back and forth - you'll see that everything is moving and the fonts are different. Thus, that is no "PDF 1.3 to PDF 1.7" conversion but a completely different document merely looking similar. – mkl Sep 22 '21 at 13:00
  • @mkl thank you! this original document is made by SAP ERP. I'm sorry because i cannot upload whole file like it is in original because of data. But if i open original file with LibreOffice and export it, then i can open it in Firefox. I don't know exacly what Editor do with file but the result is readable PDF in Firefox.... – Blaz Cus Sep 22 '21 at 13:14
  • @BlazCus All told, therefore, your problem is completely unrelated to the claimed PDF version of the document. Instead it is either an issue with a broken font structure in the PDF or a bug of Firefox. Searching around one soon finds many reports of issues of Firefox with certain fonts, so I would assume that this is simply the latter, a Firefox bug. – mkl Sep 22 '21 at 13:23
  • @BlazCus As an aside, the way you redacted header and footer of the PDF is very inefficient. It is trivial to unhide them. – mkl Sep 22 '21 at 13:28
  • @mkl It sounds like you mean "ineffective", not "inefficient"? – M. Justin Sep 22 '21 at 14:40
  • 1
    Well, if it's not effective, it's in particular not efficient.. ;) But essentially, yeah, it's ineffective. – mkl Sep 22 '21 at 14:48
  • @mkl thank you again for your answer! I will try to change font and see what happen. :) – Blaz Cus Sep 23 '21 at 05:31
  • 1
    @mkl your code WORKS for me!! Thank you!! – Blaz Cus Sep 23 '21 at 07:26
  • @BlazCus Great! In that case please mark the answer as accepted by clicking the check mark at its upper left. – mkl Sep 23 '21 at 07:45