Today, I just wanted to convert a PDF into a PNG image. I knew I already did it but I forgot to remember how (I only need it once every six months). I hope you don't mind if I use my own blog to avoid looking again for it. First, it requires ImageMagick. The following script will do the trick:
from PythonMagick import Image os.environ["MAGICK_HOME"] = r"path_to_ImageMagick" file = "my.pdf" to = file.replace(".pdf",".png") p = PythonMagick.Image() p.density('300') p.read(os.path.abspath(file)) p.write(os.path.abspath(to)) # the ImageMacgick command line to do it # cmd = "convert -density 300 -depth 8 -quality 85 {0} {1}".format(file, to)
If you need to convert a specific page, you can build another PDF with this page by using module PyPDF2. Instructions are described by this blog post: PyPDF2: The New Fork of pyPdf. The convert from PNG to PDF can be handle with the same code:
file = "my.png" to = file.replace(".png",".pdf") # we switch the extension p = PythonMagick.Image() p.density('300') p.read(os.path.abspath(file)) p.write(os.path.abspath(to))
Now if you want to merge multiple PDF into one (for example all PDF in the current folder):
from PyPDF2 import PdfFileMerger, PdfFileReader filenames = [ file for file in os.listdir(".") if ".pdf" in file and "merged" not in file ] merger = PdfFileMerger() for filename in filenames: merger.append(PdfFileReader(open(filename, 'rb'))) merger.write("merged.pdf")
For Windows, PythonMagick can be found here: PythonMagick.