Powershell Tip: Convert HTML to PDF
There are no native methods to create a pdf file in Powershell. So I looked into outside sources for converting HTML output to PDF. I ended up using a stand alone dll and some .NET calls to achieve my goal.
Introduction
Honestly I forgot why I thought it would be important that my report generation scripts be able to create pdf output. I had a fun time figuring out how to make it happen though. I settled on using a third party dll from codeplex (supposedly codeplex.com is all about open source projects but I’ve yet to see the source for several of the cooler projects hosted here). While watching a movie with the wife I proceeded to work my way backwards from a VB example to come up with a Powershell alternative. This is how it works.
Details
Here is the singular example given for this dll:
Generate PDF with one line of code:
The code seems simple enough, feed the GeneratePdf object member some html and a pdf will get created like magic. The minor challenge is taking this .Net based dll, loading it into memory, defining the correct object from its assemblies, and feeding it the properly formatted html data so that it can create a pdf.
First lets define the dll location (which we test actually exists in a full blown function provided later):
Next lets load up the assembly and create a new NReco.HtmlToPdfConverter object to work with:
Finally, use the GeneratePdf method to create a pdf which we then write to a file in byte mode (pdf output is not simple text after all)
You will notice that I explicitly cast the $html to a string. With the function I provide you will have to send the html data cast this way. I found that if I pulled in html content from a file it would result in an array of strings which the GeneratePdf method would treat individually. This would end up creating 100+ page pdf pages (a page for every element).
If you play around with $PdfCreator you will find that there are several methods included for changing page size and other document properties as well.
The Code
Here is the function I came up with to convert html to a pdf. Included at the end is a quick example. If you are going to use this code then you will likely have to unblock the dll after extracting it into the same directory as the script. Also, I had to run PowerGUI as admin otherwise I didn’t have enough permission to load the dll into memory. Enjoy!