Zachary Loeber

The personal website of Zachary Loeber.

Powershell Tip: Convert HTML to PDF

2014-03-07 4 min read Microsoft Powershell Zachary Loeber

There are no native methods to create a pdf file in Powershell. So I looked into outside sources for converting HTML output to PDF. I ended up using a stand alone dll and some .NET calls to achieve my goal.

Introduction

Honestly I forgot why I thought it would be important that my report generation scripts be able to create pdf output. I had a fun time figuring out how to make it happen though. I settled on using a third party dll from codeplex (supposedly codeplex.com is all about open source projects but I’ve yet to see the source for several of the cooler projects hosted here). While watching a movie with the wife I proceeded to work my way backwards from a VB example to come up with a Powershell alternative. This is how it works.

Details

Here is the singular example given for this dll:

Generate PDF with one line of code:

var pdfBytes = (new NReco.HtmlToPdfConverter()).GeneratePdf(htmlContent);

The code seems simple enough, feed the GeneratePdf object member some html and a pdf will get created like magic. The minor challenge is taking this .Net based dll, loading it into memory, defining the correct object from its assemblies, and feeding it the properly formatted html data so that it can create a pdf.

First lets define the dll location (which we test actually exists in a full blown function provided later):

$PdfGenerator = "$((Get-Location).Path)\NReco.PdfGenerator.dll"

Next lets load up the assembly and create a new NReco.HtmlToPdfConverter object to work with:

$Assembly = [Reflection.Assembly]::LoadFrom($PdfGenerator)
 $PdfCreator = New-Object NReco.PdfGenerator.HtmlToPdfConverter

Finally, use the GeneratePdf method to create a pdf which we then write to a file in byte mode (pdf output is not simple text after all)

$ReportOutput = $PdfCreator.GeneratePdf([string]$HTML)
Add-Content -Value $ReportOutput -Encoding byte -Path $FileName

You will notice that I explicitly cast the $html to a string. With the function I provide you will have to send the html data cast this way. I found that if I pulled in html content from a file it would result in an array of strings which the GeneratePdf method would treat individually. This would end up creating 100+ page pdf pages (a page for every element).

If you play around with $PdfCreator you will find that there are several methods included for changing page size and other document properties as well.

The Code

Here is the function I came up with to convert html to a pdf. Included at the end is a quick example. If you are going to use this code then you will likely have to unblock the dll after extracting it into the same directory as the script. Also, I had to run PowerGUI as admin otherwise I didn’t have enough permission to load the dll into memory. Enjoy!

Function ConvertTo-PDF
{
    <#
    .SYNOPSIS
        Converts HTML strings to pdf files.
    .DESCRIPTION
        Converts HTML strings to pdf files.
    .PARAMETER HTML
        HTML to convert to pdf format.
    .PARAMETER ReportName
        File name to create as a pdf.

    .EXAMPLE
        $html = 'test'
        try 
        {
            ConvertTo-PDF -HTML $html -FileName 'test.pdf' #-ErrorAction SilentlyContinue) 
            Write-Output 'HTML converted to PDF file test.pdf'
        } 
        catch
        {
            Write-Output 'Something bad happened! :('
        }

        Description:
        ------------------
        Create a pdf file with the content of 'test' if the pdf creation dll is available.

    .NOTES
        Requires   : NReco.PdfGenerator.dll (http://pdfgenerator.codeplex.com/)
        Version    : 1.0 03/07/2014
                     - Initial release
        Author     : Zachary Loeber

        Disclaimer : This script is provided AS IS without warranty of any kind. I 
                     disclaim all implied warranties including, without limitation,
                     any implied warranties of merchantability or of fitness for a 
                     particular purpose. The entire risk arising out of the use or
                     performance of the sample scripts and documentation remains
                     with you. In no event shall I be liable for any damages 
                     whatsoever (including, without limitation, damages for loss of 
                     business profits, business interruption, loss of business 
                     information, or other pecuniary loss) arising out of the use of or 
                     inability to use the script or documentation. 

        Copyright  : I believe in sharing knowledge, so this script and its use is 
                     subject to : http://creativecommons.org/licenses/by-sa/3.0/
    .LINK
        http://zacharyloeber.com/

    .LINK
        http://nl.linkedin.com/in/zloeber
    #>
    [CmdletBinding()]
    param
    (
        [Parameter( HelpMessage="Report body, in HTML format.", 
                    ValueFromPipeline=$true )]
        [string]
        $HTML,
        [Parameter( HelpMessage="Report filename to create." )]
        [string]
        $FileName
    )
    BEGIN
    {
        $DllLoaded = $false
        $PdfGenerator = "$((Get-Location).Path)\NReco.PdfGenerator.dll"
        if (Test-Path $PdfGenerator)
        {
            try
            {
                $Assembly = [Reflection.Assembly]::LoadFrom($PdfGenerator)
                $PdfCreator = New-Object NReco.PdfGenerator.HtmlToPdfConverter
                $DllLoaded = $true
            }
            catch
            {
                Write-Error ('ConvertTo-PDF: Issue loading or using NReco.PdfGenerator.dll: {0}' -f $_.Exception.Message)
            }
        }
        else
        {
            Write-Error ('ConvertTo-PDF: NReco.PdfGenerator.dll was not found.')
        }
    }
    PROCESS
    {
        if ($DllLoaded)
        {
            $ReportOutput = $PdfCreator.GeneratePdf([string]$HTML)
            Add-Content -Value $ReportOutput -Encoding byte -Path $FileName
        }
        else
        {
            Throw 'Error Occurred'
        }
    }
    END
    {}
}

$html = 'test'
try 
{
    ConvertTo-PDF -HTML $html -FileName 'test.pdf' #-ErrorAction SilentlyContinue) 
    Write-Output 'HTML converted to PDF file test.pdf'
} 
catch
{
    Write-Output 'Something bad happened! :('
}
comments powered by Disqus