How To Make Microsoft Word Documents with PHP

Share this article

As I had pointed out in my previous article, PHP and WMI – Dig deep into Windows with PHP, we do live in a world where we PHP devs have to deal with the Windows operating system from time to time. WMI (Windows Management Interface) is one such occasion and Microsoft Office Interop is another – an even more important and more frequently used one.

In this article, we will see a simple integration between Word and PHP: to generate a Microsoft Word document based on the inputs in an HTML form using PHP (and its Interop extension).

Preparations

First, please make sure a typical WAMP environment has been set up in your Windows development machine. As Interop is purely a Windows feature, we will have to host Apache and PHP under Windows. In this instance, I am using EasyPHP 14.1, which is quite easy to install and configure.

Next, we will have to install Microsoft Office. Its version is not that critical. I am using Office 2013 Pro but any Office version later than 2007 should work.

We then have to make sure the libraries to develop an Interop application (called PIA, Primary Interop Assemblies) are installed. To ascertain this, we can open the Windows Explorer and navigate to: <Windows Directory>\assembly and we will see a bunch of installed PIAs:

We see a Microsoft.Office.Interop.Word entry (underlined in the snapshot). This will be the PIA we use in this demo. Please pay special attention to its “Assembly Name”, “Version” and “Public Key Token”. These are to be used in our PHP scripts very soon.

In this directory, we can also see other PIAs (including the whole Office family) available for programming (not only for PHP, but also for VB.net, C#, etc)

If the PIAs list does not include the whole package of Microsoft.Office.Interop, we will either re-install our Office and include PIA features; or we have to manually download the package from Microsoft and install it. Please consult this MSDN page for detailed instructions.

NOTE: Only Microsoft Office 2010 PIA Redistributable is available to download and install. The PIA version in this package is 14.0.0. Version 15 only comes with Office 2013 installation.

Finally, we have to enable the PHP extension php_com_dotnet.dll in the php.ini file and restart the server.

Now we can move on to the programming.

The HTML form

As the focus of this demo is on the back end processing, we will create a simple front end with a simple HTML form, which looks like the figure below:

We have a text field for “Name”, a radio button group for “Gender”, a range control for “Age” and a text area for “Message”; and finally, of course, a “Submit” button.

Save this file as “index.html” in an directory under the virtual host’s root directory so that we can access it with a URI like http://test/test/interop.

The back end

The back end PHP file is the focus of our discussion. I will first list the code of this file, and then explain it step by step.

<?php

$inputs = $_POST;
$inputs['printdate']=''; 
// A dummy value to avoid a PHP notice as we don't have "printdate" in the POST variables. 

$assembly = 'Microsoft.Office.Interop.Word, Version=15.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c';
$class = 'Microsoft.Office.Interop.Word.ApplicationClass';

$w = new DOTNET($assembly, $class);
$w->visible = true;

$fn = __DIR__ . '\\template.docx';

$d = $w->Documents->Open($fn);

echo "Document opened.<br><hr>";

$flds = $d->Fields;
$count = $flds->Count;
echo "There are $count fields in this document.<br>";
echo "<ul>";
$mapping = setupfields();

foreach ($flds as $index => $f)
{
    $f->Select();
    $key = $mapping[$index];
    $value = $inputs[$key];
    if ($key == 'gender')
    {
        if ($value == 'm')
            $value = 'Mr.';
        else
            $value = 'Ms.';
    }
    
    if($key=='printdate')
        $value=  date ('Y-m-d H:i:s');

    $w->Selection->TypeText($value);
    echo "<li>Mappig field $index: $key with value $value</li>";
}
echo "</ul>";

echo "Mapping done!<br><hr>";
echo "Printing. Please wait...<br>";

$d->PrintOut();
sleep(3);
echo "Done!";

$w->Quit(false);
$w=null;



function setupfields()
{
    $mapping = array();
    $mapping[0] = 'gender';
    $mapping[1] = 'name';
    $mapping[2] = 'age';
    $mapping[3] = 'msg';
    $mapping[4] = 'printdate';
    

    return $mapping;
}

After setting up the $inputs variable to hold the values posted from our form, and creating a dummy value for printdate – we will discuss why we need this later – we come across these four critical lines:

$assembly = 'Microsoft.Office.Interop.Word, Version=15.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c';
$class = 'Microsoft.Office.Interop.Word.ApplicationClass';

$w = new DOTNET($assembly, $class);
$w->visible = true;

A COM manipulation in PHP requires an instantiation of a “class” within an “assembly“. In our case, we are to operate with Word. If we reflect on the first screenshot we showed, we will be able to construct the full signature of the Word PIA:

  • “Name”, “Version”, “Public Key Token” are all taken from the information displayed when we browse to “c:\Windows\assembly“.
  • “Culture” is always neutrual.

The class we are to invoke is always the assembly’s name plus “.ApplicationClass“.

With these two parameters set, we will be able to instantiate a Word object.

This object can stay in the background or we can bring it to the foreground by setting its visible attribute to true.

Next, we open the document to be processed and assign the “document” instance to a $d variable.

In that document, to create content based on the inputs from the HTML form, we have a few options.

The most unfavorable way is to hard code all the contents in PHP and then output into the Word document. I strongly discourage this due to the following reasons:

  1. There will be no flexibility. Any change in the output will require modification of the PHP script.
  2. It violates the separation between control and presentation.
  3. It will drastically increase the lines of code if we are to apply styles to the document contents (alignment, font, style, etc). Programmatically changing styles is too cumbersome.

Another way is to do a “search-replace”. PHP has strong built-in capabilities in doing this. We can create a Word document putting some special delimiters around the placeholder contents that are to be replaced. For example, we can create a document containing something like this:

{{name}}

and in PHP, we can simply replace this with the “Name” value we retrieved from the form submission.

This is straightforward and avoids all the disadvantages in the first option. We just need to find the right delimiter, and in this case, we are more like doing a template rendering, except that the template used is now a Word document.

The third option is my recommendation and is an advanced topic in Word. We will use fields to represent the placeholders, and in our PHP code, we will directly update the fields with respective form values.

This approach is flexible, fast and conforms with Word’s best practices. It also avoids full text search in the documents, which helps performance. Note that this option has its drawbacks too.

Word, ever since its debut, has never supported named indexes for fields. Even though we provided a name for the fields we created in the Word document, we still have to use number subscripts to access each field. This also explains why we have to use a dedicated function (setupfields) to do the manual mapping between the field index and the name of the form fields.

To learn how to insert fields in a Word document (click here for a ready-made version), please consult the relevant Word help topics and manuals. For this demo, we have a document with 5 MERGEFIELD fields. Also, we placed the document in the same directory as the PHP script for easy access.

Please note, the field printdate does not have a corresponding form field. That is why we added a dummy printdate key to the $inputs array. Without this, the script can still run but there will be notice saying that the index printdate is not presented in the $inputs array.

After updating the fields with form values, we will print the document using:

$d->PrintOut();

The PrintOut method has a few optional parameters and we are using its simplest form. This will print one copy to the default printer connected to our Windows machine.

We can also choose to use PrintPreview to take a look at the output before we decide to print the document. In a purely automated environment, we will of course use PrintOut instead.

We have to wait for a few seconds before we quit the Word application because the printing job needs some time to be fully spooled. Without delay(3), $w->Quit gets executed immediately and the printing job gets killed too.

Finally, we call $w->Quit(false) to close the Word application invoked by our PHP script. The only parameter provided here is to specify if we want to save changes before quitting. We did make changes to the document but we really don’t want to save them because we want to keep a clean template for other users’ input.

After we complete the code, we can load the form page, input some values and submit the form. The below images show the output of the PHP script and also the updated Word document:


Improving the coding speed and understanding more about PIA

PHP is a weakly typed language. A COM object is of type Object. During our PHP coding, there is no way to get a meaningful code insight out of an object, be it a Word Application, a Document, or a Field. We don’t know what properties it has, or what methods it supports.

This will greatly slow down our development speed. To make it faster, I would recommend we develop the functions in C# first and then migrate the code back to PHP. A free C# IDE I would recommend is called “#develop” and can be downloaded here. I prefer this one to the VS series because #develop is smaller, cleaner, and faster.

The migration of C# code to PHP is not scary at all. Let me show you some lines of C# code:

Word.Application w=new Word.Application();
w.Visible=true;
			
String path=Application.StartupPath+"\\template.docx";
			
Word.Document d=w.Documents.Open(path) as Word.Document;
			
Word.Fields flds=d.Fields;
int len=flds.Count;
			
foreach (Word.Field f in flds)
{
	f.Select();
	int i=f.Index;
	w.Selection.TypeText("...");
}

We can see that C# code is almost identical to the PHP code we showed previously. C# is strongly typed so we see a few type casting statements and we have to explicitly give our variables a type.

With variable type given, we can enjoy code insight and code completion so the development speed is much faster.

Another way to speed up our PHP development is to tap on Word macros. We perform the same actions we need to do and record them with a macro. The macro is in Visual Basic, which can also be easily transformed to PHP.

Most importantly, Microsoft’s official documentation on Office PIA, especially the namespace documentation for each Office applications, is always the most detailed reference material. The mostly used three applications are:

Conclusion

In this article, we demonstrated how to populate a Word document using PHP COM libraries and Microsoft Office Interop capabilities.

Windows and Office are widely used in everyday life. To have knowledge on the power of both Office/Windows and PHP will be essential for any PHP + Windows programmers.

With PHP’s COM extension, the door to mastering this combination is opened.

If you are interested in this area of programming, please leave your comments and we will consider having more articles on this topic. I look forward to seeing more real world applications developed using this approach.

Frequently Asked Questions (FAQs) about Creating Microsoft Word Documents with PHP

How Can I Add Images to My Word Document Using PHP?

Adding images to your Word document using PHP is a straightforward process. You can use the addImage() function provided by the PHPWord library. This function allows you to specify the path to the image file you want to add. You can also specify additional parameters such as width, height, and alignment to customize the appearance of the image in your document. Remember to include the necessary headers in your PHP script to ensure the image is correctly interpreted and displayed in the Word document.

Can I Create Tables in Word Documents Using PHP?

Yes, you can create tables in Word documents using PHP. The PHPWord library provides a function called addTable() that you can use to create a table in your document. You can then use the addRow() and addCell() functions to add rows and columns to your table. You can also specify the width, height, and alignment of your table and its cells to customize its appearance.

How Can I Apply Styles to Text in My Word Document Using PHP?

Applying styles to text in your Word document using PHP is easy with the PHPWord library. You can use the addText() function to add text to your document, and you can specify a style array as a second parameter to this function. This style array can include properties such as font size, font color, bold, italic, underline, and more. You can also create a style object using the addTitleStyle() or addParagraphStyle() functions and apply it to your text.

Can I Convert HTML to Word Document Using PHP?

Yes, you can convert HTML to a Word document using PHP. The PHPWord library provides a function called addHtml() that you can use to add HTML content to your document. This function parses the HTML content and converts it into a format that can be displayed in a Word document. However, please note that not all HTML tags are supported, and some complex HTML structures may not be correctly converted.

How Can I Save My Word Document to a Specific Location Using PHP?

After creating your Word document using PHP, you can save it to a specific location using the save() function provided by the PHPWord library. This function takes the path to the location where you want to save the document as a parameter. If the specified location does not exist, the function will attempt to create it. If the function is unable to create the location or save the document, it will throw an exception.

Can I Add Headers and Footers to My Word Document Using PHP?

Yes, you can add headers and footers to your Word document using PHP. The PHPWord library provides functions called addHeader() and addFooter() that you can use to add headers and footers to your document. You can then use the addText() function to add text to your headers and footers. You can also apply styles to your headers and footers using the same methods as described in question 3.

How Can I Add Page Breaks to My Word Document Using PHP?

You can add page breaks to your Word document using PHP by using the addPageBreak() function provided by the PHPWord library. This function inserts a page break at the current position in the document, causing all subsequent content to appear on a new page.

Can I Create a Word Document from a Template Using PHP?

Yes, you can create a Word document from a template using PHP. The PHPWord library provides a function called loadTemplate() that you can use to load a Word document template. You can then use the setValue() function to replace placeholder text in the template with your own content.

How Can I Add Hyperlinks to My Word Document Using PHP?

You can add hyperlinks to your Word document using PHP by using the addLink() function provided by the PHPWord library. This function creates a hyperlink at the current position in the document. You can specify the URL of the hyperlink and the text to be displayed as the hyperlink.

Can I Add Lists to My Word Document Using PHP?

Yes, you can add lists to your Word document using PHP. The PHPWord library provides a function called addListItem() that you can use to add a list item to your document. You can specify the text of the list item and the depth of the item in the list. You can also apply styles to your list items using the same methods as described in question 3.

Taylor RenTaylor Ren
View Author

Taylor is a freelance web and desktop application developer living in Suzhou in Eastern China. Started from Borland development tools series (C++Builder, Delphi), published a book on InterBase, certified as Borland Expert in 2003, he shifted to web development with typical LAMP configuration. Later he started working with jQuery, Symfony, Bootstrap, Dart, etc.

apimicrosoftOOPHPPHPprintingwindowsword
Share this article
Read Next
Get the freshest news and resources for developers, designers and digital creators in your inbox each week