I love PHP. I love MySQL. They are powerful. They are easy to use. They are well documented.
I have no particular aversion to Microsoft Word. As a word processor, and more, it has served me well over the years. It has produced for me innumerable essays, reports, resumes, Engineering Department notices, and letters to Santa. I never before had the pleasure of working Word as a programmer.
A client wished to perform full-text searches on documents uploaded to her website. As you might expect, the Microsoft Word file format prevents one from simply reading in the text. Still, "No problem," we said. "I think we've heard of some COM platform that will let PHP talk to Word. We can definitely do this." You will notice that this is the moment at which Brooke and I took our first step into Hell.
You see, COM allows any programming language to interact directly with a Microsoft application, such as the IE or the shell or Excel. In PHP, we should be able to run Word, open a document, and read from that document.
So, we started poking around online, looking for COM documentation and examples of similar implementations. The examples were there, albeit sparsely, but the documentation was mostly lacking. When I instantiate a COM handle to Word, what methods are at my disposal? No one would tell me. Furthermore, no one presented examples of opening a document, reading the entire contents, and closing it. Seems simple, seems universally useful but it isn't there. Go look: I dare you to try.
And then, I found this PHP class and I experienced an epiphany, a ray of sweet, warm sunlight shining on my cold, bare ass. I could open the Word document with COM in PHP, and then, without reading it, save it as a text file. AND THEN I COULD READ THE TEXT FILE.
$word = new COM("word.application") or die("Unable to instantiate Word");
$new_filename = substr($filename,0,-4) . ".txt";
// the '2' parameter specifies saving in txt format
$word = NULL;
$fh = fopen($new_filename, 'r');
// this is where we exit Hell
$contents = fread($fh, filesize($new_filename));
This method works! It actually works! I can actually have the contents of the Word document! Huzzah.
I posted this here, with attribution to the aforementioned PHP class for inspiration and for the format parameter to the SaveAS function, in the hope that some other hapless fool, attempting to complete the same task, will find solace in these lines. Feel free to contact me with any questions: I am more than happy to help you defeat the COM demon.
As a closing note, the second half of the task, intelligent full-text search, was rendered trivial, laughably easy, by the MySQL built-in full-text search functions. Thank you, Open Source. You win again.