Get additional file properties in windows

I am trying to pull the word count from Microsoft Word document into Cache. Is there anyway to get the values of the extended file properties without opening the word document?  If I right click on a word document (Word does not need to be installed) I can see the additional properties that I want to reference however don't know how to access these without calling out to VBA or poweshell.

 

 

  • 0
  • 0
  • 95
  • 2
  • 1

Answers

It's not a file property. Docx is just a zip archive, inside it is docProps/app.xml file. Here's how it looks like:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Properties>
    <TotalTime>4</TotalTime>
    <Pages>8</Pages>
    <Words>1882</Words>
    <Characters>10731</Characters>
    <Application>Microsoft Office Word</Application>
    <Lines>89</Lines>
    <Paragraphs>25</Paragraphs>
    <CharactersWithSpaces>12588</CharactersWithSpaces>
</Properties>

Explorer reads the app.xml file and gets information from it.

You can do the same I suppose, here's an article on that.

In your case you don't want to unpack the whole docx, check this unzip implementation for ObjectScript.

Hi Eduard, many thanks for the response however the files that I am trying to pick up the word count properties for are .doc files and not .docx files and what I don't understand is how does windows just pull these properties even when word is not installed, then surely there should be a way that COS can tap into the same windows libraries?

You can use Apache POI library (or call PS, but cursory googling shows that Word is a requirement via COM objects) for example to get this information. As for how explorer gets doc info - I have honestly no idea.