Written by

Question Jon Astle · Jun 13, 2019

Get additional file properties in windows

I am trying to pull the word count from Microsoft Word document into Cache. Is there anyway to get the values of the extended file properties without opening the word document?  If I right click on a word document (Word does not need to be installed) I can see the additional properties that I want to reference however don't know how to access these without calling out to VBA or poweshell.

Comments

Eduard Lebedyuk · Jun 14, 2019

It's not a file property. Docx is just a zip archive, inside it is docProps/app.xml file. Here's how it looks like:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Properties>
    <TotalTime>4</TotalTime>
    <Pages>8</Pages>
    <Words>1882</Words>
    <Characters>10731</Characters>
    <Application>Microsoft Office Word</Application>
    <Lines>89</Lines>
    <Paragraphs>25</Paragraphs>
    <CharactersWithSpaces>12588</CharactersWithSpaces>
</Properties>

Explorer reads the app.xml file and gets information from it.

You can do the same I suppose, here's an article on that.

In your case you don't want to unpack the whole docx, check this unzip implementation for ObjectScript.

0
Jon Astle  Jun 17, 2019 to Eduard Lebedyuk

Hi Eduard, many thanks for the response however the files that I am trying to pick up the word count properties for are .doc files and not .docx files and what I don't understand is how does windows just pull these properties even when word is not installed, then surely there should be a way that COS can tap into the same windows libraries?

0
Eduard Lebedyuk  Jun 17, 2019 to Jon Astle

You can use Apache POI library (or call PS, but cursory googling shows that Word is a requirement via COM objects) for example to get this information. As for how explorer gets doc info - I have honestly no idea.

0