Question
· Nov 22, 2021

How to lists all files in a given folder (and sub-folders if needed)

Hi Community,

I recently needed to interrogate some folders/sub-folders to retrieve filenames using cache object script(COS) and I implemented it in the following way.

ClassMethod ListDir(
               path = "",
               wildchar = "*",
               recursive As %String(VALUELIST=",y,n") = "y",
               ByRef dirlist)
{
               i path'=""{
                              i ##class(%File).DirectoryExists(path){
                                             s rs=##class(%ResultSet).%New("%File:FileSet")
                                             s sc=rs.Execute(path,wildchar,"",1)
                                             while(rs.Next()){
                                                            s name=rs.Data("Name")
                                                            s type=rs.Data("Type")
                                                            // if sub-folder loop once more
                                                            i type="D",recursive="y"{
                                                                           d ..ListDir(name,wildchar,"y",.dirlist)
                                                            }
                                                            // if file add to list
                                                            i type="F"{
                                                                           s dirlist($i(dirlist))=name
                                                            }
                                             }
                                             d rs.Close()
                              }
               }
}

The code passes the output as an array (dirlist). Setting the recursive flag to "y" will loop through individual sub-folders and providing a value for wild-char will restrict the result to your desired extension. Pretty sure there's probably a better way to do this, but this worked for me and results are returned fairly quickly. I hope you find it useful.

Product version: Caché 2018.1
$ZV: Cache for Windows (x86-64) 2018.1.1 (Build 312U)
Discussion (9)1
Log in or sign up to continue

This (getFiles) method is marked as internal in Cache, and yes, it's typical internal as it's usage is relied on the strong internals knowledge :). Besides, it's hidden in IRIS, and its caller should be rewritten to achieve DBMS independence:

 ClassMethod ListDir2(path = "", wildchar = "*", recursive As %String(VALUELIST=",y,n") = "y", ByRef dirlist)
{
 pExtension=1
 pExtension(1)=wildchar

#if $zversion["IRIS"
 temp=$name(^IRIS.Temp)
#else
 temp=$name(^CacheTemp)
#endif
 
 pTempNode=$i(@temp)
 @temp@(pTempNode)
   
 ##class(%SQL.Util.Import).getFiles(path,.pExtension,pTempNode,recursive="y")
 dirlist=@temp@(pTempNode)
 @temp@(pTempNode)  ;zw dirlist
}

FileSet does a lot of things under the hood. I found that it does several QueryOpen operations per file, due to GetFileAttributesEx calls to get file size, modified date and such. One call should be enough, but FileSet does 4 calls per file :



$ZSEARCH seems more efficient (especially if you don't need extra file info like size or date). This function is not meant to be called in a recursive context, so special care is needed :

kill FILES
set FILES($i(FILES))="C:\somepath\"
set key = ""
for
{
    set key = $order(FILES(key),1,searchdir)
    quit:key=""
    set filepath=$ZSEARCH(searchdir_"*")
    while filepath'=""
    {
        set filename = ##class(%File).GetFilename(filepath)
        if (filename '= ".") && (filename '= "..") //might exclude more folders
        {
            if ##class(%File).DirectoryExists(filepath)
            {
                set FILES($i(FILES)) = filepath_"\" //search in subfolders
            }
            else
            {
                //do something with filepath
                //...
            }
        }

        set filepath=$ZSEARCH("")
    }
}

$ZSEARCH still does one QueryOpen operation per file (AFAIK it's not needed since we only need filename, which is provided by QueryDirectory operation happening before, using FindFirstFile) , but at least it does it only once.

Based on my own measurements, it's at least 5x faster ! (your results may vary). I am looping through 12.000 files, if your have a smaller dataset, it might not worth the trouble.

If you need extra file attributes (like size) you can use those functions :

##class(%File).GetFileDateModified(filepath)
##class(%File).GetFileSize(filepath)

Even with those calls in place, it's still faster than FileSet.