Written by

Maeston Consulting
Question Tony Alexander · Nov 22, 2021

How to lists all files in a given folder (and sub-folders if needed)

Hi Community,

I recently needed to interrogate some folders/sub-folders to retrieve filenames using cache object script(COS) and I implemented it in the following way.

ClassMethod ListDir(path = "",wildchar = "*",recursive As %String(VALUELIST=",y,n") = "y",ByRef dirlist){i path'=""{i ##class(%File).DirectoryExists(path){s rs=##class(%ResultSet).%New("%File:FileSet")s sc=rs.Execute(path,wildchar,"",1)while(rs.Next()){s name=rs.Data("Name")s type=rs.Data("Type")// if sub-folder loop once morei type="D",recursive="y"{d ..ListDir(name,wildchar,"y",.dirlist)}// if file add to listi type="F"{s dirlist($i(dirlist))=name}}d rs.Close()}}}

The code passes the output as an array (dirlist). Setting the recursive flag to "y" will loop through individual sub-folders and providing a value for wild-char will restrict the result to your desired extension. Pretty sure there's probably a better way to do this, but this worked for me and results are returned fairly quickly. I hope you find it useful.

Product version: Caché 2018.1
$ZV: Cache for Windows (x86-64) 2018.1.1 (Build 312U)

Comments

Vitaliy Serdtsev · Nov 23, 2021

There is a ready-made method for this: getFileslink

Usage example:

<FONT COLOR="#0000ff">s </FONT><FONT COLOR="#800000">path</FONT><FONT COLOR="#000000">=</FONT><FONT COLOR="#008000">"C:\Temp"

</FONT><FONT COLOR="#0000ff">s </FONT><FONT COLOR="#800000">pExtension</FONT><FONT COLOR="#000000">=1 </FONT><FONT COLOR="#0000ff">s </FONT><FONT COLOR="#800000">pExtension</FONT><FONT COLOR="#000000">(1)=</FONT><FONT COLOR="#008000">"*"

</FONT><FONT COLOR="#0000ff">s </FONT><FONT COLOR="#800000">pTempNode</FONT><FONT COLOR="#000000">=</FONT><FONT COLOR="#0000ff">$i</FONT><FONT COLOR="#000000">(^CacheTemp) </FONT><FONT COLOR="#0000ff">k </FONT><FONT COLOR="#000000">^CacheTemp(</FONT><FONT COLOR="#800000">pTempNode</FONT><FONT COLOR="#000000">)     </FONT><FONT COLOR="#0000ff">d </FONT><FONT COLOR="#000080">##class</FONT><FONT COLOR="#000000">(</FONT><FONT COLOR="#008080">%SQL.Util.Import</FONT><FONT COLOR="#000000">).</FONT><FONT COLOR="#0000ff">getFiles</FONT><FONT COLOR="#000000">(</FONT><FONT COLOR="#800000">path</FONT><FONT COLOR="#000000">,.</FONT><FONT COLOR="#800000">pExtension</FONT><FONT COLOR="#000000">,</FONT><FONT COLOR="#800000">pTempNode</FONT><FONT COLOR="#000000">,1) </FONT><FONT COLOR="#0000ff">m </FONT><FONT COLOR="#800000">dirlist</FONT><FONT COLOR="#000000">=^CacheTemp(</FONT><FONT COLOR="#800000">pTempNode</FONT><FONT COLOR="#000000">) </FONT><FONT COLOR="#0000ff">k </FONT><FONT COLOR="#000000">^CacheTemp(</FONT><FONT COLOR="#800000">pTempNode</FONT><FONT COLOR="#000000">)

</FONT><FONT COLOR="#0000ff">zw </FONT><FONT COLOR="#800000">dirlist</FONT>

0
Ben Spead  Nov 23, 2021 to Vitaliy Serdtsev

This looks like an excellent resource and I hadn't heard of it before.  Thank you so much for sharing!

0
Alexey Maslov  Nov 23, 2021 to Ben Spead

This (getFiles) method is marked as internal in Cache, and yes, it's typical internal as it's usage is relied on the strong internals knowledge :). Besides, it's hidden in IRIS, and its caller should be rewritten to achieve DBMS independence:

ClassMethod ListDir2(path = "", wildchar = "*", recursive As %String(VALUELIST=",y,n") = "y", ByRef dirlist){
 pExtension=1
 pExtension(1)=wildchar

#if $zversion["IRIS"
 temp=$name(^IRIS.Temp)#else
 temp=$name(^CacheTemp)#endif
 
 pTempNode=$i(@temp)
 @temp@(pTempNode)
   
 ##class(%SQL.Util.Import).getFiles(path,.pExtension,pTempNode,recursive="y")
 dirlist=@temp@(pTempNode)
 @temp@(pTempNode)  ;zw dirlist}
0
Tony Alexander · Nov 23, 2021

I guess I re-invented the wheel a bit, thanks for the info.

0
Ben Spead  Nov 23, 2021 to Tony Alexander

Don't worry about it - I have been writing ObjectScript code for 18 years and I had no idea about this method - I have always done it the way you did in your example :)  Thanks for asking the question so I could learn something new as well!

0
Alexey Maslov  Nov 24, 2021 to Tony Alexander

@Tony Alexander,
your method is ~30% faster then "ready made" on the directory tree populated with ~ 200 files.
Keep re-inventing wheels!

0
Sidney Levy  Nov 28, 2021 to Alexey Maslov

For those MUMPS veterans. you can also use the $ZSEARCH function. As in the example:

  NEW $NAMESPACE
  SET $NAMESPACE="USER"
  SET file=$ZSEARCH("i*")
   WHILE file'="" {
       WRITE !,file
       SET file=$ZSEARCH("")
   }
   WRITE !,"That is all the matching files"
   QUIT

I think this one will be faster

0
David Underhill  Nov 25, 2021 to Tony Alexander

I hadn't heard of that %SQL method either and, as well as speed, the %File method has advantages such as more file details being returned and better filtering with wildcards.

0
Norman W. Freeman · May 12

FileSet does a lot of things under the hood. I found that it does several QueryOpen operations per file, due to GetFileAttributesEx calls to get file size, modified date and such. One call should be enough, but FileSet does 4 calls per file :


$ZSEARCH seems more efficient (especially if you don't need extra file info like size or date). This function is not meant to be called in a recursive context, so special care is needed :

kill FILES
set FILES($i(FILES))="C:\somepath\"set key = ""for
{
    set key = $order(FILES(key),1,searchdir)
    quit:key=""set filepath=$ZSEARCH(searchdir_"*")
    while filepath'=""
    {
        set filename = ##class(%File).GetFilename(filepath)
        if (filename '= ".") && (filename '= "..") //might exclude more folders
        {
            if##class(%File).DirectoryExists(filepath)
            {
                set FILES($i(FILES)) = filepath_"\"//search in subfolders
            }
            else
            {
                //do something with filepath//...
            }
        }

        set filepath=$ZSEARCH("")
    }
}

$ZSEARCH still does one QueryOpen operation per file (AFAIK it's not needed since we only need filename, which is provided by QueryDirectory operation happening before, using FindFirstFile) , but at least it does it only once.
Based on my own measurements, it's at least 5x faster ! (your results may vary). I am looping through 12.000 files, if your have a smaller dataset, it might not worth the trouble.
If you need extra file attributes (like size) you can use those functions :

##class(%File).GetFileDateModified(filepath)
##class(%File).GetFileSize(filepath)

Even with those calls in place, it's still faster than FileSet.

0