1

Trying to get a count of all files a folder matching a given mask, but want to avoid the expense of returning each match or even a list, since there could be potentially tens of thousands of matches.

I could do FindNextFile repeatedly until done, but that's a lot of costly round trips.

Is there a convenience function for this?

This is the code in use now, and since all I need is the count, looking for a less-costly way to get there:

string[] files = System.IO.Directory.GetFiles(Path, InFileMask);
if (files.Length == ExpectedCount)
Chibueze Opata
  • 9,856
  • 7
  • 42
  • 65
Robert Kerr
  • 1,291
  • 2
  • 17
  • 34
  • 4
    I doubt that you will beat `FindNextFile`. What makes you think that is slow? http://stackoverflow.com/questions/15351288/is-there-a-faster-alternative-to-enumerating-folders-than-findfirstfile-findnext – David Heffernan Jul 18 '13 at 13:58
  • Was hoping since the FS is already keeping a directory of filenames, ther would be a fast call for querying that with a match and only returning the count, rather than a list of files. FindNextFile is a lot of round trips I don't need, as I have to keep counting repeatedly since the number of files is constantly changing. – Robert Kerr Jul 18 '13 at 20:46
  • What do you mean "keep counting repeatedly"? Is there more that you are not telling use. – David Heffernan Jul 18 '13 at 20:55
  • yes but it's impertinent. the count of matching files continues until it meets the expected count. however, the cost of a single count is what needs to be reduced. – Robert Kerr Jul 19 '13 at 13:09

2 Answers2

2

You should be able to achieve this by using the Microsoft Indexing Service. There's also another article here. Unfortunately, it not always active on every system and thus not fully reliable.

A better option would be to implement your own indexing service. You can scan the whole PC when the app is started, and rely on a FileSystemWatcher to listen for file changes on the system. This way you can implement a very fast file enumeration and you will be able to return file counts in less microseconds.

An easier approach will be to use the excellent Faster Directory Enumeration Tool which also has mask/filter support that can handle this. e.g.

var filesCount = FastDirectoryEnumerator.GetFiles(Path, InFileMask).Length;
Chibueze Opata
  • 9,856
  • 7
  • 42
  • 65
  • Looks like indexing service will be the way to go. The FasterDirectoryEnumerator has some issues which as described in the comments would be issues in this situation. I'll do some testing and come back to this with results! – Robert Kerr Jul 19 '13 at 13:20
0

If this is really an issue ( with tens of thousands of filenames to search, and thousands of different searches to be done ) then the simplest ( without writing huge amounts of code ) might be to cache the filenames in a indexed database table and then use SQL to do the count. After the work of copying the filenames to the database and indexing the table, any decent SQL engine should do an excellent job on each individual search - I would use SQLite myself.

The snag will be keeping the cache up to date. How much this is a pain will depend on exactly what your attempting to do.

ravenspoint
  • 19,093
  • 6
  • 57
  • 103
  • You could use the indexing service, which is already maintaining a database of this information. – Raymond Chen Jul 18 '13 at 14:16
  • @Raymond: How do I query the indexing service for a count of files in a folder matching a filter? I think that's what you're indicating is possible, by your answer. – Robert Kerr Jul 18 '13 at 20:46
  • I just did an MSDN search on "indexing service" and hey look, I found [Windows Search overview](http://msdn.microsoft.com/en-us/library/aa965362(v=vs.85).aspx) and [Querying the index programmatically](http://msdn.microsoft.com/en-us/library/bb266517(v=vs.85).aspx) and [Windows Search SDK samples](http://www.microsoft.com/en-us/download/details.aspx?id=7388). – Raymond Chen Jul 18 '13 at 23:11
  • Yeh looks like indexing service queries ought to reduce the expensive parts by a factor. Will be doing some testing. – Robert Kerr Jul 19 '13 at 13:18