File Enumerator
From Real Software Documentation
Aim
We’re taking a breather from learning new programming concepts, and putting some of these concepts together in a useful project.
Contents |
A Useful Class: A File Enumerator
We’ve absorbed a lot of concepts in the course so far. We’re going to switch over to using these features to design a very useful class. This will be a flexible file enumerator.
Enumerating is a fancy programmer’s term for going through a set of things one at a time (a similar term you might see is traversing). We will create a class you can point at a folder or a file, and our class will generate an event for each item in that folder.
Design Issues
There are a few major approaches we could take to this project:
- We could create a class that traverses part of the file tree, providing its subclass with an opportunity to process each file as it does so; or
- We could create a class that repeatedly calls a separate file receiver class as it traverses the file tree; or
- We could make the traversal class passive, driven by repeated calls from outside, asking for the next file.
The approach we will take is the first one; if we wish, we can extend it to create the second. The third approach will tend to be more cumbersome — the external caller will need to keep checking the state of the file traversal, checking whether the traversal has finished, for example. Before we go on, you should read about the FolderItem class.
Depth First vs. Breadth First
Files are arranged in a tree structure of folders, sub-folders, sub-sub-folders, and so on. Classes are in a similar tree structure of classes, subclasses, sub-subclasses, and so on. Tree structures for information are very common in computer programming, and dealing with them will often involve enumerating their contents as we are doing in this project. There are two main ways of traversing a tree structure:
- In a depth-first traversal, when we encounter a folder in the current folder, we immediately start enumerating its contents, and immediately into any subfolder of that folder we come to, and so on. We will eventually get back to the rest of the items in a folder after enumerating the contents of its sub-folder; or
- In a breadth-first traversal, we enumerate all the files in the current folder, then enumerate the contents of any folders in the current folder. We will see both of these techniques in action in this lesson’s project.
The Project
- Open the project 18_FileEnumerator. Open the FileEnumerator class.
Notice that the class has a constructor, one public method, several private methods and three new events.
Two of these events allow a subclass to determine how the enumeration takes place (DepthFirst) and allows a subclass to interrupt the iteration (Stop). Both of these work just fine with the default return value. So a subclass really only needs to implement one event handler (EnumerateItem).
- Open the FileCounter class, and see how straightforward it is to create a child class that counts the files in a location.
- Go back to the FileEnumerator class and examine the recursive structure of the file enumeration process.
Note how a depth-first enumeration is carried out by immediately recursively enumerating any folder we come to when we come to them, while a breadth-first enumeration is performed by enumerating all the items in a folder, and then recursively enumerating the contents of any sub-folders.
- Put a breakpoint in the Enumerate method of FileEnumerator. Run the program, and choose a folder with a small number of files and one or two subfolders. Trace through the execution. Then comment out the command:
from the DepthFirst event handler for the FileCounter class, run the program, and choose the same folder.
TrueItem
Note that when we pick out a particular item in a folder, we use the TrueItem function rather than the Item function of the FolderItem. The difference lies in the treatment of shortcuts (shortcuts are called aliases on Macintosh): Item directed at a shortcut will work with the file the shortcut points at, while TrueItem will work on the shortcut file itself.
This was important to do, because if we were following shortcuts to their source, it would be possible to get stuck in a looped directory structure, in which a shortcut pointed back at one of the folders it was contained in, causing the enumeration process to run forever.
It would clearly be better if we supported following shortcuts to their original file, and in a later project we will do that; in fact, we will notify the subclass through an event that we’ve encountered a shortcut, and let the subclass choose whether or not we follow it.
Even better, we will build a class that can’t get stuck in a loop, no matter what the subclass does. However, doing this is a surprising amount of extra work, and we will wind up building quite a bit of infrastructure to make this work efficiently in the coming lessons.
Concepts We’ve Used In This Project
Notice that this project uses recursion. This is a natural way to approach this problem, because the data involved has a recursive structure (a folder is the same type of thing as the folder it is within).
Notice that we’ve built a class that must be subclassed in order to do anything useful. Notice also that we’ve built a flexible class that can be used in a variety of ways. Armed with this class, you can make a variety of small file utilities very quickly. Notice how little extra code is involved in using this class to create a program that can count files and folders, for example.
Further Exercises
There are any number of potential exercises one could undertake from these beginnings. Try to develop a useful file utility for yourself. Perhaps something that lists all the files anywhere within a folder, for example.
One nice enhancement would be to give the subclass the ability to decide whether a shortcut or its target should be enumerated. Probably the best way to do that would be to use a special event for an alias: EnumerateAlias(f As FolderItem, g As FolderItem) As Boolean. The method would return True to indicate that the target should be enumerated if it is a folder, False if it should not. You should think about the looping directory structure problem. It isn’t too hard to construct a simple solution, but it is a fair bit more work to devise something that doesn’t slow down very badly as the directory structure you’re searching gets bigger.
