Received 30 January 2016; accepted 20 March 2016; published 23 March 2016
1. Introduction
File systems are among the most important parts of an operating system. It is the structure and method of organization for data in a storage unit. Without it, the content of a disk is nothing but a long stream of meaningless bytes. A file system keeps track of where these bytes are, which ones are related to others, as well as file metadata. Learning about file systems is essential in any computing program. Not only is it very important in the Information Technology field, but the concepts of data structures and algorithms behind them can also be used in other, higher level, applications. File systems, especially modern ones such as exFAT or NTFS, can be difficult for some students to fully understand. This is the impetus for the Zion project.
Zion is a simulator that is targeted to computing students to help them understand I/O operations and experiment “hands on” with file systems. The application is provided with examples of a simple file system and a volume manager, so that students can see the complex tasks that the operating system has to do in order to send and receive I/O packets. Users can run workloads, which represents a sequence of “create,” “delete,” “read,” or “write” operations, and analyze the results of these instructions. On each workload session Zion shows the time that was spent to read and write files, how many I/O operations were performed, along with additional information. Zion displays a diagram of the volume, divided in blocks, where users can see where a particular file was written. The expandability of the Zion framework allows students to implement and run more sophisticated file systems. Users can create a virtual NTFS or Unix File System, for instance, or they can even invent new ones. The goal of this project is to provide students and professors a tool that could potentially be used in laboratory assignments for Operating Systems or Computer Architectures classes.
This paper is organized as follows. Section 2 provides context for Zion by examining existing file system simulators. Section 3 discusses the design of Zion itself, while Section 4 covers the expandability features. Section 5 provides an evaluation of Zion, and Section 6 points to areas of future work.
2. Background and Related Work
Zion is not the only file system simulator available to students. Two applications in particular are very similar: Modern Operating Systems Simulators (MOSS) by Ray Ontko, and Java File System Simulator by Moazan.
MOSS File System Simulator [1] , written in Java, was designed to show how Linux operating systems work behind the scenes. It is a command-based application that has a set of the most important UNIX system calls such as “creat,” “open,” “read,” and “write,” but also “mkdir” and “cp.” This tool allows users to, among other things, simulate the creation of a file by specifying filename, block size, and blocks. MOSS not only shows the block details output in the terminal screen, but it also creates the actual file on the disk. While this tool is very powerful and it implements a great variety of system functions, it does not let the users execute more than one instruction at the time. Users can create batch files to run multiple instructions, but the tool does not show an overall statistics of such operations.
Java Files System Simulation [2] is an application that can run on any machine and, unlike MOSS, it has a graphical user interface. This application is simple to use and it has a very intuitive interface. Java Files System Simulation allows users to create a virtual drive and copy and paste files from the computer. Unfortunately there is not much documentation available on how new file systems can be created. While there is a way to load file systems from the computer (.fs files), it is not clear to what extends users can customize these file systems.
Both MOSS and Java Files System Simulation are very powerful tools. However, they do not allow users to run a workload of instructions and show statistical information regarding the I/O operations. They do not display details such as the amount of bytes that are read and written, or the time that a particular operation takes (MOSS does show the bytes and blocks, but only for one instruction at the time). Zion, on the other hand, has functionality to load several hundreds of instructions, and show useful analytical information (both numerical and graphical) after each workload session. The interfaces provided with the applications, together with sample code, allow students to easily expand Zion and create new files systems or volume managers.
3. Zion File System Simulator
Zion is a Windows application that allows users to run workloads of I/O instructions such as “create,” “delete,” “read,” and “write” through a simulation of a file system. Users can see where files are being stored in a virtual volume, and they can analyze statistical information including the amount of time needed to complete such instructions, the number of I/O operations, as well as the space used in the volume and the disk fragmentation. Zion is an instructive tool and students can experiment with different workloads. For instance, they can stress a file system with several “creates” and “deletes” to see how the file system handles and minimizes volume fragmentation. An additional improvement over existing simulators is the ability to create new volume types and new file systems, and to dynamically load them into the application through dynamic-link libraries (DLLs).
3.1. Usage
A user starts the application by choosing the volume and the file system they want to use, and customizing these components. The implementation of these components is completely independent of the application framework itself, therefore the customization parameters available depend on the specific type of components that have been loaded into Zion. Out of the box Zion comes with the Simple FS and Standard Disk modules, which are two separate DLLs. Simple FS is a very basic version of a FAT file system. It uses a file allocation table to keep track of where each file (first address) is stored on the disk. It has no maximum number of files that can be stored, unlike FAT and NTFS. However, it does not know the concept of folders: Simple FS is a flat file system. Standard Disk is an example of a volume that, like hard disk drives, is divided into blocks and sectors. The size of the disk can be dynamically set by the user using the parameters provided. Zion runs on Windows machine, and it was designed to mimic as much as possible the internals of Microsoft’s most famous operating system. Two of the most important sources of information used to build Zion are “Windows System Programming” [3] and “Windows Internals” [4] .
3.2. Parameters
Zion displays a set of adjustable parameters for the volume and file system. With the Standard Disk, the user can specify the drive size, the size of a sector (currently locked to 512 bytes, like in most real life implementations), and the number of sectors for each block, shown in Figure 1 (the icons used are from [5] ).
To provide a more realistic feeling, the user can also specify values for average seek time, latency, and transfer time. Physical disks need to move their magnetic heads to the right track (seek time), then the disks have to
rotate so that the head can access the specific sector (latency), and then the data has to be written/read (transfer time). Zion lets the user specify the average time that these operations take, and then it uses these parameters to calculate runtime statistics.
3.3. Disk Format
Once the disk parameters have been set, the volume needs to be mounted and then formatted, like in actual computers. This is done by means of the menu buttons “Mount” and “Format” (Figure 2). The flags next to the file system and volume names denote that the mounting and formatting operations were successful (Figure 3). The number of blocks in the disk is calculated based on the disk size and the sector size (Figure 4). A block is the smallest allocation unit of a volume. The format process not only creates these allocation units, but it also marks the blocks that are reserved by the file system (this needs to be included in the file system implementation). Zion then displays the graphical representation of the volume showing the blocks in the disks after the formatting process (Figure 5).
Figure 3. Volume and file systems flags.
Figure 5. Diagram of the volume with statistical data.
3.4. Workload
A workload represents a series of instructions for the I/O manager. These instructions include CREATE, DELETE, READ, and WRITE, and they can be saved in a simple comma separated text file. Below is an example workload.
CREATE, file4.txt
CREATE, file5.txt
CREATE, file2.txt
WRITE, file2.txt, 1878606 bytes
DELETE, file5.txt
READ, file2.txt, 4412717 bytes
CREATE, file6.txt
DELETE, file2.txt
CREATE, file7.txt
WRITE, file4.txt, 1593981 bytes
DELETE, file7.txt
Each instruction has different parameters separated by a comma. The first parameter is the instruction type. The second is the file name, and the third one is the size. Size applies to WRITE and READ only, and it can be in one of the following units (not case sensitive):
“mb” or “megabyte”: 1,000,000 bytes
“mib” or “mebibyte”: 1,048,576 bytes (220 bytes)
“kb” or “kilobyte”: 1000 bytes
“kib” or “kibibyte”: 1024 bytes (210 bytes)
“bytes” (by default, if omitted)
Note: while the tool allows users to use MB/MiB as size units, files that are too big might not be written or read. This is because Zion actually allocates the same amount of data into memory for simulation purposes.
Although workloads can be created with any text editor, Zion also provides a tool to help the user create random instructions for the workload (Figure 6).
In the “Create Workload File” screen, users can indicate what instructions they want to include, the minimum and maximum size for read and write, and the number of entries. Notice that “create” is always selected. This is because the tool needs to have a reference to existing files in order to generate “delete,” “read,” and “write” instructions.
The “Open” button (Figure 7) loads the workload into the application, and the “Workload” tab of the GUI displays the list of these instructions with the total file count and size (Figure 8). Workloads can also be created on the fly by dragging and dropping existing files from your computer. For each file dropped into the list, the system generates two entries: CREATE and WRITE. The WRITE instruction will have the size of the file, and it will contain the actual file data.
3.5. Session
With the volume properly mounted and formatted, and with a workload ready, the user can execute the instructions (Figure 9 and Figure 10).
Running a session consists of looping through each instruction and perform the appropriate operations. Alternatively, users can run a single instruction by selecting it and clicking “Run.” Based on the instructions in the workload, files are created, written, read, and deleted from the virtual volume. Because the list of files at the end of a session may not correspond to the list of files from the workload, the “Files” tab of the GUI shows all the files that are still present on the disk once the session is done (Figure 11). Clicking on the file name will display additional information such as creation, access, and write time (the data available depends on the actual implementation of the file system).
The “Volume Blocks” diagram (Figure 12) highlights the allocation units that are used and available, as well as additional information such as how much space has been taken by files, and the percent of disk fragmentation, although the actual data available depends on the implementation of the chosen file system (Figure 13).
3.6. Session Log and Metrics
Sessions logs and metrics are probably the most important outputs of a workload session (Figure 14 and Figure 15). Depending on user preference (see section “Additional Settings”), the session log can show different types of events. The tool always captures the instruction being executed with a timestamp. In addition, the log can show each time the file system performs one of the four operations (create, delete, read, and write), or when the volume manager writes to or reads from a block. In addition, the session metrics can include how many bytes were read and written, the total and average time spend, the number of I/O operations, and the lookup counts of
Figure 10. Overview of the application with a session of workload in execution.
Figure 11. List of files available on the disk.
Figure 12. Volume blocks after the workload session is complete.
the file system (that is, how many lookups are performed in order to find a free entry in the allocation table, or the next address in chain of a file). The user must be aware that including this additional information can significantly slow down the session run, because the system has to capture and display more data.
The amount of information in the session log and metrics depend on the actual implementation of both file system and volume manager. For instance, the file system has to keep track (and send) the amount of total I/O operations to Zion in order to display it at the end of the workload session. Another example is the StandardDisk implementation of a volume. This particular implementation does not allow the volume manager to append data into blocks. Therefore the file system has to do additional reads and writes in order to append data into an existing file. This explains the discrepancy between the bytes from the session metrics and the size from the workload. A session log can be exported into a text file by clicking “Save Log File”, or Zion can automatically save all session log files.
3.7. Additional Settings
4. Expandability
One of the key strengths of Zion is its expandability. Students can easily create their own file systems or simulate the most popular ones such FAT, NTFS, or Unix File System. All they need to do is create a project in any .NET language (the minimum required is 3.0), import the two interfaces IFileSystem and IVolume, and create their own implementation. Upon startup, Zion scans the folder in which the application is executed and it loads all the available assemblies that have the “FileSystem” or “Volume” in their assembly information.
The interfaces IFileSystem and IVolume were deliberately designed to be as intuitive as possible (see Github Repository). File systems and volumes communicate information to Zion by raising appropriate events (for instance, Write Completed). Zion captures the events and associated data to display metrics
5. Conclusions and Future Work
File systems are an essential component of operating systems, and therefore they are an building block for computer science students. The complexity of file systems can make it challenging for students to learn the various structures and features of file systems available in the industry. Zion was designed to help reduce this learning curve. The goal of the Zion project is to provide a robust, easy to use, and flexible laboratory tool for computer science classes. The Zion File System Simulation application is designed as a learning tool for computing students. It is not intended for measuring accurate performance of file systems and volume managers, although more complex implementations could lead to this. While the tool shows information such as total time and I/O operations, this is just indicative information, and it may not be realistic if the implementation skips steps (such as a circular scan) on certain operations. Students may find it hard to notice differences in metrics from one file
system implementation to another, unless these file systems are substantially different in the way they handle the I/O operations.
Zion is an instructive tool for students who are learning operating systems and computer architectures, and its educational power is at different level. The learning experience could be as little as running workloads and observing the results, reading the code of actual file systems and volume implementations (learning more about data structures and algorithms), or actually creating their own implementations. Zion provides a range of choices to educators.
Zion certainly has room for improvement. Future work could include providing additional file systems and volumes, perhaps a simulation of actual FAT or NTFS. New volume managers could simulate rotating disks using different scheduling algorithms such as Circular Scan or Shortest Seek Time First. Whether used as it is, or expanded with new components, Zion has the potential to become a teaching tool in Operating Systems or Computer Architectures laboratory classes.
The installer, the file system and volume interfaces, as well as part of the source code, are available online (GitHub Repository [6] ).