This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Abstract- In this paper, we present an efficient hybrid solid state device flash file system (HSSDFFS) for flash memory storage. Flash memory like NAND and NOR flash memories has become a major method for data storage. Currently, a block level translation interface is required between an existing file system and flash memory chips due to its physical characteristics. HSSDFFS is based on both NOR flash and NAND flash memory. In a conventional NAND flash-based flash file system, there is a trade-off between life span and durability in the frequent writing of small amounts of data. Because NAND flash supports only a page-level I/O, at least one page is wasted in the synchronous writing of small amounts of data. The wasting of pages reduces the utilization and life span of the NAND flash. To alleviate the utilization problem, some NAND flash-based flash file systems write small amounts of data asynchronously with RAM buffers, though buffering in RAM decreases the durability of the system. Our HSSDFFS eliminates the trade-off between life span and durability. It synchronously stores data as a log in the NOR flash, whenever we append small amounts of data to a file. The merged logs are then flushed to the NAND flash in a page-aligned fashion. The implementation of our HSSDFFS is based on our previous NAND flash-based file system Core Flash File System.
Keywords- Storage management, file system management, NOR flash, Solid state device, NAND flash and Flash file system.
Now a days Solid Disk Drives are used to develop PC, Laptop, notebook pc, and tablet pc, mobile and embedded devices. Flash memory has become an increasingly important component in nonvolatile storage media because of its small size, shock resistance, and low power consumption . In nonvolatile memory, NOR flash memory provides a fast random access speed, but it has a high cost and low density compared with NAND flash memory. In contrast to NOR flash memory, NAND flash memory has the advantages of a large storage capacity and relatively high performance for large read/write requests. Solid Disk Drives are developed by Flash memories. Recently, the capacity of a NAND flash memory chip became 8GB and this size will increase quickly. Based on the NAND flash chip, a solid state disk has been developed and this can be used as a storage system in laptop computers . SSD provide more convenient and flexibility than Hard disk drive (HHD) in terms of power, durability, fast access, size, noise. etc. Therefore, NAND flash is used widely as data storage in embedded systems and will also be used for PC based systems in the near future.
NAND flash memory chips are arranged into blocks; each block has a fixed number of pages, which are the units of read/write. A page is further divided into a data region for storing data and a spare region for storing the status of the data region. In first generation NAND flash memory, the typical page size was 512 bytes, the additional spare region was 16 bytes, the block size was 16 KB, and it was composed of 32 pages. As its capacity grew, the page size of the next generation became 2KB with an additional 64 bytes in the spare region and the block size became 128 KB. Due to flash memory characteristics in the form of Electrically Erasable Read Only Memory (EEPROM), in-place updates are not allowed. This means that, when data is modified, the new data must be written to an available page in another position and this page is then considered a live page. Consequently, the page that contained the old data is considered a dead page. As time passes, a large portion of flash memory is composed of dead pages and the system should reclaim the dead pages for writing operations. The erase operation makes dead pages become available again. However, because the unit of an erase operation is a block, which is much larger than a write unit, this mismatch results in an additional copying of live pages to another location when erasing a block . This process is called garbage collection. Pros and cons of Solid State Drives (SSD) and Solid Disk Drives (HDD) are compared in table 1.
New Solid State Drives - SSD
Traditional Hard Disk Drives - HDD
No moving parts - solid state construction
Moving parts - rotating platter with mechanical arm
Tolerant of shock and vibration
damage from shock and vibration
Higher cost per GB
Lower cost per GB
Very fast random access and read speeds
Slower read/write performance
Ideal for lower capacity: 16GB to 128GB
Higher total capacity, up to 1TB
Lower power consumption
Higher power consumption
Table 1. Comparison of SSD and HDD
BACKGROUND AND RELATED WORKS
In this section, we present a brief overview of flash memory constraints and previous works on flash-based storage systems.
Flash Memory Constraints
Flash memory has three physical constraints. The first constraint is a constraint in which an in-place update is not allowed because any written areas should be erased before they can be reprogrammed. Out-of-place updates are responsible for most of the constraints in the design of flash filing systems. The second constraint is a size problem in which the size of the erase unit is much larger than that of a program unit. The erase unit is called an erase block. In a recent flash chip, the size of an erase block is 128 Kbytes, whereas the write unit is a word or a page of 2 Kbytes. Thus, live data and obsolete data may exist in an erase block. A cleaning operation like garbage collection is then required to move the live data into a free erase block before the erasure. The third constraint is the limited life span of an erase block due to the number of erase counts. Usually 100,000 erase cycles are guaranteed for each erase block. The erase cycles should be even throughout all of the erase blocks in order to level the wear of the erase blocks .
Flash-Based Storage System
Many works have focused on efficient flash-based storage systems and they can be classified into two main strategies. The first strategy involves the creation of a virtual block device layer, called a flash translation layer , between a legacy file system and flash chips. The flash translation layer hides the flash memory constraints from the legacy file system. The other strategy is to implement a flash-aware file system which is designed specifically for direct use on flash chips without translation layers. The flash aware file system, which differs from legacy file systems, should be aware of the flash properties of out-of-place updates. Many small embedded devices in a ubiquitous sensor network are equipped with flash memories and flash-aware file systems. The ELF , MicroHash , and TINX  systems are designed to store sensed data in the NAND flash of sensor devices. The most important design constraints are small memory requirement and low power consumption.
In addition, the Journaling Flash File System 2 (JFFS2)  and Yet Another Flash File System (YAFFS)  systems were designed for general embedded systems equipped with several megabytes of NAND flash. These designs, which are based on a traditional log structured file system , can perform out-of-place Updates and cleaning.
However, the JFFS2 and YAFFS systems were designed for small flash memory of less than several megabytes. For these systems, the entire flash medium should be scanned at the mounting time. Moreover, because the scanning time is proportional to the size of the flash memory, the time required for mounting a large flash memory would be intolerable. Another consideration is the fact that every file abstraction and data structures for resource management should always be loaded into the RAM. In terms of mounting time and memory requirements, the previous JFFS2 and YAFFS systems are not scalable. In JFFS2, most of the NAND flash area should be scanned to load file system-specific data structures in the RAM. However, YAFFS scans the entire spare region for loading file system structures and part of the data regions that contain metadata. To alleviate the scalability problem for large flash memories, researchers on CFFS  and JFFS3  have proposed a reduction in the scanning time and memory requirements.
However, the previous NAND flash-based flash file systems failed to consider the durability and life span of frequent small updates and there was a trade-off between durability and the life span. For a durable system, the file system should be mounted as a synchronous mode, resulting in the wasting of a single page of the NAND flash for the writing of even a small amount of data. Asynchronous writing, on the other hand, sacrifices data durability.
Flash File System
A more efficient use of flash memory as storage would be possible by using a file system designed specifically for use on such devices, without the extra translation layer. One such design is the Journaling Flash File System 2 (JFFS2) . The JFFS2 is a logstructured file system that sequentially stores the nodes containing data and metadata in every free region in the flash chip. However, in the design of the JFFS2, the NAND flash memory characteristics, such as the spare regions and read/write units, were not fully considered and utilized. Therefore, the performance of the JFFS2 with NAND flash memory storage is reduced, especially for the mount time and RAM footprint. The JFFS2 creates a new node containing both the inode and data for the file when the write operation is performed and the corresponding inode's version is increased by one. Therefore, the JFFS2 should scan the entire flash memory media at the mounting time in order to find the inode with the latest version number. Furthermore, many in-memory footprints are required to maintain all the node information.
Another more efficient approach to using flash memory as storage is the Yet Another Flash File System (YAFFS) , which is designed specifically for NAND flash memory chips. In YAFFS, each page is marked with a file ID and chunk number. The file ID denotes the file inode number and the chunk number is determined by dividing the file position by the page size. These numbers are stored in the spare region of the NAND flash memory. Therefore, the boot scanning time to build file structures should only require reading of the spare region; thus, the mounting time is faster than that of the JFFS2. However, it also requires full flash scanning to find out the flash usage, so the boot scanning time increases linearly along with the flash memory size. The overall flash memory-based file system architecture is shown in Figure 1.
Figure 1. Flash memory-based file system architecture.
These two designs, JFFS2 and YAFFS, are effective for considering the characteristics of flash memory and yield better performance than the flash translation methods because the translation layer between the file system and flash memory chip is not present. However, in designing a flash file system, these two file systems hardly consider the file system characteristics, such as the different characteristics between metadata and data and the file usage patterns according to the file sizes. These characteristics greatly affect the flash memory performance.
The design goals of the flash file system, which is called the Core Flash File System (CFFS), are determined by the different access patterns between the metadata and data and the file usage patterns according to the file sizes. The fundamental file system structure of the CFFS has followed that of YAFFS . In the CFFS, each inode occupies an entire page, like YAFFS. Each inode includes its attributes, such as its i-number, uid, gid, ctime, atime, mtime, and so on. In addition to this, the inode stores its file name and parent inode pointer; thus, the CFFS does not have distinct dentries in the media. This can reduce additional flash page updates because a dentry update is not required. For fast lookup in the directory, the CFFS constructs the dentry in the RAM when the system is booted. File data is stored as a chunk whose size is the same as a flash page. Each data chunk is marked with an inode number and a chunk number in the spare region; the chunk number represents the file offset, so it is determined by dividing the file position by the page size. If an inode occupied one page, it would require a high storage capacity compared with other Unix-like file systems, such as Ext2; however, one page per file is not a significant overhead compared to the large data region. Rather, if several inodes share one flash page, as in the Ext2 file system, the update frequency of that page will increase by the number of inodes stored on that page, thus resulting in many flash pages being consumed because some modifications of flash pages require whole page updates. Therefore, the effect of a one page occupation per inode can have a similar effect to the sharing of several inodes on one flash page.
The main feature of the CFFS is the data index entries in the inode structure. Since an entire flash page is used for one inode, numerous indexing entries can be allocated to point to the data regions. For example, if we use a flash memory with a 512 byte page size, 64 four-byte index entries can exist; if we use a flash memory with a 2 KB page size, 448 four-byte index entries can exist. The four-byte digit number is sufficient to point to an individual flash page. Using these index entries, the CFFS classifies the inode into two classes: i _ class1 maintains direct indexing for all index entries except the final one and i _ class2 maintains indirect indexing for all index entries except the final one. The final index entry is indirectly indexed for i _ class1 and double indirectly indexed for i _ class2.
III. HYBRID SOLID STATE DEVICE FLASH FILE SYSTEM
In this section, we describe the architecture, data structures, and operations of our proposed HSSDFFS. We also present an analysis of the life span and the threshold size that determines whether data is logged to the NOR flash or stored in NAND flash. Finally,
Figure 2 The Flash solid state devices.
We discuss performance issues with respect to the shortcomings of the NAND and NOR flash.
The architecture of a hybrid solid state device is designed based on the characteristics of SSD. Every SSD device can be spitted in to two parts called hot and cold area as shown in figure 3 with separated read and write area. The Figure 4 shows how the HSSDFFS uses both types of flash memory as a storage medium. First, NAND flash consists of multiple erase blocks, which are the unit of the erase operations. Each erase block contains a fixed number of pages, which are the unit of I/O operation. A page is a basic allocation unit and each page may contain file data or metadata. Second, the NOR flash also consists of multiple erase blocks and each erase block has multiple log blocks. A log block, which is reserved for a file and contains multiple variable-sized logs that correspond to the file, is a basic allocation unit for the NOR flash. The size of the log block is defined when the file system is created .
There are two resource management modules for both types of flash memory. The resource management covers the management of erase blocks, garbage collection, and the allocation of free pages or log blocks. Other than these two resources management modules for both types of flash memory, there is a file management module. For each file, there are three corresponding types of data: metadata, file data, and logs. The file management module should maintain the metadata, file data, and logs of each file.
The file metadata is stored in the NAND flash, but the file data can be written in either the NAND flash or the NOR flash. When the file data is small enough it is transformed to a log and the log is written in the NOR flash. Otherwise, the file data is written in the NAND flash.
Figure 3. The architecture of a hybrid solid state device.
Figure 4. The architecture of a hybrid solid state device flash file system (HSSDFFS).
B. File Operations
File operations normally follow the same process as in the previous implementation of the CFFS so that access is directed just to the NAND flash. However, when a file is associated with logs, the operations become slightly complex.
C. Reading a File
When a file is first accessed, the file's metadata is read as a means of checking whether the file has logs in the NOR flash. If the log block address in the metadata is invalid, the file does not have logs in the NOR flash and the read operation is subsequently directed to the NAND flash. On the other hand, if the file has logs in the NOR flash, the log block is scanned and the corresponding log map is loaded in the RAM before the data or attributes of the file are read. Because a log header contains the length of the following log data, we can locate the next log header. Furthermore, by reading all of the log headers, we can construct a log map that is a doubly linked list of log descriptors. When the log map is completely loaded in the RAM, the requested read operation is serviced by the reading of logs that are described in the linked list of logs.
D. Writing a Small Amount of Data to a File
When an application appends a small amount of data to a file, the size of the data is checked first. Only data that is smaller than a predetermined threshold is stored as a log in the NOR flash; larger amounts of data are stored in the NAND flash. To write file data as a log, we need to check if the log block address in the metadata is valid. If the log block address is invalid, the file does not have logs in the NOR flash; hence, a new empty log block is allocated to the file and a new log map is constructed in the RAM.
If the log block address is valid, the file has logs in the NOR flash; hence, the log block should be scanned and the corresponding log map should be loaded in the RAM before the data is appended. Finally, the new data is appended at the tail of the log block, a new descriptor of the log is appended to the log map in the RAM, and the file metadata is updated with an address of the newly allocated log block.
However, while appending logs, the log block grows and is filled with logs. When the log block is filled with enough logs, all of the logs in the log block are transferred to data pages of the NAND flash. We refer to this job as flushing. The flushing job includes the loading of file data from the log block to RAM, the allocation of free data pages in the NAND flash, the writing of loaded file data to the NAND flash, and the making of an obsolete mark for the flushed log block. The log block eventually becomes obsolete and can be erased later.
E Performance Issues
The NOR flash has faster access but a much slower write and erase time than the NAND flash. We discuss the performance shortcomings of the NOR flash in terms of the erase latency, the write latency, the flushing time, and the wear leveling.
1. Erase Latency
The NOR flash has a much longer erase time than the NAND flash. Erasing one block of 64 Kbytes takes around 0.7 second in a typical NOR flash  but 2 milliseconds in a recent NAND flash . The very long erasing time greatly reduces the write throughput and the real-time performance of the NOR flash.
However, we can hide the long erase delay by running a cleaning thread in the background. Recent flash chips support two advanced features to provide pre emptive ness of an erase operation, namely, the erase suspend/resume feature and the simultaneous read/write feature .
2. Write Latency
Generally speaking, the write latency of NOR flash is longer than that of NAND flash, but it depends on the size of the written data. Because NOR flash is byte addressable, the write latency is linearly proportional to the size of the written data. However, the fixed time for programming one page elapses whenever a small amount of data is written in the NAND flash. For instance, the time taken to write one word (2 bytes) is almost the same as the time taken to write one page (2 Kbytes) in the NAND flash. The write latencies for writing data of less than 32 bytes are comparable in both types of flash memory.
The flushing process blocks other read or write requests until it is completed because the flushing should be performed atomically for consistency. The delay depends on the size of a log block. As a log block becomes larger, the flushing is deferred further and the flushing delay is prolonged. Therefore, the size of a log block should be carefully chosen in order to meet a given real-time system requirement.
4. Wear-Leveling Effects
In our HSSDFFS, every small item of data is sequentially logged in the NOR flash just until it is flushed to the NAND flash. Thus, a complex wear-leveling scheme is unnecessary for the NOR flash. Instead, we allocated an empty erase block in a sequential order, thereby enabling the wear of the erase blocks to be leveled naturally. Furthermore, the HSSDFFS provides a better garbage collection performance of the NAND flash than a conventional flash file system, which generates excessive garbage while synchronously appending frequent small amount of data.
In a conventional NAND flash-based flash file system, there is a trade-off between the life span and durability. Because NAND flash supports only a page-level I/O, a single page of 2 Kbytes is required in order to synchronously write a small amount of data. Our HSSDFFS prolongs the life span and enhances the durability of written data more effectively than conventional NAND flash-based file systems. When we append a small amount of data to a file, the data are synchronously stored as a log in the NOR flash. The merged logs are then flushed to the NAND flash in a page-aligned fashion. By doing this, we avoid data loss in the event of an unexpected power outage. Furthermore, by increasing the utilization of NAND flash, we can prolong the life span than a conventional NAND flash-based file system. Finally, the HSSDFFS provides a single combined partition to facilitate its usage in many areas.