HCC has invested a great deal of research, test and development effort to design truly fail-safe file systems that will always recover from unexpected system events such as power loss or reset. A standard FAT file system is not fail-safe and therefore risks becoming corrupt – this is not normally acceptable in an embedded environment. The fundamental problem is that to make a new entry in a FAT consistent, more than one area of the disk must be modified in a single, uninterrupted action. This is logically impossible to achieve. Although a check-disk program can recover some situations, this normally requires user intervention and decision-making. For product designers who value or depend on the data stored in their embedded devices, a fail-safe system is strongly recommended.

Journal based file systems generally guarantee only the integrity of the metadata and are not always deterministic. A transaction based file system provides integrity for both file data and metadata, though the commit points are normally system wide. HCC employs a hybrid approach for its fail-safe file systems and all our implementations are transaction based on a file-by-file basis. This has the advantage that a single file operation can be executed without reference to the state of other files or operations, meaning each application using the file system can operate safely and independently.

Any file system claiming fail-safety must define what is required of the low-level media driver to guarantee fail-safety. With all HCC fail-safe file systems, the requirements of the low-level driver are clearly defined. This enables designers to create systems that will survive unexpected reset or power failure. It is important to note that in most systems involving flash storage, careful management of the power to the target media is critical. HCC’s experienced team can offer insight into the design of reliable file system solutions.

For all fail-safe file systems HCC have created simulation environments that are designed to ensure the robustness of the system through random reset and system verification on restart. HCC develops test harnesses for each system, in which an external controller randomly interrupts power to the target system. In order to ensure integrity, these tests are run continuously for weeks using multiple hardware configurations.