OpenMCDF is a .NET library that allows us to work with Compound Document File (CDF) files from a C# application.
OpenMCDF provides a way to read and write CDF files from C# applications. It allows operations such as extracting data from CDF files, modifying existing files, and creating new CDF files from scratch.
Some of its features are:
- Supports read/write operations on streams and storages.
- Allows traversing the file tree structure.
- Complies with versions 3 and 4 of the CDF specifications.
- Uses lazy loading whenever possible to reduce memory usage.
- Offers an intuitive API for working with structured files.
CDF Files
CDF files, also known as OLE Structured Storage, are a binary file format used by many applications. For example:
- All documents created by Microsoft Office up to the 2007 version
- Outlook .msg messages
- Windows thumbnail cache files (thumbs.db)
- Visual Studio .suo files (solution options) are also compound files
- Many audio/video editing tools (*.aaf, for example).
CDF files are essentially containers that can store various types of data, such as text, images, embedded objects, metadata, and more. The advantage of CDF files is that they allow multiple data components to be stored and managed in a single file, making it easier to transport and exchange information more efficiently.
The hierarchical structure of a CDF file consists of two main elements:
Streams: These are individual data blocks that contain specific information, such as text, images, embedded objects, etc. Each stream has a unique name that identifies it within the file.
Storages: These are containers that can hold other streams or storages. They allow organizing information in a tree structure. Storages can be nested, meaning that a storage can contain other storages and streams.
How to Use OpenMCDF
We can easily add the library to a .NET project through the corresponding Nuget package.
Install-Package OpenMCDF
Here are some examples of how to use OpenMCDF extracted from the library’s documentation
Create a new compound file
byte[] b = new byte[10000];
CompoundFile cf = new CompoundFile();
CFStream myStream = cf.RootStorage.AddStream("MyStream");
myStream.SetData(b);
cf.Save("MyCompoundFile.cfs");
cf.Close();
Open an existing file and get a data stream
String filename = "report.xls";
CompoundFile cf = new CompoundFile(filename);
CFStream foundStream = cf.RootStorage.GetStream("Workbook");
byte[] temp = foundStream.GetData();
// Do something with 'temp'
cf.Close();
Add and remove items
CompoundFile cf = new CompoundFile();
CFStorage st = cf.RootStorage.AddStorage("MyStorage");
CFStream sm = st.AddStream("MyStream");
// Delete an item
cf.RootStorage.Delete("AStream"); // Assumes it exists.
Persist changes
cf.RootStorage.AddStream("MyStream").SetData(buffer);
cf.Commit();
Compress a compound file
CompoundFile.ShrinkCompoundFile("MultipleStorage_Deleted_Compress.cfs");
OpenMCDF is compatible with .NET Standard 2.0. It is an open-source project, and all the code and documentation are available in the repository at GitHub - ironfede/openmcdf