The "Convert mbox Files to XML" screen asks for the following information:
- Name. This is the name of the email account.
- Directory. This is the account directory. This must already exist and must contain (directly or indirectly) one or more mbox-format files with the .mbox file extension.
- Folder. If no folder is specified, then the program will convert all .mbox files in the account directory or in any directory that is a child of the account directory. for that account from the DArcMail database. If the "yes" box is checked and if a folder "archive/deleted items" is specified, then the program will convert only message data from the mbox file with the path "<account_directory>\archive\deleted items.mbox".
- Max size for internal storage of attachments. The upper limit, in bytes, of the size of a message attachment if the attachment is to be included in the the main XML file. If a message attachment is larger than this upper limit, it will be written to a separate XML file, and a reference to that separate file will be included in the main XML file. If you choose '0' as the max size for internal storage, then ALL message attachments will be stored externally.
- Subdirectories for external storage. Each email attachment, regardless of size, will be stored in a separate XML file that is referenced in the main XML file. The external XML files for a given folder are stored in the same directory as the .mbox file for the folder. However, if the "Subdirectories for external storage" box is checked, then the attachment XML files will be distributed among subdirectories of the applicable folder directory. You can use this feature to prevent the creation of directories with thousands of files.
- Split XML into chunks. If "yes" is checked, then the XML will be split into multiple files. The "chunking" is based on message count; you can request that a file contain no more than 1,000 or 5,000 or 10,000 messages.
If you convert an entire email account, then the name of the XML file will be the name of the email account with the ".xml" file extension. If the email account name contains characters that are not permitted in a file name, then each occurrance of an illegal character will be replaced with an underscore. Characters that are replaced with an underscore include any character with a numeric value less than decimal 32 (' ') or greater than decimal 127 ('~'), as well as these characters:
/ ? < > \ : * | " ^The XML file for an account will be created in the account directory.
If you export a single folder, then the path of the XML file will be the account directory joined to the folder name joined to the extension ".mbox". Since an email folder name is the same as the relative (to the account directory) path of an mbox file less the ".mbox" extension, all characters in an email folder name are legal in a file name.
If you choose to split an XML file into chunks, then each chunk will have a "." plus a "chunk id" before the ".xml" extension, like this:
johnsmith.aa.xml
johnsmith.ab.xml
...
For each XML file that DArcMailXml produces, it also produces a file with the same path as the XML file but with the extension ".csv" in place of the extension ".xml". The CSV file contains one row of basic information for each message. The file is in "comma-separated values" so that is can be loaded into a spreadsheet program like Microsoft Excel or OpenOffice Calc. The CSV file has six columns:
- From
- To
- Date
- Subject
- Message ID
- SHA1 Hash
- Number of errors encountered in parsing the message
- The first error message encountered
If you choose to "chunk" XML ouptut files, then the CSV files will also be "chunked".
Besides the XML output files and the CSV files, the xml conversion process produces a log file, dm_xml.log.txt, in the account directory. The contents of this file are also displayed in a pop-up window when the xml conversion process has completed.