[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [MirageOS-devel] irmin storage overhead and dedup
Hi Thomas, Iâm trying to figure out what kind of storage overhead and dedup I get in Irmin. First I tried to convert the google email archive (2.4G) to the IMAP server Irmin format . After conversion the size of the git repository was twice the size of the original archive. I do have some additional structures that I create, like per mailbox index and summary statistics and per email message flags so perhaps the extra size is coming from those structures though it seems a bit high. I will have to estimate the expected size from additional structures to understand this result. Next I dumped into irmin 2,000 of 1M files with random ascii content which resulted in the git repository size of 950M. I figure Irmin compresses the content, right? To verify this I dumped 2,000 of 2.4M image files with concatenated counter to make the content unique. The size of repository for this was 4.6G, which is expected. Then I repeated the last test but with identical images and this time the size was 27M, which was clearly a nice proof of the deduping by Irmin. My question is whether the compression in Irmin is configurable? Can it be configurable per individual content? For instance, I donât want to compress images as there is nothing to gain from the space saving and consequently there is unnecessary resource usage but I do want to compress the text if the compression overhead is reasonable. I can figure out the type of content from MIME type in IMAP server. Thanks Gregory _______________________________________________ MirageOS-devel mailing list MirageOS-devel@xxxxxxxxxxxxxxxxxxxx http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |