
29-05-2020, 01:23
|
 |
Noob
|
|
Join Date: Jul 2012
Location: South Africa
Posts: 3,749
Thanks: 2,170
Thanked 11,206 Times in 2,307 Posts
|
|
Quote:
Originally Posted by FitGirl
Thanks for returning to the project, deduplication is a very useful feature.
I have an idea which will reduce the required RAM for dedup. You may store some rare/large duplicated streams in a temp file, while storing small/frequent dupes in RAM - this way the excessive HDD load won't happen, cause reads will be rare and the RAM won't be used that much. 1-2 GB is a pretty big amount even for machines with 8 GB. And for users with 4 GB installation will be almost impossible, considering srep and lolz/lzma. Even with page file. So reduction/control over used RAM is a must, I think.
I'd recommend you Halo Reach for testing dedup, it has tons of duplicate streams of a different size.
|
Quote:
Originally Posted by Gupta
maybe he can introduce the second phase in compression, then he can store forward reference count for a stream that should ideally decrease requirements memory size beyond window size and that should increase compression too.
HDDs are very slow, I recently upgraded to nvme based storage and I can feel the speed.
|
Quote:
Originally Posted by panker1992
there is also a sorting match feature that can reduce ram needed and that is as follows.
srep does a very good job finding matches that are located far away!
that in order to happen makes a dictionary!
IF you sort the files you feed srep you can actually reduce ram needed and its speed
Sorting preprocession can speedup the process and cost less ram !! and remove IO overhead because NO temps
|
Believe me I have several ideas of how to reduce memory usage before even relying on virtual memory. Optimisation is my middle name.
|