Go Back   FileForums > Game Backup > PC Games > PC Games - CD/DVD Conversions > Conversion Tutorials

Reply
 
Thread Tools Display Modes
  #1  
Old 15-11-2021, 02:34
kj911 kj911 is offline
Registered User
 
Join Date: Apr 2010
Location: world
Posts: 74
Thanks: 40
Thanked 23 Times in 15 Posts
kj911 is on a distinguished road
NEWPACK compressor

This is an experimental general-purpose lossless data compressor, the genesis of which was the "fast" DSD compression mode developed for WavPack. I discovered early on that several regular compressors (e.g., bzip2) do a surprisingly decent job on DSD (1-bit PCM) audio files. Conversely, after I developed the "fast" DSD mode of WavPack, I discovered that it could often do a decent job on other types of data (like text).

This implementation scans each buffer of data to be compressed and creates a probability model for each byte which is based on the hash of some number of previous bytes in the stream. This model consists of a bitmask representing which hashes are actually present in the file and a symbol probability table for each hash that appeared. These two model components are then compressed recursively and sent to the output file, and then the actual data is encoded using the model with a range coder and appended to the end of the encoded block.

The history depth is controlled with a single parameter that goes from 0 to 7, with zero representing no history (i.e., only the frequency of eash isolated symbol is stored) and 7 representing the maximum practical history size (up to a 30-bit hash and up to 2^20 probability tables). The default is 5 (24-bit hash and 2^16 tables) for the default block size of 20,000,000 bytes.

For version 0.0.2 there are options to preprocess each block for better compression. First, if the data is interleaved or periodic for some other reason, then deinterleaving the data with a given "stride" can often improve performance. For example, 16-bit stereo audio data works much better deinterleaved with a stride of 4. A particular stride may be explicitly specfied with -i<n>, or there is an option -i that will scan each block in an attempt to detect any periodicity and will check if that produces better compression.

The other preprocessing operation is for numerical data (as opposed to symbolic) and performs an arithmetic "delta" operation for each byte. Again, there is an option to force this operation -nn and an option to try the option for each block and use it if it offers improvement -n.

Finally, there is an command-line option -a to enable tests for both preprocessors. This is rather slow, but can only improve the compression because if the preprocessors don't actually generate an improvement, they are not used.

It's important to note that this is very experimental. It may not faithfully restore some files (and do so silently). Executables built on different platforms may not be compatible with each other, and future versions will probably not decompress files created with this version. It's probably full of undefined behavior (UB) and potentially exploitable memory corruption bugs and does virtually no sanity checking of incoming data. Certainly not recommended as a day-to-day compressor.

Quote:
NEWPACK Experimental General-Purpose Lossless Compressor Version 0.0.2
Copyright (c) 2020 David Bryant. All Rights Reserved.

Usage: NEWPACK [-d] [-options] infile outfile
specify '-' for stdin or stdout

Options: -d = decompression (default is compression)
-a = use all specialty modes (equivalent to -irn)
-[1-7] = probability model depth (default = 5 for 20 MB block)
-0 = no history employed in model (symbol frequency only)
-e = exhaustive search (very slow for very little gain)
-i = automatically detect interleave and use if better
-i<n> = force specified interleave stride (1-16, default = 1)
-b<n> = specify block size (1 MB - 100 MB, default = 20 MB)
-l = use long blocks (100 MB; history depth set to -6)
-s = use short blocks (4 MB; history depth set to -4)
-t = use tiny blocks (1 MB; history depth set to -3)
-n = try numerical data type preprocessing and use if better
-nn = force numerical data type preprocessing (i.e., deltas)
-r = try simple run-length encoding and use if better
-rr = force simple run-length encoding (really just for testing)
-m = decode random output based solely on model
-vv = very verbose messaging (include internal details)
-v = verbose messaging

Warning: EXPERIMENTAL - DON'T EVEN THINK OF ARCHIVING WITH THIS!!
More info, source code and results: https://github.com/dbry/newpack

My WinXP compatible binary its attached and different compiled version.
Attached Files
File Type: rar newpack_0.0.2_x86_winxp.rar (78.6 KB, 30 views)
File Type: zip newpack.zip (25.3 KB, 49 views)
Reply With Quote
The Following 4 Users Say Thank You to kj911 For This Useful Post:
FoRMaT-2007 (16-11-2021), Harsh ojha (15-11-2021), ScOOt3r (25-12-2021), ZAZA4EVER (15-11-2021)
Sponsored Links
Reply

Tags
lossless data compressor, newpack

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Mini Compressor AIO 2021 Carldric Clement Conversion Tutorials 251 15-05-2022 06:07
Ultimate Conversion Compressor (UCC) vollachr Conversion Tutorials 55 26-04-2021 09:27
Kitty File Compressor (STDIO Patch) 78372 Conversion Tutorials 1 31-10-2019 07:46
New Fox Kompressor 1.01c Ultimate felice2011 Conversion Tutorials 29 07-12-2018 04:24
[Tutorial] Making Compressor like BlackBox? Carldric Clement Conversion Tutorials 8 15-10-2014 21:50



All times are GMT -7. The time now is 17:11.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, vBulletin Solutions Inc.
Copyright 2000-2020, FileForums @ https://fileforums.com