I haven't tested your modded tool yet but i plan to
if you can digest that 99> entropy to skip data and make it somehow comparable to 4x4 but with ratio of 7z/xz (lzma2) then this tool has a shot !!
Btw you can optimize 7z/xz to as many threads as you want but yours will have skipping which will make it faster by default
Ex:
[External compressor:lzma2]
; Block Size = -ms256m | Multithread = -mmt=8 |
header = 0
packcmd = 7z a -txz -an -m0=lzma2{

ption}:fb=273:mf=bt4:mc=100000000:lc=4: lp=0 -mmt=4 -ms1024m -mx9 -si -so <stdin> <stdout>
unpackcmd = 7z x -txz -an -y -si -so <stdin> <stdout>
[External compressor:lzma2_alt]
; Alternative Chunking Method for multithreading decompression
header = 0
packcmd = 7z a -txz -an -m0=lzma2:d512m:c1024m:fb=273:mf=bt4:mc=100000000:l c=4:lp=0 -mmt=4 -mx9 -si -so <stdin> <stdout>
unpackcmd = 7z x -txz -an -y -si -so <stdin> <stdout>