View Single Post
  #18  
Old 09-04-2023, 10:07
KaktoR's Avatar
KaktoR KaktoR is offline
Lame User
 
Join Date: Jan 2012
Location: From outer space
Posts: 4,481
Thanks: 1,083
Thanked 7,096 Times in 2,692 Posts
KaktoR is on a distinguished road
How to know what compression method a game has

Here I will show you how to know what compression method a game has, so you know what precompression you can apply (or should) to achieve a good ratio.

I will make this more superficial, because I don't really want to dig deeper inside this rabbit hole, I could write a book about this otherwise. Another reason is simply that I don't know all about this. Furthermore I will explain only oodle compression here, just because oodle is nowadays very common.

First things first. In this showcase I will use xtool.exe + xtoolui.dll, just for simplification.

https://i.imgur.com/jtBUHUF.png


In the first example I will work with the game "The Last of Us Part 1", just because it's a current game. However I can tell you that this game use oodle compression, just before I made any efforts in testing with the game files in any way. How can I tell without doing anything? In the past, most games which used oodle compression had a file called "oo2core_x_win64.dll" somewhere in the game folder (replace "x" with an integer starting from 3 up to 9). In this example it is the case aswell. If you look in the root game folder you will notice the file "oo2core_9_win64.dll", so you can see directly that there is somewhere oodle compression used.

Ok, but oodle compression family has several codecs, like kraken, mermaid, selkie, hydra and leviathan. How to know what codec to use? First step is to scan an input, like a game file, with the oodle scanner by Razor12911.

In this example I use the file "common.psarc" as an input to test with. So just drag&drop this file onto "oo2scan_7_win64.exe", which opens a cmd window.

https://i.imgur.com/kG30rO9.png

Now you see stream list with some informations. The important information is the codec used. In the above example it is [1], which means kraken codec is used.

---------------------------

If you ever feel the need to check over, note the offset (Pos: 000xxxxx) and search for this offset in HxD (Search -> Go to... -> ...). As an example, I search for the first oodle stream (Pos: 00013750). So go to "Search" -> "Go to..." and type in as hex "13750".

https://i.imgur.com/kj7YOfU.png


As a side note, kraken header is "8C 06". If you want to dig deeper, you can also find out CSize and DSize (CompressedSize and DecompressedSize for this stream), but this will go too deep into this on this topic.


---------------------------

Anyways, we should move along with the more easier stuff. So we know now that the game is using oodle kraken compression. Now copy the file "oo2core_9_win64.dll" from the game folder in the same folder where xtool.exe is located and open xtool.exe and set the input to the file "common.psarc". For the beginning, you can leave all the settings to default. Now you press the "Configure" button, which will open a new window. There you select the Oodle option and tick the checkbox "Kraken"

https://i.imgur.com/NOBqfy1.png

Then just press "OK" and then press "Start" to begin the process. It looks like this then when it's finished.

https://i.imgur.com/WbgL7OK.png

There you can see some helpful informations, like the Original size and decompressed size of the file, the time it has needed and the amount of streams which were processed. In this case not all the streams were processed, but it is not that much, so in this case we can ignore it. Either it is because an incorrect library (oo2core_x_Win64.dll) was used or something else.

Repeating this step, but in addition with the verbose mode, we will see something like this.

https://i.imgur.com/86AT2Bt.png

In verbose mode we get even more usefull informations about all the streams, for example which level they are using (level means the compression level of the kraken codec: l1 - l9, where l4 is somewhat like medium compression).

The l# option is sometimes usefull to speed up the process a bit, but don't except big time saves here.

Another usefull option is "Number of scan iterations", which will increase the streams that could potentially be found. If set to n128 for example, you will find a bit more streams, which means that the output will be a bit bigger.

Example:
Code:
n32 default option
Streams: 14951 / 14974
Time: 00:00:09 (CPU 00:00:42)
Size: 1.10 GB >> 1.66 GB

n128
Streams: 16320 / 16331
Time: 00:00:16 (CPU 00:01:03)
Size: 1.10 GB >> 1.68 GB
As you can see, n128 finds more streams, the output is a bit larger, but the time to process this is also higher. So for a full game you have to ask yourself, if this additional time is worth it.

Another example with n256
Code:
Streams: 16679 / 16686
Time: 00:00:23 (CPU 00:01:27)
Size: 1.10 GB >> 1.69 GB
As you can see here, the time between n32 default and n256 has more then doubled to find and process the streams, and the output size just increased marginal. So it is always a consideration.

Now the question could be: is the overall output - after deduplication and compression - worth it to increase the scan iterations?
Code:
n32 default - srep m3f - fast lzma2 l6
Streams: 14951 / 14974
Time: 00:00:55 (CPU 00:04:58)
Duplicates: 126 (0.00 MB) [0.01 MB >> 7.04 MB]
Srep decompression memory: 209 MB [210 MB*]

Size: 1.10 GB >> 1.66 GB >> 1.65 GB >> 1.35 GB >> 0.99 GB
Code:
n256 - srep m3f - fast lzma2 l6
Streams: 16679 / 16686
Time: 00:01:09 (CPU 00:05:43)
Duplicates: 126 (0.00 MB) [0.01 MB >> 7.04 MB]
Srep decompression memory: 231 MB [232 MB*]

Size: 1.10 GB >> 1.69 GB >> 1.68 GB >> 1.35 GB >> 0.98 GB
Not that much, just a bit. However keep in mind that this was a test with fast-lzma2, so to get meaningful results you should test it with stronger lzma or lolz.

As a side note, this game also has several deflate streams behind the "oodle wall", if you can say so. A simplified example for imagination purposes: Imagine an egg, the eggshell is oodle compression, behind this shell there are some zlib streams, and the inner yellow is the raw data. Or the vise-versa way: some of the raw data are compressed with zlib -> then all data is compressed with oodle.

Example file "core.psarc"
Code:
kraken
Streams: 28187 / 28213
Time: 00:00:57 (CPU 00:00:58)

Size: 2.12 GB >> 3.37 GB
Code:
kraken+preflate
Streams: 1188 / 1188
Time: 00:01:18 (CPU 00:06:23)

Size: 3.37 GB >> 5.37 GB
This will make a difference in final compression. Let's take a look at this:
Code:
kraken - srep m3f - fast lzma2 l6
Size: 2.12 GB >> 3.37 GB >> 3.30 GB >> 2.52 GB >> 1.95 GB
Code:
kraken+preflate - srep m3f - fast lzma2 l6
Size: 3.37 GB >> 5.37 GB >> 5.36 GB >> 3.27 GB >> 1.90 GB
50mb saved. But here again, this is a test with fast-lzma2 plugin, so better test such things with stronger lzma or lolz to get meaningful results.

Now that we know something about the main data for this game, we can write a method for DiskSpan_GUI.
Personally I would suggest to use something like this:
Code:
xtool:mkraken,l4,n256:core_2.9.9+xtool:mpreflate+srep+...
:core_2.9.9 is important in the method command line, because it will tell DiskSpan_GUI which oodle library to use for this game. 2.9.9 is the library version the game is actually shipped with. In most cases this is always the correct library, but there are also exceptions like "Elden Ring", which is shipped with a different oodle library which will not work correctly processing the streams, or games which even doesn't has any library outside the executable, which will lead us to the next case.

---------------------------

As a side note, "The Last of Us Part 1" has also some video files in bink2 format. In older titles we just used binkpack (bpk) to compress them, but the newer titles use a codec which binkpack cannot process. In those cases you can open the bk2 file in HxD and look directly in the file header for "KB2n". If there is "n", then go ahead and just store this files, or use weak compression with deduplication to get some mb's out of it, if at all tbh.

---------------------------

So, what to do if a game use oodle compression but there is no library inside the game folder? For this example I will use "Assassin's Creed Valhalla", simply because I have installed it at the moment. In those cases it could be useful to know when development of a game has begun, or at least try to guess it. Or instead go for the release date of the game. The oodle development history can help you with this to pick the correct library version, or at least get near enough to it.

Ok so, "Assassin's Creed Valhalla" has no oodle library. The library code is somewhere inside the binaries of the game, so no way we can safely say what library version to use at this point. But we know that the release date of this game was "10. November 2020". So we look at the list I linked above. I would say, oodle library v2.8.X should be used, but which one exactly? Here comes the library checker, which is a part of xtool.

Here again, first check a file with the oodle scanner. I choose "DataPC.forge" for this.

https://i.imgur.com/3wooheG.png

In this case we see [2], which means the oodle mermaid codec is used, So we check xtoolui again and make some settings.

https://i.imgur.com/ZMXfN8h.png

So now we use the library checker. For this just select the drop-down menu and select "Oodle". Now you have to point to a folder, which contains different oodle libraries. For this purpose we have this. Just download the attachment and extract it somewhere.

https://i.imgur.com/U0OOO0l.png

Like I said already, my guess is that this game was compressed with a oodle version 2.8.X, but we don't know which version exactly. Now point to the folder ".\Oodle\v2.8.x" and select it, like in the image above.

Now start the process and wait until it's finished. This will take some time because XTool is now checking the file with different libraries one after another. To know the correct library watch out for the library which gives you the best results. Just compare amount of streams, times and sizes.

https://i.imgur.com/qNDAuDa.png

As you can see here it is very likely that a library from v2.8.0 to v2.8.4 was used for the game. It is now up to you which library to use. Additionaly you can check more game files and if the results are the same, then pick up any library you want. Personally I would use v2.8.4 library because there were some fixes to the oodle code before v2.8.4.

Code:
Streams: 4615 / 4615
Time: 00:00:46 (CPU 00:03:58)

Size: 1.03 GB >> 1.54 GB
Some games, as games built with Anvil Forge engine (Assassin's Creed for example) have a xtool plugin you can use.

Here is a example with the anvil plugin on the same file as above
Code:
Streams: 60440 / 60440
Time: 00:01:03 (CPU 00:06:10)

Size: 1.03 GB >> 1.84 GB
So here again a command line method for DiskSpan_GUI
Code:
xtool:manvil:core_2.8.4+srep+...
I hope I had forgot nothing here to explain.

Edit: Corrected some things.
__________________
Haters gonna hate

Last edited by KaktoR; 15-04-2023 at 11:34.
Reply With Quote
The Following 9 Users Say Thank You to KaktoR For This Useful Post:
adrianskiloses (09-04-2023), Cesar82 (09-04-2023), jihack (16-07-2023), Junior53 (24-06-2023), kj911 (11-04-2023), mausschieber (09-04-2023), mykhydro (31-01-2024), sajmon83 (16-06-2024), Titeuf (10-04-2023)