|
#1
|
||||
|
||||
|
( Overview to Entropy of a File )
Simplistically entropy is disorder, or better in computing the density of information that a data stream can contain. So the more the content of a file will be predictable, the higher the entropy. The software ENT (Pseudorandom Number Sequence Test Program, http://www.fourmilab.ch/random) performs various statistical tests, providing output in the following report: 1) Entropy is the density of information contained in a file expressed in number of bits per character. The maximum entropy is 8, when we find a file with entropy or 8 means that it is perfectly random, or is compressed. In fact, taken a bitmap, its entropy is 4.724502 bits per byte, if you turned it into jpeg becomes 7.938038 bits per byte. If you compress the bmp with winrar I get 7.996259 bits per byte. This is clear. I take a text file where I have written a thousand times the same word we get is that its entropy Entropy = 2.545246 bits per byte. If we compress with winrar we get Entropy = 6.747827, and winrar maximum compression Entropy = 6.756800. 2) Test of Chi square Used for the study of random data streams. If we apply it to image files as a result they are random data. In practice, it occupies the deviation percentage of the flow of data from a real random sequence. However, if the result is> 99% or <1% of the data stream is not random. If it is between <5% and> 95% of the flow is random suspiciously, if intermediate then we are on random. 3) Arithmetic Mean Sum all the bytes and divides them for the length: it is a type of arithmetic mean. The closer the number is 127.5 more random. 4) Test of Pi-Greco Montecarlo The more the value is close to pi-greco (3.14 ..) plus the data stream is random / compressed. 5) Coefficient of Correlation Ie how predictable a byte knowing his previous. More the value is close to 1 and more is predictable, more and more close to 0 is random. SOME EXAMPLES ON THE IMAGES : We analyze a bitmap file ... Quote:
The chi-square gives us a value of 0.01 it says that the flow is not accidental but it is a picture is not reliable. The average is 160 and deviates from 125 and is therefore not random. Even the monte carlo is far from 3.14. The correlation is 0.81: bitmaps are always close correlation to 1. If they were random data would be 0. Now let's look at the same bitmap converted to jpg ... Quote:
Here the entropy is very high (7.9), the file is very compressed. Test Monte Carlo 3.21, quite close to the pi greek, so close to random. Correlation coefficient close to 0. understand that it is compressed. ![]() ![]() ![]() After this short overview this GUI natively leverages the application ENT, to calculate the entropy of a file, or an full data folder, providing a report based on the reduction of the file or folder data, and its total compression ratio, for know quickly if a file type and / or folder, will have a high or low compression. Classification of the file or group of files: The scanning of the file or folder data, it's divided into 5 blocks with a calculation of the entropy range from 1.0 to 7.0 for Deflate e Text, and 1.0 to 7.5 for Void and Msrsolid, through direct reading of the file arc.group during the scanning, and based on the reading of the extensions of the 4 masks "Void, deflate, Msrsolid and Text" and the basic method. The files with higher entropy than 7.0 or 7.5 or an extension not set in arc.group file are classified and added to the basic method. I chose the level 7 and 7.5 on the basis of various tests performed out on individual files of various formats, a file with the entropy level from 7.0 to 7.5, with a strong compression carried out with different samples of compressors you get a reduction of 20-25% and a compression ratio of 75-80%. Each block contains additional information, according to the main method and masks, number of files scanned and belonging, percentage of size reduction, percentage of compression ratio, total size of files added in the belonging block. ![]() Creating a masked method estimate in based on the entropy and scanning of files: With a choice of 44 compressors on 4 masks, these will be activated or deactivated, in based on the scan and the entropy of the previously evaluated files, in order to speed up and simplify the creation of the final masked method with a correct of compression estimated of 90% on the compressed files. ![]() LZbench v1.7.1 by Inikep : " Benchmark Compressors LZ77/LZSS/LZMA " https://github.com/inikep/lzbench Thank Inikep from encode.ru, the application incorporates a modified and adapted for the complete benchmark on a single file or entire directory of 63 compressors of the Family LZ77/LZSS/LZMA, with a full report out for each file (compression speed in Mb/s, decompression speed in Mb/s, original size, compressed size, ratio and file name). In the same way we will have a final scan with the reduction in size and compression ratio, to compare the various compressors and choose based to the speed of compression, decompression and ratio, on the types of scanned files. In LZBench no size file limit, even using a low amount of memory the average of the ratio is calculated for the number of divided parts obtaining the overall result of the compression ratio. Quote:
![]() "arc.groups" updated to version 3.0, based on version 2.5 of Panker1992, they were added over 200 popular formats used in the area of gaming. We Avoid the Vultures, those who not give credit and thanks for all the work, please do not use the application and not download... ![]() UPDATED : BE_Parent_Dir The file parent directory is displayed in the masks box. Other minor fix. In Down.
__________________
≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈ ≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈ « I Mediocri Imitano, I Geni Copiano, Dio Crea & Distrugge » (Io Ridefinisco & Perfeziono le Loro Opere Rendendole Uniche) ![]() ≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈ ≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈≈ « Mediocrities Imitate, Genius Copy, God Creates & Destroys » (I Reconsider & Improve Their Works, Rending Them One And Only)
Last edited by felice2011; 25-04-2017 at 00:59. Reason: Added BE_Parent_Dir |
| The Following 17 Users Say Thank You to felice2011 For This Useful Post: | ||
-XCX- (29-03-2017), arkantos7 (01-04-2017), ChronoCross (29-03-2017), COPyCAT (24-01-2018), elit (28-09-2017), EzzEldin16 (29-03-2017), gozarck (04-04-2017), JRD! (04-04-2017), kassane (29-09-2017), knife16 (29-03-2017), mikey26 (29-03-2017), ramazan19833 (29-03-2017), Razor12911 (04-04-2017), rinaldo (04-04-2017), romulus_ut3 (03-04-2017), Simorq (29-03-2017), Stor31 (21-04-2017) | ||
| Sponsored Links |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| (Arrow) Cmd -Info - Bench -Test | felice2011 | Conversion Tutorials | 36 | 28-11-2016 12:39 |
| Fast Brute (test) | Razor12911 | Conversion Tutorials | 49 | 07-06-2016 03:44 |
| test bench | rinaldo | Conversion Tutorials | 8 | 28-02-2016 04:55 |
| Bejeweled 2 | smoggey | PC Games | 3 | 28-09-2005 17:11 |