What is a Checksum?
Checksum is the result of an algorithm called a cryptographic hash function. It’s applied to blocks of the data in the file.
Checksum can be used to verify the integrity the data in a data transfer (when you download or copy a file) or to ensure the data has not been modified.
Use checksum to compare the hash value at sender and receiver ends. If the hash value is same, it implies that your copy of the file is genuine and error free.
Commonly used algorithms
- SYSV:
- CRC:
- Used for detecting random errors and accidental data corruption.
- ==Is not a cryptographic hashing algorithm== (it’s using a linear function based on cyclic redundancy checks)
- Can produce either 9, 17, 33 or 65 bits
- Not intended to be used for cryptographic purposes since makes no cryptographic guarantees,
- Unsuitable for use in digital signatures, because it’s ==easily reversible==.
- Should not be used for encryption purposes.
- Different strings can generate the collision (different inputs can give the same result).
- Invented in 1961 and used in Ethernet and many other standards.
- A CRC polynomial with a factor of x+1- is guaranteed to detect all errors of odd parity.
- An n-bit CRC with a nonzero x term is guaranteed to detect n- -bit burst errors.
- Various other errors can be guaranteed detected by a judicious choice of generating polynomial; see the standard Koopman–Chakravarty reference (preprint, paywall-free) for more details.
- MD5:
- Cryptographic hash algorithm,
- Producing a 128-bit (16-byte) hash value (32 digit hexadecimal numbers).
- Can be used for encryption purposes.
- Deprecated for security.
- SHA1:
- SHA224:
- SHA256:
- SHA384:
- SHA512:
- BLAKE2B:
- SM3:
Sum
The sum
command computes a 16-bit checksum for each given file, or standard input if none are given or for a file of ‘-’. Synopsis:
sum [option]… [file]…
The sum
command prints the checksum for each file followed by the number of blocks in the file (rounded up). If at least one file is given, file names are also printed. By default, GNU sum computes checksums using an algorithm compatible with BSD sum and prints file sizes in units of 1024-byte blocks.
The program accepts the following options:
-r
: This is option is specified by default unless -s was also given. Use the default (BSD compatible) algorithm. Default BSD compatible algorithm:
sum -r alpine.iso
28266 214016 alpine.iso
It is the same as:
sum alpine.iso
28266 214016 alpine.iso
-s
or--sysv
: Compute checksums using an algorithm compatible with System V sum’s default, and print file sizes in units of 512-byte blocks.
sum -s alpine.iso
7188 428032 alpine.iso
An exit status of zero indicates success, and a nonzero value indicates failure.
The sum
is provided for compatibility; the cksum
program is preferable in new applications.
Checksum format
Legacy output format
cksum
by default prints the POSIX standard CRC checksum for each file along with the number of bytes in the file, and the file name unless no arguments were given. The 32-bit CRC used is based on the polynomial used for CRC error checking in the ISO/IEC 8802-3:1996 standard (Ethernet).
- SystemV format:
cksum --algorithm=sysv alpine.iso
7188 428032 alpine.iso
- BSD format:
cksum --algorithm=bsd alpine.iso
28266 214016 alpine.iso
Tagged output format
With the –algorithm option selecting non legacy checksums, the cksum
command defaults to output of the form: digest_name (file name) = digest
The standalone checksum utilities can select this output mode by using the –tag option.
cksum -a sha256 --tag alpine.iso
SHA256 (alpine.iso) = c66fc1e0470781f8ecbab8eb9cc8d906066171a5e0c6c1ab20aedc7061836d27
Untagged output format
With the –untagged option and the –algorithm option selecting non legacy checksums, the following output format is used. This is the default output format of the standalone checksum utilities. For each file, we print the checksum, a space, a flag indicating binary or text input mode, and the file name. Binary mode is indicated with ‘*’, text mode with ‘ ’ (space). Binary mode is the default on systems where it’s significant, otherwise text mode is the default.
cksum -a sha256 --untagged alpine.iso
c66fc1e0470781f8ecbab8eb9cc8d906066171a5e0c6c1ab20aedc7061836d27 alpine.iso
cksum -a sha256 alpine.iso
SHA256 (alpine.iso) = c66fc1e0470781f8ecbab8eb9cc8d906066171a5e0c6c1ab20aedc7061836d27
Without –zero, and with non legacy output formats, if file contains a backslash, newline, or carriage return, the line is started with a backslash, and each problematic character in the file name is escaped with a backslash, making the output unambiguous even in the presence of arbitrary file names. Since the backslash character itself is escaped, any other backslash escape sequences are reserved for future use.
Verify the integrity of a file
I have downloaded the Alpine operating system ISO, to make sure it has been downloaded correctly with no error, modifications or infected by malware on the way, i will compare the checksum provided from the official web page, with the one generated from the ISO file.
In this case the provided checksum it has the .sha256
extension, this indicates that was generated using the sha256 algorithm, so i need to use the same algorithm on the comparison.
There is two native command to verify a checksum file on Ubuntu 24.04:cksum
command:
cksum -a sha256 -c alpine.iso.sha256
sha256sum
command:
sha256sum -c alpine.iso.sha256
Both will give us the same output: alpine.iso: OK
. This indicate us that the checksum that botch checksum matches.
Considerations:
- The computed checksum file (the text file downloaded for comparison) has the file name inside of it.
- Checksum file algorithm, need to match the checksum command algorithm or will throw and error:
no properly formatted checksum lines found
. - Checksum files can be a
.txt
extension, as long as it has the correct format. - Checksum file, and the original file, need to be on the same folder when you run the checksum command, or you will receive a
No such file or directory
error.
Useful command options
Checksum using sum
on all the .iso
files on the current folder:
sum -r *.iso
28266 214016 alpine2.iso
28266 214016 alpine.iso
Checksum using sum
on all the files in the directory:
sum -r *
28266 214016 alpine2.iso
28266 214016 alpine.iso
Leave a Reply