Checksum

·

What is a Checksum?

Checksum is the result of an algorithm called a cryptographic hash function. It’s applied to blocks of the data in the file.
Checksum can be used to verify the integrity the data in a data transfer (when you download or copy a file) or to ensure the data has not been modified.
Use checksum to compare the hash value at sender and receiver ends. If the hash value is same, it implies that your copy of the file is genuine and error free.

Commonly used algorithms

  • SYSV:
  • CRC:
    • Used for detecting random errors and accidental data corruption.
    • ==Is not a cryptographic hashing algorithm== (it’s using a linear function based on cyclic redundancy checks)
    • Can produce either 9, 17, 33 or 65 bits
    • Not intended to be used for cryptographic purposes since makes no cryptographic guarantees,
    • Unsuitable for use in digital signatures, because it’s ==easily reversible==.
    • Should not be used for encryption purposes.
    • Different strings can generate the collision (different inputs can give the same result).
    • Invented in 1961 and used in Ethernet and many other standards.
    • A CRC polynomial with a factor of x+1- is guaranteed to detect all errors of odd parity.
    • An n-bit CRC with a nonzero x term is guaranteed to detect n- -bit burst errors.
    • Various other errors can be guaranteed detected by a judicious choice of generating polynomial; see the standard Koopman–Chakravarty reference (preprint, paywall-free) for more details.
  • MD5:
    • Cryptographic hash algorithm,
    • Producing a 128-bit (16-byte) hash value (32 digit hexadecimal numbers).
    • Can be used for encryption purposes.
    • Deprecated for security.
  • SHA1:
  • SHA224:
  • SHA256:
  • SHA384:
  • SHA512:
  • BLAKE2B:
  • SM3:

Sum

The sum command computes a 16-bit checksum for each given file, or standard input if none are given or for a file of ‘-’. Synopsis:

sum [option]… [file]…

The sum command prints the checksum for each file followed by the number of blocks in the file (rounded up). If at least one file is given, file names are also printed. By default, GNU sum computes checksums using an algorithm compatible with BSD sum and prints file sizes in units of 1024-byte blocks.

The program accepts the following options:

  • -r: This is option is specified by default unless -s was also given. Use the default (BSD compatible) algorithm. Default BSD compatible algorithm:
sum -r alpine.iso
28266 214016 alpine.iso

It is the same as:

sum alpine.iso
28266 214016 alpine.iso
  • -s or --sysv: Compute checksums using an algorithm compatible with System V sum’s default, and print file sizes in units of 512-byte blocks.
sum -s alpine.iso
7188 428032 alpine.iso

An exit status of zero indicates success, and a nonzero value indicates failure.
The sum is provided for compatibility; the cksum program is preferable in new applications.

Checksum format

Legacy output format

cksum by default prints the POSIX standard CRC checksum for each file along with the number of bytes in the file, and the file name unless no arguments were given. The 32-bit CRC used is based on the polynomial used for CRC error checking in the ISO/IEC 8802-3:1996 standard (Ethernet).

  • SystemV format:
cksum --algorithm=sysv alpine.iso
7188 428032 alpine.iso
  • BSD format:
cksum --algorithm=bsd alpine.iso
28266 214016 alpine.iso

Tagged output format

With the –algorithm option selecting non legacy checksums, the cksum command defaults to output of the form: digest_name (file name) = digest

The standalone checksum utilities can select this output mode by using the –tag option.

cksum -a sha256 --tag alpine.iso
SHA256 (alpine.iso) = c66fc1e0470781f8ecbab8eb9cc8d906066171a5e0c6c1ab20aedc7061836d27

Untagged output format

With the –untagged option and the –algorithm option selecting non legacy checksums, the following output format is used. This is the default output format of the standalone checksum utilities. For each file, we print the checksum, a space, a flag indicating binary or text input mode, and the file name. Binary mode is indicated with ‘*’, text mode with ‘ ’ (space). Binary mode is the default on systems where it’s significant, otherwise text mode is the default.

cksum -a sha256 --untagged alpine.iso
c66fc1e0470781f8ecbab8eb9cc8d906066171a5e0c6c1ab20aedc7061836d27  alpine.iso
cksum -a sha256 alpine.iso
SHA256 (alpine.iso) = c66fc1e0470781f8ecbab8eb9cc8d906066171a5e0c6c1ab20aedc7061836d27

Without –zero, and with non legacy output formats, if file contains a backslash, newline, or carriage return, the line is started with a backslash, and each problematic character in the file name is escaped with a backslash, making the output unambiguous even in the presence of arbitrary file names. Since the backslash character itself is escaped, any other backslash escape sequences are reserved for future use.

Verify the integrity of a file

I have downloaded the Alpine operating system ISO, to make sure it has been downloaded correctly with no error, modifications or infected by malware on the way, i will compare the checksum provided from the official web page, with the one generated from the ISO file.

In this case the provided checksum it has the .sha256 extension, this indicates that was generated using the sha256 algorithm, so i need to use the same algorithm on the comparison.
There is two native command to verify a checksum file on Ubuntu 24.04:
cksum command:

cksum -a sha256 -c alpine.iso.sha256

sha256sum command:

sha256sum -c alpine.iso.sha256

Both will give us the same output: alpine.iso: OK. This indicate us that the checksum that botch checksum matches.

Considerations:

  • The computed checksum file (the text file downloaded for comparison) has the file name inside of it.
  • Checksum file algorithm, need to match the checksum command algorithm or will throw and error: no properly formatted checksum lines found.
  • Checksum files can be a .txt extension, as long as it has the correct format.
  • Checksum file, and the original file, need to be on the same folder when you run the checksum command, or you will receive a No such file or directory error.

Useful command options

Checksum using sum on all the .iso files on the current folder:

sum -r *.iso
28266 214016 alpine2.iso
28266 214016 alpine.iso

Checksum using sum on all the files in the directory:

sum -r *
28266 214016 alpine2.iso
28266 214016 alpine.iso

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *