Generate MD5 hash of a file in Golang

April 7, 2016

Why generate a MD5 hash of a file?

You may have noticed before on some websites where you download a file, there is some strange string of text around the download link, named "checksum1", for example "md5 checksum: dcdfb1413d7fa48c3cab920c0448f236". Now this may look like a bunch of random characters to you but don't be fooled, it's a one-way mathematical hash that correspondents to the file you want to download. The idea behind this is to check if the integrity of the file you downloaded is not corrupted (some more information below). So you generate a MD5 hash of the file that you saved to your disk, and compare it to the MD5 hash from where you downloaded it, if they don't match, it means that something went wrong during the transferring of the file.

Nowadays with stable, broadband internet connections this isn't a real issue anymore. But back in the days of Dial-up internet errors where common. During transfers packets could get lost2, or bits could be misinterpreted by the modem, which would cause the avalanche effect3 on your hash.

It's also good practice in programming when you transfer files between programs, that you check if the received file matches the file that has been sent, by MD5 checksum or other popular hashing algorithms like CRC4SHA5,...

A Go function

The argument you have to specify in the following function is the path to the file you desire to hash (the absolute path or relative path), and the output is either an empty string with an error, or a string with 32 characters and a nil error.

import (
	"crypto/md5"
	"encoding/hex"
	"io"
	"os"
)
func hash_file_md5(filePath string) (string, error) {
	//Initialize variable returnMD5String now in case an error has to be returned
	var returnMD5String string
	//Open the passed argument and check for any error
	file, err := os.Open(filePath)
	if err != nil {
		return returnMD5String, err
	}
	//Tell the program to call the following function when the current function returns
	defer file.Close()
	//Open a new hash interface to write to
	hash := md5.New()
	//Copy the file in the hash interface and check for any error
	if _, err := io.Copy(hash, file); err != nil {
		return returnMD5String, err
	}
	//Get the 16 bytes hash
	hashInBytes := hash.Sum(nil)[:16]
	//Convert the bytes to a string
	returnMD5String = hex.EncodeToString(hashInBytes)
	return returnMD5String, nil
}

Example

This example calculates the MD5 hash of itself, the running program.

package main
import (
	"crypto/md5"
	"encoding/hex"
	"fmt"
	"io"
	"os"
)
func hash_file_md5(filePath string) (string, error) {
	var returnMD5String string
	file, err := os.Open(filePath)
	if err != nil {
		return returnMD5String, err
	}
	defer file.Close()
	hash := md5.New()
	if _, err := io.Copy(hash, file); err != nil {
		return returnMD5String, err
	}
	hashInBytes := hash.Sum(nil)[:16]
	returnMD5String = hex.EncodeToString(hashInBytes)
	return returnMD5String, nil
}
func main() {
	hash, err := hash_file_md5(os.Args[0])
	if err == nil {
		fmt.Println(hash)
	}
}

This should output the following in windows on golang 1.5.3 with no special compiler options:

C:/Gotests/md5string.exe  [C:/Gotests]
656fc88239fed34577fca4084cf2add6

Comments
Jivpat
10/07/2022 15:52

Very helpful, thank you!

permalink
References
1
Checksum - Wikipedia

https://en.wikipedia.org/wiki/Checksum

cached copy
2
Packet loss - Wikipedia

https://en.wikipedia.org/wiki/Packet_loss

cached copy
3
Avalanche effect - Wikipedia

https://en.wikipedia.org/wiki/Avalanche_effect

cached copy
4
Cyclic redundancy check - Wikipedia

https://en.wikipedia.org/wiki/Cyclic_redundancy_check

cached copy
5
Secure Hash Algorithms - Wikipedia

https://en.wikipedia.org/wiki/Secure_Hash_Algorithm

cached copy
Tags