Perceptual Hash

Perceptual hashing is based on four steps:

  • (1) Convert the image to grayscale and resize it to n_size_side_preprocess=32
  • (2) Compute the DCT of the 32x32 image.
  • (3) Get the top n_hash=n_size_side*n_size_side bits of the previous 32x32 array (usually ignoring the first value).
  • (4) Create a hash comparing each value of the previous array with the median.

Example

Let us visualize the original image and hash and mathash of an image.

using TestImages
img = testimage("fabio_color_256.png");
img

A "perceptual mathash" (matrix hash) can be created using perceptual_mathash(image, n_size_side), as follows:

using TestImages, Images, ImageHashes
img = testimage("fabio_color_256.png");
mat_hash = perceptual_mathash(img, 8)
mat_hash
8×8 BitMatrix:
 1  1  0  1  1  0  0  0
 0  1  0  0  0  1  1  0
 0  0  1  1  0  1  1  1
 0  1  0  0  0  0  0  0
 1  0  0  0  0  0  0  0
 0  0  0  1  0  0  0  0
 0  0  0  0  1  0  0  0
 0  0  0  0  0  0  0  0

We can visualize the hash using Gray..

using TestImages, Images, ImageHashes
img = testimage("fabio_color_256.png");
mat_hash = perceptual_mathash(img, 8)
Gray.(mat_hash)

The bigger the matrix hash is, the higher quality it can achive.

using TestImages, Images, ImageHashes
img = testimage("fabio_color_256.png");
mat_hash = perceptual_mathash(img, 28)
Gray.(mat_hash)

A hash (vector hash) for an image can be created with perceptual_hash function.

using TestImages, ImageHashes
img = testimage("fabio_color_256.png");
img_hash = perceptual_hash(img, 8)
img_hash
0x88d020a482606020

Execution time and allocations

using TestImages, ImageHashes, BenchmarkTools
img = testimage("fabio_color_256.png");
benchmark = @benchmark perceptual_hash($img, 8)
benchmark
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  42.354 μs …  17.003 ms  ┊ GC (min … max): 0.00% … 36.51%
 Time  (median):     44.888 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   49.810 μs ± 239.208 μs  ┊ GC (mean ± σ):  2.44% ±  0.51%

  ▁▅▆▇████▇▇▆▆▆▅▅▅▄▄▃▃▂▁▁          ▁▁▂▂▁▁▁▁▁ ▁                 ▃
  ██████████████████████████▇▇▄▆▆▇████████████▇███▇█▆▇▆▇▇▅▆▅▅▇ █
  42.4 μs       Histogram: log(frequency) by time        62 μs <

 Memory estimate: 19.97 KiB, allocs estimate: 34.