jef-sure/article.txt

## article.txt
Published in Image Processing On Line on 2021–10–15.
Submitted on 2021–05–05, accepted on 2021–08–24.
ISSN 2105–1232 c 2021 IPOL & the authors CC–BY–NC–SA
This article is available online with supplementary materials,
software, datasets and online demo at
https://doi.org/10.5201/ipol.2021.355

2015/06/16 v0.5.1 IPOL article class

Image Forgeries Detection through Mosaic Analysis: the
Intermediate Values Algorithm
Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel
Universit´e Paris-Saclay, ENS Paris-Saclay, Centre Borelli, F-91190 Gif-sur-Yvette, France
{quentin.bammey, rafael.grompone, jean-michel.morel}@ens-paris-saclay.fr
Communicated by Tina Nikoukhah Demo edited by Tina Nikoukhah and Quentin Bammey

Abstract
Cameras sample each image pixel in one color channel only. The remaining channels are inter-
polated from neighboring pixels during demosaicing. This operation leaves traces, that can be
exploited to authentify images and detect forgeries. This paper describes the method introduced
by Choi et al. that exploits the fact that interpolated pixels are more prone to be intermediate
values, to detect in which pattern an image has been sampled. We then use this information
to ﬁnd regions that are inconsistent with the global image. We attribute a conﬁdence score to
each detection, which can then be thresholded to provide a binary map of detected forgeries.
Source Code
The reviewed source code and documentation for this algorithm are available from the web page
of this article1. Usage instruction are included in the README.txt ﬁle of the archive.
Keywords: image forensics; forgery detection; demosaicing; CFA

1 Introduction
Most cameras do not see full color images2. They only sample one color per pixel using a color ﬁlter
array (CFA), and must interpolate the missing values, as seen in Figure 1. The most common CFA
is the Bayer matrix, shown in Figure 2a. It is made of a simple 2 × 2 pattern, which samples twice as
many pixels in green as in red or blue. Because interpolation leaves traces, it is sometimes possible
to identify in which pattern an image has been sampled, in other words, to know which of the four
blocks seen in Figure 2b represents the ﬁrst sampled pixels at the image origin. Other CFA than
the Bayer matrix exist, however those are not commonly used; as a consequence this article focuses
solely on the Bayer CFA.
If there is a forgery in an image, chances are that the pattern in the manipulated area will not
coincide with the rest of the image. For instance, if part of an image is copied and pasted onto
another (or potentially the same) image, there is a 43 chance that the patterns will not be aligned, as
can be seen in Figure 3. If this shift in the periodic pattern of sampled colors can be detected, then
it will constitute an important evidence as to the presence of a forgery.
1https://doi.org/10.5201/ipol.2021.355
2Some cameras are able to sample full color image by using superposed sensors. Due to the high cost for a very
small improvement on the ﬁnal image quality, these are extremely rare.

Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel, Image Forgeries Detection through Mosaic Analysis: the Intermediate
Values Algorithm, Image Processing On Line, 11 (2021), pp. 317–343. https://doi.org/10.5201/ipol.2021.355


Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel

(a) In a raw image, each pixel is only sampled in one color
channel.

(b) Demosaicing interpolates the missing colors.

Figure 1: Most cameras can only sample one color channel per pixel, and need to interpolate the missing data with one of
many demosaicing algorithms. (Here, the mosaic has been artiﬁcially recreated on a low-resolution image so it can be seen
to the naked eye.)

(a) The Bayer CFA, by far the most commonly used CFA.

rggb grbg

gbrg bggr
(b) The four potential patterns that arise from the Bayer
CFA.
Figure 2: The Bayer color ﬁlter array (CFA), and the four potential patterns that arise from it. The pattern identiﬁes the
position of the top-left red-sampled pixel in an image, in other words the modulo 2 oﬀset of the CFA. With a Bayer CFA,
half the pixels are sampled in green, one quarter in red and the other quarter in blue. This is due to the human perception
being able to detect ﬁner details in the green values than in red or blue values, hence the need to better reconstruct the
green.

The method we present here, originally proposed by Choi et al. [5], uses the fact that demo-
saicing is basically an interpolation operation. As a consequence, interpolated pixels are more often
intermediate values among their immediate neighbors, as seen in Figure 4. For instance, with the
simple bilinear demosaicing, missing colors are directly averaged from the direct neighbors that were
originally sampled in that color, and are thus always intermediate values.
Of course, this simple behavior is no longer true with more complex algorithms, which interpo-
late pixels using more samples among all three channels. Nevertheless, with most algorithms, an
interpolated pixel is still more likely to be an intermediate value than a sampled one.
In this paper, we describe, analyze and expand this method. The original article explains how
to detect in which pattern an image, or part of it, has been sampled. Starting from there, we detect
which regions of an image are inconsistent with the main image, and attribute a conﬁdence score to
this detection. We also propose another way of computing intermediate values, which yields slightly
better results.

318


Image Forgeries Detection through Mosaic Analysis: the Intermediate Values Algorithm

(a) Authentic image (b) Forged image
Figure 3: Colors in which pixels are sampled in an authentic and forged image. In the forged area of the second image, there
is a 43 probability that the patterns of the authentic and forged area are misaligned, causing a shift in the otherwise-periodic
CFA.

18 56 94 85 76 96 116 104
49 56 63 52 41 64 88 87
80 56 32 19 6 33 60 70
59 62 66 49 32 59 87 88
38 69 100 79 58 86 114 106
40 51 63 74 85 75 66 63
42 34 26 69 112 65 18 21
38 47 57 75 94 72 50 35
(a) Red channel

139 240 154 16 94 56 72 20
92 131 168 76 72 94 24 43
85 24 100 48 102 224 130 72
60 107 160 68 64 122 200 153
92 184 125 0 50 0 133 108
52 155 156 76 136 117 224 127
146 228 111 12 110 108 107 44
56 114 48 90 184 141 52 90
(b) Green channel
Figure 4: Red and green channels of a toy image demosaiced with bilinear interpolation in the rggb pattern. Red values
correspond to positions where the value was interpolated. Highlighted cells correspond to pixels that take an intermediate
value, i.e. that are not a local extremum among their direct neighbors. While sampled pixels can have intermediate values,
many more can be found among interpolated pixels in both the red and green channels. The blue channel, not shown here,
behaves similarly to the red one.

2 Method
During demosaicing, missing colors on each pixel are interpolated from its neighbors. As a conse-
quence, pixels that are interpolated in a given channel are more likely to be an intermediate value,
in other words, to be neither lower than all its direct neighbors nor higher than all of them. This is
especially true with the simplest demosaicing algorithm, the bilinear demosaicing which interpolates
the three channels separately.
The detection method analyzed here counts the intermediate values corresponding to each of the
four patterns. On the correct pattern, as most pixels are sampled, there should be fewer intermediate
values than in the other patterns.

2.1 Intermediate Values Detection
Let I of shape (X, Y ) be one color channel of an image. The pixel at location (x, y) is considered an
intermediate value if min(Ix−1,y, Ix+1,y, Ix,y−1, Ix,y+1) ≤ Ix,y ≤ max(Ix−1,y, Ix+1,y, Ix,y−1, Ix,y+1). We
deﬁne M(I) as the mask of intermediate values of I. Its value is 1 if (x, y) is an intermediate value
of I, and 0 otherwise.

319


Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel

If (x, y) is at the border of the image, at least one of x±1 and y±1 is out of the image boundaries.
To avoid border eﬀects, we would thus have to mask out a 1-pixel border around the image. However,
doing this would cause an imbalance in the number of pixels corresponding to diﬀerent patterns, in
other words there would be, in border windows, more pixels corresponding to one pattern than to
another. To solve the imbalance, we mask out a 2-pixels-wide border instead. More formally, the
mask of intermediate values is therefore deﬁned by

M(I)x,y ,


0 if x ∈ {0, 1, X − 2, X − 1} or y ∈ {0, 1, Y − 2, Y − 1},
1 otherwise, if min(Ix−1,y, Ix+1,y, Ix,y−1, Ix,y+1) ≤ Ix,y ≤ max(Ix−1,y, Ix+1,y, Ix,y−1, Ix,y+1),
0 otherwise.
The computation of this mask is described in Algorithm 1.

Algorithm 1: Mark intermediate values (original isotropic version)
1 function is intermediate(arr)
Input arr: Array of size (X, Y ), one channel of an image
Output mask: Array of size (X − 4, Y − 4), intermediate values mask
2 mask := 0(X−4,Y −4)
3 for x from 2 to X − 2 and y from 2 to Y − 2 do
4 mi := min (arrx+1,y, arrx,y−1, arrx−1,y, arrx,y+1)
5 ma := max (arrx+1,y, arrx,y−1, arrx−1,y, arrx,y+1)
6 if mi ≤ arrx,y ≤ ma then
7 maskx−2,y−2 := 1
8 return mask

To limit demosaicing artifacts, many demosaicing algorithms tend to avoid interpolating against
strong gradients, such as against an edge, and thus often only interpolate in one direction (in which
the gradient is smaller). To take this into account, we propose to replace the original isotropic
intermediate values mask with bidirectional ﬁlters, that separately consider horizontally and vertically
intermediate values. We deﬁne the mask of horizontal intermediate values as

M(I)hx,y ,


0 if x ∈ {0, 1, X − 2, X − 1} or y ∈ {0, 1, Y − 2, Y − 1},
1 otherwise, if min(Ix−1,y, Ix+1,y) ≤ Ix,y ≤ max(Ix−1,y, Ix+1,y),
0 otherwise.
Vertical values are computed in a similar way as

M(I)vx,y ,


0 if x ∈ {0, 1, X − 2, X − 1} or y ∈ {0, 1, Y − 2, Y − 1},
1 otherwise, if min(Ix,y−1, Ix,y+1) ≤ Ix,y ≤ max(Ix,y−1, Ix,y+1),
0 otherwise.
With this deﬁnition, the bidirectional mask of intermediate values is then deﬁned as the mean of the
horizontal and vertical masks by
M(I)x,y ,1
2  M(I)

h
x,y + M(I)

v

x,y.
The mask is therefore null at the border and where a pixel is not an intermediate value, equal to 12
where the pixel is either horizontally or vertically an intermediate value, and equal to 1 when it is
an intermediate value both horizontally and vertically. The computation of the bidirectional mask
is detailed in Algorithm 2.

320


Image Forgeries Detection through Mosaic Analysis: the Intermediate Values Algorithm

Algorithm 2: Mark intermediate values (bidirectional variant)
1 function is intermediate(arr)
Input arr: Array of size (X, Y ), one channel of an image
Output mask: Array of size (X − 4, Y − 4), intermediate values mask
2 mask := 0(X−4,Y −4)
3 for x from 2 to X − 2 and y from 2 to Y − 2 do
4 mh := min(arrx−1,y, arrx+1,y)
5 Mh := max(arrx−1,y, arrx+1,y)
6 mv := min(arrx,y−1, arrx,y+1)
7 Mv := max(arrx,y−1, arrx,y+1)
8 if mh ≤ arrx,y ≤ Mh then
9 maskx−2,y−2+= 12
10 if mv ≤ arrx,y ≤ Mv then
11 maskx−2,y−2+= 12
12 return mask

The original isotropic mask and the bidirectional one will be compared in Section 3. For the rest
of this section, we consider R, G and B the masks of intermediate values obtained on the respectively
red, green and blue channels of the image. Which of the two methods was used to compute those
masks is irrelevant to the rest of the algorithm.

2.2 Division into Windows
The strategy to ﬁnd forgeries using inconsistencies in the CFA patterns is to ﬁrst ﬁnd in which
pattern the full image has been demosaiced, then to ﬁnd the pattern used in diﬀerent windows of
the image. If the pattern detected in a window is diﬀerent from the one detected for the full image,
then this window is inconsistent with the rest of the image and can be considered as forged.
To improve the precision of detection, we do not simply use adjacent windows, but rather sliding
windows with overlap. The window size W and stride are set as parameters of the algorithm. The
stride determines the number of pixels between the left (or top) border of two consecutive windows,
so that a stride equal to the window size leads to adjacent windows without overlapping, a stride
equal to half the window size leads to a new window starting at the middle of the previous one, etc.
Using a lower stride will not drastically improve the detection, but may help delineate a detected
forgery more precisely, at the cost of a slower algorithm.

2.3 Finding the Pattern
The four Bayer patterns can be divided into two subgroups by their diagonal: rggb and bggr share
the ·gg· diagonal, whereas grbg and gbrg share the g··g diagonal (see Figure 5). Because the Bayer
CFA samples twice as many pixels in green than in red or blue, it is easier to ﬁnd information on the
pattern in the green channel. This is ampliﬁed by the fact that many demosaicing algorithms ﬁrst
interpolate the green channel by itself, but interpolate the red and blue channels using information
from the green channel.
As a consequence, the presented method ﬁrst tries to detect the diagonal pattern using the green
channel (·gg· or g··g), then uses the red and blue channels to compare the two potential patterns
sharing that diagonal. We denote by R, G and B be the masks of intermediate values on the

321


Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel

·gg· g··g

rggb bggr grbg gbrg
Figure 5: The four possible sampling patterns can be grouped by the diagonal on which the green channel was sampled:
rggb and bggr share the ·gg· diagonal, whereas grbg and gbrg share the g··g one.

respectively red, green and blue channels. (They will not be confused with the R, G, B channels
that we no longer use in the rest of this paper). These masks can represent either the full image
or a window of it. To maintain the balance between patterns, the masks must be of even size. For
this reason, the window size must be even, and the last row/column of the full image is removed if
necessary to ensure the evenness of the shape. Here we denote the shape of these masks (2X, 2Y )
for easier notations of the diﬀerent positions on the CFA. We start by looking at the green channel
for the diagonal grids. The intermediate value count corresponding to the ·gg· pattern is
C·gg· ,

X−1

Xx=0

Y −1

Xy=0 (G2x+1,2y + G2x,2y+1) ,
while the count corresponding to the g··g pattern is
Cg··g ,

X−1

Xx=0

Y −1

Xy=0 (G2x,2y + G2x+1,2y+1) .
The count diﬀerence of the diagonal is then deﬁned as
Δdiag , 1

2X · Y (C·gg· − Cg··g) .
This diﬀerence is positive if the detected diagonal is g··g, and negative if it is ·gg·:

D ,


g··g Δdiag > 0,
·gg· Δdiag < 0,
−1 Δdiag = 0.
The normalization by 2X1Y means that the resulting diﬀerence belongs to [−1, 1], and is equal to
±1 if all pixels in one of the patterns are intermediate values, whereas the other pattern has no
intermediate values (XY is the number of 2 × 2 blocks in a mask of shape (2X, 2Y ), and we sum
two pixels in this block for each pattern). Note that the ±1 limit is only theoretical: even with
bilinear demosaicing, where all interpolated pixels are intermediate values, sampled pixels can be
intermediate too, for instance where they belong to a slope. As a consequence, the diﬀerence will
not reach those values in natural cases.
Once we know the main diagonal, we can compare the two patterns sharing that diagonal. The
green channel does not provide any information on this, so we use the red and blue channels. The
count of intermediate values corresponding to each pattern is
Crggb ,PXx−1

=0 P

Y −1
y=0 (R2x,2y + B2x+1,2y+1) ,

Cbggr ,PXx−1

=0 P

Y −1
y=0 (R2x+1,2y+1 + B2x,2y) ,

Cgrbg ,PXx−1

=0 P

Y −1
y=0 (R2x+1,2y + B2x,2y+1) ,

Cgbrg ,PXx−1

=0 P

Y −1
y=0 (R2x,2y+1 + B2x+1,2y) .
322


Image Forgeries Detection through Mosaic Analysis: the Intermediate Values Algorithm

The count diﬀerences of the two pattern pairs are then deﬁned as
Δrggb−bggr ,2X1Y (Crggb − Cbggr) ,
Δgrbg−gbrg ,2X1Y (Cgrbg − Cgbrg) ,
and are then combined into the main grid diﬀerence
Δmain , ΔΔrggb−bggr D = ·gg·,

grbg−gbrg D = g··g.
Finally, the main detected grid can be obtained as

M ,





rggb D = ·gg· and Δmain < 0,
bggr D = ·gg· and Δmain > 0,
grbg D = g··g and Δmain < 0,
gbrg D = g··g and Δmain > 0,
−1 otherwise.
Both for the diagonal and main grids, if there is strict equality in the two counts detected, no grid
is considered detected. Naturally, if no decision is taken on the diagonal, no main grid is selected
either. The grid detection is detailed in Algorithm 3.
While Δmain is later used to make decisions on forgeries, the two intermediary comparisons
Δrggb−bggr and Δgrbg−gbrg are easier to understand visually, and are thus kept for visualization.
Our implementation of the count diﬀerence computation is slightly diﬀerent from the description
of the original article. In the original article, the diﬀerence is not normalized by 2X1Y . More impor-
tantly, the diﬀerence is computed separately in the red and blue channels, and the strongest of the
two is kept, whereas we use their sum. The reason for this is that the original article only tries to
classify in which pattern an image has been sampled, without considering how conﬁdent one can be
in the detection, or how to use it to detect forgeries. When only considering classiﬁcation of an image
or window into the four patterns, both the original article and our implementation provide the same
results. However, adding the normalization and summing the two channels makes it easier for us to
also compute a conﬁdence value for the detections, which will be described in the next subsection.
Finally, we note that even though this algorithm is presented for one window, the grid detection
is obviously performed on all windows simultaneously.

2.4 Forgery Detection
Using the previously-described algorithms, we can compute the intermediate value masks in all
channels, cut them into windows, and detect the diagonal and pattern of the global image and of
each window. With this information, we could simply say that the windows which do not use the
same pattern as the main grid correspond to forged regions. However, doing this creates many false
positives, as the detection is not always correct. In ﬁrst instance, if the grid of a window does not
match the global image’s grid, we can consider that window as forged with a conﬁdence of |Δmain|
(or |Δdiag| if looking at the diagonals). However, if the threshold is low, isolated detections of a
given grid will be made by mistake. On the contrary, in a region with many windows sharing the
same grid, only those above the threshold will be detected, so a high threshold will cause most of
the detections to be missed. In both cases, using a ﬁxed threshold will lead to mistakes that would
be easy to avoid by looking at the map more globally.
We therefore propose to segment the windows into connected components by their grids. In other
words, a connected component is a set of spatially connected windows whose detected pattern is the
same. This segmentation is performed with scikit-image [10]. Components whose detected pattern

323


Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel

Algorithm 3: Find the grid
1 function ﬁnd grid(R, G, B)
Input R: Array of even size (2X, 2Y ), typically as returned by is intermediate or a
sub-window of it on the red channel
Input G: Same as above for the green channel
Input B: Same as above for the blue channel
Output M: CFA pattern identiﬁed by the function (one of rggb, grbg, gbrg, bggr)
Output D: Diagonal pattern identiﬁed by the function (either ·gg· or g··g)
Output Δmain: Diﬀerence of count of intermediate values between the two patterns
sharing the same diagonal. Positive if the best pattern is rggb or
grbg, negative if the best pattern is bggr or gbrg.
Output Δdiag: Diﬀerence of count of intermediate values between the two diagonal
patterns. Negative for ·gg·, Positive for g··g.
Output Δrggb−bggr, Δgrbg−gbrg: Diﬀerence of count of intermediate values between two
grids sharing the same pattern.
# First we select the best diagonal pattern using the green values
2 C·gg· := PXx−1

=0 P

Y −1
y=0 G2x,2y+1 + G2x+1,2y

3 Cg··g := PXx−1

=0 P

Y −1
y=0 G2x,2y + G2x+1,2y+1
4 Δdiag := 2X1Y (C·gg· − Cg··g)
5 if Δdiag ¡ 0 then
6 D := ·gg·
7 else
8 D := g··g
# Compare patterns with the same diagonal.
9 Crggb := PXx−1

=0 P

Y −1
y=0 R2x,2y + B2x+1,2y+1

10 Cbggr := PXx−1

=0 P

Y −1
y=0 R2x+1,2y+1 + B2x,2y

11 Cgrbg := PXx−1

=0 P

Y −1
y=0 R2x+1,2y + B2x,2y+1

12 Cgbrg := PxX−1

=0 P

Y −1
y=0 R2x,2y+1 + B2x+1,2y
13 Δrggb−bggr = 2X1Y (Crggb − Cbggr)
14 Δgrbg−gbrg = 2X1Y (Cgrbg − Cgbrg)
15 if D = ·gg· then
16 Δmain = Δrggb−bggr
17 if Δmain < 0 then
18 M := rggb
19 else
20 M := bggr
21 else
22 Δmain = Δgrbg−gbrg
23 if Δmain < 0 then
24 M := grbg
25 else
26 M := gbrg
27 return M, D, Δmain, Δdiag, Δrggb−bggr, Δgrbg−gbrg

324


Image Forgeries Detection through Mosaic Analysis: the Intermediate Values Algorithm

is equal to the one of the global image are immediately discarded; they are not considered forged as
they agree with the full image. For components whose detected pattern is diﬀerent, we consider them
as forged, with a conﬁdence value which corresponds to the maximum absolute diﬀerence of count of
all windows in that components (either |Δmain| or |Δdiag| depending on whether we are looking at the
full pattern or the diagonal). In other words, the conﬁdence of a component is the conﬁdence of its
most prominent window. The computation of the conﬁdence by connected component is performed
in Algorithm 4.

Algorithm 4: Connected conﬁdence computation
1 function connected conﬁdence(G, global G, Δ)
Input G: Grid/diagonal detected on each window, shape (XW , YW )
Input global G: Grid/diagonal detected on the main image
Input Δ: Either Δmain or Δdiag
Output conﬁdence: Conﬁdence that each pixel is forged
2 labels := label connected(G, global G)
3 conﬁdence := 0XW ,YW
4 for label from 0 to max(labels) do
# ⊙ denotes Hadamard product
5 conﬁdence+= max ((labels = label) ⊙ |Δ|)
6 return conﬁdence

We apply this method separately to both the diagonal and the full pattern detection; this yields
two conﬁdence maps, which are then merged by taking their pointwise maximum. Although the full
pattern analysis can encompass the diagonal detection, in many cases the algorithm can only ﬁnd
the diagonal but hesitates on the full pattern. Hence, separating the detections enables the method
to detect signiﬁcant diagonal traces even when the full pattern cannot be detected.
These conﬁdence maps are useful to visualize the detection. However, they do not constitute by
themselves a decision on the detection. They cannot either be used as a heat map: the maximal
absolute value of the diﬀerence, 1, is never reached in actual cases, and even the most conﬁdent
detections will rarely reach a score of 0.3.
To make a ﬁnal decision on the image, we thus threshold the obtained conﬁdence map by a ﬁxed
threshold γ. This is equivalent to performing hysteresis thresholding with a lower threshold 0 and a
higher threshold γ on each map (M = g) ⊙ |Δmain| for each pattern g except the full image’s pattern,
and (D 6= dimg)⊙|Δdiag|, where dimg is the full image’s detected diagonal, ⊙ is the Hadamard product
(pointwise multiplication), while the expression (A = b) is an array of the same shape of A, equal to
1 where A takes the value b and 0 elsewhere.
Finally, all the outputs are resized to have one value per pixel, rather than per window. This
is done with nearest neighbor s interpolation for binary outputs, and with linear interpolation for
continuous outputs. The computation of the forgery map is detailed in Algorithm 5.
Overall, the full algorithm can achieve linear complexity in the input size. Indeed, each individual
step is linear, including the connected conﬁdence computation since each block of the image belongs
to at most one component and is thus only processed once. In practice, the vectorized Python
implementations processes all blocks for each component, thus leading to a quadratic complexity.
Although an optimal computation in another language could oﬀer the optimal worst-case linear
complexity, this is largely irrelevant since the number of inconsistent connected components usually
does not scale linearly with the image size.

325


Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel

Algorithm 5: Global algorithm
1 function ﬁnd forgeries(img, W, stride, threshold)
Input img: Input image, size (X, Y, 3). X and Y must be even (the last row and/or
column may be cut to ensure this).
Param W: int, Window size
Param stride: int, Distance between the left/top border of two consecutive windows.
Must divide W.
Param γ: ﬂoat, higher hysteresis threshold to select relevant inconsistencies.
Output forged full: Final map of detected forgeries (pointwise maximum of forged main
and forged diag)
Output forged {main, diag}: Detected forgeries after thresholding, respectively on the
full pattern and on the diagonal
Output conﬁdence {full, main, diag}: Conﬁdence that each region is a forgery
Output inconsistent {full, main, diag} raw: Binary mask of each region being
inconsistent with the global image, regardless
of signiﬁcance.
2 intermediate := is intermediate(img)
3 windows := create sliding windows(intermediate, W, stride)
4 Xw, Yw := number of windows per column/row
# Pattern and diagonal on the global image
5 global M, global D, , , , = find grid(intermediate[:, :, 0], intermediate[:, :
, 1], intermediate[:, :, 2])
# Pattern and diagonal on each window
6 M, D, Δmain, Δdiag := 0Xw,Yw
7 for x from 0 to Xw and y from 0 to Yw do
8 mainx,y, diagx,y, Δmainx,y , Δdiagx,y := find grid(windowsx,y,0, windowsx,y,1, windowsx,y,2)
# Inconsistent regions
9 bad {main, diag} raw := {M, D} =6 global {main, diag}
10 bad full raw := max(bad diag raw, bad main raw)
# Connected confidence
11 conﬁdence main := connected confidence(M, global M, Δmain)
12 conﬁdence diag := connected confidence(D, global D, Δdiag)
13 conﬁdence full := max(conﬁdence main, conﬁdence diag)
# Threshold
14 forged {full, main, diag} := conﬁdence {full, main, diag} > γ
15 return forged {full, main, diag}, conﬁdence {full, main, diag}, inconsistent {full, main,
diag} raw

3 Experiments

To evaluate the ability of this method to detect the CFA pattern correctly, we take 15 images from
the Raise dataset [6], and demosaic them using the 7 algorithms available in LibRaw: Bilinear
interpolation, AAHD, AHD, DCB, DHT, PPG and VNG. Eleven of these images are of size 4948 ×
3280, the other 4 are of size 4310 × 2868. The selected images can be seen in Figure 6.

326


Image Forgeries Detection through Mosaic Analysis: the Intermediate Values Algorithm

(a) r002fc3e2t (b) r1ead3024t (c) r1ceba29dt

(d) r0a2ff882t (e) r0a808003t (f) r0a966704t

(g) r0e04cc91t (h) r0ea0825ft (i) r1a0f5585t

(j) r1c9fdcf4t (k) r06aa7dabt (l) r07cfb432t

(m) r07ffdc87t (n) r16da5576t (o) r191f3cdet
Figure 6: These 15 images from the Raise dataset [6] were used in our experiments.

327


Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel

3.1 CFA Pattern Detection
We start by analyzing, at a global scale, whether the method is able to detect the correct pattern
of the 15 images described above. Results can be seen in Table 1. One can see that the algorithm
detects the correct grid in all 15 images when they are demosaiced with bilinear, AHD or DCB
demosaicing. It also works well on the PPG and VNG algorithms, despite a few mistakes in the full
pattern identiﬁcation against PPG or VNG-demosaiced images. These mistakes are solved when using
bidirectional ﬁlters. When the image is demosaiced with AAHD or DHT, however, the algorithm
consistently fails to detect even the diagonal, and consequently also fails on the full pattern, in both
versions of the algorithm.

Demosaicing Diagonal Full pattern
AAHD 0/15 0/15
AHD 15/15 15/15
DCB 15/15 15/15
DHT 3/15 3/15
Bilinear 15/15 15/15
PPG 15/15 13/15
VNG 15/15 14/15
(a) Original isotropic intermediate values

Demosaicing Diagonal Full pattern
AAHD 0/15 0/15
AHD 15/15 15/15
DCB 15/15 15/15
DHT 2/15 2/15
Bilinear 15/15 15/15
PPG 15/15 15/15
VNG 15/15 15/15
(b) Bidirectional ﬁlters for intermediate values
Table 1: Identiﬁcation of the main diagonal and of the full pattern on the 15 images. For each demosaicing algorithm, we
show how many of the 15 images had their diagonal/full pattern correctly detected by the method. In its original version, the
algorithm works very well when the demosaicing is done with AHD, DCB or bilinear demosaicing, with a few errors on the
full pattern against PPG- or VNG-demosaiced images. It fails to detect even the diagonal on AAHD- and DHT-demosaiced
images. Bidirectional ﬁlters for intermediate value computation yield perfect results on PPG and VNG, but still fail against
AAHD- and DHT-demosaiced images.

Looking at the results on Image r0a2ff882t in Figure 7, we can see again that the results
depend on the demosaicing algorithm used by the method. All windows are detected correctly
against DCB and bilinear demosaicing, but the algorithm is confused on the diagonal pattern in the
PPG-demosaiced image, though bidirectional ﬁlters for the intermediate value computation partly
alleviate this problem. On the VNG-demosaiced image, there are also false detections on the diagonal
itself. More importantly, the basket of the bike causes errors with AHD, PPG and VNG demosaicing.
This was to be expected with a periodic structure that fools the detection. The result might easily
be misinterpreted as a forgery. As can be seen on Figure 8, however, using bidirectional ﬁlters yields
a very low relative conﬁdence for the identiﬁcation of the basket’s grid compared to the rest of the
image. As a consequence, a reasonable thresholding level should still enable one to automatically
discard this false detection.
Most images that are found on the web are JPEG-compressed. It is thus vital to test the
robustness of this algorithm to JPEG compression. JPEG compression quickly discards the highest
frequencies, at which CFA artifacts are located. As a consequence, it would be illusory to expect
results on heavily-compressed images. However, being able to detect the CFA pattern on low-
compression images extends the application range of a CFA grid detection method. We show results
after JPEG compression on Figures 9 and 10. JPEG compression is done with the Pillow library3.
On the two studied images, we can see that even the highest-quality compression of 100 causes many
errors in the pattern detection, though the algorithm remains largely usable, especially when only
3Alex Clark, Pillow (PIL fork) documentation https://buildmedia.readthedocs.org/media/pdf/pillow/
latest/pillow.pdf

328


Image Forgeries Detection through Mosaic Analysis: the Intermediate Values Algorithm

Original image (r0a2ﬀ882t), in the rggb pattern
AAHD AHD DCB DHT Bilinear PPG VNG

Bidirectional Original

Figure 7: Results of the method on 6464 windows, both with the original isotropic intermediate value mask and the proposed
bidirectional one, on one image with the 7 diﬀerent demosaicing algorithms. Both methods work perfectly on the DCB-
and bilinear-demosaiced images. With the AHD, PPG and VNG methods, both the original isotropic and the bidirectional
ﬁlters have trouble discerning between the two patterns sharing the same diagonal, but the bidirectional detection makes
fewer mistakes. Periodically textured regions like the basket can create a localized shift in the detected mosaic, which could
be mistaken for a forgery. With the AAHD and DHT algorithm, the method consistently detects the wrong diagonal.

(a) Isotropic (b) Bidirectional
Figure 8: This ﬁgure shows, on the AHD-demosaiced bicycle image, the diﬀerence of counts of intermediate values cor-
responding to the rggb and bggr patterns, on the red and blue channels. This count is what is used by the algorithm
to decide on a grid. A negative diﬀerence corresponds to the correct rggb pattern, a positive diﬀerence to the incorrect
bggr pattern. The diﬀerence is normalized by dividing it by the size of the block (64 × 64). The texture in the basket
area leads to a locally consistent shift in the position of the intermediate values. The error is slightly less prominent when
a bidirectional mask is used, but is still consistently in favor of the wrong grid.

329


Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel

looking at the diagonal. A quality factor of 100 does not speciﬁcally remove the high frequencies,
however the discretization in the frequency domain already includes a loss of information. At JPEG
quality 98, the algorithm no longer detects the correct pattern, except in the easier case of the bilinear
demosaicing algorithm. However, it can still detect the diagonal of most windows, albeit with a few
errors. Finally, at JPEG quality 95, the algorithm is unable to ﬁnd anything.
All in all, JPEG compression remains the biggest limitation of this method, and of CFA detection
in general.
In Figures 11, 12 and 13, we evaluate the robustness of the method to additive white Gaussian
noise (AWGN). Because AWGN is not spatially correlated, it remains possible to detect the pattern
in most cases with a noise of standard deviation σ = 5 (on [0, 255]-ranged images). More localized
errors are made as the noise level increases, but thanks to the lack of spatial correlation of the
noise (and consequently of the errors), the risk of mistakenly interpreting these as forgeries remains
relatively low. Finally, we can see in Figure 13 that detecting the pattern over AWGN is made easier
by using a larger window size, which averages the noise while keeping the artifacts. Of course, this
comes at the price of potentially missing smaller forgeries.
Median ﬁltering has often been proposed as a counter-forensics measure to hide forgeries. Al-
though it can be easily detected [8], we evaluate the robustness of the presented method to median
ﬁltering in Figure 14, using a median ﬁlter of footprint 10 1 0

1 1

0 1 0

. We can see that the results of the
method are completely inverted, because median ﬁltering shifts the intermediate values. As a conse-
quence, images on which the correct diagonal was found before ﬁltering now yield wrong detection,
whereas the method ﬁnds the correct pattern on AAHD- and DHT-demosaiced images, where it was
failing without median ﬁltering. Figure 15 explains this phenomenon with a toy example. Without
further elaborating, we note that only the diagonal detection is aﬀected. As a consequence, if median
ﬁltering has been detected, detections can be made correct again by simply reverting the detected
diagonal.

330


Image Forgeries Detection through Mosaic Analysis: the Intermediate Values Algorithm

AAHD AHD DCB DHT Bilinear PPG VNG

Uncompressed
Bidirectional Original

JPEG 100
Bidirectional Original

JPEG 98
Bidirectional Original

JPEG 95
Bidirectional Original

Figure 9: Detection of the method after JPEG compression. Results are shown on Image r07ffdc87t, in the rggb pattern,
uncompressed and submitted to JPEG compression of quality 100, 98 and 95. At JPEG quality 100 (the highest possible),
although the correct pattern is usually found in most blocks of the image, errors between the two dual patterns start to
appear. At JPEG quality 98, the method remains globally able to detect the main diagonal, but cannot distinguish the
dual patterns anymore. At JPEG quality 95, the algorithm is unable to do any detection, even against bilinear demosaicing.
Bidirectional intermediates provide a small boost to JPEG robustness, though it is not enough to make the algorithm reliable
to use on JPEG-compressed images.

331


Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel

AAHD AHD DCB DHT Bilinear PPG VNG

Uncompressed
Bidirectional Original

JPEG 100
Bidirectional Original

JPEG 98
Bidirectional Original

JPEG 95
Bidirectional Original

Figure 10: Detection of the method after JPEG compression. Results are shown on Image r0ea0825ft, in the grbg pattern,
uncompressed and submitted to JPEG compression of quality 100, 98 and 95. On this image, which is more diﬃcult to
analyze than the one in Figure 9, errors are already present in the uncompressed image, the diagonal is also locally wrong
on the stairs against VNG demosaicing, especially with the original isotropic intermediate values. These errors become more
prominent against other demosaicing methods as well at JPEG quality 100 (highest possible), and detection becomes barely
possible. At JPEG quality 98, contrarily to Figure 9, detection is mostly impossible, although the diagonal can still be found
with local mistakes against bilinear and DCB demosaicing if bidirectional ﬁlters are used. Again, bidirectional intermediates
provide a consistent, although small, boost to JPEG robustness.

332


Image Forgeries Detection through Mosaic Analysis: the Intermediate Values Algorithm

AAHD AHD DCB DHT Bilinear PPG VNG

Uncompressed
Bidirectional Original

Noisy σ = 5
Bidirectional Original

Noisy σ = 10
Bidirectional Original

Figure 11: Robustness of the method to additive white Gaussian noise (AWGN), that can be added to images either for
aesthetic reasons or to maliciously hide manipulations. Image r07cfb432t, in the rggb pattern. We show results against
noise of standard deviation from 0 (noiseless) to 10, window size 64 × 64. Because the noise is independent of the image,
it does not create locally coherent errors that can hardly be distinguished from forgeries. However, the probabilities of a
sampled or interpolated pixel being an intermediate value go closer to one another as more noise is added, making the
detection harder.

333


Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel

AAHD AHD DCB DHT Bilinear PPG VNG

Uncompressed
Bidirectional Original

Noisy σ = 5
Bidirectional Original

Noisy σ = 10
Bidirectional Original

Figure 12: Robustness of the method to additive white Gaussian noise (AWGN), that can be added to images either for
aesthetic reasons or to maliciously hide manipulations. Image r1c9fdcf4t, in the rggb pattern. We show results against
noise of standard deviation from 0 (noiseless) to 10, with window size 64 × 64. Because the noise is independent of the
image, it does not create locally coherent errors that can hardly be distinguished from forgeries. However, the probabilities
of a sampled or interpolated pixel being an intermediate value go closer to one another as more noise is added, making the
detection harder.

334


Image Forgeries Detection through Mosaic Analysis: the Intermediate Values Algorithm

AAHD AHD DCB DHT Bilinear PPG VNG

σ = 5, W = 64
Bidirectional Original

σ = 5, W = 128
Bidirectional Original

σ = 5, W = 256
Bidirectional Original

Figure 13: Robustness of the method to additive white Gaussian noise (AWGN), that can be added to images either for
aesthetic reasons or to maliciously hide manipulations. Image r1c9fdcf4t, in the rggb pattern. Noise standard deviation
5, with window sizes 64 × 64, 128 × 128 and 256 × 256. Because the noise is independent of the image and not spatially
correlated, using bigger windows improves the robustness to it by providing more samples (at the cost of potentially missing
smaller forgeries).

335


Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel

AAHD AHD DCB DHT Bilinear PPG VNG

Base image
Bidirectional Original

Median ﬁlter
Bidirectional Original

(a) Image r0e04cc91t, in the rggb pattern.

AAHD AHD DCB DHT Bilinear PPG VNG

Base image
Bidirectional Original

Median ﬁlter
Bidirectional Original

(b) Image r1a0f5585t, in the rggb pattern
Figure 14: Results of the method on 64 × 64 blocks on two images, unprocessed and median-ﬁltered. The median ﬁlter is
often used as a simple attack to hide forgeries by removing camera traces. It shifts the intermediate values on the green
channel, thus confusing the algorithm on the diagonal pattern. Consequently, with the AAHD and DHT algorithms, which
already shift the green channel intermediate values into the sampled pixels, the algorithms makes a better detection after
median-ﬁltering than on the unprocessed image.

336


Image Forgeries Detection through Mosaic Analysis: the Intermediate Values Algorithm

139 240 154 16 94 56 72 20
92 131 168 76 72 94 24 43
85 24 100 48 102 224 130 72
60 107 160 68 64 122 200 153
92 184 125 0 50 0 133 108
52 155 156 76 136 117 224 127
146 228 111 12 110 108 107 44
56 114 48 90 184 141 52 90
(a) Original array

139 139 168 94 72 94 56 43
131 131 131 72 94 72 72 43
85 100 100 76 72 122 130 123
92 107 107 64 68 122 133 153
72 125 156 68 50 117 133 133
115 156 125 76 110 117 127 127
146 146 111 90 110 110 107 100
114 114 111 90 141 141 107 90
(b) After median ﬁltering

Median
ﬁlter

Figure 15: Values on an array, before and after median ﬁltering. The array corresponds to the green channel of an
image in the ·gg· position demosaiced with bilinear interpolation. Red values correspond to positions where the value was
interpolated. Highlighted cells correspond to pixels that take an intermediate value. Notice the shift of intermediate values
from one diagonal to another: originally, almost all (18 out of the 20) intermediate values are found in the interpolated
pixels, but after median ﬁltering, most (16 out of 29) are located in the sampled (black) position.

3.2 Image Forgery Detection
The ultimate goal of the method is to ﬁnd mosaic inconsistencies in an image. We use forgeries from
the Trace database [2] to evaluate the method. The Trace database is constituted of 1000 images
taken from the Raise dataset. Two forgery masks are made for each image: the endomask, obtained
by taking a random object from the image’s automatic segmentation, and the exomask, which is
simply the endomask of another image of the set and thus do not correlate to the contents of the
image. The concept of the database is to process the image with two diﬀerent pipelines, and merge
them with one of the forgery masks. Of the six datasets that are proposed, two are of interest to us:
• in the CFA Grid dataset, the two pipelines are the same, but the pattern of demosaicing changes
(the algorithm is the same). The forgery thus has a diﬀerent CFA pattern than the rest of the
image.
• in the CFA Algorithm dataset, the two pipelines are the same, but the algorithm of demosaicing
changes. A new CFA pattern is also chosen at random for the forged region, with a 41 chance
of being the same than the original image’s.
For the quantitative experiments, we use the CFA grid with exomasks dataset. For the qualitative
experiments, we use samples from both the CFA grid and CFA algorithm datasets. Unless otherwise
speciﬁed, quantitative experiments are done with the Matthews Correlation Coeﬃcient (MCC) [9].
This metric varies from -1 for a detection that is complementary to the ground truth, to 1 for a
perfect detection. A score of 0 represents an uninformative result and is the expected performance
of any random classiﬁer. The MCC is more representative than the F1 and IoU scores [3, 4], partly
because it is less dependent on the proportion of positives in the ground truth, which is especially
important given the large variety of forgery mask sizes in the database. It is deﬁned by

MCC = p(T P + F P ) · (T P × T N − F P × F N

T P + F N) · (T N + F P ) · (T N + F N)

,

where TP, FP, TN and FN respectively represent the numbers of true positives (TP), false positives
(FP), true negatives (TN) and false negative (FN). The score is computed for each image, and
then averaged over each dataset. As most surveyed methods do not provide a binary output but a
continuous heatmap, we weight the confusion matrix using the heatmap. For several results, we also
provide the Intersection over Union (IoU), the F1 score and the Precision and Recall. Quantitative
experimental results can be found in Table 2.

337


Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel

MCC IoU F1
Isotropic, raw 0.518 0.490 0.573
Isotropic, γ = 0.1 0.592 0.567 0.622
Bidirectional, raw 0.543 0.515 0.595
Bidirectional, γ = 0.1 0.610 0.584 0.642
Bammey [1] 0.682 0.617 0.702
(a) Results with isotropic and bidirectional intermediate val-
ues, raw and with connected conﬁdence both continuous
and thresholded at γ = 0.2, compared with Bammey [1].
Both the presented method and Bammey are used on 32×32
windows.

All images Same diagonal Diﬀerent diagonal
Main grid 0.476 0.503 0.461
Diagonal 0.429 0.000 0.671
Combined 0.610 0.501 0.673
(b) Inﬂuence of using only main grid inconsistencies, diag-
onal inconsistencies and their combination (pointwise max-
imum of the two detection maps), on the full database, and
when only looking at images whose authentic and forged
parts share/do not share the same diagonal. The diagonal
is shared in 364 out of the 1000 images of the dataset. By
combining the two maps, the obtained results are almost as
good as the diagonal map when the forgeries don’t share
their diagonal, almost as good as the full map when they
do share the same diagonal (and thus cannot be detected
by their diagonal), and overall much better than any of the
two maps used alone.

Algorithm All AAHD AHD DCB DHT Bilinear PPG VNG
#Images 1000 126 138 133 155 154 147 147
Isotropic 0.592 0.372 0.696 0.786 0.305 0.742 0.590 0.657
Bidirectional 0.610 0.375 0.755 0.763 0.350 0.649 0.766 0.613
(c) Results of the presented method depending on how the image was demosaiced. The method is used with bidirectional
ﬁlters, on 64 × 64 windows, with hysteresis thresholding and combining the main grid and diagonal inconsistencies. Even
though the method ﬁnds the wrong diagonal with the AAHD and DHT algorithms, it is consistent in doing so, and can
thus still detect some forgeries, though not as well as against other demosaicing algorithms.
MCC IoU F1 Precision Recall
γ = 0.05 0.543 0.518 0.590 0.570 0.733
γ = 0.1 0.610 0.584 0.642 0.670 0.650
γ = 0.15 0.531 0.513 0.558 0.600 0.535
γ = 0.2 0.382 0.371 0.400 0.433 0.382
(d) Results with diﬀerent metrics, raw and with conﬁdence threshold-
ing. The method is used with bidirectional ﬁlters, on 64×64 windows
and combining the main and diagonal inconsistencies. Even though
thresholding slightly lowers the recall, its gain in precision is much
larger, thus yielding better MCC, IoU and F1 scores.

Window size MCC
16 0.592
32 0.610
64 0.412
128 0.163
(e) Results with diﬀerent window sizes. The
method is used with bidirectional ﬁlters, continu-
ous normalisation at γ = 0.2, 32 × 32 windows.

Table 2: Quantitative experiments on the Trace database [2]. Where parameters are not speciﬁed, these are used: bidi-
rectional ﬁlters, continuous normalization at γ = 0.2, 32 × 32 windows, combined results of the full pattern and diagonal
maps.

338


Image Forgeries Detection through Mosaic Analysis: the Intermediate Values Algorithm

In Table 2a, we can see that using bidirectional ﬁlters slightly improves the overall results. This
corroborates the visual results of the previous subsection. Using thresholding not only improves the
understandability of the method, it also provides signiﬁcant improvements in the scores. Although
this method does not work as well as Bammey [1], we note that it is also much simpler and more
interpretable.
Table 2b shows results from taking only the results of the diagonal detection, the full pattern
or their combination by pointwise maximum. The strategy of merging the two maps by pointwise
maximum is the good one: it performs almost as well as the diagonal map on forgeries which do not
share their diagonal, almost as well as the full map on forgeries that do share their diagonal (and
are thus invisible in the diagonal map), and thus performs much better on the overall database than
any of the maps taken separately.
In Table 2c, we present the scores of the method depending on the demosaicing used to process
the image. Unsurprisingly, the method does not work well on AAHD- and DHT-demosaiced images,
as we saw in the previous subsection. However, because it is consistent in detecting the diagonal, it
can still be used to see that two AAHD- or DHT-demosaiced regions use a diﬀerent diagonal. This
will be explored further below. Bidirectional ﬁlters work better than the original isotropic ﬁlters in
most cases. The biggest gaps occur with PPG4 and AHD [7] demosaicing, which explicitly interpolate
in the smoothest direction. On the other hand, isotropic ﬁlters work better with simpler demosaicing
methods such as bilinear demosaicing, which does not try to ﬁnd a better direction for interpolation.
We examine the inﬂuence of the threshold in Table 2d. As fewer windows get detected, a higher
threshold systematically means that the recall is lower. However, a higher threshold does not nec-
essarily improve the precision; the best precision (and best score overall) is achieved with a 0.1
threshold, and higher thresholds yield a lower precision. This can be explained by the fact that the
most conﬁdent detections often correspond to textured areas, where intermediate values are created
by the texture more than the demosaicing, and are thus a source of false positives.
Finally, Table 2e shows the scores with diﬀerent window sizes. While a window size of 32 × 32
yields the best results, this is inherently tied to the database in question. We saw earlier that
increasing the window size would often lead to a better grid identiﬁcation, but this also comes at the
cost of missing small forgeries, and also failing to identify the borders of the bigger ones.
In Figure 16, we investigate the importance of the thresholding. Inconsistent false detections
are usually not found in large connected regions of the same detected grid. As a consequence, even
if some of those detections were to be signiﬁcant, they would not cause a large number of false
detections. On the other hand, regions detected because they are truly forged have a high chance of
being actually forged. A conﬁdent result on one window of that region is thus enough to detect the
whole region. Of course, this threshold is not fully automatic: it must still be set by the user, and
will not ﬁlter out zones that are strongly detected for reasons other than a forgery, such as saturation
or textured areas.
The image in Figure 16b is interesting: as it was demosaiced with AAHD, its diagonal is not
detected correctly (and no conﬁdent detection is thus made in Δgrid). However, because the two
regions do not share their diagonal, the inconsistency is still detected. The region demosaiced in the
grbg pattern is detected as being in the ·gg· diagonal, whereas the region demosaiced in the bggr
is detected as g··g: even though those predictions are wrong, they are still inconsistent with one
another.
We further explore this phenomenon in Figure 17. As long as the two regions of an AAHD-
demosaiced image do not share the same diagonal, they can still be detected. However, if the
diagonal is wrong, then the algorithm will not look at the correct diﬀerence map to select the full
4A description of Patterned Pixel Grouping (PPG) by his author, Chuan-Kai Lin, can be found in https://
web.archive.org/web/20160923211135/https://sites.google.com/site/chklin/demosaic/, archived from the
original page on 23rd September 2016.

339


Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel

Image Δdiag Conﬁdence map Out, γ = 0.05 Out, γ = 0.1

Ground truth Δgrid Raw inconsistencies Out, γ = 0.15 Out, γ = 0.2
(a) Image r06888d38t of the CFA Grid dataset with endomask, demosaiced with the VNG algorithm. The authentic region
was demosaiced in the grbg pattern, the forged region in the gbrg pattern.

Image Δdiag Conﬁdence map Out, γ = 0.05 Out, γ = 0.1

Ground truth Δgrid Raw inconsistencies Out, γ = 0.15 Out, γ = 0.2
(b) Image r1594b4b3t of the CFA Grid dataset with exomask, demosaiced with the AAHD algorithm. The authentic region
was demosaiced in the grbg pattern, the forged region in the bggr pattern.
Figure 16: Inﬂuence of the threshold on the removal of inconsistent false positives. Less-conﬁdent regions of the correct
detection are kept, as long as they are connected to a more conﬁdent window, whereas inconsistent false detections are not
connected to a large region: even if a few of them are conﬁdently detected, most will be ﬁltered out. Raw inconsistencies
highlight every block whose detected pattern is inconsistent with the main image, regardless of signiﬁcance.

pattern. As a consequence, it will be unable to make a consistent decision, and will thus not detect
two diﬀerent patterns that share the same diagonal. Nevertheless, the unused Δ maps still shows
clear traces of the forgery. If one is suspicious that the diagonal is reversed (for instance because it
is detected conﬁdently over the whole image, whereas the full pattern is inconsistent), inverting the
results of the diagonal, or visually examining the unused Δ map, can thus still reveal the forgery.

In the case of an image whose forged regions come from diﬀerent algorithms, one of which being
AAHD or DHT, the diagonal inversion means that if the two regions share the same diagonal (or
even the very same pattern), the forgery could be detected, albeit for the wrong reason. However,
this also means that if the two regions do not share the same pattern – an inconsistency which is
usually easier to ﬁnd, as seen in Table 2b –, the forgery then becomes invisible. This problem can
be seen in Figure 18.

340


Image Forgeries Detection through Mosaic Analysis: the Intermediate Values Algorithm

Image Detected diagonal Detected grid

Ground Truth Δdiag Δgrid

Detected forgeries Δrggb−bggr Δgrbg−gbrg
(a) Image r1d53fccat of the CFA Grid dataset, with endomask. Both regions are demosaiced with the AAHD algorithm,
the authentic region in the gbrg pattern and the forged region in the grbg pattern. Because the method wrongly detects
the ·gg· over the whole image, it only compares the two patterns sharing that diagonal, and Δgrid = Δrggb−bggr on almost
all the image. As a consequence, no detection can be made on the grid level, and the forgery is not detected. Nevertheless,
it appears clearly on Δgrbg−grbg, which is not used in the algorithm.

Image Detected diagonal Detected grid

Ground Truth Δdiag Δgrid

Detected forgeries Δrggb−bggr Δgrbg−gbrg
(b) Image r15919202t of the CFA Grid dataset, with endomask. Both regions are demosaiced with the AAHD algorithm,
the authentic region in the grbg pattern, the forged region in the bggr pattern. Although the method ﬁnds the wrong
diagonal in both regions, it still ﬁnds that the two regions use a diﬀerent diagonal.
Figure 17: On those two AAHD-demosaiced images, the method ﬁnds the wrong diagonal. On the second image, it can
still detect that the two regions use a diﬀerent diagonal, and detects the forgery. On the ﬁrst image, however, the two
regions share the same diagonal. Using the diagonal as a basis to detect the full pattern, the method is unable to make any
consistent detections on the full pattern when the diagonal is wrong, and thus does not ﬁnd the forgery even though it is
clearly visible when the correct diagonal is used. This shows the limits of using the detected diagonal as a strict basis for
the full detection.

341


Quentin Bammey, Rafael Grompone von Gioi, Jean-Michel Morel

(a) Input image (b) Ground Truth (c) Detected inconsistencies. (d) Normalized diﬀerence be-
tween the two diagonal patterns.
Figure 18: Image r040b3002t of the CFA Algorithm dataset, with exomask. The authentic region is demosaiced with
the aahd algorithm in the grbg pattern, the forged region is demosaiced with the DCB algorithm in the bggr pattern.
Because the method consistently ﬁnds the wrong diagonal on AAHD-demosaiced images, but detects the correct diagonal
on DCB-demosaiced images, it believes that the two regions share the same diagonal, even though they do not.

4 Conclusion

The presented method counts the number of locally intermediate values corresponding to each po-
tential demosaicing pattern. The correct pattern, which contains all the originally-sampled pixels, is
expected to have fewer intermediate values than the other patterns. The method starts by detecting
which diagonal is used with the green channel, then uses the red and blue channels to distinguish
between the two patterns sharing the detected diagonal. We proposed a diﬀerent way of comput-
ing intermediate values, which yields slightly better results by exploiting the fact that demosaicing
algorithms often interpolate on only one direction.
Doing this both on the full image and in local windows, we are able to detect locally inconsistent
windows. The diﬀerence between the two compared counts of intermediate values in a window is
normalized to serve as a conﬁdence measure for the detection of this window. Those windows are
then grouped into connected components of windows that share the same detected grid, and the
conﬁdence of a component is set as the maximal conﬁdence of its windows.
This conﬁdence map enables easy visualization of the detections and their signiﬁcance, and thresh-
olding it ﬁlters out most of the incoherent false detections, while keeping the consistent detections.
However, the threshold must be set manually.
Of the seven tested demosaicing algorithms, two cause an inversion of the detected diagonal.
In the green channel, AAHD and DHT actually cause more intermediate values to appear on the
sampled pixels.
The main limitation of this method, and of CFA forgery detection in general, is its low robustness
to JPEG compression. With this method, detections are already more diﬃcult at a quality factor
of 100, and become impossible at quality 95, eﬀectively limiting its applicability range. We also saw
that counter-forensic attacks based on the addition of white noise are quite eﬀective. The classic
median ﬁlter attack is instead relatively ineﬃcient as it ends up inverting the interpolation masks,
and the diagonal can thus still be detected.

Acknowledgment

Work partly funded by the French Minist`ere des Arm´ees – Direction G´en´erale de l’Armement, and
by grant ANR-16-DEFA-0004 Signature d’Images – ANR/DGA DEFALS challenge.

342


Image Forgeries Detection through Mosaic Analysis: the Intermediate Values Algorithm

Image Credits

Raise dataset [6], processed with libraw 5

Trace dataset [2]

References
[1] Q. Bammey, R. Grompone von Gioi, and J-M. Morel, An adaptive neural network for
unsupervised mosaic consistency analysis in image forensics, in Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR), June 2020. https://doi.
org/10.1109/CVPR42600.2020.01420.
[2] Q. Bammey, T. Nikoukhah, M. Gardella, R. Grompone von Gioi, M. Colom, and
J-M. Morel, Non-Semantic Evaluation of Image Forensics Tools: Methodology and Database,
in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Jan-
uary 2022. https://arxiv.org/abs/2105.02700.
[3] D. Chicco, Ten quick tips for machine learning in computational biology, BioData Mining, 10
(2017), pp. 35–35. https://doi.org/10.1186/s13040-017-0155-3.
[4] D. Chicco and G. Jurman, The advantages of the Matthews correlation coeﬃcient (MCC)
over F1 score and accuracy in binary classiﬁcation evaluation, BMC genomics, 21 (2020), pp. 6–
6. https://doi.org/10.1186/s12864-019-6413-7.
[5] C-H. Choi, J-H. Choi, and H-K. Lee, Cfa pattern identiﬁcation of digital cameras using
intermediate value counting, in Proceedings of the Thirteenth ACM Multimedia Workshop on
Multimedia and Security, MM&Sec ’11, New York, NY, USA, 2011, Association for Computing
Machinery, p. 2126. https://doi.org/10.1145/2037252.2037258.
[6] D-T. Dang-Nguyen, C. Pasquini, V. Conotter, and G. Boato, Raise: A raw im-
ages dataset for digital image forensics, in Proceedings of the 6th ACM Multimedia Systems
Conference, 2015, pp. 219–224. https://dl.acm.org/doi/pdf/10.1145/2713168.2713194.
[7] K. Hirakawa and T.W. Parks, Adaptive homogeneity-directed demosaicing algorithm, IEEE
Transactions on Image Processing, 14 (2005), pp. 360–369. https://doi.org/10.1109/TIP.
2004.838691.
[8] M. Kirchner and J. Fridrich, On detection of median ﬁltering in digital images, in Me-
dia Forensics and Security II, vol. 7541, International Society for Optics and Photonics, 2010,
p. 754110. https://doi.org/10.1117/12.839100.
[9] B.W. Matthews, Comparison of the predicted and observed secondary structure of T4 phage
lysozyme, Biochimica et Biophysica Acta (BBA) - Protein Structure, 405 (1975), pp. 442–451.
https://doi.org/10.1016/0005-2795(75)90109-9.
[10] S. van der Walt, J.L. Sch

¨

onberger, J. Nunez-Iglesias, F. Boulogne, J.D. Warner,
N. Yager, E. Gouillart, T. Yu, and the scikit-image contributors, scikit-image:
image processing in Python, PeerJ, 2 (2014), p. e453. https://doi.org/10.7717/peerj.453.

5LibRaw library, Copyright c008-2019 LibRaw LLC, https://www.libraw.org

343
No results found