CNN卷积层和pooling层的前向传播和反向传播
日期: 2018-09-01 分类: 个人收藏 358次阅读
title: CNN卷积层和pooling层的前向传播和反向传播
tags: CNN,反向传播,前向传播
grammar_abbr: true
grammar_table: true
grammar_defList: true
grammar_emoji: true
grammar_footnote: true
grammar_ins: true
grammar_mark: true
grammar_sub: true
grammar_sup: true
grammar_checkbox: true
grammar_mathjax: true
grammar_flow: true
grammar_sequence: true
grammar_plot: true
grammar_code: true
grammar_highlight: true
grammar_html: true
grammar_linkify: true
grammar_typographer: true
grammar_video: true
grammar_audio: true
grammar_attachment: true
grammar_mermaid: true
grammar_classy: true
grammar_cjkEmphasis: true
grammar_cjkRuby: true
grammar_center: true
grammar_align: true
grammar_tableExtra: true
本文只包含CNN的前向传播和反向传播,主要是卷积层和pool层的前向传播和反向传播,一些卷积网络的基础知识不涉及
符号表示
如果 l l l层是卷积层:
p
[
l
]
p^{[l]}
p[l]: padding
s
[
l
]
s^{[l]}
s[l]: stride
n
c
[
l
]
n_c^{[l]}
nc[l] : number of filters
fliter size:
k
1
[
l
]
×
k
2
[
l
]
×
n
c
[
l
−
1
]
k_1^{[l]} \times k_2^{[l]}\times n_c^{[l-1]}
k1[l]×k2[l]×nc[l−1]
Weight:
W
[
l
]
W^{[l]}
W[l] size is
k
1
[
l
]
×
k
2
[
l
]
×
n
c
l
−
1
×
n
c
l
k_1^{[l]} \times k_2^{[l]} \times n_c^{l - 1} \times n_c^{l}
k1[l]×k2[l]×ncl−1×ncl
bais:
b
[
l
]
b^{[l]}
b[l] size is
n
c
[
l
]
n_{c}^{[l]}
nc[l]
liner:
z
[
l
]
z^{[l]}
z[l],size is
n
h
[
l
]
×
n
w
[
l
]
×
n
c
[
l
]
n_h^{[l]} \times n_w^{[l]} \times n_c^{[l ]}
nh[l]×nw[l]×nc[l]
Activations:
a
[
l
]
a^{[l]}
a[l] size is
n
h
[
l
]
×
n
w
[
l
]
×
n
c
[
l
]
n_h^{[l]} \times n_w^{[l]} \times n_c^{[l ]}
nh[l]×nw[l]×nc[l]
input:
a
[
l
−
1
]
a^{[l-1]}
a[l−1] size is
n
h
[
l
−
1
]
×
n
w
[
l
−
1
]
×
n
c
[
l
−
1
]
n_h^{[l-1]} \times n_w^{[l-1]} \times n_c^{[l - 1]}
nh[l−1]×nw[l−1]×nc[l−1]
output:
a
[
l
]
a^{[l]}
a[l] size is
n
h
[
l
]
×
n
w
[
l
]
×
n
c
[
l
]
n_h^{[l]} \times n_w^{[l]} \times n_c^{[l ]}
nh[l]×nw[l]×nc[l]
n
h
[
l
]
n_h^{[l]}
nh[l]和
n
h
[
l
−
1
]
n_h^{[l-1]}
nh[l−1]两者满足:
n
h
[
l
]
(
s
[
l
]
−
1
)
+
f
1
[
l
]
⩽
n
h
[
l
−
1
]
+
2
p
n_h^{[l]}({s^{[l]}} - 1) + {f_1^{[l]}} \leqslant n_h^{[l - 1]} + 2p
nh[l](s[l]−1)+f1[l]⩽nh[l−1]+2p
n
h
[
l
]
=
⌊
n
h
[
l
−
1
]
+
2
p
−
k
1
[
l
]
s
+
1
⌋
n_h^{[l]} = \left\lfloor {\frac{{n_h^{[l - 1]} + 2p - {k_1^{[l]}}}}{s} + 1} \right\rfloor
nh[l]=⌊snh[l−1]+2p−k1[l]+1⌋
符号
⌊
x
⌋
\left\lfloor {x} \right\rfloor
⌊x⌋表示向下取整,
n
w
[
l
]
n_w^{[l]}
nw[l]和
n
w
[
l
−
1
]
n_w^{[l-1]}
nw[l−1]两者关系同上
[外链图片转存失败(img-S6uM1cnE-1567131135563)(https://www.github.com/callMeBigKing/story_writer_note/raw/master/小书匠/1535388857043.png)]
Cross-correlation与Convolution
很多文章或者博客中把Cross-correlation(互相关)和Convolution(卷积)都叫卷积,把互相关叫做翻转的卷积,在我个人的理解里面两者是有区别的,本文将其用两种表达式分开表示,不引入翻转180度。
Cross-correlation
对于大小为 h × w h \times w h×w图像 I I I和 大小为 ( k 1 × k 2 ) (k_1 \times k_2) (k1×k2)kernel K K K,定义其Cross-correlation:
( I ⊗ K ) i j = ∑ m = 0 k 1 − 1 ∑ n = 0 k 2 I ( i + m , j + n ) K ( m , n ) {(I \otimes K)_{ij}} = \sum\limits_{m = 0}^{{k_1} - 1} {\sum\limits_{n = 0}^{{k_2}} {I(i + m,j + n)} } K(m,n) (I⊗K)ij=m=0∑k1−1n=0∑k2I(i+m,j+n)K(m,n)
其中
0
⩽
i
⩽
h
−
k
1
+
1
0 \leqslant i \leqslant h - {k_1} + 1
0⩽i⩽h−k1+1
0
⩽
j
⩽
w
−
k
2
+
1
0 \leqslant j \leqslant w - {k_2} + 1
0⩽j⩽w−k2+1
注意这里的使用的符号和 i i i的范围,不考虑padding的话Cross-correlation会产生一个较小的矩阵
Convolution
首先回顾一下连续函数的卷积和一维数列的卷积分别如下,卷积满足交换律:
h
(
t
)
=
∫
−
∞
∞
f
(
τ
)
g
(
t
−
τ
)
d
τ
h(t) = \int_{ - \infty }^\infty {f(\tau )g(t - \tau )d\tau }
h(t)=∫−∞∞f(τ)g(t−τ)dτ
c ( n ) = ∑ i = − ∞ ∞ a ( i ) b ( n − i ) d i c(n) = \sum\limits_{i = - \infty }^\infty {a(i)b(n - i)di} c(n)=i=−∞∑∞a(i)b(n−i)di
对于大小为
h
×
w
h \times w
h×w图像
I
I
I和大小为
(
k
1
×
k
2
)
(k_1 \times k_2)
(k1×k2)kernel
K
K
K,convolution为 :
(
I
∗
K
)
i
j
=
(
K
∗
I
)
i
j
=
∑
m
=
0
k
1
−
1
∑
n
=
0
k
2
−
1
I
(
i
−
m
,
j
−
n
)
k
(
m
,
n
)
{(I * K)_{ij}} = {(K * I)_{ij}} = \sum\limits_{m = 0}^{{k_1} - 1} {\sum\limits_{n = 0}^{{k_2} - 1} {I(i - m,j - n)k(m,n)} }
(I∗K)ij=(K∗I)ij=m=0∑k1−1n=0∑k2−1I(i−m,j−n)k(m,n)
KaTeX parse error: Expected 'EOF', got '\eqalign' at position 1: \̲e̲q̲a̲l̲i̲g̲n̲{ & 0 \leqsla…
注意:这里的Convolution和前面的cross-correlation是不同的:
- i , j i,j i,j范围变大了,卷积产生的矩阵size变大了
- 这里出现了很多 I ( − x , − y ) I(-x,-y) I(−x,−y),这些负数索引可以理解成padding
- 这里的卷积核会翻转180度
具体过程如下图所示
[外链图片转存失败(img-fpdG5ZPq-1567131135566)(https://hosbimkimg.oss-cn-beijing.aliyuncs.com/pic/卷积示意图new.svg “卷积示意图”)]
如果把卷积的padding项扔掉那么就变成下图这样,此时Convolution和Cross-correlation相隔的就是一个180度的翻转,如下图所示
上一篇: 请粉我
精华推荐