Young87

当前位置:首页 >个人收藏

CNN卷积层和pooling层的前向传播和反向传播


title: CNN卷积层和pooling层的前向传播和反向传播
tags: CNN,反向传播,前向传播
grammar_abbr: true
grammar_table: true
grammar_defList: true
grammar_emoji: true
grammar_footnote: true
grammar_ins: true
grammar_mark: true
grammar_sub: true
grammar_sup: true
grammar_checkbox: true
grammar_mathjax: true
grammar_flow: true
grammar_sequence: true
grammar_plot: true
grammar_code: true
grammar_highlight: true
grammar_html: true
grammar_linkify: true
grammar_typographer: true
grammar_video: true
grammar_audio: true
grammar_attachment: true
grammar_mermaid: true
grammar_classy: true
grammar_cjkEmphasis: true
grammar_cjkRuby: true
grammar_center: true
grammar_align: true
grammar_tableExtra: true

本文只包含CNN的前向传播和反向传播,主要是卷积层和pool层的前向传播和反向传播,一些卷积网络的基础知识不涉及

符号表示

如果 l l l层是卷积层:

p [ l ] p^{[l]} p[l]: padding
s [ l ] s^{[l]} s[l]: stride
n c [ l ] n_c^{[l]} nc[l] : number of filters

fliter size: k 1 [ l ] × k 2 [ l ] × n c [ l − 1 ] k_1^{[l]} \times k_2^{[l]}\times n_c^{[l-1]} k1[l]×k2[l]×nc[l1]
Weight: W [ l ] W^{[l]} W[l] size is k 1 [ l ] × k 2 [ l ] × n c l − 1 × n c l k_1^{[l]} \times k_2^{[l]} \times n_c^{l - 1} \times n_c^{l} k1[l]×k2[l]×ncl1×ncl
bais: b [ l ] b^{[l]} b[l] size is n c [ l ] n_{c}^{[l]} nc[l]
liner: z [ l ] z^{[l]} z[l],size is n h [ l ] × n w [ l ] × n c [ l ] n_h^{[l]} \times n_w^{[l]} \times n_c^{[l ]} nh[l]×nw[l]×nc[l]
Activations: a [ l ] a^{[l]} a[l] size is n h [ l ] × n w [ l ] × n c [ l ] n_h^{[l]} \times n_w^{[l]} \times n_c^{[l ]} nh[l]×nw[l]×nc[l]

input: a [ l − 1 ] a^{[l-1]} a[l1] size is n h [ l − 1 ] × n w [ l − 1 ] × n c [ l − 1 ] n_h^{[l-1]} \times n_w^{[l-1]} \times n_c^{[l - 1]} nh[l1]×nw[l1]×nc[l1]
output: a [ l ] a^{[l]} a[l] size is n h [ l ] × n w [ l ] × n c [ l ] n_h^{[l]} \times n_w^{[l]} \times n_c^{[l ]} nh[l]×nw[l]×nc[l]

n h [ l ] n_h^{[l]} nh[l] n h [ l − 1 ] n_h^{[l-1]} nh[l1]两者满足:
n h [ l ] ( s [ l ] − 1 ) + f 1 [ l ] ⩽ n h [ l − 1 ] + 2 p n_h^{[l]}({s^{[l]}} - 1) + {f_1^{[l]}} \leqslant n_h^{[l - 1]} + 2p nh[l](s[l]1)+f1[l]nh[l1]+2p
n h [ l ] = ⌊ n h [ l − 1 ] + 2 p − k 1 [ l ] s + 1 ⌋ n_h^{[l]} = \left\lfloor {\frac{{n_h^{[l - 1]} + 2p - {k_1^{[l]}}}}{s} + 1} \right\rfloor nh[l]=snh[l1]+2pk1[l]+1
符号 ⌊ x ⌋ \left\lfloor {x} \right\rfloor x表示向下取整, n w [ l ] n_w^{[l]} nw[l] n w [ l − 1 ] n_w^{[l-1]} nw[l1]两者关系同上

[外链图片转存失败(img-S6uM1cnE-1567131135563)(https://www.github.com/callMeBigKing/story_writer_note/raw/master/小书匠/1535388857043.png)]

Cross-correlation与Convolution

很多文章或者博客中把Cross-correlation(互相关)和Convolution(卷积)都叫卷积,把互相关叫做翻转的卷积,在我个人的理解里面两者是有区别的,本文将其用两种表达式分开表示,不引入翻转180度。

Cross-correlation

对于大小为 h × w h \times w h×w图像 I I I和 大小为 ( k 1 × k 2 ) (k_1 \times k_2) (k1×k2)kernel K K K,定义其Cross-correlation:

( I ⊗ K ) i j = ∑ m = 0 k 1 − 1 ∑ n = 0 k 2 I ( i + m , j + n ) K ( m , n ) {(I \otimes K)_{ij}} = \sum\limits_{m = 0}^{{k_1} - 1} {\sum\limits_{n = 0}^{{k_2}} {I(i + m,j + n)} } K(m,n) (IK)ij=m=0k11n=0k2I(i+m,j+n)K(m,n)

其中

0 ⩽ i ⩽ h − k 1 + 1 0 \leqslant i \leqslant h - {k_1} + 1 0ihk1+1
0 ⩽ j ⩽ w − k 2 + 1 0 \leqslant j \leqslant w - {k_2} + 1 0jwk2+1

注意这里的使用的符号和 i i i的范围,不考虑padding的话Cross-correlation会产生一个较小的矩阵

Cross-correlation

Convolution

首先回顾一下连续函数的卷积和一维数列的卷积分别如下,卷积满足交换律:
h ( t ) = ∫ − ∞ ∞ f ( τ ) g ( t − τ ) d τ h(t) = \int_{ - \infty }^\infty {f(\tau )g(t - \tau )d\tau } h(t)=f(τ)g(tτ)dτ

c ( n ) = ∑ i = − ∞ ∞ a ( i ) b ( n − i ) d i c(n) = \sum\limits_{i = - \infty }^\infty {a(i)b(n - i)di} c(n)=i=a(i)b(ni)di

对于大小为 h × w h \times w h×w图像 I I I和大小为 ( k 1 × k 2 ) (k_1 \times k_2) (k1×k2)kernel K K K,convolution为 :
( I ∗ K ) i j = ( K ∗ I ) i j = ∑ m = 0 k 1 − 1 ∑ n = 0 k 2 − 1 I ( i − m , j − n ) k ( m , n ) {(I * K)_{ij}} = {(K * I)_{ij}} = \sum\limits_{m = 0}^{{k_1} - 1} {\sum\limits_{n = 0}^{{k_2} - 1} {I(i - m,j - n)k(m,n)} } (IK)ij=(KI)ij=m=0k11n=0k21I(im,jn)k(m,n)
KaTeX parse error: Expected 'EOF', got '\eqalign' at position 1: \̲e̲q̲a̲l̲i̲g̲n̲{ & 0 \leqsla…

注意:这里的Convolution和前面的cross-correlation是不同的:

  1. i , j i,j i,j范围变大了,卷积产生的矩阵size变大了
  2. 这里出现了很多 I ( − x , − y ) I(-x,-y) I(x,y),这些负数索引可以理解成padding
  3. 这里的卷积核会翻转180度
    具体过程如下图所示

A*B示意图

[外链图片转存失败(img-fpdG5ZPq-1567131135566)(https://hosbimkimg.oss-cn-beijing.aliyuncs.com/pic/卷积示意图new.svg “卷积示意图”)]

如果把卷积的padding项扔掉那么就变成下图这样,此时Convolution和Cross-correlation相隔的就是一个180度的翻转,如下图所示

卷积

卷积核旋转180度

除特别声明,本站所有文章均为原创,如需转载请以超级链接形式注明出处:SmartCat's Blog

上一篇: 请粉我

下一篇: 基于Haar特征和AdaBoost分类器的人脸检测

精华推荐