How does 3D convolution really works in CAFFE, a detailed analysis

This post summarizes how 3D convolution is implemented as a 2D matrix multiplication in CAFFE and other popular CNN implementations.

we start by analyzing the code in conv_layer, specifically we look at forward_cpu code. If we follow the example of the first convolution layer, we know it is a 3D tensor operations but converted to a 2D matrix multiplication. Th important dimensions are M_, N_, K_.

Why is M_, N_, K_ important, we can see in forward CPU, it matrix A in BLAS is the weight matrix. it has M rows and K columns. B is the linearized 3D tensor in 2D matrix using a img2col operation. Its dimenstion is K by N.  (see the documentation for A B and BLAST below)

So the convolution becomes

A(M by K)  * B (K by N)

The code that calculates M_, N_, K_ are shown below


// Prepare the matrix multiplication computation.

// Each input will be convolved as a single GEMM.

M_ = num_output_ / group_;

K_ = channels_ * kernel_h_ * kernel_w_ / group_;

N_ = height_out_ * width_out_;

The call to forward CPU in convolution layer


for (int n = 0; n < num_; ++n) {

// im2col transformation: unroll input regions for filtering

// into column matrix for multplication.

im2col_cpu(bottom_data + bottom[i]->offset(n), channels_, height_,

width_, kernel_h_, kernel_w_, pad_h_, pad_w_, stride_h_, stride_w_,

col_data);

// Take inner products for groups.

for (int g = 0; g < group_; ++g) {

caffe_cpu_gemm<Dtype>(CblasNoTrans, CblasNoTrans, M_, N_, K_,

(Dtype)1., weight + weight_offset * g, col_data + col_offset * g,

(Dtype)0., top_data + (*top)[i]->offset(n) + top_offset * g);

}

// Add bias.

if (bias_term_) {

caffe_cpu_gemm<Dtype>(CblasNoTrans, CblasNoTrans, num_output_,

N_, 1, (Dtype)1., this->blobs_[1]->cpu_data(),

bias_multiplier_.cpu_data(),

(Dtype)1., top_data + (*top)[i]->offset(n));

}

}

}

CAFFE function call


template<>

void caffe_cpu_gemm<float>(const CBLAS_TRANSPOSE TransA,

const CBLAS_TRANSPOSE TransB, const int M, const int N, const int K,

const float alpha, const float* A, const float* B, const float beta,

float* C) {

int lda = (TransA == CblasNoTrans) ? K : M;

int ldb = (TransB == CblasNoTrans) ? N : K;

cblas_sgemm(CblasRowMajor, TransA, TransB, M, N, K, alpha, A, lda, B,

ldb, beta, C, N);

}

function documentation

https://developer.apple.com/library/mac/documentation/Accelerate/Reference/BLAS_Ref/#//apple_ref/c/func/cblas_dgemm

void cblas_dgemm ( const enum CBLAS_ORDER __ Order , const enum CBLAS_TRANSPOSE __ TransA ,const enum CBLAS_TRANSPOSE __ TransB , const int __ M , const int __ N , const int __ K ,const double __ alpha , const double *__ A , const int __ lda , const double *__ B , constint __ ldb , const double __ beta , double *__ C , const int __ ldc );

Parameters

Order

Specifies row-major (C) or column-major (Fortran) data ordering.

TransA

Specifies whether to transpose matrix A.

TransB

Specifies whether to transpose matrix B.

M

Number of rows in matrices A and C.

N

Number of columns in matrices B and C.

K

Number of columns in matrix A; number of rows in matrix B.

alpha

Scaling factor for the product of matrices A and B.

A

Matrix A.

lda

The size of the first dimention of matrix A; if you are passing a matrix A[m][n], the value should be m.

B

Matrix B.

ldb

The size of the first dimention of matrix B; if you are passing a matrix B[m][n], the value should be m.

beta

Scaling factor for matrix C.

C

Matrix C.

ldc

The size of the first dimention of matrix C; if you are passing a matrix C[m][n], the value should be m.

So for example, for the first convolution layer, it is essentially a

96 by 363 (A) * 363 by 3025 and you get a 96 by 3025 matrix, corresponding to the 96 * 55 *55 3D tensor.

Advertisements
This entry was posted in Convoluted Neural Nets. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s