encoding.functions¶

batchnormtrain¶

encoding.functions.batchnormtrain(input, gamma, beta, mean, std)[source]

Applies Batch Normalization over a 3d input that is seen as a mini-batch.

$y = \frac{x - \mu[x]}{ \sqrt{var[x] + \epsilon}} * \gamma + \beta$
Shape:
• Input: $$(N, C)$$ or $$(N, C, L)$$
• Output: $$(N, C)$$ or $$(N, C, L)$$ (same shape as input)

batchnormeval¶

encoding.functions.batchnormeval(input, gamma, beta, mean, std)[source]

Applies Batch Normalization over a 3d input that is seen as a mini-batch.

dilatedavgpool2d¶

encoding.functions.dilatedavgpool2d(input, kernel_size, stride=None, padding=0, dilation=1)[source]

Dilated Average Pool 2d, for dilation of DenseNet.

Reference:

Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal. “Context Encoding for Semantic Segmentation. CVPR 2018

Applies 2D average-pooling operation in kh x kw regions by step size dh x dw steps. The number of output features is equal to the number of input planes.

See DilatedAvgPool2d for details and output shape.

Parameters: input – input tensor (minibatch x in_channels x iH x iW) kernel_size – size of the pooling region, a single number or a tuple (kh x kw) stride – stride of the pooling operation, a single number or a tuple (sh x sw). Default is equal to kernel size padding – implicit zero padding on the input, a single number or a tuple (padh x padw), Default: 0 dilation – the dilation parameter similar to Conv2d

aggregate¶

encoding.functions.aggregate(A, X, C)[source]

Aggregate operation, aggregate the residuals of inputs ($$X$$) with repect to the codewords ($$C$$) with assignment weights ($$A$$).

$e_{k} = \sum_{i=1}^{N} a_{ik} (x_i - d_k)$
Shape:
• Input: $$A\in\mathcal{R}^{B\times N\times K}$$ $$X\in\mathcal{R}^{B\times N\times D}$$ $$C\in\mathcal{R}^{K\times D}$$ (where $$B$$ is batch, $$N$$ is total number of features, $$K$$ is number is codewords, $$D$$ is feature dimensions.)
• Output: $$E\in\mathcal{R}^{B\times K\times D}$$

Examples

>>> B,N,K,D = 2,3,4,5
>>> func = encoding.aggregate()
>>> E = func(A, X, C)


scaledL2¶

encoding.functions.scaledL2(X, C, S)[source]

scaledL2 distance

$sl_{ik} = s_k \|x_i-c_k\|^2$
Shape:
• Input: $$X\in\mathcal{R}^{B\times N\times D}$$ $$C\in\mathcal{R}^{K\times D}$$ $$S\in \mathcal{R}^K$$ (where $$B$$ is batch, $$N$$ is total number of features, $$K$$ is number is codewords, $$D$$ is feature dimensions.)
• Output: $$E\in\mathcal{R}^{B\times N\times K}$$

sum_square¶

encoding.functions.sum_square(input)[source]

Calculate sum of elements and sum of squares for Batch Normalization