Deep cv 101

Deep CV 101
Burness.Duan
UCloud

1.Deep CV Classical Models
2.Deep CV Applications
3.Distributed Deep Learning In UCloud
Outline

Deep CV Classical Models
1.LeNet
2.AlexNet
3.GoogLeNet
4.VGG
5.Deep Residual Network

LeNet
LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document
recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.

LeNet
https://github.com/spark-mler/WorkWithTensorflow/blob/master/CV_model/lenet/lenet.py·

AlexNet
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural
networks[C]//Advances in neural information processing systems. 2012: 1097-1105.

AlexNet Tricks
ReLU on CIFAR-10
Local Response Normalization
Dropout

GoogLeNet
Shortcoming
1. Bigger size means a larger number of parameters, which makes
the enlarged network more prone to over-ﬁtting
2. Increased network size is the dramatically increased use of
computational resources
3. Computing infrastructures are very inefﬁcient when it comes to
numerical calculation on non-uniform sparse data structures.
Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]//Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition. 2015: 1-9.

GoogLeNet
Inception Module With Dimension Reduction
Lin M, Chen Q, Yan S. Network in network[J]. arXiv preprint arXiv:1312.4400, 2013.

GoogLeNet
Overall of GoogLeNet

GoogLeNet
GoogLeNet With TFlearn
https://github.com/tflearn/tflearn/blob/master/examples/images/googlenet.py

VGG
VGG With TFlearn
network = input_data(shape=[None, 224, 224, 3])
network = conv_2d(network, 64, 3, activation='relu')
network = max_pool_2d(network, 2, strides=2)
network = fully_connected(network, 4096, activation='relu')
network = dropout(network, 0.5)
network = fully_connected(network, 4096, activation='relu')
network = dropout(network, 0.5)
network = fully_connected(network, 17, activation='softmax')
network = regression(network, optimizer=‘rmsprop’,
loss=‘categorical_crossentropy’ ,learning_rate=0.001)

Deep Residual Network
with the network depth increasing, accuracy gets saturated
(which might be unsurprising) and then degrades rapidly and
such degradation is not caused by overfitting

Deep Residual Network
F(x):=H(x)-x
def residual_block(incoming, nb_blocks, out_channels, downsample=False,
downsample_strides=2, activation=‘relu’, batch_norm=True,
bias=True, weights_init=‘variance_scaling’,
bias_init=‘zeros’, regularizer=‘L2’, weight_decay=0.0001,
trainable=True, restore=True, reuse=False, scope=None,
name=“ResidualBlock”):
resnet = incoming
in_channels = incoming.get_shape().as_list()[-1]
with tf.variable_op_scope([incoming], scope, name, reuse=reuse) as scope:
name = scope.name
for i in range(nb_blocks):
identity = resnet
if not downsample:
downsample_strides = 1
if batch_norm:
resnet = tflearn.batch_normalization(resnet)
resnet = tflearn.activation(resnet, activation)
resnet = conv_2d(resnet, out_channels, 3,
downsample_strides, ‘same’, ‘linear’, bias, weights_init,
bias_init, regularizer, weight_decay, trainable, restore)
if batch_norm:
resnet = tflearn.batch_normalization(resnet)
resnet = tflearn.activation(resnet, activation)
resnet = conv_2d(resnet, out_channels, 3, 1, ‘same’,
‘linear’, bias, weights_init, bias_init, regularizer, weight_decay,
trainable, restore)
# Downsampling
if downsample_strides > 1:
identity = tflearn.avg_pool_2d(identity, 1, downsample_strides)
# Projection to new dimension
if in_channels != out_channels:
ch = (out_channels - in_channels)//2
identity = tf.pad(identity, [[0, 0], [0, 0], [0, 0], [ch, ch]])
in_channels = out_channels
resnet = resnet + identity
return resnet

Tools For TensorFlow
Keras
Next
Applications
TFlearn
TF-slim
Learn
TensorLayer

Deep CV Applications
1.Image Classification
2.Neural Style
3.Txt2Img, img2txt

Image Classification
1.Training from scratch
2.Retrain from pre-trained model
3.Load model and frozen some layers’
weights and retrain the other layers

Training from scratch
mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.matmul(x, W) + b
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
sess = tf.InteractiveSession()
tf.initialize_all_variables().run()
for _ in range(1000):
batch_xs, batch_ys =mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_:
mnist.test.labels}))
TRAIN
DATA
MODEL
INIT
PARAMS
TEST
DATA
TEST
LABELS

Retrain from pre-trained model
TRAIN
DATA
MODEL
PRE
PARAMS
TEST
DATA
TEST
LABELS
1. Load pre params from a model file(pb)
2. Change your model to the new task(class num)
3. Fit the train data to update the all weights
4. Fit your test data to inference

Retrain from pre-trained model (frozen)
TRAIN
DATA
MODEL
PRE
PARAMS
TEST
DATA
TEST
LABELS
1. Load pre params from a model file(pb)
2. Frozen your layers’ weight
3. Change the network’s class num
4. Fit your train data to update the unfrozen layers’ weight
5. Fit your test data to inference
FROZEN
https://github.com/spark-mler/WorkWithTensorflow/tree/master/cv_bot/models/pretrain_inference

Neural Style
1.MRF-Based
2.CNN-Based
3.MRF and CNN-Based
4.Fast Neural Style

MRF-Based
Freeman W T, Liu C. Markov random fields for super-resolution and texture synthesis[J].
Advances in Markov Random Fields for Vision and Image Processing, 2011, 1: 155-165.
Efros A A, Leung T K. Texture synthesis by non-parametric sampling[C]//Computer Vision, 1999.
The Proceedings of the Seventh IEEE International Conference on. IEEE, 1999, 2: 1033-1038.

CNN-Based
Gatys L A, Ecker A S, Bethge M. A neural algorithm of artistic style[J]. arXiv preprint
arXiv:1508.06576, 2015.
https://github.com/anishathalye/neural-style
https://github.com/jcjohnson/neural-style

MRF And CNN-Based
Li C, Wand M. Combining Markov Random Fields and Convolutional Neural Networks for
Image Synthesis[J]. arXiv preprint arXiv:1601.04589, 2016
https://github.com/chuanli11/CNNMRF

Fast Neural Style
Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-
resolution[J]. arXiv preprint arXiv:1603.08155, 2016.
https://github.com/burness/neural_style_tensorflow/tree/master/fast_neural_style

TextImage
1.Text-to-Image
2.Image-to-Text

Text-to-Image
Reed S, Akata Z, Yan X, et al. Generative adversarial text to image synthesis[J].
arXiv preprint arXiv:1605.05396, 2016.
https://github.com/paarthneekhara/text-to-image
Generative Adversarial Network
Generator G and a discriminator D compete in a two-player minimax game
G: fool the discriminator
D: Distinguish real training data from synthetic images

Text-to-Image
Generative Adversarial Network
G: ℝ 𝑍
× ℝ 𝑇
→ ℝ 𝐷
D: ℝ 𝑇 × ℝ 𝐷 → 0,1

Text-to-Image
Results
the flower has yellow petals and the center of it
is brown
the flower shown has yellow anther red
pistil and bright red petals
this flower has petals that are yellow,
white and purple and has dark lines
the petals on this flower are white with a
yellow center
this flower has a lot of small round pink
petals
this flower is orange in color, and has
petals that are ruffled and rounded

Image-to-Text
Vinyals O, Toshev A, Bengio S, et al. Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning
Challenge[J]. 2016.
https://github.com/tensorflow/models/tree/master/im2txt
https://github.com/tensorflow/models/issues/480
https://github.com/tensorflow/models/pull/485/commits/c6a4f783080c5310ce0e3244daa31af57df12def

Image-to-Text
Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing
internal covariate shift[J]. arXiv preprint arXiv:1502.03167, 2015.

Distributed Deep Learning In UCloud
UCloud Multi Node Weight Update

Distributed Deep Learning In UCloud

Deep cv 101

More Related Content

What's hot (17)

Viewers also liked (8)

Similar to Deep cv 101 (20)

Recently uploaded (20)

Deep cv 101

Editor's Notes