Neural-Style-Transfer is the way toward making another picture by combining two pictures. Since we like the craftsmanship on the base picture, we might want to move that style into our own memory photographs. Obviously, we would want to spare the photograph's substance however much as could reasonably be expected and in the meantime change it as indicated by the workmanship picture style. As an experiential AI Development Company, Oodles AI elaborates on the process of neural style transfer using deep learning algorithms.
Step 1: Capture
We have to figure out how to catch substance and style picture includes so we can combine them such that the yield will look tasteful to the eye. Convolution neural systems like VGG-16 are as of now, as it were, catching these highlights, because of the way that they can order/perceive an extensive assortment of pictures (millions) with very high precision. We simply need to look further at neural layers and comprehend what they are doing.
Step2: Layer
While preparing with pictures, we should assume we pick the primary layer and begin checking a portion of its units/neurons. Since we are simply in the primary layer, the units catch just a little piece of the pictures and rather low-level highlights, as demonstrated as follows:
It would seem that the principal neuron is keen on askew lines, with the third and fourth in vertical and corner to corner lines, and the eighth for beyond any doubt enjoys the shading green. Is detectable that all these are tiny parts of pictures and the layer is somewhat catching low-level highlights.
In this layer, neurons begin to recognize more highlights; the second distinguishes thin vertical lines, the 6th and seventh begin catching round shapes, and the fourteenth is fixated on shading yellow.
This layer begins to recognize all the more intriguing stuff, the 6th is more initiated for round shapes that resemble tires, the tenth isn't anything but difficult to clarify yet prefers orange and round shapes, while the eleventh begins identifying a few people.
So the more profound we go, the more picture neurons are distinguishing, hence catching abnormal state includes (the second neuron on the fifth layer is truly into mutts) of the picture contrasted with low-level layers catching rather little parts of the picture.
This gives incredible knowledge into what profound convolutional layers are learning and furthermore, returning to our style exchange, we have an understanding into how to produce craftsmanship and keep the substance from two pictures.
Step 3: Implement
We simply need to create another picture that, when encouraged to neural systems as information, produces pretty much a similar enactment esteems as the substance (photograph) and style (workmanship painting) picture.
Optimize.py
from __future__ import print_function
import functools
import vgg, pdb, time
import tensorflow as tf, numpy as np, os
import transform
from utils import get_img
STYLE_LAYERS = ('relu1_1', 'relu2_1', 'relu3_1', 'relu4_1', 'relu5_1')
CONTENT_LAYER = 'relu4_2'
DEVICES = 'CUDA_VISIBLE_DEVICES'
# np arr, np arr
def optimize(content_targets, style_target, content_weight, style_weight,
tv_weight, vgg_path, epochs=2, print_iterations=1000,
batch_size=4, save_path='saver/fns.ckpt', slow=False,
learning_rate=1e-3, debug=False):
if slow:
batch_size = 1s
mod = len(content_targets) % batch_size
if mod > 0:
print("Train set has been trimmed slightly..")
content_targets = content_targets[:-mod]
style_features = {}
batch_shape = (batch_size,256,256,3)
style_shape = (1,) + style_target.shape
print(style_shape)
# precompute style features
with tf.Graph().as_default(), tf.device('/cpu:0'), tf.Session() as sess:
style_image = tf.placeholder(tf.float32, shape=style_shape, name='style_image')
style_image_pre = vgg.preprocess(style_image)
net = vgg.net(vgg_path, style_image_pre)
style_pre = np.array([style_target])
for layer in STYLE_LAYERS:
features = net[layer].eval(feed_dict={style_image:style_pre})
features = np.reshape(features, (-1, features.shape[3]))
gram = np.matmul(features.T, features) / features.size
style_features[layer] = gram
with tf.Graph().as_default(), tf.Session() as sess:
X_content = tf.placeholder(tf.float32, shape=batch_shape, name="X_content")
X_pre = vgg.preprocess(X_content)
# precompute content features
content_features = {}
content_net = vgg.net(vgg_path, X_pre)
content_features[CONTENT_LAYER] = content_net[CONTENT_LAYER]
if slow:
preds = tf.Variable(
tf.random_normal(X_content.get_shape()) * 0.256
)
preds_pre = preds
else:
preds = transform.net(X_content/255.0)
preds_pre = vgg.preprocess(preds)
net = vgg.net(vgg_path, preds_pre)
content_size = _tensor_size(content_features[CONTENT_LAYER])*batch_size
assert _tensor_size(content_features[CONTENT_LAYER]) == _tensor_size(net[CONTENT_LAYER])
content_loss = content_weight * (2 * tf.nn.l2_loss(
net[CONTENT_LAYER] - content_features[CONTENT_LAYER]) / content_size
)
style_losses = []
for style_layer in STYLE_LAYERS:
layer = net[style_layer]
bsstyle, height, width, filters = map(lambda i:i.value,layer.get_shape())
size = height * width * filters
feats = tf.reshape(layer, (bsstyle, height * width, filters))
feats_T = tf.transpose(feats, perm=[0,2,1])
grams = tf.matmul(feats_T, feats) / size
style_gram = style_features[style_layer]
style_losses.append(2 * tf.nn.l2_loss(grams - style_gram)/style_gram.size)
style_loss = style_weight * functools.reduce(tf.add, style_losses) / batch_size
# total variation denoising
tv_y_size = _tensor_size(preds[:,1:,:,:])
tv_x_size = _tensor_size(preds[:,:,1:,:])
y_tv = tf.nn.l2_loss(preds[:,1:,:,:] - preds[:,:batch_shape[1]-1,:,:])
x_tv = tf.nn.l2_loss(preds[:,:,1:,:] - preds[:,:,:batch_shape[2]-1,:])
tv_loss = tv_weight*2*(x_tv/tv_x_size + y_tv/tv_y_size)/batch_size
loss = content_loss + style_loss + tv_loss
# overall loss
train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss)
sess.run(tf.global_variables_initializer())
import random
uid = random.randint(1, 100)
print("UID: %s" % uid)
for epoch in range(epochs):
num_examples = len(content_targets)
iterations = 0
while iterations * batch_size < num_examples:
start_time = time.time()
curr = iterations * batch_size
step = curr + batch_size
X_batch = np.zeros(batch_shape, dtype=np.float32)
for j, img_p in enumerate(content_targets[curr:step]):
X_batch[j] = get_img(img_p, (256,256,3)).astype(np.float32)
iterations += 1
assert X_batch.shape[0] == batch_size
feed_dict = {
X_content:X_batch
}
train_step.run(feed_dict=feed_dict)
end_time = time.time()
delta_time = end_time - start_time
if debug:
print("UID: %s, batch time: %s" % (uid, delta_time))
is_print_iter = int(iterations) % print_iterations == 0
if slow:
is_print_iter = epoch % print_iterations == 0
is_last = epoch == epochs - 1 and iterations * batch_size >= num_examples
should_print = is_print_iter or is_last
if should_print:
to_get = [style_loss, content_loss, tv_loss, loss, preds]
test_feed_dict = {
X_content:X_batch
}
tup = sess.run(to_get, feed_dict = test_feed_dict)
_style_loss,_content_loss,_tv_loss,_loss,_preds = tup
losses = (_style_loss, _content_loss, _tv_loss, _loss)
if slow:
_preds = vgg.unprocess(_preds)
else:
saver = tf.train.Saver()
res = saver.save(sess, save_path)
yield(_preds, losses, iterations, epoch)
def _tensor_size(tensor):
from operator import mul
return functools.reduce(mul, (d.value for d in tensor.get_shape()[1:]), 1)
import tensorflow as tf, pdb
WEIGHTS_INIT_STDEV = .1
def net(image):
conv1 = _conv_layer(image, 32, 9, 1)
conv2 = _conv_layer(conv1, 64, 3, 2)
conv3 = _conv_layer(conv2, 128, 3, 2)
resid1 = _residual_block(conv3, 3)
resid2 = _residual_block(resid1, 3)
resid3 = _residual_block(resid2, 3)
resid4 = _residual_block(resid3, 3)
resid5 = _residual_block(resid4, 3)
conv_t1 = _conv_tranpose_layer(resid5, 64, 3, 2)
conv_t2 = _conv_tranpose_layer(conv_t1, 32, 3, 2)
conv_t3 = _conv_layer(conv_t2, 3, 9, 1, relu=False)
predsstyle = tf.nn.tanh(conv_t3) * 150 + 255./2
return predsstyle
def _conv_layer(net, num_filters, filter_size, strides, relu=True):
weights_init = _conv_init_vars(net, num_filters, filter_size)
strides_shape = [1, strides, strides, 1]
net = tf.nn.conv2d(net, weights_init, strides_shape, padding='SAME')
net = _instance_norm(net)
if relu:
net = tf.nn.relu(net)
return net
def _conv_tranpose_layer(net, num_filters, filter_size, strides):
weights_init = _conv_init_vars(net, num_filters, filter_size, transpose=True)
new_rows, new_cols = int(rows * strides), int(cols * strides)
# new_shape = #tf.pack([tf.shape(net)[0], new_rows, new_cols, num_filters])
new_shape = [batch_size_style, new_rows, new_cols, num_filters]
tf_shape = tf.stack(new_shape)
strides_shape = [1,strides,strides,1]
net = tf.nn.conv2d_transpose(net, weights_init, tf_shape, strides_shape, padding='SAME')
net = _instance_norm(net)
return tf.nn.relu(net)
def _residual_block(net, filter_size=3):
tmp = _conv_layer(net, 128, filter_size, 1)
return net + _conv_layer(tmp, 128, filter_size, 1, relu=False)
def _instance_norm(net, train=True):
batch, rows, cols, channels = [i.value for i in net.get_shape()]
var_shape = [channels]
mu, sigma_sq = tf.nn.moments(net, [1,2], keep_dims=True)
shift = tf.Variable(tf.zeros(var_shape))
scale = tf.Variable(tf.ones(var_shape))
epsilon = 1e-3
normalized = (net-mu)/(sigma_sq + epsilon)**(.5)
return scale * normalized + shift
def _conv_init_vars(net, out_channels, filter_size, transpose=False):
_, rows, cols, in_channels = [i.value for i in net.get_shape()]
if not transpose:
weights_shape = [filter_size, filter_size, in_channels, out_channels]
else:
weights_shape = [filter_size, filter_size, out_channels, in_channels]
weights_init = tf.Variable(tf.truncated_normal(weights_shape, stddev=WEIGHTS_INIT_STDEV, seed=1), dtype=tf.float32)
return weights_init
import scipy.misc, numpy as np, os, sys
def save_img(out_path, img):
img = np.clip(img, 0, 255).astype(np.uint8)
scipy.misc.imsave(out_path, img)
def scale_img(style_path, style_scale):
scale = float(style_scale)
o0, o1, o2 = scipy.misc.imread(style_path, mode='RGB').shape
scale = float(style_scale)
new_shape = (int(o0 * scale), int(o1 * scale), o2)
style_target = _get_img(style_path, img_size=new_shape)
return style_target
def get_img(src, img_size=False):
img = scipy.misc.imread(src, mode='RGB') # misc.imresize(, (256, 256, 3))
if not (len(img.shape) == 3 and img.shape[2] == 3):
img = np.dstack((img,img,img))
if img_size != False:
img = scipy.misc.imresize(img, img_size)
return img
def exists(p, msg):
assert os.path.exists(p), msg
def list_files(in_path):
files = []
for (dirpath, dirnames, filenames) in os.walk(in_path):
files.extend(filenames)
break
return files
from __future__ import print_function
import sys
sys.path.insert(0, 'src')
import scipy.misc
import tensorflow as tf
from utils import save_img, get_img, exists, list_files
from argparse import ArgumentParser
from collections import defaultdict
import time
import json
import subprocess
import numpy
from moviepy.video.io.VideoFileClip import VideoFileClip
import moviepy.video.io.ffmpeg_writer as ffmpeg_writer
BATCH_SIZE = 4
DEVICE = '/gpu:0'
# get img_shape
def ffwd(data_in, paths_out, checkpoint_dir, device_t='/gpu:0', batch_size=4):
assert len(paths_out) > 0
is_paths = type(data_in[0]) == str
if is_paths:
assert len(data_in) == len(paths_out)
img_shape = get_img(data_in[0]).shape
else:
assert data_in.size[0] == len(paths_out)
img_shape = X[0].shape
g = tf.Graph()
batch_size = min(len(paths_out), batch_size)
curr_num = 0
soft_config = tf.ConfigProto(allow_soft_placement=True)
soft_config.gpu_options.allow_growth = True
with g.as_default(), g.device(device_t), \
tf.Session(config=soft_config) as sess:
batch_shape = (batch_size,) + img_shape
img_placeholder = tf.placeholder(tf.float32, shape=batch_shape,
name='img_placeholder')
preds = transform.net(img_placeholder)
saver = tf.train.Saver()
if os.path.isdir(checkpoint_dir):
ckpt = tf.train.get_checkpoint_state(checkpoint_dir)
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)
else:
raise Exception("No checkpoint found...")
else:
saver.restore(sess, checkpoint_dir)
num_iters = int(len(paths_out)/batch_size)
for i in range(num_iters):
pos = i * batch_size
curr_batch_out = paths_out[pos:pos+batch_size]
if is_paths:
curr_batch_in = data_in[pos:pos+batch_size]
X = np.zeros(batch_shape, dtype=np.float32)
for j, path_in in enumerate(curr_batch_in):
img = get_img(path_in)
assert img.shape == img_shape, \
'Images have different dimensions. ' + \
'Resize images or use --allow-different-dimensions.'
X[j] = img
else:
X = data_in[pos:pos+batch_size]
_preds = sess.run(preds, feed_dict={img_placeholder:X})
for j, path_out in enumerate(curr_batch_out):
save_img(path_out, _preds[j])
remaining_in = data_in[num_iters*batch_size:]
remaining_out = paths_out[num_iters*batch_size:]
if len(remaining_in) > 0:
ffwd(remaining_in, remaining_out, checkpoint_dir,
device_t=device_t, batch_size=1)
def ffwd_to_img(in_path, out_path, checkpoint_dir, device='/cpu:0'):
paths_in, paths_out = [in_path], [out_path]
ffwd(paths_in, paths_out, checkpoint_dir, batch_size=1, device_t=device)
def ffwd_different_dimensions(in_path, out_path, checkpoint_dir,
device_t=DEVICE, batch_size=4):
in_path_of_shape = defaultdict(list)
out_path_of_shape = defaultdict(list)
for i in range(len(in_path)):
in_image = in_path[i]
out_image = out_path[i]
shape = "%dx%dx%d" % get_img(in_image).shape
in_path_of_shape[shape].append(in_image)
out_path_of_shape[shape].append(out_image)
for shape in in_path_of_shape:
print('Processing images of shape %s' % shape)
ffwd(in_path_of_shape[shape], out_path_of_shape[shape],
checkpoint_dir, device_t, batch_size)
def build_parser():
parser = ArgumentParser()
parser.add_argument('--checkpoint', type=str,
dest='checkpoint_dir',
helpstyle='dir or .ckpt file to load checkpoint from',
metavar='CHECKPOINT', required=True)
parser.add_argument('--in-path', type=str,
dest='in_path',helpstyle='dir or file to transform',
metavar='IN_PATH', required=True)
helpstyle_out = 'destination (dir or file) of transformed file or files'
parser.add_argument('--out-path', type=str,
dest='out_path', helpstyle=help_out, metavar='OUT_PATH',
required=True)
parser.add_argument('--device', type=str,
dest='device',helpstyle='device to perform compute on',
metavar='DEVICE', default=DEVICE)
parser.add_argument('--batch-size', type=int,
dest='batch_size',helpstyle='batch size for feedforwarding',
metavar='BATCH_SIZE', default=BATCH_SIZE)
parser.add_argument('--allow-different-dimensions', action='store_true',
dest='allow_different_dimensions',
helpstyle='allow different image dimensions')
return parser
def check_opts(opts):
exists(opts.checkpoint_dir, 'Checkpoint not found!')
exists(opts.in_path, 'In path not found!')
if os.path.isdir(opts.out_path):
exists(opts.out_path, 'out dir not found!')
assert opts.batch_size > 0
def main():
parser = build_parser()
opts = parser.parse_args()
check_opts(opts)
if not os.path.isdir(opts.in_path):
if os.path.exists(opts.out_path) and os.path.isdir(opts.out_path):
out_path = \
os.path.join(opts.out_path,os.path.basename(opts.in_path))
else:
out_path = opts.out_path
ffwd_to_img(opts.in_path, out_path, opts.checkpoint_dir,
device=opts.device)
else:
files = list_files(opts.in_path)
full_in = [os.path.join(opts.in_path,x) for x in files]
full_out = [os.path.join(opts.out_path,x) for x in files]
if opts.allow_different_dimensions:
ffwd_different_dimensions(full_in, full_out, opts.checkpoint_dir,
device_t=opts.device, batch_size=opts.batch_size)
else :
ffwd(full_in, full_out, opts.checkpoint_dir, device_t=opts.device,
batch_size=opts.batch_size)
if __name__ == '__main__':
main()
We, at Oodles, are a team of seasoned AI developers and data analysts building dynamic AI solutions for businesses and enterprises. Under machine learning and deep learning, our expertise spans across predictive analytics, document analysis, sentiment analysis, image processing, and object detection models.