There will be parts that are revised and elaborated for better understanding, however, I hereby acknowledge that the following post is based on TensorFlow tutorial provided in:
For more detailed explanations and background knowledge regarding the codes and dataset, you can always consult the link.
I. Introduction
We all know what images are, so we can skip the “what are — ?” part in this post.
Then, the most important question is left.
Why do we want to import image file?
Before answering to that question, I have more fundamental one: what is the final goal of AI? Our future AI would think like a human being or imitate mechanism of human mind. For example, AlphaGo showed to public what our AI project is heading for. Go game demands the maximum ability of human intelligence; intuition, strategic thinking and learning ability, not to mention the calculating skills.
Many of you would have heard about the story; the go summit with lee se dol and one victory of the player and so on. The victories of AlphaGo proved that AI can do more than just imitating human mind. The summit was from more than 3 years ago, and AI has been innovated persistently.
The easy part of training alpha go is that everyone knows the rules of go game, despite its complexity. (Actually, I have no clue how the game works but anyway the rules are the known facts)
However, when it comes to images it gets more complicated. We all see things. Humans, dogs, cats, birds see and collect information regarding moving images every second or every breath taken except for blind people. (FYI, I searched and figured out that ‘blind’ is a perfect pc word.)
Nonetheless, can anyone explain how we see and process image data? That is still a challenging question for neuroscientists. Ironically, it explains the necessity of converting image data into processable data for tensorflow. For AI to have human-like mechanism, image recognition is a crucial and very basic ability. In Tensorflow tutorial, you will see that ‘Images’ has its own section which implies its high importance. Then, how do we deal with images in machine learning?
To boast up your interest, let’s see what Google can do with image data. Have you ever used image search in Google?
Let’s say you took a picture of a painting in a museum but forgot whose drawing it was, its title and everything.
All you have to do is to go to images.google.com and drop your image file in the search box. Then what you have is this:
Or, you would be surprised to see what google can do in photos.google.com. It can search for certain face or surroundings(parks, beach, Seoul, etc) and analyze photos accordingly. The technology has achieved so much and we need to understand the basics of the mechanism.
As you could imagine, image data analysis is a very long story. What we are going to do in this post is just loading image data and converting it to tf.dataset for future procedure.
II. Technical Setup
from __future__ import absolute_import, division, print_function, unicode_literalstry:
# %tensorflow_version only exists in Colab.
%tensorflow_version 2.x
except Exception:
pass
import tensorflow as tf
In terms of the try and except line, we will be sticking to our previous codes, but the tensorflow tutorial is on the update so, we should stay tuned for its final edition.
So, for now in our notebooktf.__version__
will retrieve ‘2.0.0-rc2’
AUTOTUNE=tf.data.experimental.AUTOTUNE
- AUTOTUNE is a new face to us. It will automatically adjust the tf.data runtime to tune the value dynamically at runtime for the efficiency issue. However, in this level, let’s not waste our time to understand this.
import IPython.display as display
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
- IPython.display is to display images later on.
- from PIL import Image : from PIL (abbreviation of python Image Library, a.k.a Pillow) import Image to load, manage, save image files in diverse formats.
III. Retrieve the Images
1 — Download image files
We will be using image data offered by Google.
import pathlibdata_dir = tf.keras.utils.get_file(origin='https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',fname='flower_photos', untar=True)data_dir = pathlib.Path(data_dir)
- pathlib is a module that offers classes representing filesystem paths with semantics appropriate for different operating systems. (from docs.python.org) directory
- We will use
tf.keras.utils.get_file
to download a file namedfname='flower_photos'.
Recap)train_file_path = tf.keras.utils.get_file("train.csv", TRAIN_DATA_URL)
from load csv post. - Now, we want to see the explicit filesystem path using
pathlib.Path
then, data_dir will output
PosixPath(‘/root/.keras/datasets/flower_photos’). - Our flower_photos file will be in directory ‘.keras/datasets’
FYI).keras is something invisible to us but then exists.
2 — Let’s look deeper in our dataset
I have downloaded directly to laptop as usual to show you explicitly what is in the dataset. (Not recommended)
In ‘flower_photos’ folder there are 5 folders ‘daisy’, ‘roses’, ‘sunflowers’, ‘dandelion’, ‘tulips’ and one ‘License.txt” file.
And each folder has around 600 to 800 photos of flowers that correspond to the folder name.
3 — Back to Tensorflow
Now forget about folders. Up to 1 — , we have a copy of the flower photos. We want to know how the flower_photos directory looks like in colab notebook.
First, how many images do we have?
image_count = len(list(data_dir.glob('*/*.jpg')))
image_count
#output: 3670
- In ‘*/*.jpg’ , * symbolizes that anything can fill in the blank. So we are looking for a jpg file that is in any directory name with any image name.
- data_dir.glob will scrap whatever it is in the data_dir.
Then, what sub-directories do we have? (spoiler: ‘daisy’, ‘roses’, ‘sunflowers’, ‘dandelion’, ‘tulips’ and one ‘License.txt’)
print(item for item in data_dir.glob('*')if item.name != "LICENSE.txt"])[PosixPath('/root/.keras/datasets/flower_photos/sunflowers'), PosixPath('/root/.keras/datasets/flower_photos/daisy'), PosixPath('/root/.keras/datasets/flower_photos/tulips'), PosixPath('/root/.keras/datasets/flower_photos/dandelion'), PosixPath('/root/.keras/datasets/flower_photos/roses')]
- != means not equal to; we are not interested in the text file.
- There are 5 ‘items’ in data_dir, the usual name for the item would be ‘sub-directory’.
- However, what we want to know is the class name so, we will be using item.name rather than item and we will convert it to the NumPy array for future processing.
So, we have,
CLASS_NAMES = np.array([item.name for item in data_dir.glob('*') if item.name != "LICENSE.txt"])
CLASS_NAMES
array([‘sunflowers’, ‘daisy’, ‘tulips’, ‘dandelion’, ‘roses’], dtype=’<U10')
Now, we can conclude that the flower_photos directory contains 5 sub-directories with one sub-directory per one class.
We can check how many photos each class has.
a=len(list(data_dir.glob('sunflowers/*.jpg'))) #a=699
b=len(list(data_dir.glob('daisy/*.jpg'))) #b=633
c=len(list(data_dir.glob('tulips/*.jpg'))) #c=799
d=len(list(data_dir.glob('dandelion/*.jpg'))) #d=898
e=len(list(data_dir.glob('roses/*.jpg'))) #e=641
Or maybe we want to see some photos of roses.
roses = list(data_dir.glob('roses/*'))
‘a’ photo in roses class will have a path that looks like PosixPath(‘/root/.keras/datasets/flower_photos/roses/3265902330_d8b1e44545.jpg’).
for image_path in roses[:3]: display.display(Image.open(str(image_path)))
You will be able to see 3 photos in the rose directory.
IV. Load using tf.data.Dataset
Do you remember that we used tf.data.Dataset.from_tensor_slices in load Numpy data and pandas.DataFrame?
Recap) dataset = tf.data.Dataset.from_tensor_slices((df.values, target.values))
or train_dataset = tf.data.Dataset.from_tensor_slices((train_examples, train_labels))
Again, we will load image using tf.data.Dataset
Actually there is another way to load image; keras.preprocessing, however for efficiency reason it is not very recommended. Although it will be omitted in this post, you can always visit tensorflow tutorial.
Before anything done, I’d like to define some variables that was used in the keras.preprocessing part and will continue to be used in tf.data.Dataset.
BATCH_SIZE = 32
IMG_HEIGHT = 224
IMG_WIDTH = 224
1 — Create a dataset regarding the file paths.
list_ds = tf.data.Dataset.list_files(str(data_dir/'*/*'))
For understanding, let’s see how does ‘list_ds’ look like.
for f in list_ds.take(1):
print(f)
tf.Tensor(b’/root/.keras/datasets/flower_photos/sunflowers/14460075029_5cd715bb72_m.jpg’, shape=(), dtype=string)
What is list_ds?
Let’s go back to data_dir.
data_dir was; PosixPath(‘/root/.keras/datasets/flower_photos’)
Then, list_ds
list_ds = tf.data.Dataset.list_files(str(data_dir/'*/*'))
1.
str(data_dir/’*/*’)
‘stringlizes’ the (data_dir /‘*/*’);
/root/.keras/datasets/flower_photos/*/*.
The first asterisk is a blank for the class name or the sub-directory then the following is for name of the photo.
2.tf.data.Dataset.list_files(...)
will save the strings into tensors.Now, we can understand the result.
tf.Tensor(b’/root/.keras/datasets/flower_photos/sunflowers/14460075029_5cd715bb72_m.jpg’, shape=(), dtype=string)
2 — Convert file path to (image, label) pair
To do that, we will define 3 functions.
i) Get labels from a file_path: get_label() function.
def get_label(file_path):
parts = tf.strings.split(file_path, '/')
return parts[-2] == CLASS_NAMES
tf.strings.split spilt the ‘file_path’ taking ‘/’ as delimiter. The pieces or the path components will be saved as a list into ‘parts’.
In case of root/.keras/datasets/flower_photos/sunflowers/14460075029_5cd715bb72_m.jpg’,
parts will have a list of [root, .keras, datasets,…, sunflowers,14460075029_5cd715bb72_m.jpg] .
Among the components of a list, we are interested in label or class-directory ‘sunflowers’. That is why we are indexing parts[-2] .
Then the function will return True or False depending on whether or not the parts[-2] coincides with each element of CLASS_NAMES array.
If the result of get_label() function is [False False False True False], since CLASS_NAME is array([‘sunflowers’, ‘daisy’, ‘tulips’, ‘dandelion’, ‘roses’])we can conclude that our photo is in dandelion class.
ii) Decode image: decode_img() function.
Decoding is a vague concept, but it is basically a way to convert a human-version of image into a computer-version of image. Our aim with decode_img is to convert an image into a grid format according to its RGB content.
def decode_img(img):
img = tf.image.decode_jpeg(img, channels=3) #color images
img = tf.image.convert_image_dtype(img, tf.float32)
#convert unit8 tensor to floats in the [0,1]range
return tf.image.resize(img, [IMG_WIDTH, IMG_HEIGHT])
#resize the image into 224*224
iii) Combine get_label() and decode_img() so that we can get (image, label) pair for a given file_path: process_path() function.
def process_path(file_path):
label = get_label(file_path)
img = tf.io.read_file(file_path)
img = decode_img(img)
return img, label
There are two lines of code that were not looked into.
def decode_img(img):
img = tf.image.decode_jpeg(img, channels=3)
The line above will convert the compressed string to a 3D unit 8 tensor. This is needed because our process_path function has tf.io.read_file( file_path ) function which reads and outputs the entire contents of the input ‘as a string’
def process_path(file_path):
img = tf.io.read_file(file_path)
3 — Create a dataset of (image, label) pairs.
We will be using Dataset.map and num_parallel_calls is defined so that multiple images are loaded simultaneously.
labeled_ds = list_ds.map(process_path, num_parallel_calls=AUTOTUNE)
Let’s check what is in labeled_ds.
for image, label in labeled_ds.take(1):
print("Image shape: ", image.numpy().shape)
print("Label: ", label.numpy())
Image shape: (224, 224, 3) Label: [False False False True False]
- We resized the image in decode_image() into 224×224. And the 3 is for RGB color image.
- Label: [False False False True False].Our example is in the Dandelion class.
V. Prepare for training
For efficiency purpose we will be using tf.data api.
def prepare_for_training(ds, cache=True, shuffle_buffer_size=1000):
if cache:
if isinstance(cache, str):
ds = ds.cache(cache)
else:
ds = ds.cache() ds = ds.shuffle(buffer_size=shuffle_buffer_size) ds = ds.repeat() #repeat forever ds = ds.batch(BATCH_SIZE) ds = ds.prefetch(buffer_size=AUTOTUNE) return ds
- isinstance(cache,str) function returns true if cache is string type.
- cache saves time (we will be checking this later) because string data is kept in memory and there is no need to load more than once. (in colab environment, it is saved in colab memory).
- prefetch will prepare your next batch in the background while the model is training.
train_ds = prepare_for_training(labeled_ds)image_batch, label_batch = next(iter(train_ds))show_batch(image_batch.numpy(), label_batch.numpy())
VI. Performance
We will check performance by measuring time spent in loading images.
import time
default_timeit_steps = 1000def timeit(ds, steps=default_timeit_steps):
start = time.time()
it = iter(ds)
for i in range(steps):
batch = next(it)
if i%10 == 0:
print('.',end='')
print()
end = time.time() duration = end-start
print("{} batches: {} s".format(steps, duration))
print("{:0.5f} Images/s".format(BATCH_SIZE*steps/duration))
timeit() function is no more than a technical way to check time spent so just take it as given. Nothing to be scared of! 😜
1 — compare keras.preprocessing and tf.data
#keras.preprocessing
timeit(train_data_gen)
1000 batches: 97.74200057983398 s 327.39252 Images/s
#tf.data
timeit(train_ds)
1000 batches: 16.27811074256897 s 1965.83010 Images/s
Can you see that the tf.data ‘data generator’ is about 6 times faster?
2 — use of .cache is quite a big deal.
#without cache
uncached_ds = prepare_for_training(labeled_ds, cache=False)
timeit(uncached_ds)
1000 batches: 81.07855153083801 s 394.67898 Images/s
filecache_ds = prepare_for_training(labeled_ds, cache="./flowers.tfcache")
timeit(filecache_ds)
1000 batches: 47.67451047897339 s 671.21822 Images/s
Can you see that use of .cache makes around 2 times faster?
This is all for loading images to tf.data. Are you interest in loading Pandas dataframe into Tensorflow? Then try this.
Or interested in loading CSV files into Tensorflow? Try this.
If you have any comments or questions I would be honored to answer them.
Thank you always for your interest. 😍
Hope your October is mindful.🧘🏻♀️
P.S. the painting is Four Trees by Egon Schiele 1917. What is your idea on the painting?