Tensorflow Datasets
As presented in Chollet: Chapter O8: Intro to DL for Computer Vision
For more read the docs
The tf.data
API creates efficient input pipelines for models. The core class is tf.data.Dataset
A Dataset
object is an iterator, you can use it in a for
loop or pass it directly to the fit
method of a model.
It handles a lot of stuff that would be a pain to implement, like async data pre-fetching.
It exposes a functional API for modifying a dataset.
It has a range of useful methods like:
dataset.batch(32)
for batching the datadataset.shuffle(buffer_size)
shuffle elements within a bufferdataset.prefetch(buffer_size)
prefetches a buffer of elements in GPU memory.dataset.map(callable)
Applies an arbitrary transformation to each element of the dataset.callable
takes a single element yielded by the dataset. You will use this a lot, for example in reshaping.