1. What Is TensorFlow?
TensorFlow provides tools, libraries, and resources for creating neural networks and deploying them in production.
Key Features
- Flexible Architecture: Build with high-level Keras APIs or low-level operations.
- Cross-Platform: Deploy on servers, mobile devices, browsers, and edge devices.
- Automatic Differentiation: Compute gradients automatically for training.
- Production-Ready: Serving, monitoring, and deployment tools.
- Large Ecosystem: Community support, pre-trained models, and extensions.
2. How TensorFlow Is Used
- Image classification for objects and patterns
- Natural language processing such as translation and sentiment analysis
- Time series forecasting for stocks, weather, and demand
- Recommendation systems for products and content
- Reinforcement learning for games and robotics
- Generative models like GANs and VAEs
3. Main Components for Neural Networks
3.1 Tensors
Tensors are multi-dimensional arrays and the core data structure in TensorFlow.
import tensorflow as tf
import numpy as np
# Creating tensors
scalar = tf.constant(3.0)
vector = tf.constant([1.0, 2.0, 3.0])
matrix = tf.constant([[1, 2], [3, 4]])
tensor_3d = tf.constant([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(f"Scalar shape: {scalar.shape}")
print(f"Vector shape: {vector.shape}")
print(f"Matrix shape: {matrix.shape}")
3.2 Keras API (tf.keras)
Keras is the high-level API for building neural networks. In TensorFlow 2.x, Keras is integrated as tf.keras and is the recommended approach.
4. Essential Methods for Building Neural Networks
Sequential API
The Sequential API builds models as a linear stack of layers.
from tensorflow import keras
from tensorflow.keras import layers
# Create a Sequential model
model = keras.Sequential([
layers.Dense(64, activation='relu', input_shape=(784,)),
layers.Dropout(0.2),
layers.Dense(64, activation='relu'),
layers.Dropout(0.2),
layers.Dense(10, activation='softmax')
])
# Alternative: Add layers one by one
model = keras.Sequential()
model.add(layers.Dense(64, activation='relu', input_shape=(784,)))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(10, activation='softmax'))
Functional API
Use the Functional API for multiple inputs/outputs and shared layers.
inputs = keras.Input(shape=(784,))
x = layers.Dense(64, activation='relu')(inputs)
x = layers.Dropout(0.2)(x)
x = layers.Dense(64, activation='relu')(x)
outputs = layers.Dense(10, activation='softmax')(x)
model = keras.Model(inputs=inputs, outputs=outputs)
5. Common Layer Types
Dense Layers (Fully Connected)
# Dense layer: y = activation(W*x + b)
layer = layers.Dense(
units=128,
activation='relu',
use_bias=True,
kernel_initializer='glorot_uniform',
bias_initializer='zeros'
)
Convolutional Layers (for Image Data)
model = keras.Sequential([
layers.Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Conv2D(64, kernel_size=(3, 3), activation='relu'),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(10, activation='softmax')
])
Recurrent Layers (for Sequential Data)
model = keras.Sequential([
layers.LSTM(64, return_sequences=True, input_shape=(None, 10)),
layers.LSTM(32),
layers.Dense(10, activation='softmax')
])
# GRU as an alternative to LSTM
model = keras.Sequential([
layers.GRU(64, return_sequences=True, input_shape=(None, 10)),
layers.GRU(32),
layers.Dense(10)
])
6. Complete Neural Network Example
Full workflow: load data, build model, train, evaluate, and save.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
# Load and preprocess data (MNIST dataset)
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Normalize pixel values to [0, 1]
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
# Flatten images from 28x28 to 784
x_train = x_train.reshape(-1, 784)
x_test = x_test.reshape(-1, 784)
# Build the model
model = keras.Sequential([
layers.Dense(128, activation='relu', input_shape=(784,)),
layers.BatchNormalization(),
layers.Dropout(0.3),
layers.Dense(64, activation='relu'),
layers.BatchNormalization(),
layers.Dropout(0.3),
layers.Dense(10, activation='softmax')
])
# Display model architecture
model.summary()
# Compile the model
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
# Train the model
history = model.fit(
x_train, y_train,
batch_size=128,
epochs=10,
validation_split=0.2,
verbose=1
)
# Evaluate the model
test_loss, test_accuracy = model.evaluate(x_test, y_test, verbose=0)
print(f"Test accuracy: {test_accuracy:.4f}")
# Make predictions
predictions = model.predict(x_test[:5])
print(f"Predicted classes: {np.argmax(predictions, axis=1)}")
print(f"Actual classes: {y_test[:5]}")
# Save the model
model.save('mnist_model.h5')
# Load the model
loaded_model = keras.models.load_model('mnist_model.h5')
7. Key Methods Reference
Model Creation
| Method | Description |
|---|---|
keras.Sequential() | Creates a linear stack of layers. |
keras.Model() | Creates a model from inputs and outputs (Functional API). |
model.add() | Adds a layer to a Sequential model. |
Model Compilation
| Method | Description |
|---|---|
model.compile() | Configures the model for training. |
Common optimizers: adam, sgd, rmsprop, adagrad
Common loss functions: binary_crossentropy, categorical_crossentropy, sparse_categorical_crossentropy, mse, mae
Model Training
| Method | Description |
|---|---|
model.fit() | Trains the model on data. |
model.fit_generator() | Trains using a data generator (deprecated). |
model.train_on_batch() | Single gradient update on one batch. |
Model Evaluation
| Method | Description |
|---|---|
model.evaluate() | Evaluates model on test data. |
model.predict() | Generates predictions for input samples. |
model.test_on_batch() | Tests model on a single batch. |
Model Inspection
| Method | Description |
|---|---|
model.summary() | Prints model architecture. |
model.get_weights() | Returns the weights of the model. |
model.set_weights() | Sets the weights of the model. |
model.count_params() | Counts total parameters. |
Model Persistence
| Method | Description |
|---|---|
model.save() | Saves the entire model to a file. |
keras.models.load_model() | Loads a saved model. |
model.save_weights() | Saves only the weights. |
model.load_weights() | Loads weights from a file. |
8. Advanced Features
Custom Training Loops
# Custom training loop with GradientTape
optimizer = keras.optimizers.Adam()
loss_fn = keras.losses.SparseCategoricalCrossentropy()
@tf.function
def train_step(x, y):
with tf.GradientTape() as tape:
predictions = model(x, training=True)
loss = loss_fn(y, predictions)
gradients = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
return loss
# Training loop
for epoch in range(10):
for batch, (x_batch, y_batch) in enumerate(train_dataset):
loss = train_step(x_batch, y_batch)
if batch % 100 == 0:
print(f"Epoch {epoch}, Batch {batch}, Loss: {loss:.4f}")
Callbacks
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
callbacks = [
# Stop training when validation loss stops improving
EarlyStopping(
monitor='val_loss',
patience=5,
restore_best_weights=True
),
# Save the best model during training
ModelCheckpoint(
filepath='best_model.h5',
monitor='val_accuracy',
save_best_only=True
),
# Reduce learning rate when metric plateaus
ReduceLROnPlateau(
monitor='val_loss',
factor=0.5,
patience=3
)
]
model.fit(x_train, y_train, epochs=50, validation_split=0.2, callbacks=callbacks)
9. Activation Functions
| Activation | Use Case | Method |
|---|---|---|
| ReLU | Hidden layers (most common) | activation='relu' |
| Leaky ReLU | Hidden layers (prevents dying ReLU) | layers.LeakyReLU() |
| Sigmoid | Binary classification output | activation='sigmoid' |
| Softmax | Multi-class classification output | activation='softmax' |
| Tanh | Hidden layers (centered at 0) | activation='tanh' |
| Linear | Regression output | activation='linear' |
10. Best Practices
- Start simple and gradually increase complexity.
- Normalize input data to consistent ranges.
- Use ReLU for hidden layers and softmax for multi-class outputs.
- Add Dropout and L2 regularization to prevent overfitting.
- Use batch normalization for stability and faster training.
- Monitor validation loss with callbacks.
- Experiment with learning rates (default 0.001 for Adam).
- Use GPU acceleration with tensorflow-gpu.
11. Common Pitfalls
- Vanishing gradients: use ReLU instead of sigmoid or tanh in deep networks.
- Exploding gradients: apply gradient clipping or batch normalization.
- Overfitting: add dropout, reduce model complexity, or use more data.
- Underfitting: increase model capacity or train longer.
- Class imbalance: use class weights or resampling.