TensorFlow for Neural Networks

1. What Is TensorFlow?

TensorFlow provides tools, libraries, and resources for creating neural networks and deploying them in production.

Key Features

Flexible Architecture: Build with high-level Keras APIs or low-level operations.
Cross-Platform: Deploy on servers, mobile devices, browsers, and edge devices.
Automatic Differentiation: Compute gradients automatically for training.
Production-Ready: Serving, monitoring, and deployment tools.
Large Ecosystem: Community support, pre-trained models, and extensions.

2. How TensorFlow Is Used

Image classification for objects and patterns
Natural language processing such as translation and sentiment analysis
Time series forecasting for stocks, weather, and demand
Recommendation systems for products and content
Reinforcement learning for games and robotics
Generative models like GANs and VAEs

3. Main Components for Neural Networks

3.1 Tensors

Tensors are multi-dimensional arrays and the core data structure in TensorFlow.

import tensorflow as tf
import numpy as np

# Creating tensors
scalar = tf.constant(3.0)
vector = tf.constant([1.0, 2.0, 3.0])
matrix = tf.constant([[1, 2], [3, 4]])
tensor_3d = tf.constant([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

print(f"Scalar shape: {scalar.shape}")
print(f"Vector shape: {vector.shape}")
print(f"Matrix shape: {matrix.shape}")

3.2 Keras API (tf.keras)

Keras is the high-level API for building neural networks. In TensorFlow 2.x, Keras is integrated as tf.keras and is the recommended approach.

4. Essential Methods for Building Neural Networks

Sequential API

The Sequential API builds models as a linear stack of layers.

from tensorflow import keras
from tensorflow.keras import layers

# Create a Sequential model
model = keras.Sequential([
  layers.Dense(64, activation='relu', input_shape=(784,)),
  layers.Dropout(0.2),
  layers.Dense(64, activation='relu'),
  layers.Dropout(0.2),
  layers.Dense(10, activation='softmax')
])

# Alternative: Add layers one by one
model = keras.Sequential()
model.add(layers.Dense(64, activation='relu', input_shape=(784,)))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(10, activation='softmax'))

Functional API

Use the Functional API for multiple inputs/outputs and shared layers.

inputs = keras.Input(shape=(784,))
x = layers.Dense(64, activation='relu')(inputs)
x = layers.Dropout(0.2)(x)
x = layers.Dense(64, activation='relu')(x)
outputs = layers.Dense(10, activation='softmax')(x)
model = keras.Model(inputs=inputs, outputs=outputs)

5. Common Layer Types

Dense Layers (Fully Connected)

# Dense layer: y = activation(W*x + b)
layer = layers.Dense(
  units=128,
  activation='relu',
  use_bias=True,
  kernel_initializer='glorot_uniform',
  bias_initializer='zeros'
)

Convolutional Layers (for Image Data)

model = keras.Sequential([
  layers.Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)),
  layers.MaxPooling2D(pool_size=(2, 2)),
  layers.Conv2D(64, kernel_size=(3, 3), activation='relu'),
  layers.MaxPooling2D(pool_size=(2, 2)),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(10, activation='softmax')
])

Recurrent Layers (for Sequential Data)

model = keras.Sequential([
  layers.LSTM(64, return_sequences=True, input_shape=(None, 10)),
  layers.LSTM(32),
  layers.Dense(10, activation='softmax')
])

# GRU as an alternative to LSTM
model = keras.Sequential([
  layers.GRU(64, return_sequences=True, input_shape=(None, 10)),
  layers.GRU(32),
  layers.Dense(10)
])

6. Complete Neural Network Example

Full workflow: load data, build model, train, evaluate, and save.

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

# Load and preprocess data (MNIST dataset)
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Normalize pixel values to [0, 1]
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Flatten images from 28x28 to 784
x_train = x_train.reshape(-1, 784)
x_test = x_test.reshape(-1, 784)

# Build the model
model = keras.Sequential([
  layers.Dense(128, activation='relu', input_shape=(784,)),
  layers.BatchNormalization(),
  layers.Dropout(0.3),
  layers.Dense(64, activation='relu'),
  layers.BatchNormalization(),
  layers.Dropout(0.3),
  layers.Dense(10, activation='softmax')
])

# Display model architecture
model.summary()

# Compile the model
model.compile(
  optimizer='adam',
  loss='sparse_categorical_crossentropy',
  metrics=['accuracy']
)

# Train the model
history = model.fit(
  x_train, y_train,
  batch_size=128,
  epochs=10,
  validation_split=0.2,
  verbose=1
)

# Evaluate the model
test_loss, test_accuracy = model.evaluate(x_test, y_test, verbose=0)
print(f"Test accuracy: {test_accuracy:.4f}")

# Make predictions
predictions = model.predict(x_test[:5])
print(f"Predicted classes: {np.argmax(predictions, axis=1)}")
print(f"Actual classes: {y_test[:5]}")

# Save the model
model.save('mnist_model.h5')

# Load the model
loaded_model = keras.models.load_model('mnist_model.h5')

7. Key Methods Reference

Model Creation

Method	Description
`keras.Sequential()`	Creates a linear stack of layers.
`keras.Model()`	Creates a model from inputs and outputs (Functional API).
`model.add()`	Adds a layer to a Sequential model.

Model Compilation

Method	Description
`model.compile()`	Configures the model for training.

Common optimizers: adam, sgd, rmsprop, adagrad

Common loss functions: binary_crossentropy, categorical_crossentropy, sparse_categorical_crossentropy, mse, mae

Model Training

Method	Description
`model.fit()`	Trains the model on data.
`model.fit_generator()`	Trains using a data generator (deprecated).
`model.train_on_batch()`	Single gradient update on one batch.

Model Evaluation

Method	Description
`model.evaluate()`	Evaluates model on test data.
`model.predict()`	Generates predictions for input samples.
`model.test_on_batch()`	Tests model on a single batch.

Model Inspection

Method	Description
`model.summary()`	Prints model architecture.
`model.get_weights()`	Returns the weights of the model.
`model.set_weights()`	Sets the weights of the model.
`model.count_params()`	Counts total parameters.

Model Persistence

Method	Description
`model.save()`	Saves the entire model to a file.
`keras.models.load_model()`	Loads a saved model.
`model.save_weights()`	Saves only the weights.
`model.load_weights()`	Loads weights from a file.

8. Advanced Features

Custom Training Loops

# Custom training loop with GradientTape
optimizer = keras.optimizers.Adam()
loss_fn = keras.losses.SparseCategoricalCrossentropy()

@tf.function
def train_step(x, y):
  with tf.GradientTape() as tape:
    predictions = model(x, training=True)
    loss = loss_fn(y, predictions)
  gradients = tape.gradient(loss, model.trainable_variables)
  optimizer.apply_gradients(zip(gradients, model.trainable_variables))
  return loss

# Training loop
for epoch in range(10):
  for batch, (x_batch, y_batch) in enumerate(train_dataset):
    loss = train_step(x_batch, y_batch)
    if batch % 100 == 0:
      print(f"Epoch {epoch}, Batch {batch}, Loss: {loss:.4f}")

Callbacks

from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau

callbacks = [
  # Stop training when validation loss stops improving
  EarlyStopping(
    monitor='val_loss',
    patience=5,
    restore_best_weights=True
  ),
  # Save the best model during training
  ModelCheckpoint(
    filepath='best_model.h5',
    monitor='val_accuracy',
    save_best_only=True
  ),
  # Reduce learning rate when metric plateaus
  ReduceLROnPlateau(
    monitor='val_loss',
    factor=0.5,
    patience=3
  )
]

model.fit(x_train, y_train, epochs=50, validation_split=0.2, callbacks=callbacks)

9. Activation Functions

Activation	Use Case	Method
ReLU	Hidden layers (most common)	`activation='relu'`
Leaky ReLU	Hidden layers (prevents dying ReLU)	`layers.LeakyReLU()`
Sigmoid	Binary classification output	`activation='sigmoid'`
Softmax	Multi-class classification output	`activation='softmax'`
Tanh	Hidden layers (centered at 0)	`activation='tanh'`
Linear	Regression output	`activation='linear'`

10. Best Practices

Start simple and gradually increase complexity.
Normalize input data to consistent ranges.
Use ReLU for hidden layers and softmax for multi-class outputs.
Add Dropout and L2 regularization to prevent overfitting.
Use batch normalization for stability and faster training.
Monitor validation loss with callbacks.
Experiment with learning rates (default 0.001 for Adam).
Use GPU acceleration with tensorflow-gpu.

11. Common Pitfalls

Vanishing gradients: use ReLU instead of sigmoid or tanh in deep networks.
Exploding gradients: apply gradient clipping or batch normalization.
Overfitting: add dropout, reduce model complexity, or use more data.
Underfitting: increase model capacity or train longer.
Class imbalance: use class weights or resampling.