Skip to main content

Command Palette

Search for a command to run...

Part 3: Building a Neural Network Layer

Updated
5 min read
B

With 2+ years of experience in web backend development, I now specialize in AI engineering, building intelligent systems and scalable solutions. Passionate about crafting innovative software, I love exploring new technologies, experimenting with AI models, and bringing ideas to life. Always learning, always building.

Now that our dataset is ready, we are ready to start building the components of our neural network.

But before wejump into the code, let’s pause for a moment to define what deep learning actually is. Understanding the "why" and "what" behind our implementation is as important as writing the code itself.

What is Deep Learning?

Deep learning is a method for teaching machines to learn patterns by stacking many mathematical transformations, known as layers

Each layer consists of:

  • Weights: Parameters that determine the strength of the input.

  • Biases: Values that allow the model to shift its logic more flexibly.

During training, these weights and biases are adjusted so the network "fits" the data. You can think of a neural network as a giant formula built from smaller formulas. It might feel a bit abstract at first, but you will start understand as we build the process.

Now let's start writing some code. The Layer is the first thing we will build.

Define a Layer

Create a new file model.rs. Our implentation of layer struct looks like this:

use burn::module::Param;
use burn::prelude::{Backend, Tensor};

#[derive(Debug)]
pub struct Layer<B: Backend> {
    weight: Param<Tensor<B, 2>>,
    bias: Param<Tensor<B, 1>>,
}

As previously mentioned, our Layer struct consists of weight and bias fields.

In this implementation, we wrap these fields in Param, which is Burn’s specialized type for learnable parameters. By using Param, We can easily initialize and update the parameter values.

Why this shape?

If the layer expects d_input features and produces d_out features, weights shoud have shaoe [d_input, d_out], biases have shape [d_out].

This will make sense later when we discuss matrix multiplication and the forward pass.

Build an Initializer

The next thing we will build is the initializer.

Neural-network layers contain parameters (weights and biases) and these parameters must be initialized before we can use them.
Initialization plays a big role in training stability, but for now, we will start with the simplest possible strategy so we can focus on understanding the mechanics.

Let's create a new file called initializer.rs.

Step 1. Define the initializer type

We start with a simple enum that describes two basic initialization strategies Zeroes, Ones which set every values as 0 and 1.

#[derive(Clone)]
pub enum Initializer {
    Zeroes,
    Ones,
}

Keep in mind that these are not used in real-world training, but they are good examples for learning. We will add more realistic initializers later.

Step 2. Implement the init_with function

Next, we implement a method that actually creates a tensor using the chosen initializer.

impl Initializer {
    pub fn init_with<B: Backend, const D: usize, S: Into<Shape>>(
        &self,
        shape: S,
        device: &B::Device,
    ) -> Param<Tensor<B, D>> {
        let device = device.clone();
        let shape: Shape = shape.into();
        let config = self.clone();

        Param::uninitialized(
            ParamId::new(),
            move |device, _| {
                let tensor = match config {
                    Initializer::Zeroes => Tensor::<B, D>::zeros(shape, &device),
                    Initializer::Ones => Tensor::<B, D>::ones(shape, &device),
                };

                tensor
            },
            device,
            false,
        )
    }
}

Let's break this down piece by piece.

Generics

pub fn init_with<B: Backend, const D: usize, S: Into<Shape>>

This is a flexible initializer because:

  • B is the tensor backend (CPU, GPU, NdArray, etc.)

  • D is the rank of the tensor (1D, 2D, 3D…)

  • S is anything that can be converted into a Shape

This allows you to create any tensor shape with a single function.

Arguments

  • shape tells us the dimension sizes, for example [3, 2] or [10]

  • device determines where the tensor lives (CPU / GPU)

Initialization

Burn initializes params using a closure:

Param::uninitialized(
    ParamId::new(),
    move |device, _| {
        let tensor = match config {
            Initializer::Zeroes => Tensor::<B, D>::zeros(shape, &device),
            Initializer::Ones => Tensor::<B, D>::ones(shape, &device),
        };

        tensor
    },
    device,
    false,
)

This closure:

  1. Receives the device

  2. Creates the actual tensor using the chosen initializer

Returns it as the parameter value

Step 3. Create a layer with an initializer

Add init_with for Layer struct

use crate::initializer::Initializer;

impl<B: Backend> Layer<B> {
    pub fn init_with(
        initializer: &Initializer,
        d_input: usize,
        d_out: usize,
        device: &B::Device,
    ) -> Self {
        let weight = initializer.init_with::<B, 2, [usize; 2]>([d_input, d_out], &device);
        let bias = initializer.init_with::<B, 1, [usize; 1]>([d_out], &device);
        Self { weight, bias }
    }
}

This creates:

  • weight tensor → shape [d_input, d_out]

  • bias tensor → shape [d_out]

Using whatever initializer the user passes in.

Run the code

In your main function, you can now create a layer and print its raw tensor values:

use crate::initializer::Initializer;
use crate::model::Layer;
use burn::backend::NdArray;

fn main() {
    type B = NdArray;
    let zero_tensors =
        Initializer::Zeroes.init_with::<B, 2, [usize; 2]>([2, 2], &Default::default());
    let one_tensors = Initializer::Ones.init_with::<B, 2, [usize; 2]>([2, 2], &Default::default());

    println!(
        "Zero Tensors: {:?}",
        zero_tensors.val().to_data().to_vec::<f32>().unwrap()
    );
    println!(
        "One Tensors: {:?}",
        one_tensors.val().to_data().to_vec::<f32>().unwrap()
    );

    let layer: Layer<B> = Layer::init_with(&Initializer::Ones, 1, 1, &Default::default());
    println!("Layer: {:?}", layer);
}

Running this will show the actual numbers stored in the weight and bias tensors.

Right now they are all 1.0 or 0.0, depending on your initializer.
In future parts, we will replace this with proper random initialization.

Conclusion

We now have a fully functional, flexible initializer component that:

  • creates tensors of any shape

  • works with any backend

  • supports multiple initialization strategies

This initializer will be used when constructing layers so that weights and biases start with controlled values.

In the next part, we’ll load the data and start preparing tensors for training.