Part 3: Building a Neural Network Layer
With 2+ years of experience in web backend development, I now specialize in AI engineering, building intelligent systems and scalable solutions. Passionate about crafting innovative software, I love exploring new technologies, experimenting with AI models, and bringing ideas to life. Always learning, always building.
Now that our dataset is ready, we are ready to start building the components of our neural network.
But before wejump into the code, let’s pause for a moment to define what deep learning actually is. Understanding the "why" and "what" behind our implementation is as important as writing the code itself.
What is Deep Learning?
Deep learning is a method for teaching machines to learn patterns by stacking many mathematical transformations, known as layers
Each layer consists of:
Weights: Parameters that determine the strength of the input.
Biases: Values that allow the model to shift its logic more flexibly.
During training, these weights and biases are adjusted so the network "fits" the data. You can think of a neural network as a giant formula built from smaller formulas. It might feel a bit abstract at first, but you will start understand as we build the process.
Now let's start writing some code. The Layer is the first thing we will build.
Define a Layer
Create a new file model.rs. Our implentation of layer struct looks like this:
use burn::module::Param;
use burn::prelude::{Backend, Tensor};
#[derive(Debug)]
pub struct Layer<B: Backend> {
weight: Param<Tensor<B, 2>>,
bias: Param<Tensor<B, 1>>,
}
As previously mentioned, our Layer struct consists of weight and bias fields.
In this implementation, we wrap these fields in Param, which is Burn’s specialized type for learnable parameters. By using Param, We can easily initialize and update the parameter values.
Why this shape?
If the layer expects d_input features and produces d_out features, weights shoud have shaoe [d_input, d_out], biases have shape [d_out].
This will make sense later when we discuss matrix multiplication and the forward pass.
Build an Initializer
The next thing we will build is the initializer.
Neural-network layers contain parameters (weights and biases) and these parameters must be initialized before we can use them.
Initialization plays a big role in training stability, but for now, we will start with the simplest possible strategy so we can focus on understanding the mechanics.
Let's create a new file called initializer.rs.
Step 1. Define the initializer type
We start with a simple enum that describes two basic initialization strategies Zeroes, Ones which set every values as 0 and 1.
#[derive(Clone)]
pub enum Initializer {
Zeroes,
Ones,
}
Keep in mind that these are not used in real-world training, but they are good examples for learning. We will add more realistic initializers later.
Step 2. Implement the init_with function
Next, we implement a method that actually creates a tensor using the chosen initializer.
impl Initializer {
pub fn init_with<B: Backend, const D: usize, S: Into<Shape>>(
&self,
shape: S,
device: &B::Device,
) -> Param<Tensor<B, D>> {
let device = device.clone();
let shape: Shape = shape.into();
let config = self.clone();
Param::uninitialized(
ParamId::new(),
move |device, _| {
let tensor = match config {
Initializer::Zeroes => Tensor::<B, D>::zeros(shape, &device),
Initializer::Ones => Tensor::<B, D>::ones(shape, &device),
};
tensor
},
device,
false,
)
}
}
Let's break this down piece by piece.
Generics
pub fn init_with<B: Backend, const D: usize, S: Into<Shape>>
This is a flexible initializer because:
Bis the tensor backend (CPU, GPU, NdArray, etc.)Dis the rank of the tensor (1D, 2D, 3D…)Sis anything that can be converted into aShape
This allows you to create any tensor shape with a single function.
Arguments
shapetells us the dimension sizes, for example[3, 2]or[10]devicedetermines where the tensor lives (CPU / GPU)
Initialization
Burn initializes params using a closure:
Param::uninitialized(
ParamId::new(),
move |device, _| {
let tensor = match config {
Initializer::Zeroes => Tensor::<B, D>::zeros(shape, &device),
Initializer::Ones => Tensor::<B, D>::ones(shape, &device),
};
tensor
},
device,
false,
)
This closure:
Receives the device
Creates the actual tensor using the chosen initializer
Returns it as the parameter value
Step 3. Create a layer with an initializer
Add init_with for Layer struct
use crate::initializer::Initializer;
impl<B: Backend> Layer<B> {
pub fn init_with(
initializer: &Initializer,
d_input: usize,
d_out: usize,
device: &B::Device,
) -> Self {
let weight = initializer.init_with::<B, 2, [usize; 2]>([d_input, d_out], &device);
let bias = initializer.init_with::<B, 1, [usize; 1]>([d_out], &device);
Self { weight, bias }
}
}
This creates:
weight tensor → shape
[d_input, d_out]bias tensor → shape
[d_out]
Using whatever initializer the user passes in.
Run the code
In your main function, you can now create a layer and print its raw tensor values:
use crate::initializer::Initializer;
use crate::model::Layer;
use burn::backend::NdArray;
fn main() {
type B = NdArray;
let zero_tensors =
Initializer::Zeroes.init_with::<B, 2, [usize; 2]>([2, 2], &Default::default());
let one_tensors = Initializer::Ones.init_with::<B, 2, [usize; 2]>([2, 2], &Default::default());
println!(
"Zero Tensors: {:?}",
zero_tensors.val().to_data().to_vec::<f32>().unwrap()
);
println!(
"One Tensors: {:?}",
one_tensors.val().to_data().to_vec::<f32>().unwrap()
);
let layer: Layer<B> = Layer::init_with(&Initializer::Ones, 1, 1, &Default::default());
println!("Layer: {:?}", layer);
}
Running this will show the actual numbers stored in the weight and bias tensors.
Right now they are all 1.0 or 0.0, depending on your initializer.
In future parts, we will replace this with proper random initialization.
Conclusion
We now have a fully functional, flexible initializer component that:
creates tensors of any shape
works with any backend
supports multiple initialization strategies
This initializer will be used when constructing layers so that weights and biases start with controlled values.
In the next part, we’ll load the data and start preparing tensors for training.