-
Václav Tran authoredVáclav Tran authored
Milestone
Read Articles and Sources
- Were RNNs All We Needed?
- Coding a Recurrent Neural Network (RNN) from scratch using Pytorch
- PyTorch RNN Class Documentation
- MAPE
Brief Assignment Summary
Project focuses on predicting parameters of sinusoidal waves using neural networks with minGRU architecture. The core task is to predict amplitude (A) and frequency (ω) parameters from discretized sine wave samples. The waves are defined as y(t) = A sin(ωt) where t ∈ [0, 2π], sampled at 100 points.
Current Progress
- Implemented dataset generation of sinusoidal waves
- Implemented a multi-layer minGRU architecture that allows parallel computation of hidden states in logspace
- Trained minGRU with various levels of network depth - using AdamW optimizer and MSE criterion
- So far I only experimented with various levels of network depth, experiments with the amount of neurons in hidden layers was also done, however I did not unfortuntely save the results.
Results
Training Configuration
The model was trained with the following hyperparameters:
- Dataset size (N): 1000
- Maximum amplitude (MAX_AMP): 10
- Maximum frequency (MAX_FREQ): 10
- Points per sequence (N_POINTS): 100
- Batch size: 16
- Learning rate: 0.001
- Number of epochs: 100
- Hidden size: 32
- Noise standard deviation: 0
- Train/val split: 0.8 / 0.2
The architecture was so far evaluated with varying network depths:
- 1 layer
- 4 layers
- 8 layers
Current Results
The model was evaluated with three different network depths (1, 4, and 8 layers), with all other hyperparameters held constant. Mean absolute percentage error (MAPE) was used as the evaluation metric (lower is better). Best model selection is based on the arithmetic mean of amplitude and frequency MAPE on the validation set.
1 Layer:
- Training Loss: 23.70
- Validation Loss: 15.20
- Amplitude MAPE: Train 1.01% / Val 0.86%
- Frequency MAPE: Train 1.16% / Val 2.81%
4 Layer:
- Training Loss: 0.17
- Validation Loss: 0.10
- Amplitude MAPE: Train 0.10% / Val 0.09%
- Frequency MAPE: Train 0.25% / Val 0.09%
8 Layer:
- Training Loss: 0.17
- Validation Loss: 0.08
- Amplitude MAPE: Train 0.40% / Val 0.05%
- Frequency MAPE: Train 2.55% / Val 0.06%
The 4-layer model achieved the best overall perforamnce, followed by the 8-layer model, where the model has subsantially higher frequency MAPE than the 4-layer model. Both significantly outperformed the single-layer model (1.84% average MAPE). The deeper architectures demonstrated substantially better parameter prediction accuracy.
Project Repository: [Link to repository]
Next steps include:
- Hyperparameter experimentation optimization
- Noise testing
- Comparison with other models? (Regular GRU, LSTM, RNN)