Grokking Epoch reduced to 650 epochs Vs Transformer architecture epochs took 20k epochs

avinavsahoo · January 25, 2026, 8:11am

Hi all,

We have an AI lab Pulseinnovas and our AI lab has discovered an new architecture which beats Google’s Transformer architecture…Google’s transformer architecture took 20k epochs of training to generalize over benchmark dataset…while our model just did it at 650 epochs thereby cutting the training cost by 98-99…i have attached our benchmark run results
The blue line attending the level of accuracy is Transformer architecture by Google and orange line is our model on X axis we have number of epochs and y axis we have test accuracy

Topic		Replies	Views
4.9 % reached on ARC-AGI2 benchmark AI/ML AWG Topics	1	53	September 2, 2025
AI/ML Subgroup for Genetic Perturbation Predictive Modeling (GPPM) AIML AWG Open Projects omics , news , subgroups , perturb-seq , transcriptomics	41	1103	January 5, 2026
Quantum Brain for Natural Language processing AIML AWG Open Projects	0	70	July 9, 2025
Quantum Natural Language processing model prototype AIML AWG Open Projects	2	95	August 1, 2025
Discrete weight Neural networks AI/ML AWG Topics	0	25	January 10, 2026

Grokking Epoch reduced to 650 epochs Vs Transformer architecture epochs took 20k epochs

Related topics