Sure, the role of neural network used in this project is recognizing hand written digits. The input is a 81x81 2D grid. There are a few fully-connected layers trying classify all these input pixels into 10 possible outputs and assign each output with a probability. The outputs are the 10 digits in our number system.
This model is trained using pytorch, a machine learning framework, and exported to onnx. The onnx file then gets used by a backend server written in python that will communicate with godot.
The math behind how fully-connected layers work is a little complicated. If you want to get into the details of how exactly machine "learns" you should go watch some youtube tutorials about back-propagation and statistic modeling.
The tricky part here is embedding an entire video (bad apple in this case) into the weights of the fully-connected layers. Training the model with custom-set limitations was really hard. This model's accuracy at guessing a human's handwriting correctly is only about 60% based on the augmented dataset that it was tested on.
I think it's lemmy's parsing issue that the "The" word and the repo link got merged