Transformers
16-824 Visual Learning and Recognition: Homework 3 · Spring 2023 · GITHUB About The Project Implemented and trained different components of a Transformer decoder for image captioning using a subset of the COCO dataset. Additionally, a Vision Transformer (ViT) was implemented for classification on CIFAR10. Built With Python NumPy Pytorch Results For the entire report, please refer to the Documentation