Transformers | Jiyoon Park

16-824 Visual Learning and Recognition: Homework 3 · Spring 2023 · GITHUB

About The Project

Implemented and trained different components of a Transformer decoder for image captioning using a subset of the COCO dataset. Additionally, a Vision Transformer (ViT) was implemented for classification on CIFAR10.

Built With

Python
NumPy
Pytorch

Results

For the entire report, please refer to the Documentation

About The Project#

Built With#

Results#

About The Project

Built With

Results