Deep Learning

Has the AI-Era come to video games already?

With the big noises made by ChatGPT, many different industries have noticed the value of LLM technologies. Unsurprisingly, the video game industry is one of them. In this blog, I introduce several cool demos/WIPs that I’ve recently found, and share my opinions on why they might have profound influences on the future of video game industry.

Shaojie Jiang

Aug 13, 2023 8 min read Deep Learning, NLP, LLM, video games

Has the AI-Era come to video games already?

One source of LLM hallucination is exposure bias

With the release of closed-source ChatGPT, GPT-4, and open-source LLaMa models, the LLM development has seen tremendous improvements in recent months. While we are hyped with the fact that these LLMs are capable of many tasks, we have also noticed again and again that these LLMs hallucinate content.

Shaojie Jiang

Aug 9, 2023 3 min read paper reading notes, Deep Learning, NLP, LLM, hallucination, information retrieval

One source of LLM hallucination is exposure bias

Transformer Align Model

Jointly Learning to Align and Translate with Transformer Models

Shaojie Jiang

May 16, 2020 2 min read paper reading notes, Deep Learning, NLP

Compressive Transformers

Built on top of Transformer-XL, Compressive Transformer1 condenses old memories (hidden states) and stores them in the compressed memory buffer, before completely discarding them. This model is suitable for long-range sequence learning but may cause too much computational burden for tasks that only have short sequences.

Shaojie Jiang

May 12, 2020 3 min read paper reading notes, Deep Learning, NLP

Compressive Transformers

Visualizing the Loss Landscape of Neural Nets

What characterizes a easier to train, easier to generalize neural model?

Shaojie Jiang

May 6, 2020 3 min read paper reading notes, Deep Learning

Visualizing the Loss Landscape of Neural Nets

Adaptive Computation Time

My notes for the paper: Adaptive Computation Time for Recurrent Neural Networks1. Additive vs multiplicative halting probability Multiplicative: In the paper (footnote 1), the authors discuss throughly their considerations for deciding the computation time.

Shaojie Jiang

Apr 28, 2020 2 min read paper reading notes, Deep Learning

A Hub for Transformer Blogs and Papers

This is a growing list of pointers to useful blog posts and papers related to transformers. Transformers explained Blog: The Illustrated Transformer has many intuitive animations of how transformer models work Blog: Universal Transformers introduces the idea of recurrence among layers Blog: Transformer vs RNN and CNN for Translation Task GNNs: similarities and differences Blog: Transformers are Graph Neural Networks bridges transformer models and Graph Neural Networks Transformer improvements Blog: DeepMind Releases a New Architecture and a New Dataset to Improve Long-Term Memory in Deep Learning Systems Nural Turing Machine + transformer?

Shaojie Jiang

Mar 2, 2020 1 min read Deep Learning

What's New in XLNet?

In this post, I will try to understand what makes XLNet better than BERT.

Shaojie Jiang

Last updated on Jul 3, 2019 4 min read NLP, Deep Learning

What's New in XLNet?