Astle's stream

Moments I felt the AI

Since 2020, there were only a few moments I was really fascinated by AI, and I thought about jotting them down here.

When chatGPT (3.5) came out, I was intrigued by it (obviously), but more as a certain piece of software I needed to know the internals off, rather than it feeling like "AI". Nonetheless, it kick started my mild obsession of going through books and courses to understand the how.
While travelling in a train, my friend mentioned the newly released GPT-4V, with vision capabilities. He then went on to click a picture of the floor and ask gpt-4 to describe it. That moment was a fuel for me to continue learning about these models (though I didn't really aim to be a model trainer or anything as such: just pure curiosity).
YouTube was my go-to entertainment site, and here I came upon Gothamchess' AlphaZero vs Stockfish analysis. The match was thrilling (credits to Gothamchess' extremely entertaining commentary) and really excited me about the paradigm of Reinforcement learning, even though both RL and AlphaZero where old at that point (came out in 2017). Before that, I had bought a book on the chess engine, but it failed to awestruck me the way Levy did (the book heavily focuses on chess, rather than AI).
I had been well into the lore of deep learning, LLMs and the AI paradigm, whence I came upon a blog which explores the entirety of llm training process, from the ground up. Now this blog does it simply through tracking the process of a new player in the AI league: DeepSeek. Simply reading this blog, and reading about the (then) 16 papers released by DeepSeek, I was hooked. It wasn't just transformers, it was distributed training, creative data preprocessing and many other things (as evident by their recent opensource week tweets) that set them apart, not to mention everything is open source. Probably my favourite blog on LLMs till date.