Sleeper cell —
OpenAI’s GPT-2 running locally in Microsoft Excel teaches the basics of how LLMs work.
Benj Edwards
– Mar 15, 2024 8:56 pm UTC
Getty Images
It seems like AI large language models (LLMs) are everywhere these days due to the rise of ChatGPT. Now, a software developer named Ishan Anand has managed to cram a precursor to ChatGPT called GPT-2—originally released in 2019 after some trepidation from OpenAI—into a working Microsoft Excel spreadsheet. It’s freely available and is designed to educate people about how LLMs work.
“By using a spreadsheet anyone (even non-developers) can explore and play directly with how a ‘real’ transformer works under the hood with minimal abstractions to get in the way,” writes Anand on the official website for the sheet, which he calls “Spreadsheets-are-all-you-need.” It’s a nod to the 2017 research paper “Attention is All You Need” that first described the Transformer architecture that has been foundational to how LLMs work.
Anand packed GPT-2 into an XLSB Microsoft Excel binary file format, and it requires the latest version of Excel to run (but won’t work on the web version). It’s completely local and doesn’t do any API calls to cloud AI services.
Even though the spreadsheet contains a complete AI language model, you can’t chat with it like ChatGPT. Instead, users input words in other cells and see the predictive results displayed in different cells almost instantly. Recall that language models like GPT-2 were designed to do next-token prediction, which means they try to complete an input (called a prompt, which is encoded into chunks called tokens) with the most likely text. The prediction could be the continuation of a sentence or any other text-based task, such as software code. Different sheets in Anand’s Excel file allow users to get a sense of what is going on under the hood while these predictions are taking place.
Spreadsheets-are-all-you-need only supports 10 tokens of input. That’s tiny compared to the 128,000-token context window of GPT-4 Turbo, but it’s enough to demonstrate some basic principles of how LLMs work, which Anand has detailed in a series of free tutorial videos he has uploaded to YouTube.
A video of Iman Anand demonstrating “Spreadsheets-are-all-you-need” in a YouTube tutorial.
In an interview with Ars Technica, Anand says he started the project so he could satisfy his own curiosity and understand the Transformer in detail. “Modern AI is so different from the AI I learned when I was getting my CS degree that I felt I needed to go back to the fundamentals to truly have a mental model for how it worked.”
He says he was originally going to re-create GPT-2 in JavaScript, but he loves spreadsheets—he calls himself “a spreadsheet addict.” He pulled inspiration from data scientist Jeremy Howard’s fast.ai and former OpenAI engineer Andrej Karpathy’s AI tutorials on YouTube.
“I walked away from Karpathy’s videos realizing GPT is mostly just a big computational graph (like a spreadsheet),” he says, “And [I] loved how Jeremy often uses spreadsheets in his course to make the material more approachable. After watching those two, it suddenly clicked that it might be possible to do the whole GPT-2 model in a spreadsheet.”
We asked: Did he have any difficulty implementing a LLM in a spreadsheet? “The actual algorithm for GPT2 is mostly a lot of math operations which is perfect for a spreadsheet,” he says. “In fact, the hardest piece is where the words are converted into numbers (a process called tokenization) because it’s text processing and the only part that isn’t math. It would have been easier to do that part in a traditional programming language than in a spreadsheet.”
When Anand needed assistance, he naturally got a little help from GPT-2’s descendant: “Notably ChatGPT itself was very helpful in the process in terms helping me solve thorny issues I would come across or understanding various stages of the algorithm, but it would also hallucinate so I had to double-check it a lot.”
GPT-2 rides again
This whole feat is possible because OpenAI released the neural network weights and source code for GPT-2 in November 2019. It’s particularly interesting to see that particular model baked into an educational spreadsheet because when it was announced in February 2019, OpenAI was afraid to release it—the company saw the potential that GPT-2 might be “used to generate deceptive, biased, or abusive language at scale.”
Still, the company released the full GPT-2 model (including weights files needed to run it locally) in November 2019, but the company’s next major model, GPT-3, which launched in 2020, has not received an open-weights release. A variation of GPT-3 later formed the basis for the initial version of ChatGPT, launched in 2022.
A video of Anand demonstrating “Spreadsheets-are-all-you-need” at AI Tinkerers Seattle, October 2023.
Anand’s spreadsheet implementation runs “GPT-2 Small,” which unlike the full 1.5-billion-parameter version of GPT-2 clocks in at 124 million parameters. (Parameters are numerical values in AI models that store patterns learned from training data.) Compared to the 175 billion parameters in GPT-3 (and even larger models), it probably would not qualify as a “large” language model if released today. But in 2019, GPT-2 was considered state-of-the-art.
You can download the GPT-2-infused spreadsheet on GitHub, though be aware that it’s about 1.2GB. Because of its complexity, Anand said it can frequently lock up or crash Excel, especially on a Mac; he recommends running the sheet on Windows. “It is highly recommended to use the manual calculation mode in Excel and the Windows version of Excel (either on a Windows directory or via Parallels on a Mac),” he writes on his website.
And before you ask, Google Sheets is currently out of the question: “This project actually started on Google Sheets, but the full 124M model was too big and switched to Excel,” Anand writes. “I’m still exploring ways to make this work in Google Sheets, but it is unlikely to fit into a single file as it can with Excel.”
>>> Read full article>>>
Copyright for syndicated content belongs to the linked Source : Ars Technica – https://arstechnica.com/?p=2010453