Thoughts on Groq

By Zhenyi Tan

(Note: This post is not actually about Groq.)

I have this tendency of brushing off a topic if it didn’t catch my interest. One of those topics was AI. I mean, come on, you can go to any tech news website and see headlines about new AI breakthroughs multiple times a day.

2 days ago, I was having lunch with a friend, and he said, “Did you hear about Groq? I saw their demo video, their AI can chat like us, and it’s all happening in real-time.”

I hadn’t seen the video, but my first thought was, “So what? Siri can chat like us in real-time too, although not very well.” My second thought was maybe the demo was staged. Like maybe there’s a supercomputer somewhere doing all the work. As if being skeptical made me a smarter person.

My friend didn’t push it, and we moved on. But then, I kept seeing Groq everywhere. So I decided to check it out, in case it’s a bigger deal than I thought.

Turns out, it’s a bigger deal than I thought. The oversimplification is this: if you use a bunch of simple chips with static RAM (like the kind you find in CPU caches) to run software compiled with a custom compiler, you can generate text very quickly, like 10 times faster than using a GPU.

Due to the implications for Nvidia and the GPU industry, I decided to share it with my business-savvy friend. But… he was just as skeptical as I was. And that kind of ticked me off. I even told him, “If you’re gonna keep asking these kinds of questions, then maybe we shouldn’t talk about this at all.”

Luckily, it didn’t turn into an argument, and we ended up having a pretty good discussion. But looking back, I realize I had no right to be angry because I was no different from my friend.

So… sorry, man, to both my lunch friend and my business-savvy friend.