People don't understand AI
Sun Nov 09 2025
By B. Hassan
Introduction
There’s no doubt that AI has improved massively over the past few years. We went from relatively small, single-purpose models with a few thousand parameters to multi-purpose LLMs like ChatGPT and Claude with trillions of parameters.
The revolution becomes even more clear when you zoom out and look at lesser-known models. Generative AI has made searching for specific graphic design assets almost obsolete. From voice cloning to image, video, and even 3D model generators, the range of generative AI applications is staggering.
Take AlphaFold, for example. It solved one of biology’s toughest puzzles: the problem of protein folding. We’ve long understood DNA and which codons code for which amino acids, even mapping entire genomes for many organisms. However, understanding how amino acid chains fold into 3D structures has long hindered biological research; each amino acid chain can fold in countless unpredictable ways.
AlphaFold can take a simple amino acid sequence and predict its 3D shape with remarkable accuracy. That lets scientists infer what the protein does and even design entirely novel proteins. Today, it’s already being used to create novel drugs, including a universal snake antivenom.
But for all these breakthroughs, I think AI has also been overhyped. Between sci-fi doomsday talk, exaggerated fears of job loss, and general confusion about the underlying functions of AI, the conversation has drifted away from reality.
So, let’s clear up a few misconceptions.
The underlying algorithms
At its core, an AI model is just a set of numbers called weights or parameters. Each parameter is a floating-point number that influences how the model responds to input. The higher the precision and the more parameters you have, the higher the potential quality of the model, but also the more expensive it is to train and run.
You might think: “Why not just scale up to a quadrillion parameters and get the perfect AI?” If only it were that easy.
Modern frontier models have around 1.8 trillion parameters using 32-bit floats. That means the raw model weights alone would take around 7.2 terabytes of storage. If we use model quantization to decrease the precision of each parameter to 16-bits or even 8-bits, we get a 3.6 terabyte or a 1.8 terabyte model respectively.
Generally, the whole model needs to be loaded into the VRAM of a GPU. Sure there are some techniques like model sharding that can divide and load the model into VRAM in separate parts, reducing memory demand. However, these techniques require a lot of back and forth trips and can slow down the generation significantly.
To put that in perspective, an NVIDIA H100 GPU, the gold standard for AI training, has 80 GB of VRAM. Just loading a single 1.8-trillion-parameter model would require about 45 H100s. No surprise NVIDIA became the world’s most valuable company as of this writing.
Backpropagation and Gradient Descent
Now that we’ve looked at the big picture, let’s zoom in on what’s actually happening inside these models and how they are trained. At the heart of this process is the back propagation and gradient descent algorithms.
Every AI model starts with random parameters. If we had to tune them manually at a rate of one second per parameter, it would take over 64,000 years to set the two trillion parameters of a modern model. This is clearly not practical.
In addition, we generally don’t know what values to set each parameter to get the desired result. Thus an algorithm had to be invented to set these parameters, which is back propagation and gradient descent.
So instead, we let math do the work.
I won’t go too deep into these algorithms but the main idea is that we start backwards, by using data similar to what we want the model to generate, we vectorize the data, we compute the model’s current output and derive a gradient vector representing the difference between the current and desired outputs. We then slightly nudge the model parameters to more closely align with the desired output. We repeat this operation with different training data over millions of iterations, gradually improving with each small correction until the model output roughly aligns with what we want.
At its core, AI is just an averaging machine. It uses a lot of computations and matrix multiplications to generate the most likely answer an average user is looking for based on its training data (aka embedding space).
If that sounds complex, I assure you it’s not that complex once you implement it yourself, it just requires basic understanding of calculus and linear algebra; high school level mathematics. And if you want a better explanation or you want to dive into details, I’d suggest starting with this amazing 3blue1brown deep learning series.
The real challenge isn’t the math, it’s the scale. Training modern models takes hundreds of GPUs and terabytes of curated data. And as any gamer will tell you, GPUs are expensive. And if you spend any time on the internet (Twitter, I’m looking at you), you realize there is also a severe shortage of high quality data.
The golden rule of AI
As anyone who has tried training a simple AI trainer would tell you, the golden rule of AI is “garbage in garbage out”. As much as people seem to understand this rule, it tends to get lost in conversations.
First of all, contrary to what many people believe, most AI models aren’t trained directly on your data. Most of the data scattered across the internet are, frankly, useless. There are the occasional gems but most are either too dumb or too extreme. And you definitely don’t want to train you AI model on twitter data or it will claim it is MechaHitler and start posting Nazi tweets (oh wait that happened already).
Secondly and most importantly, using the current model training algorithms, AI output can’t surpass the training data it is trained on! Put in a smarter way, AI models can’t truly go beyond the statistical boundaries of their training data. They interpolate and combine what they’ve seen rather than invent something fundamentally new.
Sure, a lot of things can happen when you have 2 trillion parameters and certain emergent behavior appear. These are well documented and seem to emerge in spite of the training data, but the point is, the model as a whole can’t be better than its training data.
Take AlphaFold, for example, which is one of the best uses of AI in my opinion. The model didn’t discover anything new, we previously knew how a protein folds, and we had extensive databases for how amino acids shape a protein. The problem is that you had to sift through terabytes of data to understand how a protein may fold, and you still may not get a correct answer. AI helped with automation and pattern recognition, ultimately it didn’t invent anything new.
With the current training algorithms, despite what many people may think, AI won’t tell us how to cure cancer, but it can summarize the latest cancer research.
The second part to this rule is that the more the model sees something in its training data the more it is emphasized (the bigger the nudge towards the direction of that piece of data when doing gradient descent), and so the better it understands it.
And if you are a programmer, you may have realized this already. You may have realized that AI can produce high quality code for already established algorithms and frameworks, but struggles to generate code for niche frameworks, niche languages, and struggles at creating new algorithms; the model hasn’t seen enough examples in its training data to understand it.
That’s why most AI generated websites look the same; when making a good website, you don’t want an average website that an AI model has seen multiple times. You want an amazing and a unique website. Also, that’s why AI is really good at solving MCQ questions; most of these questions have been copied over and over and are everywhere in its training data, and so it understands.
That’s why AI progress has plateaued recently, at least in terms of what models feel capable of. The bottleneck isn’t the data or the compute. Sure these make it extremely challenging to get started, but the bottleneck is the current algorithms. Most AI advancements in the past couple of months has been superficial like faster output, larger context windows, additional abilities like agents, or a more encompassing knowledge-base, but the underlying models don’t feel any different. It’s hard to remember when was the last AI model drop that shook the world.
Conclusion
I hope I didn’t dunk on AI too hard in this article. This article is mostly based on my experience running and working with AI with a sprinkling of trust me bro science. My point isn’t to just abandon AI or not to use it at all, my point is to instill some common sense and help people understand when it’s best to use AI and when it’s not.
For example, AI is currently miles ahead of any traditional Google search. It is a fantastic search engine, diving as deep as you need, summarizing as much as you want, and it feels far more responsive and context-aware than any traditional search engine. Certainly it is much better than being yelled at by random strangers on StackOverflow (yeah RIP StackOverflow).
AI models currently solve a distribution issue, they are really good at generating well-known algorithms you don’t fancy implementing yourself or some functions that are really annoying to implement yourself - these are scattered all over the internet with thorough and detailed explanations and so AI understands them really well.
However, AI models fall apart if you let them generate a full website and later add more features. Every beginner to-do app looks the same, but as the codebase grows it becomes more and more unique, and AI falls apart when you steer away from averages.
Above all and most importantly, don’t participate in learned helplessness and give up on learning because “AI will take our jobs”. With their current algorithms, AI may replace the average and below average workers, but there will always be a place for the pioneers and innovators.
If you will take one thing from this article, it is to not be content with being average, don’t be a jack of all trades but a master of none. Being average is just so replaceable by AI. Instead be a pioneer in a specific field and you will always have a place in the market - unless newer AI algorithms are developed and AI takes over the world, but then you would have more things to worry about.