I think one of the things that is missing from this post is engaging a bit in trying to answer: what are the highest priority AI-related problems that the industry should seek to tackle?
Karpathy hints at one major capability unlock being UI generation, so instead of interacting with text the AI can present different interfaces depending on the kind of problem. That seems like a severely underexplored problem domain so far. Who are the key figures innovating in this space so far?
In the most recent Demis interview, he suggests that one of the key problems that must be solved is online / continuous learning.
Aside from that, another major issues is probably reducing hallucinations and increasing reliability. Ideally you should be able to deploy an LLM to work on a problem domain, and if it encounters an unexpected scenario it reaches out to you in order to figure out what to do. But for standard problems it should function reliably 100% of the time.
Notable omission: 2025 is also when the ghosts started haunting the training data. Half of X replies are now LLMs responding to LLMs. The call is coming from inside the dataset.
I think one of the things that is missing from this post is engaging a bit in trying to answer: what are the highest priority AI-related problems that the industry should seek to tackle?
Karpathy hints at one major capability unlock being UI generation, so instead of interacting with text the AI can present different interfaces depending on the kind of problem. That seems like a severely underexplored problem domain so far. Who are the key figures innovating in this space so far?
In the most recent Demis interview, he suggests that one of the key problems that must be solved is online / continuous learning.
Aside from that, another major issues is probably reducing hallucinations and increasing reliability. Ideally you should be able to deploy an LLM to work on a problem domain, and if it encounters an unexpected scenario it reaches out to you in order to figure out what to do. But for standard problems it should function reliably 100% of the time.
Notable omission: 2025 is also when the ghosts started haunting the training data. Half of X replies are now LLMs responding to LLMs. The call is coming from inside the dataset.
xposted to https://x.com/karpathy/status/2002118205729562949