We are still in the brute force era. Lot of unnecessary compute is happening and that will keep reducing. But the bigger headache with cheaper and smarter is people start running faster and faster in different directions. So cost and complexity moves to keeping things in sync the larger the team gets.
We are still in the brute force era. Lot of unnecessary compute is happening and that will keep reducing. But the bigger headache with cheaper and smarter is people start running faster and faster in different directions. So cost and complexity moves to keeping things in sync the larger the team gets.
Yes
It's already happening, see qwen/gemma sized models in the sub 36B category