Analog Hacker News

kenferry 2 hours ago

Seems like engagement bait or a thought exercise more than a realistic project.

> "But I need to debug!"

> Do you debug JVM bytecode? V8's internals? No. You debug at your abstraction layer. If that layer is natural language, debugging becomes: "Hey Claude, the login is failing for users with + in their email."

Folks can get away without reading assembly only when the compiler is reliable. English -> code compilation by llms is not reliable. It will become more reliable, but (a) isn’t now so I guess this is a project to “provoke thought” (b) you’re going to need several nines of reliability, which I would bet against in any sane timeframe (b) English isn’t well specified enough to have “correct” compilation, so unclear if “several nines of reliability” is even theoretically possible.

knlb 2 hours ago

> Do you debug JVM bytecode? V8's internals? No. You debug at your abstraction layer

In the fullness of time, you end up having to. Or at least I have. Which is why I always dislike additional layers and transforms at this point.

(eg. when I think about react native on android, I hear "now I'll have to be excellent at react/javascript and android/java/kotlin and C++ to be able to debug the bridge; not that "I can get away with just javascript".)

[-]

XJ6w9dTdM 2 hours ago

Exactly yes, that's what I was going to comment. You sometimes need to debug at every layers. All abstractions end up leaking in some way. It's often worth it, but it does not save us from the extra cognitive load and from learning the layers underneath.

I'm not necessarily against the approach shown here, reducing tokens for more efficient LLM generation; but if this catches on, humans will read and write it, will write debuggers and tooling for it, etc. It will definitely not be a perfectly hidden layer underneath.

But why not, for programming models, just select tokens that map concisely existing programming languages ? Would that not be as effective ?

zaptheimpaler 3 minutes ago

I get the idea, but this language seems to be terrible for humans, while not having a lot of benefits for LLMS besides keeping keywords in single tokens. And I bet like 1 or 2 layers into an LLM, the problem of a keyword being two tokens doesn't really matter.

synalx 2 hours ago

One major disadvantage here is the lack of training data on a "new" language, even if it's more efficient. At least in the short term, this means needing to teach the LLM your language in the context window.

I've spent a good bit of time exploring this space in the context of web frameworks and templating languages. One technique that's been highly effective is starting with a _very_ minimal language with only the most basic concepts. Describe that to the LLM, ask it to solve a small scale problem (which the language is likely not yet capable of doing), and see what kinds of APIs or syntax it hallucinates. Then add that to your language, and repeat. Obviously there's room for adjustment along the way, but we've found this process is able to cut many many lines from the system prompts that are otherwise needed to explain new syntax styles to the LLM.

gnanagurusrgs 2 hours ago

Creator here. This started as a dumb question while using Claude Code: "Why is Claude writing TypeScript I'm supposed to read?"

40% of code is now machine-written. That number's only going up. So I spent some weekends asking: what would an intermediate language look like if we stopped pretending humans are the authors?

NERD is the experiment.

Bootstrap compiler works, compiles to native via LLVM. It's rough, probably wrong in interesting ways, but it runs. Could be a terrible idea. Could be onto something. Either way, it was a fun rabbit hole.

Contributors welcome if this seems interesting to you - early stage, lots to figure out: https://github.com/Nerd-Lang/nerd-lang-core

Happy to chat about design decisions or argue about whether this makes any sense at all.

[-]

wmoxam 2 hours ago

> 40% of code is now machine-written

How did you arrive at that number?

[-]

gnanagurusrgs 2 hours ago

Ran a simple test with the examples you find in the project. Will publish those benchmarks.. actually that makes me think, I should probably do a public test suite showing the results. :)

[-]

wmoxam 2 hours ago

Ok, now I'm even more confused. 40% of what code is machine written?

andrepd 2 hours ago

The same way LLMs arrive at things? :)

tyre 25 minutes ago

I love the idea! I’m glad you did this.

What about something like clojure? It’s already pretty succinct and Claude knows it quite well.

Plus there are heavily documented libraries that it knows how to use and are in its training data.

liqilin1567 38 minutes ago

I like the idea, but this is going be a very very long journey to develop a completely new machine-friendly language like this while LLMs still have many limitations now.

wilsonnb3 2 hours ago

> "Why is Claude writing TypeScript I'm supposed to read?" 40% of code is now machine-written. That number's only going up.

How much of the code is read by humans, though? I think using languages that LLMs work well with, like TS or Python, makes a lot of sense but the chosen language still needs to be readable by humans.

[-]

sublinear 28 minutes ago

Why do people keep saying LLMs work well with high level scripting languages?

I've never had a good result. Just tons of silent bugs that are obvious those experienced with Python, JS/TS, etc. and subtle to everyone else.

[-]

alienbaby 17 minutes ago

Perhaps they are being more successful in their use of llm's than you are?

kevml 2 hours ago

It’s an interesting thought experiment! My first questions boil down to the security and auditability of the code. How easy is it for a human to comprehend the code?

[-]

gnanagurusrgs 2 hours ago

It is still very visible, auditable, and one of the features I'm hoping to add is a more visual layer that shows the nook and corners of the code, coming up soon. But regardless, the plain code itself is readable and visible, but it's not as friendly as the other languages for humans.

dlenski 2 hours ago

This is a 21st-century equivalent of leaving short words ("of", "the", "in") out of telegrams because telegraph operators charged by the word. That caused plenty of problems in comprehension… this is probably much worse because it's being applied to extremely complex and highly structured messages.

It seems like a short-sighted solution to a problem that is either transient or negligible in the long run. "Make code nearly unreadable to deal with inefficient tokenization and/or a weird cost model for LLMs."

I strongly question the idea that code can be effectively audited by humans if it can't be read by humans.

al_borland an hour ago

> Do you debug JVM bytecode? V8's internals? No. You debug at your abstraction layer. If that layer is natural language, debugging becomes: "Hey Claude, the login is failing for users with + in their email."

I’ve run into countless situations where this simply doesn’t work. I once had a simple off-by-one error and the AI could not fix it. I tried explaining the end result of what I was seeing, as implied by this example, with no luck. I then found why it was happening myself and explained the exact problem and where it was, and the AI still couldn’t do it. It was sloshing back and further between various solutions and compounding complexity that didn’t help the issue. I ended up manually fixing the problem in the code.

The AI needs to be nearly flawless before this is viable. I feel like we are still a long way away from that.

perons 2 hours ago

That looks to me like Forth with extra steps and less clarity? Not sure why I'd choose it over something with the same semantic advantages ("terse english" but in a programming language), but just agressively worse for a human operator to debug.

DSMan195276 2 hours ago

> Do you debug JVM bytecode? V8's internals? No.

I can't speak for the author, but I do often do this. IMO it's a misleading comparison though, you don't have to debug those things because rarely does the compiler output incorrect code compared to the code you provided, it's not so simple for an LLM.

wilsonnb3 2 hours ago

> Do you debug JVM bytecode? V8's internals? No. You debug at your abstraction layer. If that layer is natural language, debugging becomes: "Hey Claude, the login is failing for users with + in their email."

I debug at my abstraction layer because I can trust that my compiler actually works, LLMs are fundamentally different and need to produce human readable code.

xlbuttplug2 2 hours ago

I have the exact opposite prediction. LLMs may end up writing most code, but humans will still want to review what's being written. We should instead be making life easier for humans with more verbose syntax since it's all the same to an LLM. Information dense code is fun to write but not so much to read.

nrhrjrjrjtntbt 2 hours ago

Feels like a dead end optimisation ala the bitter lesson.

No LLM has seen enough of this language vs. python and context is now going to be mostly wordy not codey (e.g. docs, specs etc.)

[-]

norir 2 hours ago

I suspect this is wrong. If you are correct, that implies to me that LLMs are not intelligent and just are exceptionally well tuned to echo back their training data. It makes no sense to me that a superior intelligence would be unable to trivially learn a new language syntax and apply its semantic knowledge to the new syntax. So I believe that either LLMs will improve to the point that they will easily pick up a new language or we will realize that LLMs themselves are the dead end.

[-]

tyushk 40 minutes ago

I don't think your ultimatum holds. Even assuming LLMs are capable of learning beyond their training data, that just lead back to the purpose of practice in education. Even if you provide a full, unambiguous language spec to a model, and the model were capable of intelligently understanding it, should you expect its performance with your new language to match the petabytes of Python "practice" a model comes with?

ksec 2 minutes ago

My take on the timeline;

1950s: Machine code

1960s: Assembly

1970s: C

1980s: C++

1990s: Java, Python

2000s: Frameworks

2020s: AI writes, humans review

killingtime74 37 minutes ago

Why not write in LLVM IR then? Or JVM/CLR bytecode? Makes no sense to make it unreadable but also need to be compiled.

tom_ an hour ago

The space separated function call examples could do with a 2+ary example i think. How do you do something like pow(x+y,1/z)? I guess it must be "math pow x plus y 1 over z"? But then, the sqrt example has a level of nesting removed, and perhaps that's supposed to be generally the case, and actually it'd have to be "math pow a b" amd you need to set up a and b accordingly. I'm possibly just old fashioned and out of touch.

Animats an hour ago

The question is whether this language can be well understood by LLMs. Lack of declarations seems a mistake. The LLM will have to infer types, which is not something LLMs are good at. Most of the documentation is missing, so there's no way to tell how this handles data structures.

A programming language for LLMs isn't a bad idea, but this doesn't look like a good one.

agnishom an hour ago

> LLMs tokenize English words efficiently. Symbols like {, }, === fragment into multiple tokens. Words like "plus", "minus", "if" are single tokens.

The insight seems flawed. I think LLMs are just as capable of understanding these symbols as tokens as they are English words. I am not convinced that this is a better idea than writing code with a ton of comments

munchler 2 hours ago

By this logic, shouldn’t you be prompting an LLM to design the language and write the compiler itself?

[-]

jychang 2 hours ago

Easy to produce != easy to consume, and vice versa

Optimizing a language for LLM consumption and generation (probably) doesn't mean you want a LLM designing it.

dgreensp 2 hours ago

A curly brace is multiple tokens? Even in models trained to read and write code? Even if true, I’m not sure how much that matters, but if it does, it can be fixed.

Imagine saying existing human languages like English are “inefficient” for LLMs so we need to invent a new language. The whole thing LLMs are good at is producing output that resembles their training data, right?

CGamesPlay 2 hours ago

If you're going to set TypeScript as the bar, why not a bidirectional transpile-to-NERD layer? That way you get to see how the LLM handles your experiment, don't have to write a whole new language, and can integrate with an existing ecosystem for free.

unsaved159 an hour ago

Ironically the getting started guide (quite long) is still to be executed by a human, apparently. I'd expect an LLM first approach, such as, "Insert this prompt into Cursor, press Enter and everything will be installed, you'll see Hello World on your screen".

ekinertac 2 hours ago

the real question isn't "should AI write readable code" but "where in the stack does human comprehension become necessary?" we already have layers where machine-optimized formats dominate (bytecode, machine code, optimized IR). the source layer stays readable because it's the interface where human judgment enters.

maybe AI should write better readable code than humans. more consistent naming, clearer structure, better comments. precisely because humans only "skim". optimize for skimmability and debuggability, not keystroke efficiency.

azhenley 2 hours ago

Aren't there many programming languages not built for humans? They're built for compilers.

[-]

baobun 2 hours ago

Loads! Hardly a novel idea.

https://en.wikipedia.org/wiki/Intermediate_representation#La...

mehmetkose 2 hours ago

‘So why make AI write in a format optimized for human readers who aren't reading?’ well yo’ll do when you needed to. sooner or later. but i like the idea anyway

measurablefunc an hour ago

I can't tell if this is parody or not. It seems like it's parody.

[-]

0928374082 an hour ago

"Poe's law" strikes again?

[-]

measurablefunc an hour ago

The fact that people are taking it seriously is a bit worrying.

thealistra an hour ago

For the bootstrap c lexer and parser was hand rolling really necessary? Lex and yacc exist for a reason

joegibbs 2 hours ago

Would it make more sense to instead train a model and tokenise the syntax of languages differently so that white space isn’t counted, keywords are all a single token each and so on?

[-]

__MatrixMan__ 2 hours ago

After watching models struggle with string replacement in files I've started to wonder if they'd be better off in making those alterations in a lisp: where it's normal to manipulate code not as a string but as a syntax tree.

diath an hour ago

The entire point of LLM-assisted development is to audit the code generated by AI and to further instruct it to either improve it or instruct it to fix the shortcomings - kind of being a senior dev doing a code review on your colleague's merge request. In fact, as developers, we usually read code more than we write it, which is also why you should prefer simple and verbose code over clever code in large codebases. This seems like it would be instead aimed at pure vibecoded slop.

> Do you debug JVM bytecode? V8's internals?

People do debug assembly generated by compilers to look for miscompilations, missed optimization opportunities, and comparison between different approaches.

[-]

alienbaby 8 minutes ago

This describes where we are at now. I don't think it's the entire point. I think the point is to get it writing code at a level and quantity it just becomes better and more efficient to let it do its thing and handle problems discovered at runtime.

kace91 2 hours ago

>NERD is what source code becomes when humans stop pretending they need to write it.

It is so annoying to realise mid read that a piece of text was written by an LLM.

It’s the same feeling as bothering to answer a call to hear a spam recording.

[-]

throwaway150 an hour ago

I don't know how you can be so sure about that sentence being written by LLM. I can imagine it is perfectly possible that a human could've written that. I mean, on some day I might write a sentence just like that.

I think HN should really ban complaints about LLM written text. It is annoying at best and a discouraging insinuation at worst. The insinuation is really offensive when the insinuation is false and the author in fact wrote the sentence with their own brain.

I don't know if this sentence was written by LLM or not but people will definitely use LLMs to revise and refine posts. No amount of complaining will stop this. It is the new reality. It's a trend that will only continue to grow. These incessant complaints about LLM-written text don't help and they make the comment threads really boring. HN should really introduce a rule to ban such complaints just like it bans complaints about tangential annoyances like article or website formats, name collisions, or back-button breakage

[-]

jjj123 an hour ago

The funny thing is I’ve never seen an author of a post chime in and say “hey! I wrote this entirely myself” on an AI accusation. I either see sheepish admission with a “sorry, I’ll do better next time” or no response at all.

Not saying the commenters never get it wrong, but I’ve seen them get it provably right a bunch of times.

[-]

throwaway150 an hour ago

> The funny thing is I’ve never seen an author of a post chime in and say “hey! I wrote this entirely myself” on an AI accusation.

I've seen this happen many times here on HN where the one accused comes back and says that they did in fact write it themselves.

Example: https://news.ycombinator.com/item?id=45824197

AdieuToLogic an hour ago

> I don't know if this sentence was written by LLM or not but people will definitely use LLMs to revise and refine posts. No amount of complaining will stop this. It is the new reality. It's a trend that will only continue to grow.

Using an LLM to generate a post with the implication it is the author's own thoughts is the quintessential definition of intellectual laziness.

One might as well argue that plagiarism is perfectly fine when writing a paper in school.

[-]

throwaway150 an hour ago

> Using an LLM to generate a post

You are talking about an entirely different situation that I purposely avoided in my comment.

muhaccount an hour ago

I agree with you on principle, but agree with the OP in practice. "NERD is what source code becomes when humans stop pretending they need to write it" means nothing. Once I read that I realized there was very little to zero review of the outputs by a human, so it's all potential hyperbole. It flips an internal switch to be more skeptical...for me, at least.

If its output isn't massaged by a team, then I appreciate the callouts until the stack is mature/proven. Doesn't make it better/worse...just a different level of scrutiny.

dented42 2 hours ago

I can’t be alone in this, but this seems like a supremely terrible idea. I reject whole heartedly the idea that any sizeable portion of one’s code base should specifically /not/ be human interpretable as a design choice.

There’s a chance this is a joke, but even if it is I don’t wanna give the AI tech bros more terrible ideas, they have enough. ;)

[-]

mkoubaa 2 hours ago

Some people are only capable of learning the hard way

porcoda an hour ago

Seems the TL;DR is “squished out most of the structural and semantic features of languages to reduce tokens, and trivial computations still work”. Beyond that nothing much to see here.

globalnode 2 hours ago

I was going to try and resist posting for as long as possible in 2026 (self dare) and here I am on day 1 -- this is a pretty bad idea. are you going to trust the llm to write software you depend on with your life? your money? your whatever? worst idea so far of 2026. wheres the accountability when things go wrong?

[-]

000ooo000 an hour ago

LLMs are doing for ideas what social media did for opinions.

croes an hour ago

> You debug at your abstraction layer. If that layer is natural language, debugging becomes: "Hey Claude, the login is failing for users with + in their email."

That sounds like step 2 before step 1. First you get complains that login in doesn’t work, then you find out it’s the + sign while you are debugging.

ForHackernews 2 hours ago

Assembly already exists.

[-]

baobun 2 hours ago

Once upon a beginning I believe assbly actually had human authors and readers in mind.

LLVM IR is a better example.

https://en.wikipedia.org/wiki/Intermediate_representation#La...

johnnyfived an hour ago

Oh boy a competitor to the well-renowned TOON format? I'm so surprised this stuff is even entertained here but the HN crowd probably is behind on some of the AI memes.

tayo42 an hour ago

> 67% fewer tokens than TypeScript. Same functionality.

Doesnt typescript have types? The example seems to not have types?

dragonwriter 2 hours ago

Uh, I think every form of (physical or virtual, with some pedagogic exceptions) machine code beats Nerd as an earlier “programming language not built for humans”.

4b11b4 41 minutes ago

no

itsthecourier 2 hours ago

tokens are tokens, shorter or larger, they are tokens

in that sense I don't see how this is more succinct than phyton

it is more than typescript and c#, of course, but we need to compete with the laconic languages

in that sense you will end up with Cisc vs Risc dilemma from the cpu wars. you will find the ability to compress even more is adding new tokens to compress repetitive tasks like sha256 being a single token. I feel that's a way to compress even more

[-]

3836293648 2 hours ago

Because llm tokens don't map cleanly to what the compiler sees as a token. If coding is all LLMs will be good for this will surely change

behnamoh 2 hours ago

That's not true, the first not-for-humans language is Pel: https://arxiv.org/abs/2505.13453

Nerd: A language for LLMs, not humans