3 comments

  • pierridotite 2 hours ago

    A quick note on the implementation details for those interested in compilers:

    The hardest part wasn't the AD itself, but managing memory safety during the "growth" phase. Since NOMA compiles to native code (LLVM), I had to ensure that when a weight buffer gets realloc'd (moved in memory):

    The gradient tape updates its pointers.

    The optimizer state (Adam moments) is correctly mapped to the new indices.

    The benchmark I linked shows the result: "Preserving" this state allows the model to continue converging immediately after resizing, whereas "Resetting" it causes a massive performance regression.

    I'm specifically curious if anyone here has experience with handling SSA Phi-nodes during reverse-mode AD on the Control Flow Graph? That's my next big hurdle for supporting complex control flow.

  • cylicium 2 hours ago

    Hey ! I Saw a post that u done on reddit how can I help if I wl to contibute ?

      pierridotite 2 hours ago

      Go on our discord :) We can help u to find an issue and some implementation that could be usefull as demo or others x)