Anthropic in early talks to buy DRAM-less AI inference chips from UK startup — Fractile's SRAM architecture reduces need for pricey memory during extreme pricing and shortage crunch - AOS for Lemmy.World - A generic Lemmy server for everyone to use.

Anything that reduces the footprint of LLM's is welcome, however...

making LLM compute cheaper in datacentres won't mean lower total power/cooling/space/water consumption, like adding lanes and traffic, it will just mean more usage as it gets cheaper (and a short-term bump in margins for the LLM owners)
these are still highly dedicated chips that are always going to be bound up in the mega-scale datacentre deployments
- what happens if there is a paradigm shift in the exact compute architecture? Loads of junk servers and no applications able to make use of such a glut
- these do nothing to push LLMs out of the datacenter and into non-corporate hands, which is the only spot where we might see fewer privacy concerns, less corporate control etc

If we're stuck with the current compute/corporate paradigms, at least alternatives nibling at the unhealthy dominance of nVidia and the cloud giants is some small benefit.