How to Get 1.5 TFlops of FP32 Performance on a Single M1 CPU Core
128-byte vector registers, goddamn
How to Get 1.5 TFlops of FP32 Performance on a Single M1 CPU Core
128-byte vector registers, goddamn
You Want Modules, Not Microservices
I had a whole blog post draft about software architecture supervening upon service architecture but fuck writing
I’ve been going through a lot lately and it’s pretty much entirely incomprehensible to everyone I know.