NHacker Next
- new
- past
- show
- ask
- show
- jobs
- submit
login
Much faster yet than stable pytorch 2.3 (46% on A100, as per the tweet), and much much faster yet compared to pytorch 2.2, which was the stable version a couple weeks ago. Also llm.c is much faster yet when the performance comparison is on H100 instead of A100, or on multiple GPU instead of a single one.
I’d be happier with 93% of PyTorch but works on multiple gpu manufacturers.
Yeah, I'm sure that's what anyone trying to build some kind of AI startup that's managed to acquire a small handful of A100 or even better H100s thinks too. "Those cards sure were expensive, but ethically, I'd rather the software run slower to give me future imaginary options than to get the most out the hardware I just bought."
That... wasn't the original intention of the project. It was to create a C version of the PyTorch code that could train GPT-2.
it’s pretty impressive that PyTorch is only 7% slower than this given it can be used so generally
[dead]
Crated over the period of like 4 weeks by random people all over the internet
Rendered at 23:18:47 GMT+0000 (Coordinated Universal Time) with Vercel.