By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), January 2, 2021 1:21 pm
Room: Moderated Discussions
Jukka Larja (roskakori2006.delete@this.gmail.com) on January 1, 2021 10:28 pm wrote:
>
> So yeah, I do very much agree AMD has superior offering. ECC doesn't really matter here though.
ECC absolutely matters.
ECC availability matters a lot - exactly because Intel has been instrumental in killing the whole ECC industry with it's horribly bad market segmentation.
Go out and search for ECC DIMMs - it's really hard to find. Yes - probably entirely thanks to AMD - it may have been gotten slightly better lately, but that's exactly my point.
Intel has been detrimental to the whole industry and to users because of their bad and misguided policies wrt ECC. Seriously.
And if you don't believe me, then just look at multiple generations of rowhammer, where each time Intel and memory manufacturers bleated about how it's going to be fixed next time.
Narrator: "No it wasn't".
And yes, that was - again - entirely about the misguided and arse-backwards policy of "consumers don't need ECC", which made the market for ECC memory go away.
The arguments against ECC were always complete and utter garbage. Now even the memory manufacturers are starting do do ECC internally because they finally owned up to the fact that they absolutely have to.
And the memory manufacturers claim it's because of economics and lower power. And they are lying bastards - let me once again point to row-hammer about how those problems have existed for several generations already, but these f*ckers happily sold broken hardware to consumers and claimed it was an "attack", when it always was "we're cutting corners".
How many times has a row-hammer like bit-flip happened just by pure bad luck on real non-attack loads? We will never know. Because Intel was pushing shit to consumers.
And I absolutely guarantee they happened. The "modern DRAM is so reliable that it doesn't need ECC" was always a bedtime story for children that had been dropped on their heads a bit too many times.
We have decades of odd random kernel oopses that could never be explained and were likely due to bad memory. And if it causes a kernel oops, I can guarantee that there are several orders of magnitude more cases where it just caused a bit-flip that just never ended up being so critical.
Yes, I'm pissed off about it. You can find me complaining about this literally for decades now. I don't want to say "I was right". I want this fixed, and I want ECC.
And AMD did it. Intel didn't.
> I don't really see AMD's unofficial ECC support being a big deal.
I disagree. The difference between "the market for working memory actually exists" and "screw consumers over by selling them subtly unreliable hardware" is an absolutely enormous one.
And the fact that it's "unofficial" for AMD doesn't matter. It works. And it allows the markets to - admittedly probably very slowly - start fixing themselves.
But I blame Intel, because they were the big fish in the pond, and they were the ones that caused the ECC market to basically implode over a couple of decades.
ECC DRAM (or just parity) used to be standard and easily accessible back when. ECC and parity isn't a new thing. It was literally killed by bad Intel policies.
And don't let people tell you that DRAM got so reliable that it wasn't needed. That was never ever really true. See above.
Linus
>
> So yeah, I do very much agree AMD has superior offering. ECC doesn't really matter here though.
ECC absolutely matters.
ECC availability matters a lot - exactly because Intel has been instrumental in killing the whole ECC industry with it's horribly bad market segmentation.
Go out and search for ECC DIMMs - it's really hard to find. Yes - probably entirely thanks to AMD - it may have been gotten slightly better lately, but that's exactly my point.
Intel has been detrimental to the whole industry and to users because of their bad and misguided policies wrt ECC. Seriously.
And if you don't believe me, then just look at multiple generations of rowhammer, where each time Intel and memory manufacturers bleated about how it's going to be fixed next time.
Narrator: "No it wasn't".
And yes, that was - again - entirely about the misguided and arse-backwards policy of "consumers don't need ECC", which made the market for ECC memory go away.
The arguments against ECC were always complete and utter garbage. Now even the memory manufacturers are starting do do ECC internally because they finally owned up to the fact that they absolutely have to.
And the memory manufacturers claim it's because of economics and lower power. And they are lying bastards - let me once again point to row-hammer about how those problems have existed for several generations already, but these f*ckers happily sold broken hardware to consumers and claimed it was an "attack", when it always was "we're cutting corners".
How many times has a row-hammer like bit-flip happened just by pure bad luck on real non-attack loads? We will never know. Because Intel was pushing shit to consumers.
And I absolutely guarantee they happened. The "modern DRAM is so reliable that it doesn't need ECC" was always a bedtime story for children that had been dropped on their heads a bit too many times.
We have decades of odd random kernel oopses that could never be explained and were likely due to bad memory. And if it causes a kernel oops, I can guarantee that there are several orders of magnitude more cases where it just caused a bit-flip that just never ended up being so critical.
Yes, I'm pissed off about it. You can find me complaining about this literally for decades now. I don't want to say "I was right". I want this fixed, and I want ECC.
And AMD did it. Intel didn't.
> I don't really see AMD's unofficial ECC support being a big deal.
I disagree. The difference between "the market for working memory actually exists" and "screw consumers over by selling them subtly unreliable hardware" is an absolutely enormous one.
And the fact that it's "unofficial" for AMD doesn't matter. It works. And it allows the markets to - admittedly probably very slowly - start fixing themselves.
But I blame Intel, because they were the big fish in the pond, and they were the ones that caused the ECC market to basically implode over a couple of decades.
ECC DRAM (or just parity) used to be standard and easily accessible back when. ECC and parity isn't a new thing. It was literally killed by bad Intel policies.
And don't let people tell you that DRAM got so reliable that it wasn't needed. That was never ever really true. See above.
Linus