Nvidia A6000 GPUs Flipped Memory Bits by GPUHammer: Rowhammer Attack Returns
The Rowhammer attack on computer memory has made a comeback, this time targeting Nvidia GPUs with the A6000 series. Researchers from the University of Toronto have discovered a vulnerability in these GPUs that allows an attacker to manipulate bits in the memory using a technique called Rowhammer. This is particularly concerning for organizations running AI applications in the cloud, as it can lead to significant accuracy issues with machine learning models.
The attack was first disclosed by researchers Chris (Shaopeng) Lin, Joyce Qu, and Gururaj Saileshwar in January 2023. They presented their findings in a paper titled "GPUHammer: Rowhammer Attacks on GPU Memories are Practical" which is scheduled to be presented at USENIX Security 2025. The researchers found that the attack can be executed using Rowhammer-induced bit-flips on Nvidia A6000 GPUs with GDDR6 memory, making it the first time this type of attack has been successful against these specific GPUs.
The Rowhammer attack dates back to 2014 when computer scientists from Carnegie Mellon University and Intel published a paper describing how repeatedly accessing the same memory row in a DRAM chip could flip the stored electronic bits, resulting in data corruption and errors. The attack generally requires the attacker and victim to be tenants on the same hardware, with enough privileges to run the attack code. However, there is a variant that operates over the network under certain conditions.
The researchers demonstrated that they can use GPUHammer to alter the weights of a deep neural network to make AI model inference less accurate, an attack technique referred to as Terminal Brain Damage. They showed that in their proof-of-concept attack, they were able to degrade the accuracy of machine-learning models by up to 80 percent, despite the presence of a defense called Target Row Refresh in GDDR6 memory.
Nvidia has issued a security advisory to inform customers about the possible threat and has identified a mitigation: enabling Error Correction Codes (ECC). To enable ECC, users can run the command `nvidia-smi -e 1` and then reboot their system. However, this comes with a performance hit of around 10 percent and a reduction in memory capacity of around 6.25 percent.
The GPUHammer attack presents a concerning threat for organizations running AI applications in the cloud, as it can lead to significant accuracy issues with machine learning models. It is essential for these organizations to take precautions to protect their systems from this type of attack and to stay informed about any potential security vulnerabilities.
Key Takeaways:
- Nvidia A6000 GPUs are vulnerable to the Rowhammer attack, which can lead to significant accuracy issues with machine learning models.
- The attack was discovered by researchers from the University of Toronto and is scheduled to be presented at USENIX Security 2025.
- The Rowhammer attack can be executed using Rowhammer-induced bit-flips on Nvidia A6000 GPUs with GDDR6 memory.
- Nvidia has identified a mitigation: enabling Error Correction Codes (ECC) which can help protect against the attack but comes with a performance hit.