One of the components of AMD’s marketing angle for the Radeon RX 500 series is that it’s Polaris Refined. These are still the same Polaris 10 GPUs as in the Radeon RX 400 series, but AMD wants to highlight the small improvements that they have made/gained in the past year. For RX 400 owners this doesn’t amount to much – your cards are still shiny and chrome – but it serves to help differentiate the new cards from the old cards. And for the owners of cards like the R9 280 and 380 series that AMD is trying to reach and convince them to upgrade, it’s the justification for why AMD thinks they should want to upgrade now after having passed on the RX 400 series.
There are two elements to Polaris Refined: silicon improvements, and a new memory clock state. The former in turn is comprised of both the benefits in improving fab yields and quality that AMD has enjoyed over the past year, and a new revision of Polaris 10.
In the case of fab yields, all of the revised Polaris chips are being manufactured on what AMD is calling the “Latest Generation FinFET 14” process. This is a bit of a mouthful, but in short it’s AMD calling attention to the improvements partners GlobalFoundries and Samsung have made to their 14nm LPP processes in the last year. Yields are up and overall chip quality is better, which improves the average performance (clockspeed & power) characteristics of the chips. Both foundries have also been making other undisclosed, small tweaks to their lines to further boost chip quality. It’s not a new fab process (it’s still 14nm LPP) but it’s an improvement over where Polaris 10 production started nearly a year ago.
Typically these kinds of yearly gains would simply be rolled into a product line without any fanfare – these improvements are gradual over time anyhow, not a binary event – but for the RX 500 series AMD wants to call attention to them to explain why clockspeeds are improved versus the RX 400 series cards released last year. Though to be clear here, the difference isn’t dramatic; the gains from a year’s optimization to a manufacturing line are a fraction of a full node improvement.
Meanwhile AMD is also releasing a new revision of Polaris 10, which is being used in the RX 580/570 launch. These revised chips have received further tweaking to reach higher clockspeeds, allowing AMD to reliably clock up a bit higher and/or reduce power consumption a bit. The new revision also fixes a couple of minor issues with the GPUs. Specifically, AMD is adding a new mid-power memory clock state so that applications that require memory clocks faster than idle – primarily mixed-resolution multi-monitor and video decoding – no longer cause the memory to clock up to its most power-demanding speeds, keeping overall power consumption down.
One thing to note here is that while AMD’s chip quality has improved here though the combination of manufacturing improvements and revised silicon, for the desktop AMD is investing all of those gains into improving clockspeeds. This is why the TBPs have gone up by 30-35W over the RX 480 and RX 470.
Power Consumption: By the Numbers
Since all of AMD’s optimizations are focused on bringing down power consumption, let’s take a look at that now. There are a few different things we can look at here, and I’ll start with what’s probably the most burning question: just how much better is the new revision of Polaris 10 over the old revision?
To test this, I’ve taken the Radeon RX 580 sample AMD sent over – PowerColor’s Red Devil RX 580 – and underclocked it to the same clockspeeds as the RX 480. It should be noted however that this process is a bit more complex than just underclocking to the RX 480’s official boost clock of 1266MHz. Because the RX 480 power throttles under both FurMark and Crysis 3, it’s necessary to match the RX 480’s specific clockspeeds in those scenarios.
After doing so, what we find are mixed results.
Load Power Testing, Normalized GPU Clockspeeds (Power Draw at Wall) | ||||
FurMark (740MHz) | Crysis 3 (1230MHz) | |||
Radeon RX 480 | 231W | 301W | ||
Radeon RX 580 | 205W | 314W |
Even after dialing the RX 580 down to 1230MHz for Crysis 3 to match the reference RX 480, power consumption at the wall is still 11W higher than the RX 480. Performance is the same, so the RX 580 isn’t doing more work, but none the less power system at a system level is still a bit higher.
On the other hand, turning the RX 580 down to 740MHz to match the RX 480 on FurMark (power viruses cause significant throttling), we find the RX 580 ahead by a rather shocking 26W. Power consumption at the wall is 205W, versus 231W for the RX 480.
Broadly speaking, although FurMark isn’t always the best tool for load power measurement on cross-vendor cards, it has proven to be very reliable when looking at cards based on the same architecture. It suffers from a very specific limitation: it will push a card to its TDP limit, and this can vary among manufacturers, but even with that it typically gives you a consistent and sane metric to compare like-cards.
Consequently I tend to favor the FurMark numbers here. However it doesn’t change the fact that power consumption numbers under Crysis 3 are wildly different, and paint the RX 580 as being worst. So they can’t both be right, can they?
As it stands, I suspect we’re getting into the area of random variation – with a sample size of 1 on each Radeon card, the random variations in quality from GPU to GPU are downing out the actual data. It’s entirely possible we’re looking at a worse-than-average RX 480 and a better-than-average RX 580, especially as the latter has been binned for factory overclocking. However I’m not ready to rule out that something more complex may be going on here: that the improvements Polaris 10’s power curve aren’t linear/consistent. It may be that AMD’s greatest gains are at lower clockspeeds and voltages, and that those improvements taper off at higher clockspeeds and voltages.
But for the moment, I’m ruling it a push. The FurMark data is interesting, but without Crysis 3 being in agreement it’s not enough to say anything definitive.
That New Memory State
Finally, let’s take a look at the specific benefits AMD is touting for the new memory state that the company has included with the new Polaris 10 revision. The new mid-power state allows the memory to be clocked at 4Gbps GDDR5 on the RX 580. The other power states on the RX 580 (and the RX 480) are 1.2Gbps (idle) and 8Gbps (full load), so on the RX 480 if AMD ever needed to increase the memory clocks above idle, their only option was to go to full clocks, which on GDDR5 is relatively expensive.
The two scenarios AMD is looking to address with this new memory clock state are multi-monitor configurations and video playback. In the case of the former, mismatched monitors would require the RX 480 to go to its full memory clocks even when idling. Due to the timing differences, the higher memory clock is needed to avoid flickering. Matched monitors avoid this problem, as they have identical timings. Otherwise in the case of video playback, while AMD has their fixed function decoder to offload most of the work, it still generates a lot of video data, which can require the memory to jump to a higher clock state to keep up. Though the video playback scenario is particularly complex as the GPU clock itself can also jump up if the video decoder needs a higher performance state for itself.
Putting this to the test, I ran both the RX 480 and RX 580 through a mix of multi-monitor and video playback scenarios.
Multi-Monitor Power Testing (Power Draw at Wall) | |||||
Single Monitor (1080p) | Multi-Monitor Matched (1080p+1080p) | Multi-Monitor Mismatched (1080p + 1440p) | |||
Radeon RX 480 | 76W | 76W | 100W | ||
Radeon RX 580 | 74W | 74W | 100W | ||
GeForce GTX 1060 6GB | 73W | 73W | 73W |
Starting with the multi-monitor testing, the results were not what I was expecting. While AMD tells me that this should trigger the new mid-power state, I haven’t been able to successfully trigger it. With matched monitors the RX 580 can go to full idle, just like the RX 480. Otherwise with mismatched monitors, it always goes to 8Gbps, skipping past 4Gbps and never returning. Even with a few different monitors, the results were always the same. Due to the quick launch I haven’t had time to further debug the issue, so I’m not sure if it’s related to the monitors or if it’s something specific to the Red Devil RX 580.
Video Playback Power Testing (Power Draw at Wall) | |||||
Idle | High Bitrate H.264 | High Bitrate HEVC | |||
Radeon RX 480 | 76W | 125W | 125W | ||
Radeon RX 580 | 74W | 90W | 93W | ||
GeForce GTX 1060 6GB | 73W | 96W | 96W |
On the plus side however, AMD’s new memory state worked as expected with video playback. Whereas the RX 480 would have to settle for an 8Gbps memory clock when playing back high-biterate H.264 and HEVC video in Media Player Classic – Home Cinema, the RX 580 would settle at 4Gbps. In fact the RX 580 actually performed a bit better than expected; the RX 480 would typically have to go to higher core clock speeds as well, compounding the power cost. As a result power consumption at the wall was notably lower on the RX 580 than the RX 480.
And just for reference, this is actually a bit better than the GeForce GTX 1060 6GB. NVIDIA’s midrange card goes to its maximum memory clock in the same tests, and as a result power consumption at the wall was a few watts higher than the RX 580.
ncG1vNJzZmivp6x7orrAp5utnZOde6S7zGiqoaenZH5yfpZxZpqllGK%2ForDEqKVmqqhignl8jKuvZm1nZXqzsdWinLBnYg%3D%3D