Nvidia's GTX 1080 redefines high-end gaming performance

Nvidia's GTX 1080 redefines high-cease gaming performance

This site may earn affiliate commissions from the links on this folio. Terms of use.

GTX1080-Feature

Nvidia formally announced the GTX 1080 in Austin just over a week agone, only it held back on the GPU deep dive at the initial presentation. While the GTX 1080 isn't scheduled to launch until May 27, Nvidia has lifted the curtain on the GPUs performance and technical advances. We've covered some of the card'due south improvements in VR and overall positioning, plus new technologies like Ansel that requite gamers more artistic freedom, so we'll be focusing on areas where Nvidia hadn't disclosed every bit much information. Unfortunately, Nvidia was unable to supply us with a GTX 1080 in time for launch, so we'll have to defer benchmarks and performance comparisons for another day.

Pascal is an evolutionary step forward from Maxwell and many of the technologies that debuted in the GeForce 9xx family unit have been refined, improved, and enhanced for the GTX 1080. The base GPU packs 2560 cores with 20 SM blocks and 128 cores per cake. There are 160 texture units and 64 ROPS, which is interesting — ane mutual theory was that AMD and Nvidia would both transport essentially more ROPs this year to ensure that they didn't become fill up-charge per unit express in VR or at higher resolutions. Nvidia chose to deal with this another fashion, which we'll explore shortly.

Pascal-Diagram

The GTX 1080 packs 25% more cores and 25% more than texture units than the GTX 980 it replaces, along with a much higher base clock (ane.61GHz vs. one.1GHz for Maxwell) and significantly faster RAM (320GB/s of memory bandwidth, compared to 224GB/s for Maxwell). On newspaper, the 1080 looks much more similar the 980 Ti. In exercise, it often outperforms that card.

Memory bandwidth, memory compression

Pascal uses GDDR5X to increment its retentivity bandwidth, but Nvidia chose to stick with a 256-flake memory bus rather than the 384-bit bus that the GTX 980 Ti and GTX Titan X use. The hole-and-corner sauce in Nvidia's recipe? A more avant-garde version of the aforementioned delta colour compression techniques that Maxwell used.

PascalMemory2

When Nvidia launched Maxwell, it claimed that Maxwell reduced its retentiveness bandwidth needs by 19-thirty% over Kepler, depending on the game in question. Pascal delivers a further xi-28% comeback over Maxwell, again depending on the game in question. Because that the GTX 1080 already offers 43% more bandwidth than the GTX 980, the added color compression is icing on the cake — or a smart way to ensure the card can calibration to 4K or even beyond, depending on your bespeak of view.

Maxwell-Compression

The outrageously pink prototype above shows the deviation between Maxwell and Pascal's colour compression. You can see that Maxwell already does a pretty good job of compressing the frame, but in that location are still a significant number of areas where Maxwell couldn't compress the data.

Pascal-Compression

Here'southward Pascal'due south version of the same frame. If you're thinking "Wow, that'southward actually pink," you're on the correct track. What this means is that Pascal can excerpt retentiveness pinch savings from a much larger pct of the frame than Maxwell could. In the long run there's an inevitable diminishing marginal return to saving bandwidth via compression, only the feature worked extremely well in Maxwell and should give Pascal an additional edge in 4K gaming.

Display back up, media encode/decode, and SLI bridges

Nvidia has announced that the GTX 1080 will back up a maximum resolution of 7680×4320 at 60Hz if two DisplayPort 1.3 connectors are used to bulldoze the brandish. The GPU is only certified for DP 1.ii merely is listed every bit DP 1.3 and 1.4 "gear up." HDMI two.0b and HDCP ii.2 are both supported as well. Media standard support has a few new bells and whistles that previous Maxwell cards lacked. Pascal now supports full encode and decode in both H.265 and x-scrap H.265. 12-bit (decode-only) is besides supported, as is hardware decode for Google's VP-9 codec.

Those of you who are familiar with multi-GPU configurations are also aware that Nvidia has previously used SLI bridges to connect 1 GPU to another. AMD abandoned this approach back in 2013 when it launched Hawaii; AMD GPUs now connect direct over the PCI Express 3.0 jitney. Nvidia is still sticking with bridges for Pascal and GP104, just this time it'southward introducing a new, college-bandwidth bridge standard for modernistic GPUs. Existing bridges should function well up to 2560×1440 @ 60Hz, only if you lot desire >60Hz refresh rates or to run SLI in 4K or 5K style, you'll meet height operation if you lot use newer bridges (Nvidia did note that its LED bridges are however rated for anything upward to 5K). Information technology'south non entirely articulate if older "stiff" bridges are limited the same fashion as the older "floppy" bridges (cantankerous-GPU bandwidth was lower on the flexible bridges than on their "stiff" counterparts.)

Mordor-SLI

This slide shows the divergence in Shadows of Mordor between the old and new bridge. It'south important to notation that this dramatic departure was captured in 4K Surroundings mode, which means three 4K displays running the same game for a total resolution of 15360×2160. While the new bridges are much smoother than the former ones, the game itself doesn't maintain a playable frame rate at these resolutions and detail settings. Lower resolutions and item settings might non show the same gains.

Asynchronous compute

Asynchronous compute has been a hotly debated topic ever since Ashes of the Singularity debuted and showed AMD belongings an advantage over Nvidia, ostensibly due to this particular adequacy. While that situation is rather more nuanced and game-specific, in that location are going to be a number of questions regarding how Pascal stacks up to the contest.

According to Nvidia, GP104 improves on Maxwell in some significant means. Maxwell was only capable of performing draw-level preemption and could only switch to a unlike workload at a draw call purlieus. What this meant practically was that in that location were significant penalties to running a mixed compute + graphics workload, and nosotros saw that reflected in Maxwell'southward performance when significant asynchronous workloads were running.

Pascal-Async

Different Maxwell, Pascal can perform much finer-grained preemption. In graphics workloads, it tin can preempt at the pixel level, affluent the shader pipeline, and switch to compute. In compute workloads it tin can bandy at the instruction level and render to doing graphics work. Nvidia claims that this takes 100 microseconds or less, and while the company didn't offer competing figures for Maxwell, it should be significantly faster than what we saw last generation.

Asynchronous compute isn't a feature about games rely on still (Ashes of the Singularity is something of an exception), and we tin't deep dive into the question until we've got hardware. What I can say is that while Pascal significantly improves on Maxwell's capabilities, information technology doesn't offering the same prepare of compute capabilities that GCN does. The larger question is whether or not the difference between what the two companies support will accept an impact on future DirectX 12 titles. The fact that Nvidia holds an estimated 75-lxxx% of the gaming market place is itself a powerful argument that developers should focus on building engines that cater to Nvidia's architectures and GPU capabilities more and so than AMD's. At the same time, all the same, some developers have predicted that game engines may shift workloads towards compute engines no matter what — and that could potentially work in AMD'southward favor in future DX12 titles.

I've always recommended evaluating GPUs based on the games you're playing now, not the titles you might be playing in 12-24 months, and it'due south difficult to predict how game engines might modify in the next few years. At minimum, the changes Nvidia has fabricated to Pascal should significantly reduce any asynchronous compute penalty relative to Maxwell. At best, we should run across Pascal picking up some performance improvements from async compute, including in games where Maxwell took a functioning hit.

The graph above shows how far the GTX 1080 has come relative to its predecessor, but there's one caveat worth mentioning. According to Oxide, asynchronous compute is disabled on Nvidia cards by default, which ways these test results may non tell u.s.a. if Pascal really benefits from async compute merely however. One final annotation: When Nvidia demoed its async compute capability at Austin, it did so using DirectX 11, not DX12. We weren't able to notice more information on why information technology chose to demo using the older API, or what the functioning ramifications were for that scenario.

Wrapping information technology all up

Pascal is a significant leap forward for Nvidia, thank you to a combination of higher clocks, increased core counts, and improved efficiency. The company is forecasting significant gains over and to a higher place GTX 980 in both traditional gaming and VR, with particularly impressive boosts arriving for VR titles. While we tin't speak to that specifically just yet, the on-paper gains are substantial.

Pascal-Perf

1 divergence about this launch, all the same, is that AMD and Nvidia are taking very different approaches to the market place. Nvidia has chosen to launch high-end parts beginning, with the GTX 1080 and 1070 taking over for the 980 Ti, 980, and GTX 970. AMD, in contrast, volition launch an efficiency-focused GPU first, with Polaris 10 and 11 targeting the budget and mainstream segments in both mobile and desktop. This is the first time in a long while that the ii companies have taken this arroyo, and information technology'll be interesting to see how they compare in their respective brackets. It'southward still not clear if Pascal's VR functioning gains will require substantial optimization or not, only VR enthusiasts who held off ownership a new GPU when Oculus and Vive launched should be well-rewarded for their patience.

Current reviews bear witness the GTX 1080 outperforming the GTX 980 past 25-35%, which is in-line with our expectations. This launch is going to put serious pressure on AMD to reduce the price of its Fury products — the non-Founders variant of the 1080 will sell for $600, which puts it head-to-head against the Fury X, while the Founder'southward edition is $700. If the GTX 1070 beats Fury and Nano at $370, AMD will take to pull its prices down to compensate.

If you desire to see additional performance data, Hot Hardware has a full review of the menu, as does Ars Technica and our sister site Computer Shopper.