Meet DGX-1: Nvidia's Stupidly Powerful Supercomputer For Deep Learning

"250 Servers in a box." That's how Nvidia describes the DGX-1 -- the world's first commercially available supercomputer specifically built for deep learning. Packing in eight Tesla P100 GPUs that are capable of delivering up to 170 teraflops at peak performance, it is hands-down the most powerful system Nvidia has ever brought to market. We took some snapshots of this AI behemoth on the GTC showroom floor. Feast yer eyes!

The DGX-1 is pre-built supercomputer boasting eight 16GB Tesla GPUs, a 7TB SSD, Dual 10GbE Quad InfiniBand 100Gb networking, an NVLink Hybrid Cube Mesh and a pair of Xeon processors. In terms of raw computing power, it represents an astonishing 12x speed-up over the previous year. According to Nvidia, it provides the throughput of 250 CPU-based servers, networking, cables and racks: all in a single box.

 Nvidia deep learning Supercomputer

In addition to the aforementioned hardware, the unit comes pre-loaded with deep learning software and development tools for speedier deployment. This includes Nvidia's Deep Learning GPU Training System (DIGITS), CUDA Deep Neural Network library (cuDNN) version 5, Caffe, Theano, Torch and a range of cloud management tools, software updates and a repository for containerized applications.

The aforementioned hardware and software essentially arms researchers and data scientists with the requisite power for deep learning on a massive scale. This will allow AI systems to be trained much faster than ever before.

As Nvidia CEO Jen-Hsun Huang explained during the DGX-1's unveiling: "Data scientists and AI researchers today spend far too much time on home-brewed high performance computing solutions. The DGX-1 is easy to deploy and was created for one purpose: to unlock the powers of superhuman capabilities and apply them to problems that were once unsolvable."

 Nvidia deep learning Supercomputer

Most of the power behind the DGX-1 comes from its eight Tesla P100 graphics cards. As the first full GPU based on Nvidia's Pascal architecture, these units are just as formidable as they look. Adhering to the TSMC 16nm FinFET manufacturing process, each GPU sports a core clock of 1328MHz, high-capacity HBM2 memory, 720GB/s of memory bandwidth, 64 FP32 CUDA cores and a die that packs in 15.3 billion transistors.

Each GPU provides 10.6 teraflops of single precision floating point performance; a 3.7TB boost over the enthusiast-level Titan X. But the real performance comes from NVLink interconnect support, which allows multiple GPUs to connect directly to each other for maximum application scalability. This is like PCI Express on steroids.

 Nvidia deep learning Supercomputer

If you're wondering how this thing would fare as a gaming system, forget about it: Tesla GPUs are aimed strictly at enterprise customers -- they don't even come with HDMI or DisplayPort outputs which means you can't connect them to a monitor. Plus, there's also the $129,000 price tag to worry about.

Stanford University, Berkely, NYU and the University of Oxford will be among the first institutions to get DGX-1s. Nvidia will also be partnering with Massachusetts General Hospital to bring the power of DGX-1 to medical research; specifically in the areas of radiology, pathology and genomics.

Gizmodo travelled to GTC 2015 in San Jose, California as a guest of Nvidia.



    Someone please explain this to me.
    Other than it being powerfully cool.

      Ok here is an example.

      I am a data engineer for a travel company.
      I want to work out
      - when people buy tickets
      - where people of age groups travel to
      - where people who live in certain areas travel to
      - how long people travel for
      Etc etc

      The reason I want to know this is simply because I want to understand why people travel and then be able to market something to these customers so they can purchase more tickets

      Now I can do a lot of analysis and calculations and rules etc
      But this nvidia machine can crunch a lot of numbers for k instantly

      Think about it; rather than doing all the maths and logic which you spent months and months on and cost you lots of money to hire people, you can just buy this machine and begin feeding it data

      The example I gave you probably doesn't need the nvidia server but companies like American Express and MasterCard would
      Other places like Google etc

      It's pretty cool

    Can this even run Crysis?

      ...came to the comments just to see this.

      Thanks for making me believe in humanity again.

      I'm trying to start something similar, but with the Rift. Gimme some support, would ya!?

    I wonder how many keys per second, this monster can crack a WPA key?

      256bt will take about 17 years with this beast.

    This would run great for Minecraft and Pokemon.

Join the discussion!

Trending Stories Right Now