New vendor possible, but supercomputer could miss 2012 deadline

New vendor possible, but supercomputer could miss 2012 deadline

CHAMPAIGN — Even with IBM pulling out of the Blue Waters project here, industry analysts say the leading-edge supercomputer could very well go on with another vendor — but quite possibly at a higher cost, and perhaps missing its fall 2012 deadline.

The building, with its advanced water cooling system — expected to become an industry standard — as well as a massive electrical infrastructure, has been largely completed for about a year, but on Aug. 6 IBM announced tersely that it was pulling out of the project and would return the $30 million it has been paid for its powerful servers.

The servers were members of the POWER7 family, which helped IBM's Watson score a publicity coup on "Jeopardy."

NCSA officials have said problems started showing up in April.

The National Center for Supercomputing Applications at the University of Illinois is the leader in a consortium of institutions that want to share the enormous sustained 1 petaflop (a thousand trillion floating point operations per second) computing power.

But NCSA officials have tried to be quiet about the future of the taxpayer-funded $300-million-plus project because of a lengthy non-disclosure agreement with IBM.

"We do have conversations with vendors, and a high level of confidence we can work with a new vendor," said John Melchi, who leads the administration directorate at NCSA.

Melchi said early last week that NCSA still expected to meet the deadline. The News-Gazette has filed a Freedom of Information request for documents relating to the matter.

On Friday, Melchi reiterated that the project should still work, noting that NCSA and the National Science Foundation have partnered since 1986.

"Since 2007 we have been partnering with NSF to build a sustained petascale computing resource for the nation's academic scientists and engineers. We have been asked by NSF to replan the Blue Waters project in light of IBM's recent termination of their contract to build a POWER7 system for us," he wrote in an email.

Melchi said NSF has asked NCSA to propose an alternative vendor.

"NCSA believes there are excellent, innovative technologies available that we can bring online in accordance with our current schedule. Keep in mind that in 2007 there were four proposals submitted to NSF in hopes of getting the contract. Computer vendors are contacting us daily offering their assistance in putting together an alternative supercomputer solution for the Blue Waters project," he said.

"The problem was that IBM had more difficulty developing the innovative technology in the Power7-IH system than anticipated, which drove up the cost to both develop and manufacture the system," Melchi added.

The potential vendors include U.S. companies such as NVIDIA, Cray and SGI.

"Cray could do it tomorrow, and I think SGI might also be able to," said Timothy Prickett Morgan, an industry veteran in England and the U.S. who has a high-tech blog at http://www.theregister.co.uk.

From its point of view, Cray Inc. is optimistic the Seattle-based company can handle the project.

Cray, which pioneered its supercomputer in 1976, later hit a bad patch. But it has rebounded and now has equipment in three of the world's Top 10 Supercomputers, according to http://www.top500.org" target="_blank">http://http://www.top500.org, an independent scientific rating site.

Paul Hiemstra of Cray investor relations wasn't hesitant to say his company has the goods.

"We don't know the technical details yet, but we certainly could do something," he said. "We've had a pretty good win streak lately."

IBM says on its supercomputer site that its products have "the most installed aggregate performance." Its Roadrunner supercomputer was the first to reach a sustained performance of 1 petaflop, as measured by the benchmark used by http://www.top500.org" target="_blank">http://http://www.top500.org.

While Hiemstra was optimistic about being able to do the work, he was inconclusive about whether Cray could meet the 2012 deadline.

"I can't really comment on that," he said.

NVIDIA spokesman Ken Brown said the company would not be commenting at this time. SGI did not return calls for comment.

In the 2007 plan, the National Science Foundation agreed to spend $208 million on the project; the UI and state of Illinois pledged another $100 million.

That's probably not enough money, says Horst Simon, the deputy director of Lawrence Berkeley National Laboratory, who has also worked on NASA's high speed computation projects.

Simon said NSF has a history of slightly "low-balling" supercomputer projects and expecting corporations or government bodies to make up the difference.

That has worked in the past, he said. Simon compares the situation to Formula One racing.

"No one makes money on their Formula One cars. But there's a significant chance to boost prestige with your name on the car. There's also a certain macho aspect to the top 500 (supercomputers). The Japanese are building their K machine now as a national rallying point," he said.

The Formula One formula worked before 2007, when NSF drew up many of the Blue Waters details. But the national economy has changed the game for both governments and corporations, according to Simon.

The state of Illinois would no longer have the money to bail out an expensive project, he said, and things aren't any better at IBM.

Also, other costs were hard to control, he said.

"In order to reach their goal, very significant investments had to made in cooling and energy efficiency," he said.

"In my mind, it was clear from the beginning that the architecture NCSA had chosen was well-suited, but came with a cost. If you really want sustained petaflop computing, you need powerful memory, powerful interconnects, lots of electrical power."

Achieving supremacy became an expensive goal.

"The contract went bad," Morgan believes, "because IBM can't afford to lose the money; it's not going to use Blue Waters as a loss leader."

Morgan said, "Blue Waters has a hell of a lot of computing power and the NCSA guys had every expectation that this thing would have worked. This has to be a manufacturing problem, costing more than the $300 million to build."

Joanna Brewer of IBM Systems Group External Relations said Friday she could make no comment on the NSF issue.

NSF spokeswoman Lisa-Joy Zgorski did not return a call.

The Japanese earthquake and the global financial crisis may have played a role too, by hurting the value of the dollar paid for material.

Morgan said it's also possible that the Japanese earthquake may have struck a blow at IBM's suppliers, including memory-maker Hitachi.

Comments

News-Gazette.com embraces discussion of both community and world issues. We welcome you to contribute your ideas, opinions and comments, but we ask that you avoid personal attacks, vulgarity and hate speech. We reserve the right to remove any comment at our discretion, and we will block repeat offenders' accounts. To post comments, you must first be a registered user, and your username will appear with any comment you post. Happy posting.

Login or register to post comments

Paul Wood wrote on August 14, 2011 at 5:08 pm

Still looking for insiders who might have a little more info ... pwood@news-gazette.com

birdfarmer wrote on August 15, 2011 at 10:08 am

Paul, Don't worry about who did who wrong, there is plenty of blame to go around. The issue now should be whether the NSF/Nation should cut its losses and move on. This project was a double-down on the previously controversial NSF PACI project that wasted hundreds of millions of dollars to fund two competing supercomputer centers that arguably did not produce science worthy of the investment. So now, the remnants of $200M could buy a computer maybe not even in the top 10 to let maybe a dozen researchers in the Nation do simulations, while sucking 25MW of electrical power. Would that $200M, spread out among the other NSF Directorates to maybe fund K-12 education rather than BIG SCIENCE produce more value to the Nation? Now is the time to rethink the value of this project as was intelligently done with the Texas Supercollider fiasco.

Lostinspace wrote on August 15, 2011 at 2:08 pm

Interesting stuff. How much UI money/involvement (exclusive of grants)?

Paul Wood wrote on August 15, 2011 at 2:08 pm

That's interesting. Someone else told me to expect 20 megawatts consumption ....

birdfarmer wrote on August 15, 2011 at 2:08 pm

http://www.nsf.gov/bfa/lfo/seminars/pub/TOWNSNSF_LargeFacilitiesWSTowns.pdf

page 31 and 32 for power draw ...

... and look at the "on-line" date on page 27: July 2011 (last month)

asparagus wrote on August 15, 2011 at 2:08 pm

It is time to pull the plug on this fiasco. NCSA has been wasting money for years on dubious research with little real impact to the state. Lets take some millions of dollars and reinvest them in things that matter like education proper. If NCSA disappeared tomorrow, the university would be much better off.

Paul Wood wrote on August 15, 2011 at 2:08 pm

thank you, birdfarmer, that's helpful