Enterprise 10000

 

Enterprise 10000 or Starfire


History of the Enterprise 10000 or Starfire

  • This system came out of the Cray or BSD group
  • The E10000 extends the Enterprise line that was announced in April 1996 as a result of the acquisition, in July 1996, of the Cray Research SPARC(TM)/Solaris business. With the acquisition came a 250-person organization, the CS6400 64-way system based on SuperSPARC(TM) technology, and the almost-completed design for what is now being announced as the Enterprise 10000. The Cray organization has been integrated into SMCC as a product group called Business Systems. Q. What is the future of the CS6400 system? A. The CS6400 will be formally withdrawn by calendar Q2 1997. Existing systems will be supported for five years and can be maintained by SunService(SM). An upgrade program exists to allow a trade up from the CS6400 to the E10000. In addition, Sun has sufficient material available to allow expansions to existing CS6400s for a least one year.


  • Processors

  • 16 to 64 250MHz 4MB E$, 336MHz 4MB or 400MHz 4MB E$ UltraSPARC-II Processors codename Blackbird
  • Blackbird is a .35micron processor.
  • The process has 44bit virtual to 41bit physical addressing capability
  • The on board cache is 16KB for both data and instruction
  • The FPU has two execution units capable of two FOPS per clock cycle
  • Upgrade will be to either 300Mhz or 333MHz


  • System Board

  • Up to 16 system boards
  • Each system board has a mezzanine cards for SBus
  • There are 2 SBuses with a total of 4SBus slots
  • Each SBus runs at 25MHz 64 bit that comes to 200MB/sec total bandwidth
  • and about ~100MB/sec sustained
  • Each system card proviedes 400MB/sec bandwidth
  • Each Starfire system can provide 400MB * 16 or 6.4GB/sec IO bandwidth
  • There are NO onboard SCSI or eithernet ports.


  • UPA

  • The E10000 uses the Gigaplane-XB interconnect between System Boards. This is a router-based interconnect scheme that, in this version of the E10000, scales up to 10.5 GBps using an 83.3-MHz internal clock frequency. Latency is constant at about 600 ns regardless of the load on the interconnect. The Gigaplane-XB has been designed and tested to run at an internal clock frequency of 100 MHz, giving a data transfer rate of 12.8 GBps. This is a future growth path for the E10000. There are two Centerplane support boards in every Starfire - one for each half of the centerplane. These boards provide power and clocks to the centerplane.


  • Domains

  • The E10000 may be dynamically reconfigured as independent computers using the Dynamic System Domain feature. The system has been architected for 16 domains, and five are available in the initial version of the E10000. Each domain has its own copy of Solaris and its own boot disk and hostid. There is software isolation between domains and some degree of hardware isolation too. An example of the use of System Domains would be to divide the E10000 into a Transaction Domain and a Batch Domain. In the daytime, the Transaction Domain can be expanded at the expense of the Batch Domain, with the reverse occurring at night. System Domain is a feature borrowed from the mainframe world (LPARs). It allows the customer the flexibility to configure the E10000 to meet the needs of his business at any specific time period, and change it later as business conditions change. It also allows the E10000 to be divided into a number of independent and secure systems with isolation between them.


  • Gigaplane

  • The Gigaplane-XB is built with two data sections and four address paths. As mentioned above, there is error correction on the data and the addresses and this should take care of most transient errors. The Gigaplane-XB is implemented with active logic on the centerplane. Should one data section have a hard failure, the E10000 will come back up following an auto reboot with the remaining section carrying all the traffic. The net data bandwidth will be halved, but the E10000 will continue to provide service to users. The address paths degrade from four to three to two to one should there be a hard failure. Replacement of an E10000 centerplane can be scheduled for the next maintenance period.


  • Memory

  • Each memory bank delivers one complete cache line of 72 bytes per access. This is 64bytes plus ECC. Two types of SIMMS are available using 16Mb or 64Mb technology. There are 4 memory banks on a system board. The SIMM sizes are either 32MB or 128MB per simm. There are 8 SIMMS per bank and a total of 4 banks.
  • Maximum of 4GB of memory per system board.
  • SIMM type can not be mixed on a system board but can be mixed in a system


  • Control Board

  • All "housekeeping functions" are on a Control Board, which can be optionally redundant.
  • Each E10000 requires a Control Board and two may be configured for redundancy. Should the Control Board fail, the spare will be automatically configured into the system following an auto reboot. The failed board can be on-line hot swapped later.
  • This is the eithernet twisted pair 10baseT connection to the SSP
  • It provides the JTAG interface to the system boards
  • It provides the central clock distribution for Starfire
  • It monitors all cooling fans
  • It controls the remote switching of any I/O expansion cabinets
  • It provides the fail safe logic that monitors Starfires temperature and removes power from the system and I/O cabinets if the upper threshold is exceeded.
  • The control board plugs into the centerplane. The redudant control board plugs in from the other centerplane from the opposite side. A reboot is required to switch between control boards.


  • Hostview

  • Hostview is the GUI that is used to configure and administer a Starfire.


  • Sun's Ultra Enterprise 10000 server set record results in a critical TPC-D measure: 300-gigabyte database performance. At 300 gigabytes, the Ultra Enterprise 10000 server in a 64-processor configuration has TPC-D power of 1787.9 QppD@300GB and a TPC-D throughput of 1122.3 QthD@300GB, for a TPC-D price/performance of $3,562 QphD@300GB. Previously numbers near this level were achieved only in more expensive clustered configurations.
  • Power and Physical information

  • The E10000 has fault-tolerant power and cooling and redundant AC line feeds. These components are also on-line replaceable. These are RAS features that preserves the high availability of the E10000 should any of these components fail. The SSP logs the failure for future replacement.
  • The system can have up to 5 AC inputs.
  • These are 220V 30AMP connectors. Max draw of 24amps per line
  • 52,000BTU
  • 70"x30" 34" deep with a 5" styling panel
  • 1400 pounds fully loaded
  • The fans are both below and above the system boards.
  • The base system comes with 3 AC inputs
  • This could handle up to 32 processor system
  • The 4th AC input is needed for systems above that.
  • The 5th AC input is used for the I/O space
  • Do NOT have the OS drives internal on Starfire. The 5th AC input is not part of the load sharing and is not redundant.
  • The AC inputs go into 48V bulk supplies.


  • Power Control Unit

  • A PCU may be installed in the I/O space on a starfire. This is used to control I/O expansion units. Upto 5 PCU's may be installed in a Starfire.


  • Operating System

  • Starfire comes with Solaris, SSP and AP software.
  • Licensing is for unlimited Solaris users


  • SSP

  • The E10000 is configured with an integral System Service Processor (SSP), based on a SPARCstation(TM) 5 workstation with CD-ROM and management software. The SSP is used for normal Solaris administration, but also controls system booting, monitors the hardware for problems and is fundamental to the E10000's advanced RAS capability. It also uses SNMP-based messaging, which provides the framework that allows remote monitoring packages to include the E10000 in their management. The premise behind the SSP is that it makes sense to use a separate independent vehicle for monitoring and controlling the system. In this way, the SSP can diagnose the E10000 with no compromises.


  • RAS Features

  • A major capability for the E10000 is its RAS features. Dynamic Reconfiguration (DR) allows System Boards to be swapped while the system remains on line. Dynamic System Domains (DSD) allow the E10000 to be dynamically reconfigured as multiple smaller computers. There is fault tolerant power and cooling. In fact, an E10000 can be configured so there are no single points of failure that would cause the system to be down longer than the auto-reboot time.
  • F. Auto-reboot for all software hangs or panics is controlled by the SSP. B. The system will only be down for the time taken to auto reboot (configuration-dependent, but not more than 30 minutes). "Lights out" operation is possible. F. The E10000 has fault-tolerant power and cooling and redundant AC line feeds. These components are also on-line replaceable. B. These are RAS features that preserves the high availability of the E10000 should any of these components fail. The SSP logs the failure for future replacement. F. The E10000 has been designed with a Dynamic Reconfiguration feature that allows on-line swapping of System Boards without a required auto reboot. DR execution is controlled by the SSP. B. A RAS feature that enables concurrent servicing of the E10000; can be used to repair a failure or to upgrade the system while processing continues.


  • SUNTRUST

  • By configuring a Starfire correctly you can improve the availability from 99.5% to 99.95%. This is a difference of 44 hours downtime to 9 hours downtime per year.
  • Important uptime configuration issues:
  • Extra system boards
  • control board
  • Power and Cooling
  • System Service Processor
  • on the system
  • system domainins
  • /O redundancy
  • your data with RAID 5 or 1
  • your sysadmins
  • monitors
  • to platinum service


  • TPC-D

  • Sun's Ultra Enterprise 10000 server set record results in a critical TPC-D measure:
  • 300-gigabyte database performance. At 300 gigabytes, the Ultra Enterprise10000 server in a 64-processor configuration has TPC-D power of:
  • 1787.9 QppD@300GB and a TPC-D throughput of 1122.3 QthD@300GB,
  • for a TPC-D price/performance of $3,562 QphD@300GB.
  • Previously numbers near this level were achieved only in more expensive clustered configurations.
  • Availability

  • AVAILABILITY ------------ First Order Date: January 22, 1997 Volume Shipment Date: March 1997



  •  400MHz HPC BenchmarksInternet Link
     

     Starfire BenchmarksInternet Link
     

     Main Starfire PageInternet Link