74GB WD Raptors in RAID 0 Questions.

The two drive array show a 100% increase in failure rate compared to a single drive. The ten drive array shows a 900% increase in failure rate compared to a single drive.

Throw your formula away. It is not statistically sound.

S-

Correct me if I am wrong and apologize if I over look the answer.

I believe I learned this in my elementary math that



100% failure rate = not working at all.

If it is not working at all, why should it be an option?
 

My Computer

System One

  • CPU
    E6850
    Motherboard
    EVGA 122-CK-NF67-A1 680i
    Memory
    4 x OCZ Platinum 1GB
    Graphics Card(s)
    ATI Radeon HD 5850 1GB
    Sound Card
    SB X-Fi X Audio
    Monitor(s) Displays
    Samsung 23" 5MS
    Screen Resolution
    2048 x 1152
    Hard Drives
    2 x Barracuda 7200.10 320GB RAID 0 / 1 x 500GB Maxtor
    PSU
    Seasonic 600W M12
    Case
    CM Centurion 5
    Cooling
    air
    Internet Speed
    100Mbps
The two drive array show a 100% increase in failure rate compared to a single drive. The ten drive array shows a 900% increase in failure rate compared to a single drive.

Throw your formula away. It is not statistically sound.

S-

Correct me if I am wrong and apologize if I over look the answer.

I believe I learned this in my elementary math that



100% failure rate = not working at all.

If it is not working at all, why should it be an option?
bruce2,

Better take that elementary math class again. ;)

A 100% increase in failure rate is not a 100% failure rate.

If a drive type had an average failure rate of 1 failure every 50,000 hours, what is a 100% increase in failure rate? What is 100% of 1? It's 1. So a 100% increase in failure rate would be 2 failures every 50,000 hours.

S-
 
Last edited:

My Computer

System One

  • CPU
    Intel E6600 @ 3.0 GHz
    Motherboard
    EVGA nForce 680i SLI (NF68-A1)
    Memory
    4GB - CORSAIR XMS2 PC2 6400
    Graphics Card(s)
    EVGA GeForce 8800 GTS (640MB)
    Hard Drives
    2 - Seagate Barracuda 7200.10 (320GB)
    1 - Seagate Barracuda 7200.10 (500GB)
It does not go up 100% because your using each drive less. You are just figure event risk. 2 drives: 50%. You are on each drive 1/2 the time. I think you are confusing how increased probability of failure works.
Your formula assumes the motor and bearings never fail. That's crazy. Most drive failures I have seen are main spindle motor or bearing related or other components that are always in use when the drive is powered on. Head stepper motors don't fail too often.

So even if we don't use MTBF, your formula is still useless since the drives are not off when they are not being accessed.

S-
 

My Computer

System One

  • CPU
    Intel E6600 @ 3.0 GHz
    Motherboard
    EVGA nForce 680i SLI (NF68-A1)
    Memory
    4GB - CORSAIR XMS2 PC2 6400
    Graphics Card(s)
    EVGA GeForce 8800 GTS (640MB)
    Hard Drives
    2 - Seagate Barracuda 7200.10 (320GB)
    1 - Seagate Barracuda 7200.10 (500GB)
The two drive array show a 100% increase in failure rate compared to a single drive. The ten drive array shows a 900% increase in failure rate compared to a single drive.

Throw your formula away. It is not statistically sound.

S-

Correct me if I am wrong and apologize if I over look the answer.

I believe I learned this in my elementary math that



100% failure rate = not working at all.

If it is not working at all, why should it be an option?
bruce2,

Better take that elementary math class again. ;)

A 100% increase in failure rate is not a 100% failure rate.

If a drive type had an average failure rate of 1 failure every 50,000 hours, what is a 100% increase in failure rate? What is 100% of 1? It's 1. So a 100% increase in failure rate would be 2 failures every 50,000 hours.

S-

Sorry I over look the "increase". Thanks for the head up about taking my math again. :o
 

My Computer

System One

  • CPU
    E6850
    Motherboard
    EVGA 122-CK-NF67-A1 680i
    Memory
    4 x OCZ Platinum 1GB
    Graphics Card(s)
    ATI Radeon HD 5850 1GB
    Sound Card
    SB X-Fi X Audio
    Monitor(s) Displays
    Samsung 23" 5MS
    Screen Resolution
    2048 x 1152
    Hard Drives
    2 x Barracuda 7200.10 320GB RAID 0 / 1 x 500GB Maxtor
    PSU
    Seasonic 600W M12
    Case
    CM Centurion 5
    Cooling
    air
    Internet Speed
    100Mbps
In my server room I have over 50 hard drives. On my old SCSI workstation I had 18 hard drives. They were in RAID 1 and 10. I had hot spare fail over drives. I did complete tape backups of the data and system state. A drive failure did not bring down a server or my workstation. It only slowed down the RAID array for the period of time to do the fail over to the spare drive. Replacing a hot swap drive is done with the server or my workstation working. On the HP servers, they just moved the data backup from the spare to the replacement drive. I had three replacement drives shipped bad. The server detected them and did not move the data back from the spare. My downtime was none.

All the statistics you mention don't predict failure. They tell you the probability compared to other drives. It is really meaningless. A drive can fail at any time for any reason. I had a Seagate MFM 80MB hard drive last 18 years of use before it failed. I had a Seagate Savvio last hours. I had new HP replacement drives fail out of the box. You view statistics and probability as gospel. I don't even look at them in the server room or at home. I build so multiple points of failure are needed to bring down my server or workstation.
 

My Computer

System One

  • CPU
    pair of Intel E5430 quad core 2.66 GHz Xeons
    Motherboard
    Supermicro X7DWA-N server board
    Memory
    16GB DDR667
    Graphics Card(s)
    eVGA 8800 GTS 640 MB video card
    Hard Drives
    SAS RAID
All the statistics you mention don't predict failure. They tell you the probability compared to other drives. It is really meaningless. A drive can fail at any time for any reason. I had a Seagate MFM 80MB hard drive last 18 years of use before it failed. I had a Seagate Savvio last hours. I had new HP replacement drives fail out of the box. You view statistics and probability as gospel. I don't even look at them in the server room or at home. I build so multiple points of failure are needed to bring down my server or workstation.
Now were are on to something!! I agree that a drive can fail at any time for any reason. That is why I say that if you go from 1 drive to 2 in a storage system, that system is twice as likely to have a drive failure. Go from 1 to 3 drives and that system is 3 times as likely to have a failure.

S-
 

My Computer

System One

  • CPU
    Intel E6600 @ 3.0 GHz
    Motherboard
    EVGA nForce 680i SLI (NF68-A1)
    Memory
    4GB - CORSAIR XMS2 PC2 6400
    Graphics Card(s)
    EVGA GeForce 8800 GTS (640MB)
    Hard Drives
    2 - Seagate Barracuda 7200.10 (320GB)
    1 - Seagate Barracuda 7200.10 (500GB)
I would suggest taking a course in probabilities and statistics for engineering. It sounds correct but its not mathematically correct. You don't double or triple your risk with 2 or 3 drives. I have had every math course offered for a Bachelor's degree in Math. By your calculations server rooms would be doomed. A 95% increase in risk would be huge. You have almost guarenteed event of failure increasing your risk by 95%. You have only a 5% probability of continued success over the life of the drive.

The chances of a motor failing are very, very low compared to a PCB board or head failure. Your chance of a power or heat related failure is far higher than a motor failure on a hard drive. In 20 years of using hard drives, I have not had a motor fail. I have had PCB board failure and head failure. The servo motors don't fail that often. I have use MFM, RLL, IDE, SCSI, and SAS drives since the early 1980s. I know the engineers at Seagate. I send my drives directly to them on failure for the cause. Most of the time its heat related failure of the PCB. I have also crashed heads due to drive vibration in the chassis. The Servo motors on SCSI and SAS will outlive the platters and heads.
 

My Computer

System One

  • CPU
    pair of Intel E5430 quad core 2.66 GHz Xeons
    Motherboard
    Supermicro X7DWA-N server board
    Memory
    16GB DDR667
    Graphics Card(s)
    eVGA 8800 GTS 640 MB video card
    Hard Drives
    SAS RAID
Customers replace disk drives at rates far higher than those suggested by the estimated mean time between failure (MTBF) supplied by drive vendors, according to a study of about 100,000 drives conducted by Carnegie Mellon University.


The study, presented last month at the 5th USENIX Conference on File and Storage Technologies in San Jose, also shows no evidence that Fibre Channel (FC) drives are any more reliable than less expensive but slower performing Serial ATA (SATA) drives.
That surprising comparison of FC and SATA reliability could speed the trend away from FC to SATA drives for applications such as near-line storage and backup, where storage capacity and cost are more important than sheer performance, analysts said.
At the same conference, another study of more than 100,000 drives in data centers run by Google indicated that temperature seems to have little effect on drive reliability, even as vendors and customers struggle to keep temperature down in their tightly packed data centers. Together, the results show how little information customers have to predict the reliability of disk drives in actual operating conditions and how to choose among various drive types.
Real World vs. Data Sheets

The Carnegie Mellon study examined large production systems, including high-performance computing sites and Internet services sites running SCSI, FC and SATA drives. The data sheets for those drives listed MTBF between 1 million to 1.5 million hours, which the study said should mean annual failure rates "of at most 0.88%." However, the study showed typical annual replacement rates of between 2% and 4%, "and up to 13% observed on some systems."


Study: Hard Drive Failure Rates Much Higher Than Makers Estimate - PC World
 

My Computer

System One

  • Manufacturer/Model
    Scratch Built
    CPU
    Intel Quad Core 6600
    Motherboard
    Asus P5B
    Memory
    4096 MB Xtreme-Dark 800mhz
    Graphics Card(s)
    Zotac Amp Edition 8800GT - 512MB DDR3, O/C 700mhz
    Monitor(s) Displays
    Samsung 206BW
    Screen Resolution
    1680 X 1024
    Hard Drives
    4 X Samsung 500GB 7200rpm Serial ATA-II HDD w. 16MB Cache .
    PSU
    550 w
    Case
    Thermaltake
    Cooling
    3 x octua NF-S12-1200 - 120mm 1200RPM Sound Optimised Fans
    Keyboard
    Microsoft
    Mouse
    Targus
    Internet Speed
    1500kbs
    Other Info
    Self built.
I would agree. In a server room you have optimal power and cooling. You also have drives getting more use. Heat and power issues kills the drive. Most users don't cool the room along with the chassis. They don't have good power into the power supply to begin with. I usually get a Tripp Lite Line Conditioner for every home computer I build for users. I also recommend a dedicated circuit.
 

My Computer

System One

  • CPU
    pair of Intel E5430 quad core 2.66 GHz Xeons
    Motherboard
    Supermicro X7DWA-N server board
    Memory
    16GB DDR667
    Graphics Card(s)
    eVGA 8800 GTS 640 MB video card
    Hard Drives
    SAS RAID
I would suggest taking a course in probabilities and statistics for engineering. It sounds correct but its not mathematically correct. You don't double or triple your risk with 2 or 3 drives. I have had every math course offered for a Bachelor's degree in Math. By your calculations server rooms would be doomed. A 95% increase in risk would be huge. You have almost guarenteed event of failure increasing your risk by 95%. You have only a 5% probability of continued success over the life of the drive.
You have taken every math course offered "for a Bachelor's degree in Math" and you think that a 95% increase in a chance of failure leaves you 5% probability of no failure? You must not have been paying much attention in class. Did you fail those courses?

A 95% increase in risk in not even doubling the risk. Let's say there was a 1 in 10,000 chance of getting hit by a car if you crossed Easy Street. If I increase the chances of getting hit by 95%, you have a 1.95 in 10,000 chance of getting hit. A 200% increase in risk would mean a 3 in 10,000 chance of getting hit.

Everything is a system and RAID arrays are no different. The more devices in an array the more likely there will be a failure. RAID 0 arrays are bad from a reliability standpoint. As you said, a drive can fail at any time for any reason. Reads or writes do not have to be happening for a drive to fail. If you are using the same drives to make up array, going from 1 drive to 2 doubles the risk of failure because now you have twice as many components that can fail. Going from 1 to 3 triples the risk.

You don't use RAID 0 arrays if you care about reliability because of the problem mentioned above. If one drive fails, the whole array is down. That's why larger arrays are built using a RAID level that does not fail because of a single drive failure.

S-
 
Last edited:

My Computer

System One

  • CPU
    Intel E6600 @ 3.0 GHz
    Motherboard
    EVGA nForce 680i SLI (NF68-A1)
    Memory
    4GB - CORSAIR XMS2 PC2 6400
    Graphics Card(s)
    EVGA GeForce 8800 GTS (640MB)
    Hard Drives
    2 - Seagate Barracuda 7200.10 (320GB)
    1 - Seagate Barracuda 7200.10 (500GB)
Using RAID 0 is like putting all your Data in a room full of oily rags with a blowtorch. :)
 

My Computer

System One

  • Manufacturer/Model
    Custom - Built Myself
    CPU
    E8400 Core 2 Duo @ 9 X 400
    Motherboard
    ASUS P5E3 Deluxe WiFi X38
    Memory
    4GB OCZ Platinum OCZ3P13332GK
    Graphics Card(s)
    Gigabyte Radeon HD RX3850 775MHZ 512MB 1.4GHZ GDDR3
    Sound Card
    On Board
    Monitor(s) Displays
    Dell SP2208WFP
    Screen Resolution
    1680 X 1050
    Hard Drives
    1 X 640GB 7200 Western Digital
    2 X 500GB 7200 Seagate Barracuda
    All with cooling fans
    PSU
    ENERMAX 1000W Galaxy DXX
    Case
    CoolerMaster CM690
    Cooling
    Zalman CNPS9700 NT / 4 X 120mm / 1 X 80mm Case Fans
    Keyboard
    Microsoft Natural Ergonomic Comfort 4000
    Mouse
    Microsoft Comfort Optical 3000
    Internet Speed
    10mbps down / 1mbps up - Shaw Xtreme
    Other Info
    Lg GGC-H20L BLU-RAY HD-DVD Reader BD-ROM DVD+-RW 16X8X6 DL 4X

    Scythe Multi-Format Card Reader / Floppy Drive
Using RAID 0 is like putting all your Data in a room full of oily rags with a blowtorch. :)
Pretty much......

S-
 

My Computer

System One

  • CPU
    Intel E6600 @ 3.0 GHz
    Motherboard
    EVGA nForce 680i SLI (NF68-A1)
    Memory
    4GB - CORSAIR XMS2 PC2 6400
    Graphics Card(s)
    EVGA GeForce 8800 GTS (640MB)
    Hard Drives
    2 - Seagate Barracuda 7200.10 (320GB)
    1 - Seagate Barracuda 7200.10 (500GB)
@xguntherc: I bet you're sorry you started this one....:eek:


















Later :sarc: Ted
 

My Computer

System One

  • Manufacturer/Model
    * BFK Customs *
    CPU
    Intel C2Q 9550 Yorkfield
    Motherboard
    ASUS P5Q Pro
    Memory
    8GB Dominator 8500C5D
    Graphics Card(s)
    XFX ATI 1GB 4870 XXX
    Sound Card
    Realtek HD 7-1
    Monitor(s) Displays
    1x 47" LCD HDMI & 2x 26" LCD HDMI
    Screen Resolution
    1920x1080P & 1920x1200
    Hard Drives
    2x 500GB 7200RPM 32MB Cache WD Caviar Black
    PSU
    Corsair 620HX
    Case
    CM Cosmos RC-1000
    Cooling
    Tuniq Tower 120, 2x 140mm and 3x 120mm case fans
    Keyboard
    HP Enhansed Multimedia
    Mouse
    Razer Diamondback 3G
    Internet Speed
    18.6Mb/s
    Other Info
    My First Build ;)
Back
Top