From PostgreSQL wiki

Revision as of 04:44, 26 November 2009 by Ringerc (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The common generalization is that SCSI disks are fast, reliable, and expensive, while IDE drives including PATA and SATA designs are slow, unreliable, and cheap. The truth is a bit more complicated. Note that when "SCSI" is used here, nowadays that normally means SAS ("serial attached SCSI")

SCSI Disks

  • Often have higher available maximum RPM (10K or 15K).
  • Maximum capacity is lower (73-300GB are popular sizes).
  • Cost/MB is high.
  • Tend to be made with more expensive and possibly more reliable components, and believed to go through more thorough testing. The data from studies by Carnegie Mellon and Google don't show any significant bias toward SCSI being more reliable. But the data from Network Appliance suggests "SATA disks have an order of magnitude higher probability of developing checksum mismatches than Fibre Channel disks".
  • Usually default to the write cache being turned off, ensuring reliable database operation.
  • According to some sources such as NetAPP, the firmware in SCSI devices tends to be optimized better for RAID use in that it will return errors so data can be reconstructed from partner devices, where desktop oriented SATA devices often struggle internally to repair the damage instead. Similarly, the SAS protocol uses a command set with more facilities for reporting and recovery than is provided by the SATA interface.

ATA Disks

  • Most drives have slower RPMs (7200 is standard, some 10K designs like the Western Digital Raptor).
  • Maximum capacity is higher (2TB available). This is achieved by putting more platters into the drive. More platters means more heat and moving parts, and all other things being equal that can contribute to a higher failure rate. More platters can also mean slower seeks and generally slower performance in cases where the read and write heads are heavier; on the flip side, in cases where you're moving across the whole disk more platters may reduce average seek time.
  • Cost/MB is low. A fair performance comparison will recognize that while individual SCSI disks may be faster, if you can put more disks into the system because they're cheaper the aggregate performance of the ATA-based solution may be better. This is obviously limited by server space issues, you may reach the upper limit on disk expansion before you can add enough SATA disks to pull ahead.
  • Always default to the write cache enabled. Good (S)ATA RAID controllers will turn it off for you if setup correctly. It's possible to disable the cache on disks via the operating system, but this can be dangerous. There are reports of drives that don't turn off caching regardless and cases where the write cache turns back on if the drive is reset. A better technique is to turn the cache off using the diagnostic tools most manufacturers provide, so that it defaults to off even on reset. This requires some discipline on your part, to make sure that happens even when disks are replaced (the time around disk replacement after a drive failure tends to be stressful).
  • Consumer-oriented SATA disks can have large changes made to the drive firmware after the model is released, and in many cases it's not possible to revert to an earlier one on a newer drive. In some disk array configurations, matching drive firmware across all the disks in the array can be important. You can therefore end up where it's not possible to purchase a replacement drive anymore for a given model even if it's still on the market. SCSI disks are intended for servers and this sort of RAID compatibility is a design requirement for them. There are some models of SATA disks branded with terms like "server", "enterprise", or "RAID edition" that aim for similar firmware stability, but these drives tend to be significantly more expensive than the consumer SATA disks--closing a portion of the SATA/SAS price gap.
  • Consumer drives will also go out of their way to retry and attempt to correct for read errors. In a RAID configuration, you don't want that; it can lead to a timeout and you'd prefer to just read the known good copy from another drive. This is another difference you typically find in the "enterprise" oriented SATA drives, and all SAS disks.

If you get a good ATA controller, one that always turns off the individual disk caches for you, it's possible to build a reliable database system around ATA drives. But if you're just using the controller that comes integrated with your motherboard, unless you're very careful to validate that the write cache is disabled you risk database corruption if there's a crash.

SCSI-based setups generally avoid this issue by have sane defaults for database use. You will also likely get higher transfer rates and better seek performance in particular from an individual SCSI disk than a single ATA one. But in cases where you can throw more disks as the problem, being able to purchase more ATA disks per dollar may end up in a system that's considerably faster than the same amount spent on SCSI hardware.


Recommended SATA Controllers

  • 3ware units are generally agreed to be solid, with some concerns about their RAID 5 performance.
  • LSI MegaRAID (but not the SATA 150-2) are considered very reliable but somewhat slower than the other vendors listed here.
  • Areca controllers are very fast, but harder to obtain and since they're newer they're not as time-tested.

Helpful vendors of SATA RAID systems

Derived from and

Studies on drive reliability


Personal tools