An Intel 530-Series mSATA SSD 240GB is installed in an Intel DQ77MK motherboard mounted in an Antec Sonata-II tower case. It has FreeBSD 10.1-Release (amd64) installed on it.
The SSD will randomly disconnect resulting in a series of ahci timeout messages and ultimately a kernel panic.
Usually, if not always, on reboot the SSD is no longer visible in the BIOS until the system is power-cycled. (One small variation here is that if the system is powered down before FreeBSD panics, the device boot order is retained in the BIOS - there are other non-SSD SATA disks attached - but if the system is allowed to panic and reboot then the device boot order is lost - the SSD is no longer listed as the first boot device).
Initially in an attempt to mitigate this problem the SATA channel for the SSD was configured to reduced speed operation in FreeBSD, (in /boot/loader.conf):
(kernel log, /var/log/messages):
kernel: ada2 at ahcich4 bus 0 scbus5 target 0 lun 0
kernel: ada2: <INTEL SSDMCEAW240A4 DC33> ATA-9 SATA 3.x device
kernel: ada2: Serial Number CVDA414203E8240M
kernel: ada2: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes)
kernel: ada2: Command Queueing enabled
kernel: ada2: 228936MB (468862128 512 byte sectors: 16H 63S/T 16383C)
kernel: ada2: Previously was known as ad12
This is still the current setting, but the issue persists.
The system was also booted into linux from a DVD. Creating an "EXT3" filesystem on the SSD and writing to it was similarly unsuccessful.
Since finding the other reports of problems with this device I have taken a look at the temperatures using 'smartctl'. It seems there may indeed be the overheating issue that has been identified for this device, as evidenced from this excerpt:
# smartctl -l scttemp /dev/ada2
Current Temperature: 34 Celsius
Power Cycle Min/Max Temperature: -20/67 Celsius
Lifetime Min/Max Temperature: -20/76 Celsius
Under/Over Temperature Limit Count: 0/0
The SSD sits in the exhaust airflow from the graphics card heatsink/fan. A thermometer showed this exhaust air to be around 45C. The SSD probably does not benefit greatly from the existing case cooling - PSU exhaust fan and 120mm case rear exhaust fan - due to its position within the case and its position and orientation on the motherboard (lies parallel and close to board, air crossflow is low due to large internal volume of case and is also significantly occluded to the SSD by SATA connectors etc.).
I have now installed (loosely sat) an 80mm fan to blow air from the bottom of the case directly onto the SSD. From occasional checking of the "Current" temperature using 'smartctl', it seems to be effectively reducing the average temperature of the SSD. (For some reason the temperature history log as displayed by the 'smartctl -l scttemp' command doesn't seem to now be updating reliably, however the last four temperatures shown in the attached ada2_smartctl_scttemp.log were recorded after the additional fan was installed and seem to reflect the success of the additional cooling).
Despite the additional cooling this problem persists - it recurred about 24 hours after installing the fan and there were a succession of faults today triggered repeatedly by the same action. That action was using the FreeBSD 'pkg' command to update/install a particular port which required a ~38MB download. Each attempt to run the pkg command to install that port resulted in the SSD disconnecting before the 'fetch' of the pkg file completed - perhaps between 50-80% completion. After 3 consecutive instances of repeated failure, the 'pkg fetch' command was used to write the downloaded pkg file to a different partition (to the /usr partition rather than the default /var partition), but the SSD again disconnected. The port was ultimately successfully installed by downloading the pkg file to a non-SSD drive and completing the installation from there.
The most curious aspect of this scenario is that each download attempt of the ~38MB file occurred at ~150kB/s with a correspondingly low average write rate to the SSD. In the final, successful, install of the pkg file, over 200MB of files were written to the SSD, very rapidly, but the drive did not fault. Besides which many dozens of other port installs have been completed without incident.
One final point of interest is that prior to the acquisition of the Intel SSD, a KingSpec 32GB device was installed and exhibited essentially the same symptoms, although probably much more rapidly. At the time I put this down to it being a poor quality product, but the experience with the Intel device perhaps suggests that something else is at play?
I have a SATA->mSATA adapter on order and will try the SSD with that to see if eliminating the mSATA port provides any improvement. In the meantime is there anything else I can do to validate the condition of the SSD or resolve this problem?