veracity

telling it like it is

January 26, 2010

Linux kernel ATA dmesg errors with Crucial M225 SSD

Filed under: — Vishal Rao @ 11:04 pm

One-line executive summary: If you face slowness, hangs, filesystem corruption and/or lots of dmesg errors about ata try adding ” libata.force=noncq ” to your linux kernel boot options.

I recently bought a Crucial branded 128 GB solid state disk (SSD) model CT128M225 of the M225 range because they’re the hot new thing to get.
See http://www.codinghorror.com/blog/archives/001304.html

The disk would work well with my copy of Windows 7 RC but to my dismay Linux would spew a lot of disk error messages as you can see here: http://pastebin.ubuntu.com/347122/

I tried the usual STFW-ing and asking around in the Linux, Crucial.com and a few tech forums but to no avail. Apparently I’m one of the first to run an SSD on Linux, no way! There was some info about SMART errors with SSDs but the suggested workaround(s) didn’t work for me.

I’d almost resigned to the fact that either my SSD model would not work with current Linux versions or *shock* perhaps my particular disk was defective but then I took another closer look at the logs.

Searched again but this time for one of the error messages “failed command READ FPDMA QUEUED” and saw references to NCQ (native command queuing). Did some more digging around and hit some info about the Linux kernel’s ATA library option ” libata.force=noncq ” which when I tried seemed to resolve these issues!

Some other users (of Kingston, OCZ and Intel SSDs) mentioned they did not face such issues and apparently Intel has a good solid NCQ implementation from online docs.

Curious as to whether the root cause of the problems was a defect in my SSD’s firmware or a bug in the Linux kernel I visited the online linux cross reference site here http://lxr.linux.no/ and did some browsing. Found the file drivers/ata/libata-core.c where I noticed a line referencing OCZ SSD and a parameter called ATA_HORKAGE_NONCQ.

It didn’t take too long to put two and two together and eventually spent the rest of the day today patching three lines of code into the kernel to automatically detect my SSD model range and disable NCQ so as to avoid these problems in the first place. Hopefully it will benefit other unsuspecting or less tech savvy users for which this could be a serious problem.

It was a coincidence that I had recently upgraded my slow 512 kbps DSL connection to 8 mbps in preparation to attend some online Ubuntu classroom sessions about bug fixing etc. called “Ubuntu Developer Week” and this allowed me to quickly download all the development tools and source code, make the patch, build the source and binary package and upload it.

I sent an email to the Linux kernel mailing list with the trivial patch - but looks like it’s going to be tough to get these hardcore developers to accept it: http://lkml.org/lkml/2010/1/26/185

Anyway, I visited my nearly 5 year old Ubuntu Launchpad account site at https://launchpad.net/~vishalrao which was gathering dust, signed the Ubuntu Code of Conduct, added my OpenPGP signing key, created a “kernels” PPA (personal package archive) and uploaded my patched kernel source. Lets see if it builds or not! Check back here: https://launchpad.net/~vishalrao/+archive/kernels

3 Comments »

  1. [...] mind Another way to get good help with Linux is to fix the source yourself See my blog post: Linux kernel ATA dmesg errors with Crucial M225 SSD | veracity "Thou shalt not follow the null pointer for at its end madness and chaos lie." [...]

    Pingback by Want fast and good help in Linux? Then please keep this in mind — January 26, 2010 @ 11:11 pm

  2. [...] LKML: Vishal Rao: Re: [PATCH] ata: Disable NCQ for Crucial M225 brand SSDs Also blogged about it: Linux kernel ATA dmesg errors with Crucial M225 SSD | veracity Kernel source uploaded to my Ubuntu Launchpad PPA, this is my third attempt at upload and build, [...]

    Pingback by The SSDs-with-Linux thread. - Open Source and Linux | TechEnclave — January 26, 2010 @ 11:18 pm

  3. Afternoon/Morning/Evening,

    Are you happy with the performance of your M225 in linux?

    hdparm -t /dev/sda for me gives nothing higher than 120MB/s.

    Was wondering which scheduler you’re using with your drive and whether you’ve had to tweak system settings further?

    Thanks

    Comment by Tony — May 5, 2010 @ 7:55 pm

RSS feed for comments on this post. TrackBack URI

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

(required)

(required)


Powered by WordPress