You do not have a performance problem. You simply aren't understanding what ATTO is telling you.
1) You need to understand what the block sizes (what it calls "Transfer size") are that ATTO is testing. It's testing using 512 byte, 1KByte, 2KByte, 4KByte, 8KByte, 16KByte, 32KByte, 64KByte, 128KByte, 256KByte, 512KByte, 1MByte, 2MByte, 4MByte, and 8MByte blocks. What this literally equates to is that the ATTO software is attempting to read or write a block of data of one of those sizes all at once. Example: let's say a program needs to read 1MByte from the disk. It can do so by issuing multiple read requests of smaller amounts (e.g. two 512KB reads, four 256KB reads, eight 128KB reads, etc.), or it can try to do it "all in one swoop". Doing it "all in one swoop" IS NOT the most optimal way to do things (see Item #2 below), and it greatly depends on the device, controller, and OS tunings.
Most software does not use 512KByte blocks when reading/writing. Most software uses anywhere between 512 byte and 32KByte blocks. Likewise, filesystems also operate within similar constraints (see "NTFS cluster size", which only goes up to 64KByte anyway).
As such, I usually adjust ATTO to only test 512 byte to 128KByte transfer sizes. Anything above that is silly, because the underlying filesystem will split the reads/writes up into separate sections anyway given that cluster sizes don't exceed 64KByte.
So basically, you need to learn how to use ATTO correctly and not just blindly run every single benchmark application you can find on the Internet. Using these utilities is 100% pointless unless you *actually understand* what the tool is doing internally AND what the data is telling you. I say that with as much sincerity as possible. Please don't just blindly run benchmark utilities and say "lulz i don't see super large numbers omg!!!".
2) You're assuming that performance is going to gradually increase, or "taper off", at a specific block size. That is simply not the case due to all sorts of reasons (cache levels, wear levelling on an SSD, cache amounts on a controller as well as on the device/disk itself, etc.).
Bottom line: your SSD is performing absolutely 100% perfectly. AHCI has nothing to do with it, SATA port speed has nothing to do with it either.
Now, finally, you said "alignment is 1024, think this is OK". 1024 *what*? Sectors? You need to explain in detail exactly how you did the alignment. Are you running Windows 7? If so, you shouldn't have to do any kind of partition alignment. If you're on XP, you do have to do partition alignment, but the value you choose depends on what utility you used to do the alignment. Using DISKPAR (not DISKPART), using an alignment offset of 128 sectors is acceptable and ensures that the MBR/PBR has room for the partition tables as well. Please see this guide on OCZ's forum for how to properly align a partition using DISKPAR: