A couple of small optimizations that shouldn't make a difference but do such as: Addressing tracks as a single array. Caching a single byte instead of re-reading the array. (Note that caching 32bits did not improve but actually worsen things). Harsher interrupt disabling. Whitespace changes in main...
Plus, early bail out from DriveLoopReadNoFlux.
Specialized drive read loops for the most common cases, triggers a vast speed improvement in some scenarios. Improved saving speed. Removed some more code that's not really required right now.