You may be able to use a METS structure map to document the playing order in much the same way they are used to document page order. This is probably the best way to formally document that information as METS is such a widely used standard amongst the archival/preservation community
Unfortunately that won't help you much with automating playlist generation for software applications. But you could probably also write a METS to playlist transform or eventually create a METS reader plugin for VLC or something.
The other reccomendation I'd make is to have a target player that you aim to support, or a few but have a limit. I personally don't think you should expect to support every audio player out there. So if some players don't support your chosen method of documenting track order, that ought to be ok provided your supported player(s) do.