In that sense, no underlying physical state could be said to hold “more” information than any other, right?
In an information-theoretical sense, you can have a message that has a lower or higher information content. This is where entropy gets derived from. But it only makes sense for a fixed distribution -- a more likely outcome has a lower information content. So I think you could have a physical state holding more information, if it's a less likely state for some fixed definition of likeliness.
This would probably be closer to an actual link between informational entropy and physical -- a given microstate has lower physical entropy when it is a less-likely state (e.g. half-squished cup of coffee), and that state would have higher information content if we considered the state as the message. This intuitively makes sense, because physical entropy is in some sense the ability of a system to undergo change, so indeed a low-entropy system is "more useful", just like a message with higher information content is "more useful".
The definition of Big-O literally contains a clause that says the function is non-zero (for sufficiently large x) so please go fuck yourself