Well, seems like we’re done tweaking finally, right?
I like those changes. I just realized one problem with the addition of \n\r\t to the tokenizer: the string is rebuilt with space character so \n and \r and \t will be transformed into spaces. That could be a problem if you imagine this function just before a nl2br(). It might mess some things up for people expecting their newlines, carriage-returns, and tabs to be there.
I suggest just going back to the space tokenizer on the first pass and then see if there are any newlines, carriage-returns or tabs and then on second, third and 4th passes we incrementally replace the other whitespace chars.
I can’t think of any other way of doing it, can you?
