If you can detect them you can replace them. See <sandr>.
I used the procedure in message#11 to detect for existence of invalid UTF-8 characters, but it seems that I found some text that was produced from urltext used the following escape sequence and wasn't detected:

Should I just replace anything that starts with &# and ends with ;?