Is it possible to create an invalid UTF8 string using Javascript?
Every solution I've found relies String.fromCharCode
which generates undefined
rather than an invalid string. I've seen mention of errors being generated by ill-formed UTF8 string (i.e. ()) but I can't figure out how you would actually create one.
Is it possible to create an invalid UTF8 string using Javascript?
Every solution I've found relies String.fromCharCode
which generates undefined
rather than an invalid string. I've seen mention of errors being generated by ill-formed UTF8 string (i.e. https://developer.mozilla/en-US/docs/Web/API/WebSocket#send()) but I can't figure out how you would actually create one.
- The error mentioned there is not about UTF-8 strings, and javascript typically does not use UTF-8 to represent strings internally. – pvg Commented Sep 11, 2017 at 1:20
- @pvg: Thanks for pointing out the mistake. Not sure why I assumed UTF8 was the javascript encoding. My question should have been more specific: How can you create a string that contains unpaired surrogates? – Mattia Commented Sep 11, 2017 at 9:37
- I'm not entirely sure and the docs seem pretty vague although it's possible to reach into the bowels of javascript strings and do a lot of strange things without things instantly catching fire. i.imgur./sWVE0IY.png – pvg Commented Sep 11, 2017 at 12:14
2 Answers
Reset to default 5One way to generate an invalid UTF-8 string with JavaScript is to take an emoji and remove the last byte.
For example, this will be an invalid UTF-8 string:
const invalidUtf8 = '