Input | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Type here | ||||||||||||||||||||||||||||
Input was | โ๐๐๐ญ๐ฎ๐ซ๐๐ฅโ | |||||||||||||||||||||||||||
Interpretation | Doesn't look like any specific reference. Will describe given string as-is | |||||||||||||||||||||||||||
Constituent codepoints |
| |||||||||||||||||||||||||||
Search | ||||||||||||||||||||||||||||
Name search | no interesting words to search for | |||||||||||||||||||||||||||
Unicode string properties | ||||||||||||||||||||||||||||
Normalization | Original: U+201C U+1D40D U+1D41A U+1D42D U+1D42E U+1D42B U+1D41A U+1D425 U+201D (โ๐๐๐ญ๐ฎ๐ซ๐๐ฅโ) NFC (canonical compose): no change NFKC (fuzzy (re-)compose): U+201C U+004E U+0061 U+0074 U+0075 U+0072 U+0061 U+006C U+201D (โNaturalโ) Individual characters are: LEFT DOUBLE QUOTATION MARK LATIN CAPITAL LETTER N LATIN SMALL LETTER A LATIN SMALL LETTER T LATIN SMALL LETTER U LATIN SMALL LETTER R LATIN SMALL LETTER A LATIN SMALL LETTER L RIGHT DOUBLE QUOTATION MARK NFD (canonical decompose): no change NFKD (~= fuzzy decompose): U+201C U+004E U+0061 U+0074 U+0075 U+0072 U+0061 U+006C U+201D (โNaturalโ) Individual characters are: LEFT DOUBLE QUOTATION MARK LATIN CAPITAL LETTER N LATIN SMALL LETTER A LATIN SMALL LETTER T LATIN SMALL LETTER U LATIN SMALL LETTER R LATIN SMALL LETTER A LATIN SMALL LETTER L RIGHT DOUBLE QUOTATION MARK | |||||||||||||||||||||||||||
Encodings that can encode this properly | utf_8 utf_16 utf_32 gb18030 | |||||||||||||||||||||||||||
Encodings that will mangle your text | ascii latin_1 iso8859_2 iso8859_3 iso8859_4 iso8859_5 iso8859_6 iso8859_7 iso8859_8 iso8859_9 iso8859_10 iso8859_13 iso8859_14 iso8859_15 iso2022_jp iso2022_jp_1 iso2022_jp_2 iso2022_jp_2004 iso2022_jp_3 iso2022_jp_ext iso2022_kr gb2312 gbk big5 big5hkscs euc_jp euc_jis_2004 euc_jisx0213 euc_kr hz johab koi8_r koi8_u mac_cyrillic mac_greek mac_iceland mac_latin2 mac_roman mac_turkish ptcp154 shift_jis shift_jis_2004 shift_jisx0213 cp037 cp424 cp437 cp500 cp737 cp775 cp850 cp852 cp855 cp856 cp857 cp860 cp861 cp862 cp863 cp864 cp865 cp866 cp869 cp874 cp875 cp932 cp949 cp950 cp1006 cp1026 cp1140 cp1250 cp1251 cp1252 cp1253 cp1254 cp1255 cp1256 cp1257 cp1258 | |||||||||||||||||||||||||||
String encoding | ||||||||||||||||||||||||||||
String stuff | ||||||||||||||||||||||||||||
HTML/XML numeric entities | All but a-zA-Z0-9 and space are encoded, which is a little overzealous hexadecimal: “𝐍𝐚𝐭𝐮𝐫𝐚𝐥” decimal: “𝐍𝐚𝐭𝐮𝐫𝐚𝐥” |
|||||||||||||||||||||||||||
UTF8 bytestring | as hex: e2809cf09d908df09d909af09d90adf09d90aef09d90abf09d909af09d90a5e2809d (UTF8 bytestring length is 34) |
|||||||||||||||||||||||||||
URL-encoded UTF8 | %E2%80%9C%F0%9D%90%8D%F0%9D%90%9A%F0%9D%90%AD%F0%9D%90%AE%F0%9D%90%AB%F0%9D%90%9A%F0%9D%90%A5%E2%80%9D | |||||||||||||||||||||||||||
Javascript ~ES3 | "\u201c\ud835\udc0d\ud835\udc1a\ud835\udc2d\ud835\udc2e\ud835\udc2b\ud835\udc1a\ud835\udc25\u201d" This string contains codepoints above U+FFFF, which are coded via surrogate pairs (ES before ES6 was basically UTF-16) |
|||||||||||||||||||||||||||
ES6 | "\u{201C}\u{1D40D}\u{1D41A}\u{1D42D}\u{1D42E}\u{1D42B}\u{1D41A}\u{1D425}\u{201D}" | |||||||||||||||||||||||||||
Python py2 | Unicode string: u'\u201c\U0001d40d\U0001d41a\U0001d42d\U0001d42e\U0001d42b\U0001d41a\U0001d425\u201d' UTF8 bytestring: '\xe2\x80\x9c\xf0\x9d\x90\x8d\xf0\x9d\x90\x9a\xf0\x9d\x90\xad\xf0\x9d\x90\xae\xf0\x9d\x90\xab\xf0\x9d\x90\x9a\xf0\x9d\x90\xa5\xe2\x80\x9d' |
|||||||||||||||||||||||||||
py3 | Unicode string: '\u201c\U0001d40d\U0001d41a\U0001d42d\U0001d42e\U0001d42b\U0001d41a\U0001d425\u201d' UTF8 bytestring: b'\xe2\x80\x9c\xf0\x9d\x90\x8d\xf0\x9d\x90\x9a\xf0\x9d\x90\xad\xf0\x9d\x90\xae\xf0\x9d\x90\xab\xf0\x9d\x90\x9a\xf0\x9d\x90\xa5\xe2\x80\x9d' |
|||||||||||||||||||||||||||
Ruby | '\u{201c}\u{1d40d}\u{1d41a}\u{1d42d}\u{1d42e}\u{1d42b}\u{1d41a}\u{1d425}\u{201d}' | |||||||||||||||||||||||||||
CSS (in :before/:after) | '\201C\1D40D\1D41A\1D42D\1D42E\1D42B\1D41A\1D425\201D' | |||||||||||||||||||||||||||
TeX (experiment) | String in TeX: {\textquotedblleft}๐๐๐ญ๐ฎ๐ซ๐๐ฅ{\textquotedblright} | |||||||||||||||||||||||||||
Emoji (experiment; TODO) | ||||||||||||||||||||||||||||
Has | No | |||||||||||||||||||||||||||
CJK (experiment; TODO) | ||||||||||||||||||||||||||||
Has | No | |||||||||||||||||||||||||||
Font info (experiment; TODO) | ||||||||||||||||||||||||||||