Unicode Explorer: The Universal Language of Text

The Unicode Lab

Type anything to see how computers actually "see" your text.

Input Text

Quick Load Examples

Under the Hood Analysis

ASCII 2-Byte 3-Byte 4-Byte (Emoji)

U+0048 Basic Latin (ASCII)

UTF-8: 0x48 Dec: 72

U+0065 Basic Latin (ASCII)

UTF-8: 0x65 Dec: 101

U+006C Basic Latin (ASCII)

UTF-8: 0x6C Dec: 108

U+006C Basic Latin (ASCII)

UTF-8: 0x6C Dec: 108

U+006F Basic Latin (ASCII)

UTF-8: 0x6F Dec: 111

U+0020 Basic Latin (ASCII)

UTF-8: 0x20 Dec: 32

🌍

U+1F30D Emoticons / Symbols

UTF-8: 0xF0 0x9F 0x8C 0x8D Dec: 127757

7 Characters 10 Bytes (UTF-8)

How Unicode Works

1. The Code Point

Think of Unicode as a giant spreadsheet. Every character gets a row number. This number is called a Code Point, usually written as U+1234. For example, "A" is always row 65 (U+0041), and "💀" is row 128,128 (U+1F480).

2. Encoding (UTF-8)

Computers don't store "row numbers"; they store bits (0s and 1s). UTF-8 is the most popular way to turn those Code Points into bits. It's clever: it uses 1 byte for English letters (to match old ASCII), but expands to 2, 3, or 4 bytes for other languages and emojis. This saves massive amounts of space on the web.

3. Rendering (Fonts)

Unicode tells the computer what the character is, but not how it looks. That's the job of a Font. If you see a square box (□) or a question mark (), it doesn't mean the Unicode is broken; it just means your current font doesn't have a drawing for that specific Code Point.

ASCII vs. Unicode

The Old Way (ASCII)

check_circle Only 128 characters total.
check_circle English alphabets and numbers only.
cancel No accents (é, ñ), no other scripts (汉, Ω), no emojis.
cancel Caused "Mojibake" (garbled text) when sharing files between countries.

🌍

The Unicode Way

check_circle Over 150,000 characters.
check_circle Covers virtually all written languages (living and dead).
check_circle Includes math symbols, musical notation, and emojis.
check_circle The foundation of the modern internet.

Did You Know?

The "Ghost" Characters

Unicode includes characters that are invisible but change how text works. For example, the "Zero Width Joiner" (ZWJ) acts like digital glue. It combines "Man" + "ZWJ" + "Woman" + "ZWJ" + "Boy" to create the single family emoji 👨‍👩‍👦.

Private Use Areas

There are blocks in Unicode left intentionally empty (U+E000 to U+F8FF). Companies like Apple or Google use these for internal icons before they become official standards.

From Stone Tablets to Smart Phones

What is Unicode?

info The Short Answer