Why links break on spaces and ampersands, what %20 really means, and exactly when to encode a value.
Paste a link with a space or an ampersand into the wrong place and it breaks — the browser stops reading at the space, or the part after the & vanishes. The fix is URL encoding, also called percent-encoding, and once you understand the handful of rules behind it, a whole category of “why is my link broken?” problems disappears. This article explains what URL encoding is, why it exists, and exactly when you need it.
A URL is not free text. It is a structured address, and certain characters have special jobs inside it: ? starts the query string, & separates one parameter from the next, / divides path segments, # marks a fragment, and a space is not allowed at all. If your data contains one of those characters — say a search term with a space, or a name with an ampersand — the URL parser cannot tell your data apart from the structure. Percent-encoding solves this by replacing the troublesome character with a % followed by its two-digit hexadecimal byte value.
Every character has a numeric value. Percent-encoding takes the bytes of a character and writes each one as % plus two hex digits. A space is byte 32, which is 20 in hex, so a space becomes %20. An ampersand is %26, a question mark %3F, a forward slash %2F. Characters outside the basic set — accented letters, emoji, non-Latin scripts — are first turned into their UTF-8 bytes, then each byte is percent-encoded, which is why a single é can become %C3%A9.
| Character | Encoded | Why it matters |
|---|---|---|
| space | %20 | Spaces are not allowed in URLs |
& | %26 | Otherwise read as a parameter separator |
? | %3F | Otherwise starts the query string |
/ | %2F | Otherwise read as a path divider |
# | %23 | Otherwise starts the fragment |
+ | %2B | In query strings + can mean a space |
Here is a trap that catches almost everyone. In the query string portion of a URL, historically a + was used to represent a space (a leftover from HTML form submissions). So %20 and + can both decode to a space depending on context, but a literal plus sign must be written %2B. If you ever see a search query arrive with pluses where the spaces should be, this is why. When in doubt, encode spaces as %20 everywhere — it is unambiguous in both the path and the query.
Encode the values you drop into a URL: a search term, a filename, a redirect target, anything a user typed. Do not blanket-encode an entire finished URL, because that would turn the real ? and / that hold it together into %3F and %2F and break it. The rule of thumb: encode each piece of data before you assemble it into the URL, never the whole thing afterwards.
Suppose you want to link to a search for the phrase tom & jerry cartoons. Encoded as a query value it becomes:
https://example.com/search?q=tom%20%26%20jerry%20cartoons
The space is %20 and the ampersand is %26, so the server reads the whole phrase as one value instead of stopping at “tom” and treating “jerry cartoons” as a separate broken parameter. Decode it and you get the original phrase back, exactly.
If a value passes through two systems that each encode it, you can end up with double encoding: the % from the first pass gets encoded into %25 on the second, so %20 becomes %2520. The symptom is literal %20 text showing up on a page where a space should be. The fix is to encode exactly once, at the boundary where data enters the URL, and to decode exactly once when reading it back.
URL encoding is one of a family. Base64 encodes arbitrary binary data into a safe text alphabet (handy for embedding images or tokens), and HTML entity encoding (& for an ampersand) protects characters inside web pages rather than URLs. They solve related problems in different places, so it is worth knowing which one a given context needs.
URL encoding exists because URLs reserve a few characters for structure, and your data sometimes contains those same characters. Encode each value before placing it in the URL, decode it once on the way out, prefer %20 for spaces, and watch for double-encoding when a value travels through several systems. Get those right and broken links stop being a mystery.