Skip to content

Instantly share code, notes, and snippets.

@dajare
Last active October 17, 2025 08:32
Show Gist options
  • Select an option

  • Save dajare/fadacf6486dc2bba94375fac33177d31 to your computer and use it in GitHub Desktop.

Select an option

Save dajare/fadacf6486dc2bba94375fac33177d31 to your computer and use it in GitHub Desktop.
SE Extra Bits

This file SUPPLEMENTS SEMoS; it does not replace it.

Typography, etc.

  • Word-joiner: U+00A01
  • Hair space: U+200A
  • nb hyphen: U+2011
  • 2-em: ⸺
  • 3-em: ⸻
  • ligatures: ÆæŒœ

Semantics examples

  • <abbr epub:type="z3998:personal-name">J. G.</abbr>
  • <abbr epub:type="z3998:given-name">R. A.</abbr> Johnson
  • <abbr epub:type="z3998:surname">J.</abbr>
  • <abbr epub:type="z3998:initialism z3998:name-title">Q.C.</abbr>
    ·
  • epub:type="se:name.publication.newspaper"
			book
			essay
			journal
			newspaper
			magazine
			pamphlet
			paper
			play
			poem
			short-story
  • epub:type="se:name.vessel.ship"
  • epub:type="se:name.visual-art.film"
  • epub:type="name.person.full-name"
  • epub:type="z3998:verse" | poem song hymn lyrics
    ·
  • xml:lang="de", "fr", "la" | xml:lang="grc-Latn"

CSS

Letters

/* for salutations in headers */
header > [epub|type~="z3998:salutation"]{
	text-align: initial;
}

Block-level italics

@namespace xml "http://www.w3.org/XML/1998/namespace";
 . . .
/* blocklevel lang italics */
blockquote[xml|lang]{
	font-style: italic;
}
/* end blocklevel lang italics */

Tricks

  • find 4-digit years: (?!(2012|2007|1999|3998))\d{4} / HT: Robin
  • find two CAPITAL letters with full-stop, either with space between or no space between: [A-Z]\.[ ]?[A-Z]\.2
  • use regex across multiple lines: ^keyword.*$(\r\n|\r|\n)?3 / See more
  • for case-sensitive regex on Github code search: /(?-i)myCaseSensitiveSearchString/

Footnotes

  1. The gist won't respect the different kinds of spaces: in my local text file, I have the actual space character available for copy/paste.

  2. This helps with manual semantics getting strings like both M.P. and initials, G. K..

  3. Can search for e.g. <h2.*$(\r\n|\r|\n)?.*?<p to get consecutive lines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment