Skip to content

Instantly share code, notes, and snippets.

@daveydee33
Last active January 23, 2026 09:57
Show Gist options
  • Select an option

  • Save daveydee33/eb007dad5f1745d7ccc5c06ccd0af085 to your computer and use it in GitHub Desktop.

Select an option

Save daveydee33/eb007dad5f1745d7ccc5c06ccd0af085 to your computer and use it in GitHub Desktop.
Extract an HTML tag value using sed with regular expression

Extract the content from inside an HTML tag with sed and a regex

example <title>this is it</title> if we want to extract just this is it.

echo '<title>this is it</title>' | sed -nE 's/<title>(.*)<\/title>/\1/p'

Regex Find and select HTML attributes - such as a "class" name, even if it spans multiple lines.

(class=".*?")

Add a whitespace character to pad all content within brackets

I had to use this to do a global find/replace all within many files to reformat things like:

{{this}}

to

{{ this }}

Using gnu-sed instead of OS X sed because of compatability issues using -i with Mac version of sed. (brew install gnu-sed)

find . -type f -name '*.njk' | while read file; 
do 
  gsed -i 's/{{\(\S\)/{{ \1/g' $file; 
  gsed -i 's/\(\S\)}}/\1 }}/g' $file; 
done
@kanngard
Copy link

Note that "Extract the content from inside an HTML tag with sed and a regex" just works if there is only one tag. In this case it works, but if you try to apply for other tags that can occur multiple times, you will not get the expected result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment