Skip to content

Instantly share code, notes, and snippets.

@comstock
Last active March 20, 2020 22:50
Show Gist options
  • Select an option

  • Save comstock/9052f85bd427a3916a08003c986e26ed to your computer and use it in GitHub Desktop.

Select an option

Save comstock/9052f85bd427a3916a08003c986e26ed to your computer and use it in GitHub Desktop.
parsing DRS reports with AWK

Generate a list of DRS URNs from a DRS deposit reports using AWK

"STILL IMAGE" DRS report example:

(Assuming the JP2 files are the only ones in the report to have delivery URNs.)

cat 1584126841-SLPC2033658088718609638479.txt | grep "JP2" | awk '{print "http://nrs.harvard.edu/" $11}' > list_of_image_URNs.txt

"PDS OBJECT" DRS report examples:

  1. The following generates a list of links to the JP2 files referenced in the report.

cat 1584478201-AMFoodAndDrink0066389503652647254379.txt | grep "JP2" | awk '{print "http://nrs.harvard.edu/" $2}' > list_of_image_URNs.txt

  1. The following generates a list of links for all XML files, which we presume to be METS files that provide access to complete objects.

cat 1584478201-AMFoodAndDrink0066389503652647254379.txt | grep "xml" | awk '{print "http://nrs.harvard.edu/" $2}' > list_of_object_URNs.txt

PDS example explained:

  • cat is a command that will read a file

  • 1584478201-AMFoodAndDrink0066389503652647254379.txt is an example DRS report.

  • | is called a PIPE. cat will feed the DRS report through the PIPE, to grep, which filters and only lets XML files pass through the next |.

  • awk '{print "http://nrs.harvard.edu/" $2}' tells AWK to concatenate "http://nrs.harvard.edu/ with the value of the second column of the text file. (DRS confirmation reports are organized into columns.)

  • > list_of_URNs.txt will direct the output of AWK to a text file, e.g., list_of_object_URNs.txt.

Installation notes

Windows

You can now install a Linux terminal on your Windows 10 machine.

Mac

On a Mac you can install AWK via the homebrew package installer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment