URL's found by loading up a google doc. Opening chrome inspector and going to record network.
Then going to File -> Version History -> See Version History
URL form is https://docs.google.com/document/d/<DOCUMENT ID>/revisions/tiles?id=<DOCUMENT ID>&start=1&showDetailedRevisions=false&filterNamed=false&token=<TOKEN>&includes_info_params=true
Obtained by recording the network whilst loading revision history.
Will return a json of the form seen in "List.json".
Expanding a revision gave: https://docs.google.com/document/d/1namGVTADAlbFkF2QdrcHdr1ilxmOt_6dYJ8Wx5PBsTg/revisions/tiles?id=1namGVTADAlbFkF2QdrcHdr1ilxmOt_6dYJ8Wx5PBsTg&start=3&end=26&showDetailedRevisions=true&filterNamed=false&token=AC4w5VjCEgQV5tpwQNS5gYEEJ3__xGqcQA%3A1535553491010&includes_info_params=true with content in "Expanded.json"
URL form is https://docs.google.com/document/u/0/d/<DOCUMENT ID>/showrevision?id=<DOCUMENT ID>&end=<START REVISION NUM>&start=<END REVISION NUM>
Obtained by monitoring network and selecting a revision in revison history
Will return a json of the form seen in "Revisions.json". This json also appears to always start with )]}'
Data in chunkedSnapshot is split up into chunks. Each chunk consists of list of entries. Possible entries are detailed below.
For each 'entry':
tyindicates the typeasbeing metadataisbeing content data
stindicates the type of metadata. only appears on metatada entriesdocumentis for information about the document as a wholeheadingsis for information about the headingslanguageis for information about the document languageparagraphis for information about a specific paragraphtextis for information about a specific section of text. Comparable to div?
siindicates the character this entry starts at (Not applicable for all metadata)eiindicates the character this entry ends at (Not applicable for all metadata)
For the content data the following keys have been seen:
ibiseems to indicate the starting index of that chunk of content
For each of these metadata types, the keys indicated are assumed to be part of sm unless otherwise specified:
-
language data types
lgs_lseems to indicate the language code, eg 'en'
-
revision_diff
revdiff_aidis the key of the author that made the revision. A null editor is given by""revdiff_dtsometimes matchesrevdiff_aid. Possibly indicates addition/removal.- Further testing has shown no way to link user ID from this API to that from the offical API. Additionally the colour assigned to each editor is the only unique constant between the revisions themselves and the revision list.
-
text
- Not all elements may be present.
- If the element is appended with
_ieg,ts_fgc_iit may indicate if the value should be inherited. It's not clear where the value would be inherited from. Possibly this would be from the prior entry with a fallback to the given value if that is not available. ts_fgcis the color of the text. Given as a hex code, eg#000000ts_bgcis the color of the background surrrounding the text (ie, highlight color). Given as a hex code, eg#000000ts_fsis the font sizets_ffis the font face. Given as a font name, egArialts_unappears to be a flag for if the text is underlinedts_itappears to be a flag for if the text is italicisedts_btappears to be a flag for if the text is boldedts_stappears to be a flag for if the text is struckthrough
-
headings
- The top level key here appears to be the level of the heading the following styles apply to.
hs_h1,hs_h2,hs_h3,hs_h4,hs_h5, &hs_h6appear to be heading levels 1 -> 6hs_tappears to be the titlehs_ntappears to be the normal text stylehs_stappears to be the subtitle style
sdef_tsseems to indicate data about the text stylets_fgcis the color of the text. Given as a hex code, eg#000000ts_bgcis the color of the background surrrounding the text (ie, highlight color). Given as a hex code, eg#000000ts_fsis the font sizets_ffis the font face. Given as a font name, egArialts_unappears to be a flag for if the text is underlinedts_itappears to be a flag for if the text is italicisedts_btappears to be a flag for if the text is boldedts_stappears to be a flag for if the text is struckthrough
sdef_psappears to indicate data about other stuffps_hdidappears to be the id for that type of heading. Observed values includeh.4wm2lu96oxp8,h.cfkguxvzv5jl&h.pli4mhndnqfr
More info can probably be gleaned from
prettyJson.json - The top level key here appears to be the level of the heading the following styles apply to.
Any idea how one might infer the page number that a particular character number? That is, if I know the
siandeifor a word in the text, is there a way to infer page number from that?