Improvisational jpt: Processing Apple Developer JSON transcript files

If you’ve downloaded the Developer app for the Mac there’s a trove of JSON transcripts cached in your home folder at ~/Library/Group Containers/group.developer.apple.wwdc/Library/Caches/Transcripts

Being curious I took a look at them using my JSON Power Tool jpt. In it’s default mode it will “pretty print” JSON or in Javascript parlance “stringify” them with a two space indent per nesting level. Inside it can be seen the transcripts are arrays of arrays inisde a uniquely named object. The 1st entry of the array is the time in seconds and the 2nd entry is the string we want.

Arrays of arrays

One of jpt’s cool features is that it supports the venerable yet nascent JSONPath query syntax. Using JSONPath we can use the recursive operator .. to go straight to the transcript object without needing determine the unique name of the parent object, then we want all the array within there [*]and inside those array we want the second entry of the 0 based array [1]. The query looks like this $..transcript[*][1]

Just text please

The query is single quoted so the shell doesn’t interpret the $ as the beginning of a variable name. The -T option for jpt it outputs text without quotes. The default output mode for jpt is JSON (double quoted strings). I added all sorts of other niceties to the script, as you’ll see below. The results are output to your ~/Desktop in a folder called Developer Transcripts

#!/bin/bash
: <<-LICENSE_BLOCK
Developer Transcript Extractor Copyright (c) 2022 Joel Bruner. Licensed under the MIT License. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
LICENSE_BLOCK

############
# VARIABLES #
############

destinationFolderName="Developer Transcripts"
destinationPath="${HOME}/Desktop/${destinationFolderName}"

#you'll need to download the Developer app and launch it: https://apps.apple.com/us/app/apple-developer/id640199958
transcriptPaths=$(find "${HOME}/Library/Group Containers/group.developer.apple.wwdc/Library/Caches/Transcripts/ByID" -name '*json')
contentsJSON="${HOME}/Library/Group Containers/group.developer.apple.wwdc/Library/Caches/contents.json"

########
# MAIN #
########

#either jpt should be installed or the function jpt.min can be pasted in here
if ! which jpt &>/dev/null; then
	echo "Please install jpt or embed jpt.min in this script: https://github.com/brunerd/jpt"
	exit 1
fi

#ensure the destination folder exists
[ ! -d "${destinationPath}" ] && mkdir "${destinationPath}"

#ignore spaces in file paths
IFS=$'\n'

#loop through each transcript json file
for transcriptPath in ${transcriptPaths}; do
	#id is just the file name without the path and extension
	id=$(cut -d. -f1 <<< "${transcriptPath##*/}")
	#a couple of nice-to-haves
	title=$(jpt -T '$.contents[?(@.id == "'"${id}"'")].title' "${contentsJSON}")
	description=$(jpt -T '$.contents[?(@.id == "'"${id}"'")].description' "${contentsJSON}")
	#change \ (disallowed in Unix) to : (Disallowed in Finder byt allowed in Unix)
	title=${title//\//:}
	url=$(jpt -T '$.contents[?(@.id == "'"${id}"'")].webPermalink' "${contentsJSON}")

	#"wwdc" always has the year in the id but not tech-talks or insights
	if ! grep -q -i wwdc <<< "$id"; then
		year="$(jpt '$.contents[?(@.id == "'"${id}"'")].originalPublishingDate' "${contentsJSON}" | date -j -r 1593018000 +"%Y")-"
		filename="${year}${id} - ${title}.txt"
	else
		filename="${id} - ${title}.txt"
	fi

	#put the ID and Title, the URL and Description at the top of the transcript
	echo "${id} - ${title}" > "${destinationPath}"/"${filename}"
	echo -e "${url}\n" >> "${destinationPath}"/"${filename}"
	echo -e "Description:\n${description}\n\nTranscript:" >> "${destinationPath}"/"${filename}"
	
	#append the transcript extract
	jpt -T '$..transcript[*][1]' "${transcriptPath}" >> "${destinationPath}"/"${filename}"

	#just echo out our progress
	echo "${destinationPath}"/"${filename}"
done

The final result is sortable folder of text files that you can easily QuickLook through.

Some serviceably readable contents!

So there you go! Some surprise JSON transcript files from the Apple Developer app, made me wonder how someone would turn them into human readable files. It turned out it was a fun and practical use of jpt and it’s support for JSONPath. You can download an installer package from the Releases page to try it out.

ljt, a little JSON tool for your shell script

ljt, the Little JSON Tool, is a concise and focused tool for quickly getting values from JSON and nothing else. It can be used standalone or embedded in your shell scripts. It requires only macOS 10.11+ or a *nix distro with jsc installed. It has no other dependencies.

You might have also seen my other project jpt, the JSON Power Tool. It too can be used standalone or embedded in a script however its features and size (64k) might be overkill in some case and I get it! I thought the same thing too after I looked at work of Mathew Warren, Paul Galow, and Richard Purves. Sometimes you don’t need to process JSON Text Sequences, use advanced JSONPath querying, modify JSON, or encode the output in a myriad of ways. Maybe all you need is something to retrieve a JSON value in your shell script.

Where jpt was an exercise in maximalism, ljt is one of essential minimalism – or at least as minimal as this CliftonStrengths Maximizer can go! πŸ€“ The minified version is mere 1.2k and offers a bit more security and functionality than a one-liner.

ljt features:
β€’ Query using JSON Pointer or JSONPath (canonical only, no filters, unions, etc)
β€’ Javascript code injection prevention for both JSON and the query
β€’ Multiple input methods: file redirection, file path, here doc, here string and Unix pipe
β€’ Output of JSON strings as regular text and JSON for all others
β€’ Maximum input size of 2GB and max output of 720MB (Note: functions that take JSON data as an environment variable or an argument are limited to a maximum of 1MB of data)
β€’ Zero and non-zero exit statuses for success and failure, respectively

Swing by the ljt Github page and check it out. There are two versions of the tool one is fully commented for studying and hacking (ljt), the other is a “minified” version without any comments meant for embedding into your shell scripts (ljt.min). The Releases page has a macOS pkg package to install and run ljt as a standalone utility.

Thanks for reading and happy scripting!

jpt 1.0 can deal with multiple JSON texts

jpt 1.0 can now deal with multiple JSON texts whether they are proper JSON Sequences or mutants like JSON Lines and NDJSON or just plain old concatenated JSON texts in one file. Even better it gives as good as it gets, it can also output in those formats as well. Let’s take a look!

Multiple JSON Text Options:
  -M "<value>" - options for working with multiple JSON texts
     S - Output JSON Text Sequences strictly conforming to RFC 7464 (default)
     N - Output newline delimited JSON texts
     C - Output concatenated JSON texts
     A - Gather JSON texts into an array, post-query and post-patching
     a - Gather JSON texts into an array, pre-query and pre-patching 

What is a JSON Text Sequence? It’s the standardized way (RFC 7464) to combine multiple JSON texts using the Record Separator (RS) control character (0x1E) as a delimiter.

The genius of JSON Text Sequences is that newlines are not made to be significant like they are in JSON Lines/NDJSON which introduce a fragility to what should be ignorable whitespace. These formats break the JSON specification. JSON sequences however are easily parsed and if needed more easily read by humans since the JSON texts do not need to be on a single line.

Let’s look at a simple JSON sequence. We can create the record separator character with the ANSI-C shell syntax $'\x1e' additional JSON-Seq requires that numbers be terminated with a newline (\n) for consistency I did that for each JSON text.

% jpt <<< $'\x1e1\n\x1e"a"\n\x1etrue\n'         
1
"a"
true

Now you might say that looks just like JSON Lines/NDJSON but what you don’t see are the RS characters, they are stripped out in Terminal. However if we send the output to a file or pipe it into bbedit you will see them appear as ΒΏ

RS (0x1e) characters in bbedit

With jpt 1.0 we can now choose to convert multiple JSON text (whether sequences, lines, or concatenated) into an array, either pre or post query and patching.

#-Ma and -MA without a query or patch are functionally the same
% jpt -Ma <<< $'\x1e1\n\x1e"a"\n\x1etrue\n'
[
  1,
  "a",
  true
]

Now let’s look at 3 separate JSON texts of arrays and see how -Ma and -MA differ

#an array of the 3 arrays is created pre-query
#thus /2 corresponds to the last JSON text
#-Ma is most useful to get a specific JSON text
% jpt -Ma /2 <<< $'[1,2,3]\n[4,5,6]\n[7,8,9]\n'
[
  7,
  8,
  9
]

#the query /2 is run first on all 3 texts
#then those results 3, 6, and 9 are gathered into an array
% jpt -MA /2 <<< $'[1,2,3]\n[4,5,6]\n[7,8,9]\n'
[
  3,
  6,
  9
]

One of thing about JSON Lines and NDJSON is they take otherwise ignorable whitespace like newline and make it something significant, the introduce a brittleness to the otherwise robust JSON spec. Let’s see how jpt handles it

#-MN will output NDJSON and correct the missing newline
% jpt -MN <<< $'{"a":1}\n{"b":2}{"c":3}'
{"a":1}
{"b":2}
{"c":3}

As you can see a simple line based NDJSON parser would have failed with the last object, jpt treats anything that doesn’t parse as a possibly concatenated JSON.

You can also output concatenated JSON… but why?! Maybe you have your reasons, if so, use -MC

#no RS characters with concatenated JSON
jpt -MC <<< $'{"a":1}\n{"b":2}{"c":3}'
{
  "a": 1
}
{
  "b": 2
}
{
  "c": 3
}

If you work with multiple JSON texts such as JSON logs then jpt might be a useful addition to your arsenal of tools. It can be downloaded from my jpt Releases page at Github.

jpt: see JSON differently

JSON is wonderfully efficient at encoding logical structure and hierarchy into a textual form, but sometimes our eyes and brains need some help making sense of that. Let’s see how jpt can help!

The Shape of JSON

JSON uses a small and simple palette of brackets and curly braces, to express nest-able structures like objects and arrays. However once inside those structures it’s hard to visualize the lineage of the members.

Using jpt however, one can get a better idea of what the shape of a JSON document is.

For example object-person.json:

{
  "person": {
    "name": "Lindsay Bassett",
    "nicknames": [
      "Lindy",
      "Linds"
    ],
    "heightInInches": 66,
    "head": {
      "hair": {
        "color": "light blond",
        "length": "short",
        "style": "A-line"
      },
      "eyes": "green"
    },
    "friendly":true
  }
}

Using jpt -L we can output what I am currently calling “JSONPath Object Literals”. With JPOL the entire path is spelled out for each item as well as it’s value expressed as a JSON object. You can also roundtrip JPOL right back into jpt and you’ll get the JSON again.

% jpt -L object-person.json
$={}
$["person"]={}
$["person"]["name"]="Lindsay Bassett"
$["person"]["nicknames"]=[]
$["person"]["nicknames"][0]="Lindy"
$["person"]["nicknames"][1]="Linds"
$["person"]["heightInInches"]=66
$["person"]["head"]={}
$["person"]["head"]["hair"]={}
$["person"]["head"]["hair"]["color"]="light blond"
$["person"]["head"]["hair"]["length"]="short"
$["person"]["head"]["hair"]["style"]="A-line"
$["person"]["head"]["eyes"]="green"
$["person"]["friendly"]=true

#roundtrip back into jpt (-i0 means zero indents/single line)
jpt -L object-person.json | jpt -i0
{"person":{"name":"Lindsay Bassett","nicknames":["Lindy","Linds"],"heightInInches":66,"head":{"hair":{"color":"light blond","length":"short","style":"A-line"},"eyes":"green"},"friendly":true}}

If you were to fire up jsc or some other JavaScript interpreter, declare $ as a variable using var $ and then copy and paste the output above and you will have this exact object.

Perhaps though you prefer single quotes? Add -q to use single quotes for the property names but values remain double quoted.

#-q will single quote JSONPath only, strings remain double quoted 

% jpt -qL object-person.json         
$={}
$['person']={}
$['person']['name']="Lindsay Bassett"
$['person']['nicknames']=[]
$['person']['nicknames'][0]="Lindy"
$['person']['nicknames'][1]="Linds"
$['person']['heightInInches']=66
$['person']['head']={}
$['person']['head']['hair']={}
$['person']['head']['hair']['color']="light blond"
$['person']['head']['hair']['length']="short"
$['person']['head']['hair']['style']="A-line"
$['person']['head']['eyes']="green"
$['person']['friendly']=true

#-Q will single quote both
% jpt -QL object-person.json
$={}
$['person']={}
$['person']['name']='Lindsay Bassett'
$['person']['nicknames']=[]
$['person']['nicknames'][0]='Lindy'
$['person']['nicknames'][1]='Linds'
$['person']['height in inches']=66
$['person']['head']={}
$['person']['head']['hair']={}
$['person']['head']['hair']['color']='light blond'
$['person']['head']['hair']['length']='short'
$['person']['head']['hair']['style']='A-line'
$['person']['head']['eyes']='green'
$['person']['friendly']=true

If you don’t like bracket notation use -d for dot notation (when the name permits). -Q will single quote both values and key names if using brackets. To save a few lines you can use -P to only print primitives (string, number, booleans, null) and will omit those empty object and array declarations. This too can roundtrip back into jpt, it will infer the structure type (object or array) based on the key type (string or number when in brackets).

% jpt -dQLP object-person.json       
$.person.name='Lindsay Bassett'
$.person.nicknames[0]='Lindy'
$.person.nicknames[1]='Linds'
$.person.heightInInches=66
$.person.head.hair.color='light blond'
$.person.head.hair.length='short'
$.person.head.hair.style='A-line'
$.person.head.eyes='green'
$.person.friendly=true

#roundtrip into jpt, structure types can be inferred
% jpt -dQLP object-person.json | jpt -i0    
{"person":{"name":"Lindsay Bassett","nicknames":["Lindy","Linds"],"heightInInches":66,"head":{"hair":{"color":"light blond","length":"short","style":"A-line"},"eyes":"green"},"friendly":true}}

Now perhaps we don’t care so much about the data but about the structure? Then we can use the -J (JSONPath) or -R (JSON Pointer) output combined with -K or -k for the key names only

#-K key names are double quoted
% jpt -JK /Users/brunerd/Documents/TestFiles/object-person.json
$
  "person"
    "name"
    "nicknames"
      0
      1
    "height in inches"
    "head"
      "hair"
        "color"
        "length"
        "style"
      "eyes"
    "friendly"

#-q/-Q can change the quoting to singles of course
#-C will show the constructor type of every element
% jpt -JKqC /Users/brunerd/Documents/TestFiles/object-person.json
$: Object
  'person': Object
    'name': String
    'nicknames': Array
      0: String
      1: String
    'height in inches': Number
    'head': Object
      'hair': Object
        'color': String
        'length': String
        'style': String
      'eyes': String
    'friendly': Boolean


#-k for unquotes key names, -C for constructor, -i1 for 1 space indents 

% jpt -JkC -i1 /Users/brunerd/Documents/TestFiles/object-person.json
$: Object
 person: Object
  name: String
  nicknames: Array
   0: String
   1: String
  height in inches: Number
  head: Object
   hair: Object
    color: String
    length: String
    style: String
   eyes: String
  friendly: Boolean

It’s been a lot of fun creating jpt and discovery new ways to express JSON using an emerging notation like JSONPath (which is going through the RFC process now). If you’d like to see JSON a different way download jpt and give it a try!

Helping truncated JSON data with jpt 1.0

jpt 1.0 now has the ability to detect and help truncation in strings, arrays and objects. The -D option can detect this in newline delimited and concatenated JSON and -H will help the JSON by closing up strings, arrays, and objects. For example, here’s two JSON text concatenated both with truncation:

jpt -i0 -DH <<< $'{"a":[1,2 {"b":[3,4,'
{"a":[1,2]}
{"b":[3,4,null]}

#heredoc to more clearly show the input data
% jpt -i0 -DH <<EOF
heredoc> {"a":[1,2
heredoc> {"b":[3,4,
heredoc> EOF
{"a":[1,2]}
{"b":[3,4,null]}

If there is a trailing comma, a null will be put there to signify the fact there was an additional value before truncation occured, in all cases one should assume some data has been lost. The -i0 option is for zero indents, single line JSON however it is not NDJSON. When multiple JSON texts are input, as above, JSON Sequence (RFC 7464) are the default output. JSON Sequences are simply a record separator (RS) character at the beginning of JSON text. RS characters do not appear in the Terminal. If you pipe the output to bbedit or redirect to a file you’ll see them. To get some commentary on the repairs, use the -c option:

jpt -i0 -cDH <<< $'{"a":[1,2 {"b":[3,4,'         
{"a":[1,2]}
{"b":[3,4,null]}
--> Original error: SyntaxError: JSON Parse error: Expected ']'
--> Byte 10: Truncation Detected
--> Truncation help for JSON Text (1), bytes: 1-10
--> Truncation help after byte 10, array end: ]
--> Truncation help after byte 10, object end: }
--> Truncation help for JSON Text (2), bytes: 11-21
--> Truncation help after byte 21, missing value and array end: null]
--> Truncation help after byte 21, object end: }
--> Bytes 1-10: JSON Text (1)
--> Bytes 11-21: JSON Text (2)

echo $?                                 
1

As you can see -c can get pretty verbose depending on the issues. If there is any output with -c the exit code of jpt will be 1, if there is nothing to fix it will exit 0. This can be useful if you need to know if the source JSON needed any help or not.

jpt 1.0 is there when your JSON needs some help, not just when it’s perfectly formatted. If that sounds cool give it a try, download the Mac .pkg at github.com/brunerd/jpt/releases

jpt 1.0 text encoding fun

Besides JSON, jpt (the JSON power tool) can also output strings and numbers in a variety of encodings that the sysadmin or programmer might find useful. Let’s look at the encoding options from the output of jpt -h

% jpt -h
...
-T textual output of all data (omits property names and indices)
  Options:
	-e Print escaped characters literally: \b \f \n \r \t \v and \\ (escapes formats only)
	-i "<value>" indent spaces (0-10) or character string for each level of indent
	-n Print null values as the string 'null' (pre-encoding)

	-E "<value>" encoding options for -T output:

	  Encodes string characters below 0x20 and above 0x7E with pass-through for all else:
		x 	"\x" prefixed hexadecimal UTF-8 strings
		O 	"\nnn" style octal for UTF-8 strings
		0 	"\0nnn" style octal for UTF-8 strings
		u 	"\u" prefixed Unicode for UTF-16 strings
		U 	"\U "prefixed Unicode Code Point strings
		E 	"\u{...}" prefixed ES2016 Unicode Code Point strings
		W 	"%nn" Web encoded UTF-8 string using encodeURI (respects scheme and domain of URL)
		w 	"%nn" Web encoded UTF-8 string using encodeURIComponent (encodes all components URL)

		  -A encodes ALL characters
	
	  Encodes both strings and numbers with pass-through for all else:
		h 	"0x" prefixed lowercase hexadecimal, UTF-8 strings
		H 	"0x" prefixed uppercase hexadecimal, UTF-8 strings
		o 	"0o" prefixed octal, UTF-8 strings
		6 	"0b" prefixed binary, 16 bit _ spaced numbers and UTF-16 strings
		B 	"0b" prefixed binary, 8 bit _ spaced numbers and UTF-16 strings
		b 	"0b" prefixed binary, 8 bit _ spaced numbers and UTF-8 strings

		  -U whitespace is left untouched (not encoded)

Strings

While the above conversion modes will do both number and string types, these options will work only on strings (numbers and booleans pass-through). If you work with with shell scripts these techniques may be useful.

If you store shell scripts in a database that’s not using utf8mb4 table and column encodings then you won’t be able to include snazzy emojis to catch your user’s attention! In fact this WordPress install was so old (almost 15 years!) the default encoding was still latin1_swedish_ci, which an odd but surprisingly common default for many old blogs. Also if you store your scripts in Jamf (still in v10.35 as of this writing) it uses latin1 encoding and your 4 byte characters will get mangled. Below you can see in Jamf things look good while editing, fails once saved, and the eventual workaround is to use an coding like \x escaped hex (octal is an alternate)

Let’s use the red “octagonal sign” emoji, which is a stop sign to most everyone around the world, with the exception of Japan and Libya (thanks Google image search). Let’s look at some of the way πŸ›‘ can be encoded in a shell script

#reliable \x hex notation for bash and zsh
% jpt -STEx <<< "Alert πŸ›‘"
Alert \xf0\x9f\x9b\x91

#above string can be  in both bash and zsh
% echo $'Alert \xf0\x9f\x9b\x91'
Alert πŸ›‘

#also reliable, \nnn octal notation
% jpt -STEO <<< "Alert πŸ›‘"
Alert \360\237\233\221

#works in both bash and zsh
% echo $'Alert \360\237\233\221'
Alert πŸ›‘

#\0nnn octal notation
% jpt -STE0 <<< "Alert πŸ›‘"
Alert \0360\0237\0233\0221

#use with shell builtin echo -e and ALWAYS in double quotes
#zsh does NOT require -e but bash DOES, safest to use -e
% echo -e "Alert \0360\0237\0233\0221"
Alert πŸ›‘

#-EU code point for zsh only
% jpt -STEU <<< "Alert πŸ›‘"
Alert \U0001f6d1

#use in C-style quotes in zsh
% echo $'Alert \U0001f6d1'
Alert πŸ›‘

The -w/-W flags can encode characters for use in URLs

#web/percent encoded output in the case of non-URLs -W and -w are the same
% jpt -STEW <<< πŸ›‘  
%F0%9F%9B%91

#-W URL example (encodeURI)
jpt -STEW <<< http://site.local/page.php?wow=πŸ›‘
http://site.local/page.php?wow=%F0%9F%9B%91

#-w will encode everything (encodeURIComponent)
% jpt -STEw <<< http://site.local/page.php?wow=πŸ›‘
http%3A%2F%2Fsite.local%2Fpage.php%3Fwow%3D%F0%9F%9B%91

And a couple other oddballs…

#text output -T (no quotes), -Eu for \u encoding
#not so useful for the shell scripter
#zsh CANNOT handle multi-byte \u character pairs
% jpt -S -T -Eu <<< "Alert πŸ›‘"
Alert \ud83d\uded1

#-EE for an Javascript ES2016 style code point
% jpt -STEE <<< "Alert πŸ›‘"
Alert \u{1f6d1}

You can also \u encode all characters above 0x7E in JSON with the -u flag

#JSON output (not using -T)
% jpt <<< '"Alert πŸ›‘"'
"Alert πŸ›‘"

#use -S to treat input as a string without requiring hard " quotes enclosing
% jpt -S <<< 'Alert πŸ›‘'
"Alert πŸ›‘"

#use -u for JSON output to encode any character above 0x7E
% jpt -Su <<< 'Alert πŸ›‘'
"Alert \ud83d\uded1"

#this will apply to all strings, key names and values
% jpt -u <<< '{"πŸ›‘":"stop", "message":"Alert πŸ›‘"}' 
{
  "\ud83d\uded1": "stop",
  "message": "Alert \ud83d\uded1"
}

Whew! I think I covered them all. If there are newlines, tabs and other invisibles you can choose to output them or leave them encoded when you are outputting to text with -T

#JSON in, JSON out
jpt <<< '"Hello\n\tWorld"'
"Hello\n\tWorld"

#ANSI-C string in, -S to treat as string despite lack of " with JSON out
% jpt -S <<< $'Hello\n\tWorld' 
"Hello\n\tWorld"

#JSON in, text out: -T alone prints whitespace characters
% jpt -T <<< '"Hello\n\tWorld"'
Hello
	World

#use the -e option with -T to encode whitespace
% jpt -Te <<< '"Hello\n\tWorld"'
Hello\n\tWorld

Numbers

Let’s start simply with some numbers. First up is hex notation in the style of 0xhh and 0XHH. This encoding has been around since ES1, use the -Eh and -EH respectively to do so. All alternate output (i.e. not JSON) needs the -T option. In shell you can combine multiple options/flags together except only the last flag can have an argument, like -E does below.

#-EH uppercase hex
% jpt -TEH <<< [255,256,4095,4096] 
0xFFa
0x100
0xFFF
0x1000

#-Eh lowecase hex
% jpt -TEh <<< [255,256,4095,4096]
0xff
0x100
0xfff
0x1000

Next up are ye olde octals. Use the -Eo option to convert numbers to ye olde octals except using the more modern 0o prefix introduced in ES6

-Eo ES6 octals
% jpt -TEo <<< [255,256,4095,4096]    
0o377
0o400
0o7777
0o10000

Binary notation debuted in the ES6 spec, it used a 0b prefix and allows for _ underscore separators

#-E6 16 bit wide binary
% jpt -TE6 <<< [255,256,4095,4096]
0b0000000011111111
0b0000000100000000
0b0000111111111111
0b0001000000000000

#-EB 16 bit minimum width with _ separator per 8
% jpt -TEB <<< [255,256,4095,4096]
0b00000000_11111111
0b00000001_00000000
0b00001111_11111111
0b00010000_00000000

#-Eb 8 bit minimum width with _ separator per 8
% jpt -TEb <<< [15,16,255,256,4095,4096]
0b00001111
0b00010000
0b11111111
0b00000001_00000000
0b00001111_11111111
0b00010000_00000000

If you need to encode strings or numbers for use in scripting or programming, then jpt might be a handy utility for you and your Mac and if your *nix has jsc then it should work also. Check the jpt Releases page for Mac installer package download.

jpt 1.0 and JSON5 rehab

JSON5 is not JSON nor is an IETF standard but jpt 1.0 can rehabilitate it back to standards compliant JSON. Let’s take the example from its web page.

{
  // comments
  unquoted: 'and you can quote me on that',
  singleQuotes: 'I can use "double quotes" here',
  lineBreaks: "Look, Mom! \
No \\n's!",
  hexadecimal: 0xdecaf,
  leadingDecimalPoint: .8675309, andTrailing: 8675309.,
  positiveSign: +1,
  trailingComma: 'in objects', andIn: ['arrays',],
  "backwardsCompatible": "with JSON",
}

Now let’s send it through jpt

% jpt json5_example.txt        
{
  "unquoted": "and you can quote me on that",
  "singleQuotes": "I can use \"double quotes\" here",
  "lineBreaks": "Look, Mom! No \\n's!",
  "hexadecimal": 912559,
  "leadingDecimalPoint": 0.8675309,
  "andTrailing": 8675309,
  "positiveSign": 1,
  "trailingComma": "in objects",
  "andIn": [
    "arrays"
  ],
  "backwardsCompatible": "with JSON"
}

That last line is a bit misleading. JSON5 texts are totally not backward compatible with JSON parsers. What they mean is that JSON5 parsers are backward compatible with JSON. FYI – Every single JSON5 addition is not compatible with JSON, I should know I had to write parsing rules and behaviors to deal with it! Let’s use the -c flag to comment on all the work that went into un-junking the above example

% jpt -c ./json5_example.txt 
{
  "unquoted": "and you can quote me on that",
  "singleQuotes": "I can use \"double quotes\" here",
  "lineBreaks": "Look, Mom! No \\n's!",
  "hexadecimal": 912559,
  "leadingDecimalPoint": 0.8675309,
  "andTrailing": 8675309,
  "positiveSign": 1,
  "trailingComma": "in objects",
  "andIn": [
    "arrays"
  ],
  "backwardsCompatible": "with JSON"
}
--> Original error: SyntaxError: JSON Parse error: Unrecognized token '/'
--> Byte 5: // line comment (JSON5)
--> Byte 19: unquoted object key name (JSON5)
--> Byte 29: single quoted string (JSON5)
--> Byte 63: unquoted object key name (JSON5)
--> Byte 77: single quoted string (JSON5)
--> Byte 113: unquoted object key name (JSON5)
--> Byte 138: \ escaped newline in string (JSON5)
--> Byte 153: unquoted object key name (JSON5)
--> Byte 164: 0x hex value (JSON5)
--> Byte 177: unquoted object key name (JSON5)
--> Byte 198: decimal point wihtout preceding digit (JSON5)
--> Byte 208: unquoted object key name (JSON5)
--> Byte 228: trailing decimal lacking mantissa (JSON5)
--> Byte 233: unquoted object key name (JSON5)
--> Byte 247: explicit plus + for value (JSON5)
--> Byte 253: unquoted object key name (JSON5)
--> Byte 268: single quoted string (JSON5)
--> Byte 282: unquoted object key name (JSON5)
--> Byte 290: single quoted string (JSON5)
--> Byte 296: trailing comma (JSON5)
--> Byte 336: trailing comma (JSON5)

Yeah there’s so much wrong with JSON5 vs JSON its not funny. It’s baffling that Apple is acknowledging it by adding JSON5 parsing to their frameworks! At least they don’t generate it! Despite not being funny, let’s try anyway, here’s some more JSON5 example to show some of the other ways it can lead you astray.

{
	"jpt_lineBreakHandling": "Extended parsing to ignore tabs \
	after escaped newlines.\nIf JSON5 is supposed to be human readable \
	shouldn't tabs be included in the escaping?",
	"leadingDecimalPoint": .8675309, 
	"andTrailingDecimal": 8675309.,
	"negativeLeadingDecimalPoint": -.1234, 

	//Infinity converts to null, use -I for string
	"+Infinity": +Infinity,
	"-Infinity": -Infinity,
	"Infinity": Infinity,
	//null is not a number now is it? done.
	"NaN": NaN,
}

Running this through jpt we’ll get pure JSON output

% jpt ./json5_examples_brunerd.txt 
{
  "jpt_lineBreakHandling": "Extended parsing to ignore tabs after escaped newlines.\nIf JSON5 is supposed to be human readable shouldn't tabs be included in the escaping?",
  "leadingDecimalPoint": 0.8675309,
  "andTrailingDecimal": 8675309,
  "negativeLeadingDecimalPoint": -0.1234,
  "+Infinity": null,
  "-Infinity": null,
  "Infinity": null,
  "NaN": null
}


#-I to convert to signed Infinity strings instead of null
#-8 to convert NaN to string "NaN" (-N was taken, like NaN 8 is like Infinity but not quite :)
% jpt -I8 ./json5_examples_brunerd.txt 
{
  "jpt_lineBreakHandling": "Extended parsing to ignore tabs after escaped newlines.\nIf JSON5 is supposed to be human readable shouldn't tabs be included in the escaping?",
  "leadingDecimalPoint": 0.8675309,
  "andTrailingDecimal": 8675309,
  "negativeLeadingDecimalPoint": -0.1234,
  "+Infinity": "+Infinity",
  "-Infinity": "-Infinity",
  "Infinity": "Infinity",
  "NaN": "NaN"
}

The fact that they added the data type of +/- Infinity and NaN really complicates the conversion since JSON doesn’t have equivalents. By default Infinity converts to null but -I will convert it to the string "Infinity" with signedness if specified. NaN is also converted to null by default but the -8 flag will convert it to the string "NaN". Why 8? Because it’s sort like infinity but it’s not, just like NaN.

If you have need to rehab JSON5 back to JSON then jpt might be able to help you out. It can run on Mac all the way back to OS X Tiger (10.4) all the way up to the newest macOS Monterey (12) and on other *nixes with jsc installed. Check the jpt Releases page

merging JSON objects with jpt 1.0

A cool new feature for jpt 1.0 is the ability perform additional merging operations beyond JSON Merge Patch (RFC 7396). To start with let’s check out a JSON Merge Patch operation, using the mergepatch operation:

#key "a" is overwritten with the -v value, "b" is removed because null
% jpt -i0 -o mergepatch -v '{"a":"🟒","b":null}' <<< '{"a":"πŸ”΄","b":"🟑"}'                                                                           
{"a":"🟒"}

# use -u to encode all characters above 0x7E with \u escaping
% jpt -u -i0 -o mergepatch -v '{"a":"🟒","b":null}' <<< '{"a":"πŸ”΄","b":"🟑"}'                                                                           
{"a":"\ud83d\udfe2"}

Only one key survives, since the Merge Patch spec says that any key with a null value will delete that key in the input JSON. Now perhaps you might want null to survive, if so, use the merge operation:

jpt -i0 -o merge -v '{"a":"🟒","b":null}' <<< '{"a":"πŸ”΄","b":"🟑"}' 
{"a":"🟒","b":null}

Now for the more advanced modes. -o merge0 will only merge in keys that do not exist. Existing keys remain untouched. To order the keys alphabetically use -O (the letter O)

% jpt -O -o merge0 -v '{"common":"🟒","uniqToMergeData":"🟒"}' <<< '{"common":"πŸ”΄","uniqToSource":"πŸ”΄"}'
{
  "common": "πŸ”΄",
  "uniqToMergeData": "🟒",
  "uniqToSource": "πŸ”΄"
}

As you can see only the uniqToMergeData key comes through, the other two keys common and uniqToSource do not change. Useful if you are upgrading an object to a newer format and want to retain existing key values and only add the new keys with empty or default values.

merge1 let’s us do the inverse of merge0 and merge only the keys that exist in the target JSON. Keys that do not exist do not merge over.

% jpt -o merge1 -v '{"uniqToMergeData":"🟒","common":"🟒"}' <<< '{"uniqToSource":"πŸ”΄","common":"πŸ”΄"}'
{
  "uniqToSource": "πŸ”΄",
  "common": "🟒"
}

If you like pictures (I sure do) and are a visual learner you can think of merge0 and merge1 like the Exclude and Intersect Shape Areas in Photoshop.

I hope these new merge modes could be useful to you, if you have a Mac (or *nix with jsc installed) download jpt from my GitHub Releases page and give it a go!

jpt 1.0, the JSON power tool in 2022

I’m happy to announce that jpt, the JSON power tool, has been updated to 1.0 and is available on my GitHub Releases page! It’s been over a year since the last release and I’ve been of working on, learning, and pondering what a really useful JSON tool could be and then working like heck to make it a reality.

jpt is a command line utility for formatting, querying, and modifying JSON. It can run on any version of macOS and most any Unix/Linux with jsc (JavaScriptCore) installed, otherwise it’s dependency free! For systems administrators and MacAdmins who need first class JSON tooling in their own shell scripts it can be embedded as a shell function in either bash or zsh. jpt was built to stand the test of time, needing only jsc and a shell, it will work on the newest macOS all the way back to OS X 10.4 Tiger!

There’s a year’s worth of bug fixes, improvements, and new features, let’s take a look at some of them.

Notable Features and Improvement

β€’Β jpt can now input and output multi-JSON documents like JSON Text Sequences (RFC 7464) plus other non-standard formats like NDJSON and JSON Lines and concatenated JSON. See: jpt 1.0 can deal with multiple JSON texts

β€’ Truncated JSON can be de detected in concatenated/lines JSON and repaired by closing up open strings, arrays and objects. See: Helping truncated JSON data with jpt 1.0

β€’ jpt can now fully rehabilitate JSON5 back to standard JSON. As I learned last year, there’s more mutant JSON than I realized! See: jpt 1.0 and JSON5 rehab

β€’ jpt can query JSON using both JSON Pointer, IETF RFC 6901 and JSONPath, a nascent draft RFC with powerful recursive search and filtering features.

β€’ jpt can manipulate JSON using JSONPatch, JSON Merge Patch, and some new merge modes not found anywhere else. See: merging JSON objects with jpt 1.0

β€’ shell scripters and programmers may find it useful that jpt has extensive string and number encoding capabilities. From classic encodings like hex, octal, and binary to more recent ones like URL/”percent” encoding and Unicode code points. See: jpt 1.0 text encoding fun

β€’ Need help finding visualizing JSON in new ways? jpt can output the structure of JSON in ways not seen, like JSONPath Object Literals which are simply the JSONPath, an equal sign, and the JSON value (e.g. $.ok="got it"). This simple and powerful declarative syntax can help find the exact path to a particular value and can also be used to quickly prototype a JSON object. See: jpt: see JSON differently

Try it out

There’s a lot to explore in jpt 1.0, download it from my GitHub with a macOS installer package in the Releases page. Check out my other jpt blog entries here at brunerd. As the IETF JSONPath standard takes shape I’ll be updating jpt to accomodate that standard. Until then, I hope this tool helps you in your JSON work, play, or research!


jpt is unphased by macOS’s malformed JSON

A funny thing happened on the way to writing a new blog entry. Instead of finding an example JSON file that ships with macOS, I stumbled upon a slow creep of malformed JSON files that started in OS X El Capitan and continues into Big Sur. While I can’t do much about deviant JSON files in macOS, I can control how jpt handles them. So like any good Teachable Momentβ„’ and life lesson, I’ve turned lemons into lemonade and out of challenge comes innovation: jpt v0.9.6 is now nonplussed by these uncouth JSON files, with their comments and trailing commas. Like water off ducks back, jpt keeps on keepin’ on, unphased.

Finding the Misfits

To find these misfits I wrote made a script jsonParsingReportCSV-macOS. This will attempt to parse the JSON it finds using the json_pp binary that ships from Apple, the results are output as CSV and you can then easily sort it to spot troublemakers.

I’ve uploaded a couple annotated version of those searches along with the kind of deviation they exhibit: Malformed-macOS-10.15.csv and Malformed-macOS-11.1.csv

What’s their damage, man?

The two most prevalent kinds of JSON malformation I found in macOS were: trailing commas and comments. Trailing commas meaning the last item in an array or object should not have a comma after it. While Javascript has always allowed trailing commas in array literals and later in object literals (ES5), then in function parameters (ES2017), JSON has never allowed trailing commas nor comments. Neither are in the JSON spec (RFC8259).

The biggest group of offenders were the USD generated JSONs within USDKit (a private framework) and Model I/O. USD is interesting because of it’s lineage and purpose: It’s Pixar’s open source software for combining multiple 3D elements within a common scene. Or as Pixar’s page put it: “Universal Scene Description (USD) is the first publicly available software that addresses the need to robustly and scalably interchange and augment arbitrary 3D scenes that may be composed from many elemental assets.” Keep an eye on this framework πŸ•Ά

JSON Lines (Don’t Do It)

An outlier in the mix is defaultConfig.json within CoreAnalytics.framework, it is what some call “JSON Lines” or “concatenated JSON”, where each line is its own a JSON value or object. While doing some research on JSON Lines I came upon an intriguing comment in this post by rurban:

“The delimiters don’t get in the way, they protect you from MITM and other attacks. It’s a security feature, not a bug.”

Reini Urban on JSON vs. “concatenated JSON”

Taking a look at Reini’s GitHub and the level of his work, I’m going to give his statement some credence. To that end, I really didn’t want to add another output mode nor change query behavior to accommodate JSON Lines. My current compromise is to simply to attempt conversion of a JSON Lines file into an array: tack on commas and ensconce in square brackets. It Just Worksβ„’ if every line is truly a self contained JSON object. If you want to be made aware of this, use the -c option so jpt it will “complain” on stderr and exit with code 1.

The Rabbit Hole of CoreAnalytics

Within the CoreAnalytics private framework is defaultConfig.json, it is a collection of 4019 objects currently that describe the events and actions Apple is interested in collecting usage information on. They want to know if you’re actually using all those new features in CarPlay, Reminders, Photos, eGPU App usage, etc… If you want to see all the key names try this: jpt $..addTransform.name /System/Library/PrivateFrameworks/CoreAnalytics.framework/Versions/A/Resources/defaultConfig.json 

Do you actually use this stuff? Apple wants to know if it’s worth their while…

A Comment on JSON Comments

Comments in JSON has inspired debate far and wide, this post by getify sums it up very nicely:

A JSON encoder MUST NOT output comments. A JSON decoder MAY accept and ignore comments.

Douglas Crockford in some old Yahoo group that even Wayback can’t find anymore

So that’s just what jpt does now, it will never encode comments but it will strip out any and all comments it recognizes automatically, no switches needed. Which style comments? All of them (or most of them):

Commentkindlineage
# commentlineshell, Perl, PHP, Python, R, Ruby, MySQL
// commentlineC(99), Javascript, Java, Swift
/* comment */blockC, Java, Javascript, Swift
; commentlineassembly language (ASM)
-- commentlineAda, Applescript, Haskell, Lua, SQL
<!-- comment -->blockXML
*** comment ***linemacOS’s ionodecache.json

If you need a test file you can use my worstJSONEver.json. A most horrible JSON file if there ever was one! jpt soldiers on, use -c to see what it does to clean the input (-i0 for no level indents). It shows the initial parse error and each attempt at recovery.

Fixing the Misfits

There is no fixing, there is only acceptance. Since most of these files are in SIP protected areas of macOS so it will take an act of God, a well filed bug report, for Apple to correct these in a future update. But they usually only act if there’s a demonstrable bug, in this case it’s more of an observation about the condition of the JSON files than a clear issue. If someone is really bothered, they could disable SIP and go in and fix them, however future macOS updates will likely revert these changes. Since I haven’t tried this, who knows, macOS may get ornery and trip OS integrity protections, rendering your system un-bootable, YMMV πŸ€·β€β™‚οΈ. The real question is if they impact how macOS works and I’m guessing they don’t. The apps and frameworks which use these off-spec JSON file must be doing something to correct the issues before parsing. Rest assured though, if you attempt to parse one of these files with jpt there will be no issues, it Just Worksβ„’. However if you would like to know it did something, use -c to “complain” about comments, commas, JSON Lines, and hard wrapping. It will send those notices to stderr and exit with a status of 1.

jpt: malformed JSON? Bring it on.

So after all this I don’t have a good example page for jpt’s multitude of features and usage (yet) but what I do have is a JSON Power Tool with more “dummy-proof” features than it did before. I’ll take that as a win! Check out the latest release of jpt at my Github.

jpt (JSON Power Tool) is a non-compiled tool that can manipulate and query JSON in a variety of ways, it runs on macOS (10.4 – 11.0) PPC/Intel/Apple Silicon; Linux with jsc; Windows with Linux subsystem for Windows and jsc. It can be used standalone or embedded within your shell scripts. I hope you find it useful. Thanks for reading!