Skip to main content

Module tokenizer

Module tokenizer 

Expand description

Strict incremental JSON tokenizer.

The tokenizer accepts byte chunks through Tokenizer::write and emits borrowed Token views into the buffered input. Whitespace and punctuation are emitted as tokens too so an identity sink can reproduce the original byte stream exactly.

The shape follows JSON’s small grammar, shown here in an ASCII form inspired by the diagrams at https://www.json.org/json-en.html:

json
  -> ws value ws

value
  -> object
  -> array
  -> string
  -> number
  -> "true"
  -> "false"
  -> "null"

object
  -> "{" ws "}"
  -> "{" members "}"

members
  -> member
  -> member "," members

member
  -> ws string ws ":" element

array
  -> "[" ws "]"
  -> "[" elements "]"

elements
  -> element
  -> element "," elements

element
  -> ws value ws

Structs§

JsonNumber
Borrowed JSON number token.
JsonString
Borrowed JSON string token.
Tokenizer
Incremental strict JSON tokenizer.

Enums§

Token
A borrowed JSON token.

Constants§

DEFAULT_MAX_BUFFERED_BYTES
Default maximum buffered JSON input before tokenization must make progress.

Traits§

TokenSink
Consumes JSON tokens emitted by Tokenizer.

Functions§

tokenize
Tokenizes a complete byte slice.