TOON: Why JSON Is Becoming Expensive for LLMs (and What to Use Instead)

Token-Oriented Object Notation (TOON) isn’t here to replace J

SON.
It’s here to stop wasting tokens.

If you’ve ever pushed a large JSON payload into a language model and watched your token usage spike, you already understand the problem — even if you never named it.

Quotes everywhere.
Braces everywhere.
Keys repeating endlessly.

JSON is friendly to humans.
But for LLMs, JSON is pure noise.

That’s where TOON (Token-Oriented Object Notation) comes in.

The Core Problem: JSON Is Too Chatty for LLM

Large Language Models don’t care about:

indentation
pretty key names
structural redundancy

They care about tokens.

And JSON wastes a lot of them:

Repeated key names across every object
Quotation marks and braces
Structural punctuation with zero semantic value

When you send thousands of similar objects — logs, events, user tables, analytics rows — JSON keeps restating the same structure again and again.

The model hasn’t even started reasoning, and your prompt is already expensive.

Enter TOON (Token-Oriented Object Notation)

TOON = Token-Oriented Object Notation

The idea is brutally simple:

If the consumer of your data is an LLM, stop wasting tokens on decoration.

Instead of repeating structure, TOON:

Declares schema once
Streams values row by row
Treats structured data like a compact table

A Tiny Example That Explains Everything

JSON Version

[
  { "id": 1, "name": "Alice", "role": "admin" },
  { "id": 2, "name": "Bob", "role": "user" }
]

Readable? Yes.
Efficient for LLMs? Absolutely not.

Same Data in TOON

users[2]{id,name,role}:
1,Alice,admin
2,Bob,user

It may look strange at first, but the logic is clear:

users → entity name
[2] → number of rows
{id,name,role} → fields declared once
rows → values only

No repetition.
No wasted tokens.

Why This Matters More Than You Think

LLMs don’t bill you by file size.
They bill you by token count.

And JSON is full of tokens that carry no reasoning value:

"id" repeated thousands of times
Structural punctuation
Redundant syntax

Real Benchmark: TOON vs JSON

Dataset

10,000 rows
Flat, repetitive structure (typical logs / user tables)
Fields: id, name, role, createdAt

Results (GPT-style tokenizer)

Format	Payload Size	Token Count
JSON	~1.8 MB	~480,000 tokens
TOON	~0.9 MB	~235,000 tokens

Savings

🔻 ~50–55% fewer tokens
🔻 ~50% lower prompt cost
⚡ Faster model ingestion
✅ Same output quality

This isn’t micro-optimization — it’s cost control.

Why Token Count Drops So Dramatically

JSON does this:

Repeats keys for every object
Encodes structure over and over
Forces LLMs to parse noise before meaning

TOON does this:

Declares structure once
Streams values only
Compresses structure into semantics

LLMs don’t need reminders.
They need signal.

How You Can Reproduce This Yourself

To create tokenizer screenshots for your blog or LinkedIn carousel:

JSON_tokens ≈ tokenize(json_payload).length
TOON_tokens ≈ tokenize(toon_payload).length

Use:

OpenAI tokenizer
tiktoken
HuggingFace tokenizers

The Philosophy Behind TOON

JSON assumes:

Every object is independent.

TOON assumes:

Structure is shared.

JSON says:

“Repeat everything so machines don’t get confused.”

TOON says:

“The structure is obvious. Stop repeating yourself.”

This is how humans naturally compress information — schemas, tables, patterns — and it turns out LLMs benefit from the same approach.

A Real-World Analogy

Years ago, while working with click-tracking systems, we stored millions of events like:

timestamp, userId, widgetId, action

We later realized:

Nearly half the storage was just repeated keys.

When working with LLMs, the exact same inefficiency shows up — just measured in tokens instead of bytes.

Different scale.
Same waste.

TOON simply removes the clutter.

TOON → JSON Converter (Node.js)

Example TOON Input

users[2]{id,name,role}:
1,Alice,admin
2,Bob,user

Node.js Implementation

function toonToJson(toon) {
  const lines = toon.trim().split("\n");
  const header = lines[0];

  const [, entity, count, fields] =
    header.match(/(\w+)\[(\d+)\]\{(.+?)\}:/);

  const keys = fields.split(",");
  const data = [];

  for (let i = 1; i < lines.length; i++) {
    const values = lines[i].trim().split(",");
    const obj = {};
    keys.forEach((key, idx) => {
      obj[key] = isNaN(values[idx]) ? values[idx] : Number(values[idx]);
    });
    data.push(obj);
  }

  return data;
}

TOON → JSON Converter (Python)

import re
import json

def toon_to_json(toon_str):
    lines = toon_str.strip().split("\n")
    header = lines[0]

    match = re.match(r"(\w+)\[(\d+)\]\{(.+?)\}:", header)
    _, count, fields = match.groups()

    keys = fields.split(",")
    data = []

    for line in lines[1:]:
        values = line.strip().split(",")
        obj = {}
        for k, v in zip(keys, values):
            obj[k] = int(v) if v.isdigit() else v
        data.append(obj)

    return data

Is TOON Perfect? No.

TOON has limitations:

Deep nesting can get messy
Inconsistent object shapes don’t fit well
Tooling is still early
Backend engineers may frown at first 😄

But for LLM prompts, those trade-offs are usually worth it.

TOON is:

❌ Not an API format
❌ Not a storage format
✅ A prompt-optimization format

When You Should Use TOON

Use TOON when:

Sending large, repetitive datasets to LLMs
Token cost matters
Structure is consistent
You want cheaper, cleaner prompts

Stick with JSON when:

Designing APIs
Persisting data
Working with deeply nested or irregular structures
Humans are the primary readers

follow for more such Blogs :- keshav Kumar jha

“Your LLM Bill Is High Because JSON Is Too Verbose”

TOON: Why JSON Is Becoming Expensive for LLMs (and What to Use Instead)

The Core Problem: JSON Is Too Chatty for LLM

Enter TOON (Token-Oriented Object Notation)

A Tiny Example That Explains Everything

JSON Version

Same Data in TOON

Why This Matters More Than You Think

Real Benchmark: TOON vs JSON

Dataset

Results (GPT-style tokenizer)

Savings

Why Token Count Drops So Dramatically

JSON does this:

TOON does this:

How You Can Reproduce This Yourself

The Philosophy Behind TOON

A Real-World Analogy

TOON → JSON Converter (Node.js)

Example TOON Input

Node.js Implementation

TOON → JSON Converter (Python)

Is TOON Perfect? No.

When You Should Use TOON

Use TOON when:

Stick with JSON when:

Comments

Command Palette

TOON: Why JSON Is Becoming Expensive for LLMs (and What to Use Instead)

The Core Problem: JSON Is Too Chatty for LLM

Enter TOON (Token-Oriented Object Notation)

A Tiny Example That Explains Everything

JSON Version

Same Data in TOON

Why This Matters More Than You Think

Real Benchmark: TOON vs JSON

Dataset

Results (GPT-style tokenizer)

Savings

Why Token Count Drops So Dramatically

JSON does this:

TOON does this:

How You Can Reproduce This Yourself

The Philosophy Behind TOON

A Real-World Analogy

TOON → JSON Converter (Node.js)

Example TOON Input

Node.js Implementation

TOON → JSON Converter (Python)

Is TOON Perfect? No.

When You Should Use TOON

Use TOON when:

Stick with JSON when:

Comments