LandKit

Schema Markup Validator

Fetch any page, extract every JSON-LD schema block, and check for missing required fields and broken syntax that block rich results.

Why schema validation matters

  • 1

    Schema errors silently break rich results

    Google does not warn you when a single missing field knocks your page out of star ratings, FAQ snippets, or product carousels. The eligibility check happens server-side and you only notice when traffic drops.

  • 2

    AI engines like ChatGPT use schema to understand pages

    Generative engines parse JSON-LD to ground their answers. Pages with clean Article, Product, or FAQ schema get cited more often than pages without structured data.

  • 3

    Google's own testing tool only shows samples

    The Rich Results Test only validates a few schema types. Our validator inspects every JSON-LD block on the page, including Organization, BreadcrumbList, WebSite, and custom types.

Deep dive

How to debug structured data in 2026: a schema validator workflow that catches what Google misses

By Nikhil Kumar, Founder of LandKit. Last updated May 2026.

You shipped a schema. The Rich Results Test went green. Six weeks later, zero rich snippets. Sound familiar?

A schema validator only does its job if you understand what it does not check. The Google Rich Results Test confirms eligibility for one set of Google features. The schema.org markup validator confirms vocabulary compliance. Neither one tells you whether your JSON-LD will earn rich snippets in production, get lifted into a Perplexity answer, or survive the next CMS update. In a 2026 audit of 5,000 production sites, Digital Applied found 71% of sites deploy schema, but only 22% pass the Rich Results Test cleanly across every detected type. The gap is the entire game.

Why does my schema pass the Rich Results Test but I still get no rich snippets?

Passing the Rich Results Test confirms your page is eligible for a rich result, not that Google will show one. Google still applies content-quality, trust, and competitive ranking filters after eligibility passes. According to the official Search Console documentation, the test verifies that the structured data is correctly formatted, not that the page will be selected for display. Eligibility is a floor, not a guarantee.

The Rich Results Test runs on a single URL. Google's actual decision to show a rich snippet depends on the whole site.

I have watched valid Article schema sit dormant for two months on a brand-new domain because Google had not yet built enough trust to surface the rich result.

The other big trap: the Rich Results Test only validates schema types Google supports for its own SERP features. If you marked up a Service schema for a B2B agency page, the tool will say "no items detected" even though the JSON-LD is technically perfect. That is not a bug. Google does not show a rich snippet for Service, so the tool does not check it.

Three signals that you have a non-validation problem:

  • The same template works on older pages but not new ones (trust and crawl budget)
  • Search Console "Rich result" status report shows the page as valid but never gets impressions (algorithmic suppression)
  • The page outranks competitors but they get the rich snippet (page-content quality mismatch)

In each case the fix sits outside your JSON-LD.

What are the most common JSON-LD errors I see in 2026?

The Digital Applied 2026 audit broke schema failures into five patterns that account for over 90% of errors: missing required properties (38%), invalid ISO-8601 dates (24%), wrong @type for the page content (12%), missing image dimensions (9%), and duplicate @id values (7%). Most of these are fixable in under five minutes once a schema validator flags them. The hard part is catching them before you ship.

Here is my live error checklist when I am debugging a broken page.

1. Missing required fields. Article needs headline, image, datePublished, author, and publisher. Drop one and Google's Rich Results Test flags the entire item invalid.

2. Wrong date format. Dates must be ISO-8601 (2026-05-07 or full datetime with timezone). May 7, 2026 is not valid. Neither is 5/7/26.

3. Camel case violations. It is datePublished, not date_published or DatePublished. Schema.org property names are exact.

4. Nested objects flattened to strings. publisher must be an Organization object with its own @type and a nested logo ImageObject. Setting "publisher": "LandKit" silently fails Rich Results eligibility even though the JSON parses.

5. Wrong @type for the page. Putting Article schema on a homepage is the most common version of this. Putting Recipe schema on a woodworking tutorial is the most-cited version inside Google's structured data policies.

6. Invisible content. Marking up a price, rating, or author who is not visible to a human reader violates Google's policy and can trigger a manual action. Quoting Google's docs directly: "Don't mark up content that is not visible to readers of the page."

The first five are syntax. The sixth is policy. The first five get caught by a validator. The sixth gets caught by a manual review of your own page.

How do I actually validate JSON-LD before I ship?

My pre-ship validation workflow has five steps and it takes about ten minutes per template. Validate the raw JSON, validate against schema.org vocabulary, validate against Google's eligibility rules, view-source on a staging URL to confirm the JSON-LD ships in the initial HTML, and finally inspect the live URL inside Google Search Console after deploy. Skip any step and you ship blind. The cost of skipping is measured in weeks of lost rich-result eligibility, not minutes.

Here is the literal sequence I run before pushing any new template.

Step 1: paste-and-validate the JSON. Strip the schema out of the page and run it through a schema validator or jsonlint. This catches missing commas, unescaped quotes, and trailing commas.

Step 2: schema.org vocabulary check. Run the same JSON through validator.schema.org. This catches invalid property names, wrong data types, and any vocabulary you invented. A 2024 Search Engine Journal breakdown listed mixed encoding formats and duplicate aggregateRating itemprops as the two errors that survive most internal QA.

Step 3: Google eligibility check. Paste the page URL or the rendered HTML into the Rich Results Test. This is the only tool that tells you which Google rich-result features your markup is eligible for.

Step 4: view-source the staging URL. Hit the live staging URL, then view-source: in your browser. The JSON-LD must be present in the raw HTML response. If it only appears after JavaScript runs, you have a rendering problem (covered below).

Step 5: post-deploy URL Inspection in Search Console. After deploy, run URL Inspection on the live URL inside Google Search Console. This shows what Googlebot actually crawled. If the rendered HTML does not match what your validator saw, your structured data is not in production no matter what the test tools say.

The whole workflow is boring on purpose. Boring catches the bugs that cost you six weeks of debugging.

Schema markup validator vs Google Rich Results Test: which one do I actually need?

You need both, and they answer different questions. The schema.org markup validator is a joint Google-Microsoft-Yahoo project that checks whether your JSON-LD complies with the schema.org vocabulary, with no opinion on whether Google will use it. The Google Rich Results Test only checks the subset of types that trigger Google's own SERP features. According to Google's official structured data documentation, Google explicitly recommends starting with the Rich Results Test for eligibility, then using the Schema Markup Validator for general syntax compliance.

The history matters. Google retired the original Structured Data Testing Tool in 2020, then migrated the schema.org part of it to validator.schema.org in 2021. The Rich Results Test became the eligibility-only successor.

Here is how I decide which one to use.

Validation questionUse this toolWhat it tells you
Will Google show this as a rich snippet?Google Rich Results TestEligibility for Google SERP features
Is my JSON-LD valid against the schema.org vocabulary?validator.schema.orgSyntax + vocabulary compliance
Did Bing index my schema correctly?Bing URL Inspection (Webmaster Tools)Bing-specific crawl + parse
Will my markup survive in production after JS renders?Search Console URL InspectionWhat Googlebot actually saw
Does my schema match the visible page content?Manual side-by-side reviewPolicy compliance, manual-action risk

If you are short on time, run the Rich Results Test first. It is the only one that connects directly to traffic outcomes. Then run validator.schema.org for any schema type Google does not show as a rich result (Service, LocalBusiness variants, custom types). That second tool is the only way to validate non-Google-supported schema, which matters more in 2026 than it did in 2022 because of AI citations.

How does schema markup actually affect AI citations in ChatGPT and Perplexity?

Schema markup increases the rate at which AI engines lift your page into generated answers, but the effect is uneven by schema type and platform. The Digital Applied 2026 audit found a +0.34 Pearson correlation between valid schema and AI-citation rate across 5,000 sites, with Article plus BreadcrumbList producing a +47% citation lift versus a no-schema baseline and Product plus Offer producing +29% on commercial queries. Reuters and Search Engine Land have both noted the relationship is correlational, not causal: schema is one quality signal among many.

The skeptical version: a December 2024 Search/Atlas study found no correlation between raw schema coverage and citation rates. Comprehensive schema did not consistently outperform minimal schema in their dataset.

I read these together as a permission slip to focus on quality over coverage. Mark up the entities and facts that genuinely describe the page, with valid syntax and visible content. Do not pile on twelve schema types because more is better. It is not.

The platform-by-platform pattern I have seen inside LandKit's AI-citation tracking, across about 1,200 monitored brand queries on ChatGPT, Claude, Perplexity, and Gemini AI Overviews:

  • FAQPage schema pages get cited about 3x as often on conversational queries as the same content without schema
  • Article plus Person author schema shows up more often in Perplexity citations than Article alone
  • Organization plus sameAs arrays linking to LinkedIn, Crunchbase, and Wikidata reads as more authoritative to Claude and Gemini

That third one is the biggest quick win for any brand starting from zero. The sameAs array is how AI engines disambiguate your company from another company with the same name.

Why does my schema break when I render it with JavaScript or Google Tag Manager?

JSON-LD injected by JavaScript renders fine for the Rich Results Test but can fail in production indexing because Google processes JavaScript pages in three phases: crawling, rendering, and indexing. Pages frequently get indexed after the first-pass HTML crawl, before the rendering queue catches up. If your schema only exists after JavaScript executes, the indexed version of the page contains no schema at all. Quoting Google's JavaScript SEO documentation directly: "When using structured data on your pages, you can use JavaScript to generate the required JSON-LD and inject it into the page. Make sure to test your implementation to avoid issues."

Three failure modes I see constantly with JS-rendered schema.

Mode 1: GTM tag fires after document complete. The schema arrives in the rendered DOM but not the initial HTML. The Rich Results Test renders the page and sees it. The problem is the indexing queue: Google often indexes the first-pass HTML and does not wait for the second pass.

Mode 2: SSR mismatched with client-side hydration. Next.js or Nuxt SSR injects valid JSON-LD on the server, then a client-side hook unmounts and re-mounts it differently. The result is duplicate or contradictory schema in the DOM.

Mode 3: GTM blocked by robots.txt or trigger conditions excluding bots. A too-aggressive robots.txt rule blocks the JS bundle. Googlebot cannot execute the script. Schema never appears.

The fix for all three is the same: render the JSON-LD server-side, in the initial HTML response, where possible. Use view-source: on the live URL and search the raw response for application/ld+json. If it is not there in the initial HTML, you have a rendering problem regardless of what any test tool says.

If you must inject schema client-side, validate after every deploy with Search Console URL Inspection and watch the rendered HTML tab specifically.

What schema failures damage AI citations even when Google says everything is fine?

The schema errors that pass Google's validator but quietly hurt AI citations are the ones that violate vocabulary semantics or strand your page outside the entity graph. AI engines cross-reference your sameAs array against Wikidata and LinkedIn, your author against published Person entities, and your mainEntityOfPage against the rest of your site graph. A 2025 Schema App analysis found that pages with disconnected entities (no sameAs, no mainEntityOfPage, no BreadcrumbList) saw lower retrieval frequency in AI answers even when their per-page schema validated cleanly.

Five silent failures I check for after every schema deploy.

Empty or missing sameAs arrays on Organization and Person entities. This is the single biggest miss. AI engines need at least 2-3 authoritative profile URLs to confirm "this person/org is the same one I already know about."

mainEntityOfPage pointing to the wrong canonical. If your Article says it is the main entity of https://site.com/page but your canonical tag says https://www.site.com/page/, the entity graph fractures. Run a canonical tag checker at the same time you validate schema.

BreadcrumbList missing or pointing to a flat structure. Crumbs are how AI engines understand your site's information architecture. A page with no breadcrumb schema looks like an island.

author set as a string, not a Person object. "author": "Nikhil Kumar" is technically valid. {"@type": "Person", "name": "Nikhil Kumar", "url": "...", "sameAs": [...]} is what gets your byline lifted into AI answers as a credentialed source.

Schema describing different content than the visible page. Google calls this a policy violation. AI engines simply ignore it because their cross-encoder rerankers detect the mismatch and downweight the chunk.

None of these will fail the Rich Results Test. All of them measurably hurt your AI citation rate over 60-90 days.

What should I monitor in Search Console after I ship new schema?

Search Console's enhancement reports show, by rich-result type, the count of valid items, items with errors, items with warnings, and impressions tied to each. The single most important post-deploy metric is the delta between "valid items" and "valid items with warnings" because warnings often degrade silently into errors as Google's vocabulary tightens. According to Google's official rich-result documentation, warnings indicate that the property is not required but its absence reduces the chance of getting a rich result. Treat them as soft errors and fix them within 30 days.

I monitor four reports inside Search Console after every schema change.

The "Enhancement" reports for each rich-result type the site is eligible for. Filter to "with warnings" and resolve those first.

The "Pages" report filtered by "Crawled, currently not indexed." If a new schema page sits there 14+ days, you have an indexing problem that will block rich-result eligibility.

The URL Inspection tool, run manually on each new template after deploy. The "View crawled page" tab shows what Googlebot saw, which is the only ground truth that matters.

The "Performance" report filtered by "Search appearance" to compare rich-result impressions and clicks against the period before deploy.

For non-Google traffic, also check Bing Webmaster Tools' URL inspection. It surfaces markup errors specific to Bing's parser, which matters because Bing is the live search index that powers ChatGPT browsing.

Frequently asked questions

Why does my schema pass the Rich Results Test but I still don't get rich snippets?

Passing means eligible, not selected. Google applies content-quality, site-trust, and competitive ranking filters on top of eligibility. A new domain can wait 4-8 weeks before Google trusts the site enough to display rich snippets even when every validator returns green. Check the Search Console rich-result status report. If the page is valid but has zero impressions for 30+ days, the issue is trust or competition, not your JSON-LD.

What is the difference between Schema Markup Validator and Google Rich Results Test?

Schema Markup Validator (validator.schema.org) checks whether your JSON-LD complies with the schema.org vocabulary across all schema types, jointly run by Google, Microsoft, and Yahoo. Google Rich Results Test only validates the schema types that trigger Google's own SERP features, ignoring everything else. Use the Rich Results Test for Google eligibility and the Schema Markup Validator for general syntax and any non-Google-supported types like Service or LocalBusiness variants.

Does FAQ schema still help SEO in 2026?

FAQ rich results were restricted in August 2023 to "well-known, authoritative government and health websites" only, per Google's official announcement on Search Central Blog. For everyone else, FAQ rich snippets stopped showing in Google search. The schema itself is still worth deploying because Frase and other studies show FAQPage schema gets cited at higher rates inside ChatGPT, Perplexity, and Gemini AI Overviews. The Google rich result is gone. The AI-citation lift is real.

How do I validate JSON-LD that's injected by Google Tag Manager?

Run the Rich Results Test in URL mode (not Code Snippet mode) so it executes JavaScript before parsing. Then run Search Console's URL Inspection on the live URL after deploy and check the "View crawled page" tab to see what Googlebot indexed. If the rendered HTML contains the schema but the indexed HTML does not, your GTM tag fires too late or your schema is being injected after Google indexes the first-pass HTML. Move the schema server-side to fix it.

Can a schema validator catch invisible-content policy violations?

No. A schema validator only checks JSON syntax and vocabulary compliance. It cannot tell whether the content described by your JSON-LD is actually visible to a human reader on the page. Google's structured data policies explicitly require that marked-up content also be visible. Catching this requires a manual side-by-side review of your JSON-LD and your rendered page. If your aggregateRating is in the schema but not on the page, you risk a manual action that revokes rich-result eligibility for the entire site.

What's the fastest schema validator workflow if I'm shipping a new template today?

Five steps in this order: paste the JSON-LD into a syntax validator to catch missing commas and trailing brackets, run it through validator.schema.org for vocabulary compliance, run the page URL through Google's Rich Results Test for eligibility, view-source the staging URL to confirm the JSON-LD ships in the initial HTML response, and run Search Console URL Inspection on the live URL after deploy. Total time: about 10 minutes. Skipping any step is how you ship broken schema and lose six weeks finding out.

Ship the validation, not the prayer

Schema validation is not a one-time pre-flight check. It is a deployment habit, the same way you would not push code without running tests.

Run validation on every template change, not just the first deploy. CMS updates, theme tweaks, and plugin upgrades silently break schema. The Digital Applied audit found that within 12 months of initial deployment, sites that skipped continuous validation regressed back to broken state.

Build a 10-minute pre-ship checklist. Run a schema validator and a schema markup generator on the same template, then pair them with a canonical tag checker and an XML sitemap validator. Together they catch most of the on-page technical issues that block both rich snippets and AI citations. Run the same workflow every quarter on your highest-traffic templates. The schema you shipped six months ago is probably not the schema Google is parsing today.

Nikhil Kumar is the founder of LandKit, the SEO and AI visibility growth OS that tracks brand mentions across ChatGPT, Claude, Gemini, and Perplexity. He builds tools that help solo founders and small teams compete with better-resourced SEO competitors. Connect on LinkedIn at https://www.linkedin.com/in/nikhonit/.