Claude Code로 Markdown/MDX를 안전하게 처리하는 실전 가이드

Markdown을 문자열로만 다루면 생기는 문제

공개된 Markdown/MDX 글은 단순한 원고가 아닙니다. frontmatter, SEO description, 제목 계층, 자동 생성 앵커, 코드 펜스, 내부 링크, 공식 외부 링크, 다국어 라우트, 때로는 raw HTML까지 포함합니다. Claude Code에게 “이 글을 정리해 줘”라고만 맡기면 문장은 좋아져도 slug가 바뀌거나, CTA가 빠지거나, 한 언어 파일만 얇은 요약으로 남거나, 코드 펜스가 깨질 수 있습니다.

안전한 원칙은 분명합니다. Claude Code는 글을 고치게 하되, 구조는 기계가 검사하게 만듭니다. Markdown과 MDX는 AST, 즉 추상 구문 트리로 읽고, frontmatter는 데이터처럼 검증하며, HTML 출력은 XSS 경계를 기준으로 다룹니다. 다국어 파일은 한 세트로 확인해야 합니다.

이 글의 기준 자료는 2026년 6월 2일 기준으로 확인했습니다. unified의 처리 방식은 unified guide와 syntax trees guide가 기본입니다. Markdown 파싱은 remark와 remark-parse를 봅니다. MDX 문법은 MDX docs가 공식 기준입니다. frontmatter는 gray-matter를 쓰고, raw HTML 보안은 rehype-sanitize와 OWASP XSS Prevention Cheat Sheet를 함께 확인합니다. Claude Code 작업 범위는 Claude Code overview와 settings를 참고하면 좋습니다.

flowchart LR
  A["MDX file"] --> B["frontmatter"]
  B --> C["schema validation"]
  A --> D["remark / MDX AST"]
  D --> E["headings, fences, links"]
  D --> F["rehype HTML pipeline"]
  F --> G["sanitize"]
  C --> H["locale and build checks"]
  E --> H
  G --> H

작업에 맞는 parser를 먼저 고른다

Claude Code에게 줄 첫 지시는 도구 체인입니다. “Markdown을 파싱해”라고만 하면 짧은 정규표현식이 나올 가능성이 큽니다. 정규표현식은 특정 문자열을 찾는 데는 유용하지만, 공개 글의 구조 판단에는 약합니다.

목적	권장 도구	위험한 지름길
제목, 링크, 코드 펜스 읽기	`remark-parse`와 AST 순회	원문에서 `^##`만 검색
JSX가 있는 `.mdx` 처리	`remark-mdx` 또는 MDX compiler	Markdown 전용 parser만 사용
HTML 렌더링	`remark-rehype`로 rehype에 전달	문자열로 HTML 조립
raw HTML 허용	`rehype-raw` 뒤 `rehype-sanitize`	`allowDangerousHtml`만 켜기
frontmatter 읽기	`gray-matter`와 schema 검증	YAML을 직접 split

AST의 장점은 의미를 구분한다는 점입니다. 코드 블록 안의 ## fake heading은 목차에 들어가면 안 됩니다. MDX 컴포넌트의 props는 일반 링크와 다르게 봐야 합니다. tags: Claude Code, Markdown은 배열이 아니라 문자열이므로 frontmatter 검증에서 떨어져야 합니다.

실제로 많이 쓰는 4가지 유스케이스

첫 번째는 공개된 블로그 글 새로고침입니다. title, description, updatedDate, 공식 링크, 내부 링크, 코드 예제, 수익화 CTA를 함께 봅니다. ClaudeCodeLab 안에서는 CLAUDE.md best practices나 Claude Code web scraping 같은 내부 글로 자연스럽게 연결하되, 다른 slug는 건드리지 않는 것이 중요합니다.

두 번째는 문서 사이트와 help center의 MDX 컴포넌트화입니다. callout, tabs, pricing card, FAQ, 설정 예시는 MDX와 잘 맞지만, Markdown과 JSX가 섞이면 regex 기반 도구가 금방 깨집니다.

세 번째는 다국어 발행입니다. 일본어 원문은 충분히 깊은데 영어, 중국어, 스페인어, 인도네시아어가 얇은 요약이면 로컬 SEO와 독자 신뢰가 함께 떨어집니다. 각 locale에는 구체적 사례, 실패 모드, 복사 가능한 코드, 공식 링크, 내부 링크, CTA, 검증 노트가 모두 필요합니다.

네 번째는 상품과 교육 자료 운영입니다. Gumroad 페이지, 교육 안내, 무료 PDF 전달, 이메일 리소스는 Markdown을 재사용하는 경우가 많습니다. 구매나 상담에 가까운 페이지일수록 코드 펜스와 링크 검사가 곧 신뢰 검사가 됩니다.

복사해서 실행하는 최소 설정

다음 예시는 Node.js 18 이상과 ESM을 기준으로 합니다. 실제 저장소에 넣기 전에 작은 데모 폴더에서 먼저 실행해 보는 것이 좋습니다.

mkdir mdx-audit-demo
cd mdx-audit-demo
npm init -y
npm pkg set type=module
npm install unified remark-parse remark-mdx remark-gfm gray-matter
npm install unist-util-visit github-slugger
npm install remark-rehype rehype-raw rehype-sanitize rehype-stringify
mkdir tools

예제1: frontmatter, 제목, 코드 펜스, 링크 검사

이 스크립트는 gray-matter로 frontmatter를 읽고, remark와 MDX parser로 본문을 AST로 읽습니다. description 길이, 필수 필드, 코드 펜스 언어, 내부 링크, 외부 링크, 제목 slug를 한 번에 점검합니다.

// tools/audit-mdx.mjs
import fs from "node:fs/promises";
import matter from "gray-matter";
import GithubSlugger from "github-slugger";
import { unified } from "unified";
import remarkParse from "remark-parse";
import remarkMdx from "remark-mdx";
import remarkGfm from "remark-gfm";
import { visit } from "unist-util-visit";

const file = process.argv[2];
if (!file) {
  throw new Error("Usage: node tools/audit-mdx.mjs article.mdx");
}

const source = await fs.readFile(file, "utf8");
const { data, content } = matter(source);
const errors = [];
const links = { internal: [], external: [] };
const headings = [];
const codeBlocks = [];

for (const key of ["title", "description", "pubDate", "heroImage", "lang"]) {
  if (typeof data[key] !== "string" || data[key].trim() === "") {
    errors.push(`frontmatter.${key} is required`);
  }
}

if ([...String(data.description ?? "")].length > 120) {
  errors.push("description must be 120 characters or fewer");
}

if (!Array.isArray(data.tags) || data.tags.length === 0) {
  errors.push("frontmatter.tags must be a non-empty array");
}

const tree = unified()
  .use(remarkParse)
  .use(remarkMdx)
  .use(remarkGfm)
  .parse(content);

const slugger = new GithubSlugger();

visit(tree, (node) => {
  if (node.type === "heading") {
    const text = plainText(node);
    headings.push({ depth: node.depth, text, slug: slugger.slug(text) });
  }

  if (node.type === "code") {
    codeBlocks.push({ lang: node.lang || "", meta: node.meta || "" });
    if (!node.lang) errors.push("code fence is missing a language");
  }

  if (node.type === "link") {
    const url = String(node.url || "");
    if (url.startsWith("http")) links.external.push(url);
    if (url.startsWith("/")) links.internal.push(url);
  }
});

if (links.internal.length === 0) errors.push("missing internal link");
if (links.external.length === 0) errors.push("missing external link");

if (errors.length > 0) {
  console.error(errors.map((error) => `- ${error}`).join("\n"));
  process.exit(1);
}

console.log(JSON.stringify({ headings, codeBlocks, links }, null, 2));

function plainText(node) {
  if (typeof node.value === "string") return node.value;
  if (!Array.isArray(node.children)) return "";
  return node.children.map(plainText).join("");
}

실행은 한 파일부터 시작합니다. false positive가 없다는 것을 확인한 뒤 CI에 넣습니다.

node tools/audit-mdx.mjs site/src/content/blog-ko/example.mdx

예제2: 안전한 HTML 변환

raw HTML이 필요 없다면 허용하지 않는 것이 가장 안전합니다. 꼭 허용해야 한다면 rehype-raw로 HTML AST에 넣고, 바로 rehype-sanitize로 허용 schema만 남깁니다. allowDangerousHtml만 켜는 것은 보안 설계가 아닙니다.

// tools/markdown-to-safe-html.mjs
import fs from "node:fs/promises";
import { unified } from "unified";
import remarkParse from "remark-parse";
import remarkGfm from "remark-gfm";
import remarkRehype from "remark-rehype";
import rehypeRaw from "rehype-raw";
import rehypeSanitize, { defaultSchema } from "rehype-sanitize";
import rehypeStringify from "rehype-stringify";

const file = process.argv[2];
const markdown = await fs.readFile(file, "utf8");
const schema = {
  ...defaultSchema,
  attributes: {
    ...defaultSchema.attributes,
    code: [["className", /^language-/]],
  },
};

const html = await unified()
  .use(remarkParse)
  .use(remarkGfm)
  .use(remarkRehype, { allowDangerousHtml: true })
  .use(rehypeRaw)
  .use(rehypeSanitize, schema)
  .use(rehypeStringify)
  .process(markdown);

console.log(String(html));

순서가 핵심입니다. rehype-raw는 raw HTML을 트리로 되돌리고, rehype-sanitize는 허용되지 않은 태그와 속성을 제거합니다. 두 번째 단계가 없으면 위험한 속성이 렌더링 결과에 남을 수 있습니다.

예제3: 10개 locale 파일 검사

다국어 글은 한 세트로 점검해야 합니다. 아래 스크립트는 동일 slug가 모든 locale에 있고, heroImage가 보존되며, updatedDate와 description 길이가 맞는지 확인합니다.

// tools/check-locales.mjs
import fs from "node:fs";
import path from "node:path";
import matter from "gray-matter";

const slug = "claude-code-markdown-processing.mdx";
const expectedHero = "/images/hero/hero-077.png";
const locales = [
  ["ja", "site/src/content/blog"],
  ["en", "site/src/content/blog-en"],
  ["zh", "site/src/content/blog-zh"],
  ["ko", "site/src/content/blog-ko"],
  ["es", "site/src/content/blog-es"],
  ["fr", "site/src/content/blog-fr"],
  ["de", "site/src/content/blog-de"],
  ["pt", "site/src/content/blog-pt"],
  ["hi", "site/src/content/blog-hi"],
  ["id", "site/src/content/blog-id"],
];

const errors = [];

for (const [lang, dir] of locales) {
  const file = path.join(dir, slug);
  const source = fs.readFileSync(file, "utf8");
  const { data, content } = matter(source);
  if (data.lang !== lang) errors.push(`${lang}: lang mismatch`);
  if (data.heroImage !== expectedHero) errors.push(`${lang}: hero changed`);
  if (data.updatedDate !== "2026-06-02") {
    errors.push(`${lang}: updatedDate mismatch`);
  }
  if ([...String(data.description ?? "")].length > 120) {
    errors.push(`${lang}: description too long`);
  }
  if (!content.includes("https://")) errors.push(`${lang}: no external link`);
  if (!content.includes("](/")) errors.push(`${lang}: no internal link`);
}

if (errors.length > 0) {
  console.error(errors.map((error) => `- ${error}`).join("\n"));
  process.exit(1);
}

console.log("locale set is consistent");

구체적인 실패 모드

실패	결과	방지책
제목을 regex로 읽음	코드 블록 속 가짜 제목이 목차에 들어감	`heading` node만 순회
`tags`가 문자열이 됨	필터와 관련 글이 깨짐	frontmatter 타입 검증
slug 생성이 제각각	앵커 링크가 언어별로 깨짐	동일한 slugger 사용
raw HTML을 신뢰함	XSS 위험이 페이지로 이동	schema 기반 sanitize
외부 링크를 확인하지 않음	공식 문서 이전을 놓침	공개 전 HEAD/GET 검사
prompt 범위가 넓음	다른 작업자의 파일까지 수정	`owned_files` 고정

실패 모드는 작업 지시에 직접 넣어야 합니다. “품질 좋게 만들어”보다 “regex-only 제목 파싱 금지, heroImage 보존, description 120자 이하, 다른 slug 수정 금지, raw HTML은 sanitize 필수”가 훨씬 실행 가능합니다.

Claude Code에 줄 안전한 prompt

task: "Refresh one published MDX article"
owned_files:
  - "site/src/content/blog-ko/claude-code-markdown-processing.mdx"
preserve:
  - "slug path"
  - "heroImage"
  - "unrelated dirty files"
required:
  - "updatedDate: 2026-06-02"
  - "description <= 120 characters"
  - "AST-based Markdown checks"
  - "official external links"
  - "internal links and monetization CTA"
forbidden:
  - "regex-only heading parsing"
  - "raw HTML without sanitization"
  - "thin locale summaries"
verification:
  - "node scripts/check-code-fences.mjs"
  - "node scripts/check-updated-article-quality.mjs"

공개 전 체크와 CTA

공개 전에는 로컬 스크립트와 사람의 검토가 둘 다 필요합니다. 스크립트는 구조, metadata, 코드 펜스, 링크, 글의 깊이를 확인합니다. 사람은 번역이 자연스러운지, 모바일에서 단락이 길지 않은지, CTA가 문맥에 맞는지 봅니다.

node tools/audit-mdx.mjs site/src/content/blog-ko/claude-code-markdown-processing.mdx
node tools/check-locales.mjs
node scripts/check-code-fences.mjs
node scripts/check-updated-article-quality.mjs

혼자 시작한다면 무료 Claude Code cheatsheet로 기본 명령을 고정하는 것이 좋습니다. 반복 가능한 review와 글쓰기 prompt가 필요하면 Claude Code prompt templates를 사용할 수 있습니다. 팀에서 권한, CI, locale 운영, 공개 review까지 정리하려면 Claude Code training and consultation이 다음 단계입니다.

실제 검증 결과

이번 업데이트에서 Masa는 글쓰기보다 먼저 실패 조건을 정했습니다. description 초과, updatedDate 누락, heroImage 변경, 코드 펜스 언어 누락, 얇은 locale, 오래된 공식 링크를 모두 점검 대상으로 두었습니다. 그 뒤 AST와 frontmatter schema를 기준으로 Claude Code에 작업을 맡기자 review가 훨씬 구체적이 되었습니다. 최종적으로 node scripts/check-code-fences.mjs와 node scripts/check-updated-article-quality.mjs를 돌려 구조와 품질을 확인했습니다. 공개 글 리라이트에서 중요한 것은 더 많은 문장을 쓰는 것만이 아니라, 깨지면 안 되는 계약을 prompt와 스크립트로 먼저 고정하는 것입니다.