XML to PDF Converter

Transform your XML data into professional and shareable PDF documents.

Drag & Drop XML File Here

or

No file selected.
Conversion Options
Specify the XML element representing each row (e.g., if your XML has `<items><item>...</item></items>`, enter 'item').
XML Data Preview
Upload XML and specify root element to see preview.

XML to PDF Converter

XML to PDF Converter online.

XML to PDF Converter — The Complete 2025 Guide

XML to PDF Converter — The Complete 2025 Guide

Converting XML data into a polished, printable PDF document is a common need across publishing, enterprise reporting, invoicing, and data archiving. XML stores structured data; PDF delivers a fixed-layout, widely shareable document. This guide explains the best ways to transform XML into PDF — both visually attractive and programmatically repeatable — including tools, step-by-step workflows, code examples, styling tips, automation patterns, and troubleshooting.


Why Convert XML to PDF?

There are many reasons organizations convert XML into PDF:

  • Readable reports: Turn machine-friendly data into human-friendly documents for stakeholders.
  • Printing and archiving: PDFs are ideal for legal records and long-term archives.
  • Automated document generation: Generate invoices, receipts, contracts, certificates from XML sources.
  • Distribution: PDFs preserve fonts, layout, and pagination across devices.

Common Approaches (Overview)

There are three main approaches to convert XML → PDF:

  1. XSL-FO pipeline: Use XSLT to transform XML into XSL-FO, then render to PDF with a processor like Apache FOP or Antenna House.
  2. XML → HTML → PDF: Transform XML into HTML/CSS and then render to PDF with engines such as PrinceXML, wkhtmltopdf, or WeasyPrint.
  3. Programmatic generation: Use libraries (ReportLab in Python, iText in Java/.NET) to build PDFs directly from parsed XML.

Each method has trade-offs: XSL-FO is highly controllable with powerful pagination features; HTML/CSS is easier to style for web designers; programmatic generation gives ultimate control at the code level.


Method 1 — XSL-FO with Apache FOP (Recommended for structured documents)

XSL-FO (Extensible Stylesheet Language — Formatting Objects) is the traditional XML-to-PDF route. Workflow:

  1. Write an XSLT stylesheet that transforms your XML into XSL-FO.
  2. Pass the resulting .fo to an FO processor such as Apache FOP to produce PDF.

Simple XSL-FO example (invoice fragment)

<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
  <fo:layout-master-set>
    <fo:simple-page-master master-name="A4" page-height="29.7cm" page-width="21cm" margin="2cm">
      <fo:region-body />
      <fo:region-after extent="1.5cm" />
    </fo:simple-page-master>
  </fo:layout-master-set>

  <fo:page-sequence master-reference="A4">
    <fo:flow flow-name="xsl-region-body">
      <fo:block font-size="14pt" font-weight="bold">Invoice #12345</fo:block>
      <fo:block>Date: 2025-10-25</fo:block>

      <fo:table table-layout="fixed" width="100%" margin-top="10pt">
        <fo:table-column column-width="60%"/>
        <fo:table-column column-width="20%"/>
        <fo:table-column column-width="20%"/>
        <fo:table-body>
          <fo:table-row>
            <fo:table-cell><fo:block>Product A</fo:block></fo:table-cell>
            <fo:table-cell><fo:block text-align="center">2</fo:block></fo:table-cell>
            <fo:table-cell><fo:block text-align="right">$50.00</fo:block></fo:table-cell>
          </fo:table-row>
        </fo:table-body>
      </fo:table>

    </fo:flow>
  </fo:page-sequence>
</fo:root>
  

Run Apache FOP

fop -fo input.fo -pdf output.pdf

Pros: precise control over pagination, page-numbering, footnotes, table handling. Cons: XSL-FO learning curve; styling is more XML-like than CSS.


Method 2 — Transform XML to HTML/CSS then render to PDF

If designers prefer CSS, convert XML → HTML (via XSLT) and style with CSS. Then use a renderer:

  • PrinceXML — professional, powerful CSS Paged Media support (page-breaks, running headers/footers).
  • wkhtmltopdf — uses WebKit, good for many HTML layouts.
  • WeasyPrint — Python-based, modern CSS support.

Workflow

  1. Create XSLT that transforms XML into semantic HTML (use classes/IDs for styling).
  2. Apply CSS for print media (use @page rules for margins and headers).
  3. Render HTML → PDF using Prince/wkhtmltopdf/WeasyPrint.

Example (command using wkhtmltopdf)

wkhtmltopdf --enable-local-file-access input.html output.pdf

Pros: CSS styling, familiar to web developers; easier to create responsive layouts. Cons: Some PDF features (precise footnote handling) are trickier than XSL-FO; rendering differences across engines.


Method 3 — Programmatic PDF Generation (ReportLab, iText)

When you need absolute control or want to embed business logic into PDF generation, use libraries:

  • Python: ReportLab (low-level canvas API), Pisa/WeasyPrint (HTML to PDF)
  • Java: iText (commercial licensing for newer versions), Apache PDFBox (PDF manipulation)
  • .NET: iTextSharp, PdfSharp

Typical flow: parse XML with a DOM/SAX parser, map fields to PDF elements, draw text/images/tables programmatically.

# Python (high-level sketch)
from lxml import etree
from reportlab.platypus import SimpleDocTemplate, Paragraph, Table
# parse XML, build story, then doc.build(story)
  

Pros: total control; integrates business logic. Cons: more development effort for complex layouts (pagination, flowing tables).


Step-by-Step Example: XML → XSLT → Apache FOP (Automated)

  1. XML data: your source (invoice.xml).
  2. XSLT: transform invoice.xml → invoice.fo (XSL-FO).
  3. FO Processor: run Apache FOP to produce invoice.pdf.

Shell automation

xsltproc transform.xslt invoice.xml > invoice.fo
fop invoice.fo invoice.pdf
  

Or call these steps from a script or CI pipeline. Many enterprises wrap the pipeline inside microservices or serverless functions for on-demand PDF generation.


Styling & Layout Tips

  • Fonts: embed fonts in FO or CSS to ensure consistent rendering (Apache FOP supports embedded fonts).
  • Pagination: plan content flow, use page-sequence-master in XSL-FO or @page in CSS for headers/footers.
  • Tables: avoid very wide tables; specify table-layout and column widths; allow row-breaks across pages where needed.
  • Images: ensure high-resolution images for print (300 DPI) and provide absolute/relative paths accessible to renderer.
  • Localization: handle numeric/date formats and RTL languages in your transform.

Programmatic Example — XML to PDF using Python + XSLT + FOP

#!/usr/bin/env python3
import subprocess
from lxml import etree

# Transform XML -> FO using XSLT
xml = etree.parse('invoice.xml')
xslt = etree.parse('invoice.xslt')
transform = etree.XSLT(xslt)
fo = transform(xml)
with open('invoice.fo', 'wb') as f:
    f.write(etree.tostring(fo, pretty_print=True, encoding='utf-8'))

# Run Apache FOP (must be installed)
subprocess.run(['fop', 'invoice.fo', 'invoice.pdf'], check=True)
  

This gives a repeatable, scriptable pipeline suitable for automation.


Batch Processing & Automation

For high volume, implement:

  • Message queues (RabbitMQ, SQS) to accept XML jobs.
  • Worker processes that run XSLT + FO or HTML → PDF conversions.
  • Monitoring, retry, and human-in-the-loop review for low-confidence renders.

Tooling Summary

  • Apache FOP: free, XSL-FO to PDF, production-proven.
  • PrinceXML: commercial, best CSS Paged Media support.
  • wkhtmltopdf: open-source, WebKit-based HTML to PDF.
  • WeasyPrint: Python-based HTML/CSS to PDF.
  • ReportLab: programmatic PDFs in Python.
  • iText / iTextSharp: Java/.NET programmatic PDF library (license considerations).

Common Problems & Troubleshooting

  • Poor pagination: use explicit page masters (XSL-FO) or CSS page-break rules.
  • Missing fonts or garbled characters: embed fonts and ensure UTF-8 encoding.
  • Images not found: use absolute paths or ensure renderer has access to image files.
  • Performance: precompile XSLTs where possible and scale worker nodes for concurrency.

Privacy & Security

When generating PDFs from sensitive XML (invoices, medical records, PII):

  • Prefer on-premise or private cloud rendering (avoid public online converters).
  • Encrypt transport (HTTPS) and store final PDFs securely.
  • Limit retention of intermediate artifacts (delete FO files after generation if not needed).

FAQs

Q: Which approach should I choose — XSL-FO or HTML?

A: If you need precise pagination, footnotes, or print-quality control, XSL-FO (Apache FOP) is best. If your team is already strong with HTML/CSS and wants rapid styling, use HTML → Prince/weasyprint/wkhtmltopdf.

Q: Can I convert XML to PDF for dynamic templates (invoices, certificates)?

A: Yes — parameterize your XSLT or HTML templates and generate per-document PDFs in an automated pipeline.

Q: Do I need to learn XSL-FO?

A: Not always — if your team prefers CSS, transform XML to HTML + CSS. But XSL-FO gives very fine control for print workflows.


Final Thoughts

Converting XML to PDF is a solved but nuanced problem: pick the right tool for your needs. Use XSL-FO and Apache FOP for print-grade documents and complicated pagination. Use HTML/CSS pipelines (Prince, wkhtmltopdf, WeasyPrint) for designer-friendly output. Choose programmatic libraries for highly customized or logic-heavy generation. Automate the pipeline for scale, secure it for privacy, and validate output against your business requirements.

If you want, I can generate a starter XSLT + XSL-FO stylesheet tailored to a sample XML (invoice, report, or certificate), or provide a ready-to-run Docker-based pipeline (XSLT + FOP) you can deploy instantly — tell me which document type and I’ll produce it.