Extract text from
any file or URL

Name: ParseJet
Author: ParseJet

Free online tool to convert PDF to text, get YouTube transcripts, and scrape web pages. One API for 25+ formats — power your AI agents or use it directly. Free API key included.

Drop a file here or browse

PDF, DOCX, XLSX, images, audio, video, and more

Free — 3 requests/day, no signup. for 300 credits/month free.

One tool for every text extraction task

Stop installing separate libraries for each format. ParseJet handles them all.

PDF to Text Converter

Extract text from PDF files instantly. Handles scanned documents, multi-page reports, and complex layouts. Convert PDF to plain text or markdown with one click.

YouTube Transcript Generator

Get the full transcript of any YouTube video. Supports all languages, auto-generated and manual captions. Perfect for content repurposing, research, and note-taking.

Web Page Scraper

Extract the main content from any web page URL. Automatically removes navigation, ads, and boilerplate. Returns clean, readable text from any website.

Document Parser

Parse Word documents (DOCX), Excel spreadsheets (XLSX), PowerPoint presentations (PPTX), and CSV files. Extract structured text from any Office document format.

Image to Text (OCR)

Extract text from images using OCR. Supports JPG, PNG, GIF, WebP, and TIFF formats. Read text from screenshots, photos of documents, and scanned pages.

Audio & Video Transcription

Transcribe audio files (MP3, WAV, M4A) and extract audio from video files (MP4, MKV, AVI) for transcription. Convert spoken content to searchable text.

25+ formats supported

One endpoint. Every file type. Structured text output.

PDF

DOCX

XLSX

PPTX

CSV

TXT

HTML

Markdown

JSON

XML

EPUB

YouTube

Web Pages

MP3 / Audio

MP4 / Video

JPG / Images

RSS / Atom

OPML

Notebooks

How it works

Paste or upload

Drop a URL or file. ParseJet auto-detects the format — PDF, DOCX, YouTube link, web page, image, audio, or any of 25+ supported types.

Extract

Text, title, and metadata are extracted automatically. Get clean, structured output regardless of the input format.

Use the text

Copy the result for your project, or integrate via the ParseJet API to automate text extraction at scale.

Why ParseJet?

Compare building your own parsing pipeline vs using ParseJet.

Do It Yourself

✗ Install 5-10 separate libraries (pdfplumber, yt-dlp, trafilatura, python-docx...)
✗ Handle binary dependencies (ffmpeg, poppler, tesseract)
✗ Write format detection and routing logic
✗ Deal with version conflicts and platform issues
✗ Maintain and update each parser separately
✗ 50-200 lines of code per format

With ParseJet

✓ One HTTP endpoint for all 25+ formats
✓ Zero dependencies to install
✓ Auto-detection — just send the file or URL
✓ Always up-to-date parsers maintained for you
✓ Consistent JSON response for every format
✓ 3-5 lines of code total

Integrate in minutes

Works with any language. No SDK required — just HTTP.

cURL

curl -X POST https://api.parsejet.com/v1/parse/auto/url \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Python

import httpx

resp = httpx.post(
    "https://api.parsejet.com/v1/parse/auto/url",
    json={"url": "https://youtube.com/watch?v=dQw4w9WgXcQ"}
)
print(resp.json()["text"])  # Full transcript

JavaScript

const res = await fetch("https://api.parsejet.com/v1/parse/auto/file", {
  method: "POST",
  body: formData, // FormData with your PDF
});
const { text, title, source_type } = await res.json();

Built for AI agents

Give your AI the ability to read any document or URL. One API call, structured text output.

Claude & Claude Code

Use ParseJet as an MCP server or HTTP tool. Let Claude extract text from PDFs, web pages, and documents during conversations.

ChatGPT & GPT Agents

Add ParseJet as a custom action in GPTs. Your agent can parse any file or URL and reason over the extracted text.

Gemini & Google AI

Integrate via function calling. ParseJet handles the parsing so Gemini can focus on understanding the content.

LangChain & LlamaIndex

Use ParseJet as a document loader. One endpoint replaces dozens of format-specific loaders in your RAG pipeline.

OpenClaw & Open Source Agents

Any AI agent that can make HTTP requests can use ParseJet. Supports the Machine Payments Protocol (MPP) for autonomous pay-per-request.

Custom AI Workflows

Build automated pipelines with n8n, Make, or Zapier. ParseJet extracts text, your AI processes it. No code required.

Want to automate this?

ParseJet API gives you the same parsing power via a single HTTP endpoint. No ffmpeg, no poppler, no tesseract — just one API call.

curl -X POST https://api.parsejet.com/v1/parse/auto/url \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com"}'

Read API Docs

Frequently asked questions

How do I extract text from a PDF file?

Upload your PDF to ParseJet or use the API: POST /v1/parse/auto/file with your PDF. ParseJet extracts all text content, preserving structure and handling multi-page documents. Works with scanned PDFs via OCR too.

How do I get a transcript of a YouTube video?

Paste the YouTube URL into ParseJet or call POST /v1/parse/youtube with the video URL. ParseJet returns the full transcript with timestamps. Supports auto-generated captions in 100+ languages.

Can I convert PDF to Markdown?

Yes. Add ?output_format=markdown to your request. ParseJet detects headings, lists, tables, and code blocks in your PDF and converts them to clean Markdown syntax.

Is ParseJet free to use?

Yes. You get 3 free requests per day with no signup. Create a free account for 300 requests per month. Paid plans start at $19/month for 3,000 requests.

What file formats does ParseJet support?

ParseJet supports 25+ formats: PDF, DOCX, XLSX, PPTX, CSV, TXT, HTML, Markdown, JSON, XML, EPUB, YouTube videos, web pages, MP3, WAV, M4A (audio), MP4, MKV, AVI (video), JPG, PNG, GIF (images), RSS, Atom, OPML feeds, Jupyter notebooks, and email files.

Do I need an API key?

No. Anonymous access works for testing (3 requests/day). For production use, create a free API key at parsejet.com — you get 300 requests per month at no cost.

How does ParseJet compare to pdfplumber or trafilatura?

ParseJet replaces multiple libraries with one API. Instead of installing pdfplumber for PDFs, trafilatura for web pages, yt-dlp for YouTube, and python-docx for Word files, you make one HTTP call to ParseJet and it handles everything.

Can AI agents use ParseJet?

Yes. ParseJet supports the Machine Payments Protocol (MPP) for pay-per-request access without accounts. AI agents can also use anonymous access (3/day) or API keys for higher limits.

Start extracting text for free

No signup required. Parse your first file in seconds.

View Pricing

Extract text from any file or URL