You uploaded a PDF. You need the words inside it. This guide walks you through every step of the converter sitting above this page, explains what each setting does, and answers the questions people ask most.
Before You Start
Two things to check before you upload.
First, your PDF needs to contain real typed text. Open it and try to highlight a word with your cursor. If the highlight works, your file has a text layer and this converter reads it fine. If nothing highlights, your PDF is a scan — it stores pages as images. This tool reads text layers, not images. A scanned PDF gives you an empty result.
Second, keep your file under 10 MB. Most standard PDFs sit well under that. A 40-page report with charts and tables usually lands between 1 and 4 MB. If yours is bigger, split it first, then convert each part.
Step 1: Upload Your PDF
Click the upload box at the top of this page. A file picker opens. Select your PDF and confirm.
You can also drag the file straight onto the upload box. Drop it anywhere inside the dotted border. The file name and size appear below the box once the upload registers.
You will see the Convert button activate. Before you click it, look at the two settings below the upload area.
Step 2: Choose Your Settings
The converter gives you two options. Here is what each one does.
Preserve line breaks
Leave this ticked. It tells the converter to keep paragraph spacing and line breaks from the original document. Your output mirrors the structure of the source PDF.
Turn it off if you are feeding the text into a script or data pipeline. A continuous text block with no line breaks is easier for code to process — and for language models, search indexers, or any downstream NLP tool that tokenizes on its own.
Add page markers
Tick this if you need to know which page each chunk of text came from. The output adds a label — “Page 1”, “Page 2”, and so on — before each section. Useful for legal documents, research papers, or anything where the source page matters for citation or review.
Leave it unticked if you just need the text and do not care about page positions.
Step 3: Click Convert to Text
Press the button. It goes grey and shows a spinner while the server processes your file.
Most files finish in two to five seconds. A 200-page PDF with dense text takes longer, closer to fifteen to twenty seconds. The page does not reload. The result appears below the button when it is ready.
The stats bar above the output box shows you three numbers: pages found, word count, and total character count. Check these against your source document to confirm the extraction looks right.
Step 4: Copy or Download Your Text
You have two options once the result appears.
Click Copy to put all the extracted text on your clipboard. Then paste it wherever you need it — a document, an email, a text editor, anywhere.
Click Download to save a .txt file to your device. The file takes the name of your original PDF with a .txt extension. It saves to your default download folder.
If you want to convert another file, click New. The form resets and you start fresh.
Reading Your Output
The text box shows everything the converter extracted. Scroll through it before you use it.
For most clean, text-based PDFs, the output reads exactly as expected. Paragraphs line up, headings appear in the right place, and the flow matches the source document.
A few situations produce output that needs a small cleanup pass.
Multi-column layouts sometimes extract left-column and right-column text separately instead of in reading order. The converter reads text as the PDF stores it internally. A two-column document may give you all of column one, then all of column two.
Tables extract as plain text, meaning rows and columns lose their grid structure. You get the cell values, but not the table formatting.
Footnotes and headers sometimes appear mid-paragraph if the PDF places them at unusual positions in the text layer.
None of this is a conversion error. It reflects how the PDF file stores its content. A small manual edit fixes it in most cases.
What People Use PDF to Text Extraction For
Plain text is the most portable format a document can be in. Once your PDF content is a .txt file, almost everything can read it, process it, or accept it as input.
Feeding content into AI tools and language models. Pasting a wall of PDF text directly into ChatGPT, Claude, or any LLM works poorly. The .txt output from this converter gives you clean, linebreak-structured text that tokenizes well and produces better model responses. Teams building RAG (Retrieval-Augmented Generation) pipelines routinely extract PDFs to plain text as the first preprocessing step before chunking and embedding.
Searching and archiving document content. PDFs are not always indexed by search engines or internal search tools. A plain .txt version of a contract, report, or manual makes the content searchable and grep-able without opening the original file.
Editing content without a Word file. If you received a PDF and need to rework its text — rewrite a policy, update a report, strip out boilerplate — extracting to text is faster than copying paragraph by paragraph from the PDF viewer.
Data pipelines and scripting. Developers processing document text in Python, Node, or any scripting language work far more easily with a .txt file than a PDF. The “Turn off line breaks” setting produces a single continuous block, which is often exactly what a parser or regex expects.
Accessibility. Screen readers and text-to-speech tools handle plain text better than PDF. Converting a document to .txt before running it through assistive software produces cleaner output, particularly for PDFs with custom fonts or unusual encoding.
What the Two Extraction Methods Mean for You
The converter runs two extraction engines in order of accuracy.
The first is pdftotext, part of the open source Poppler library. When it runs, you get the most accurate extraction available for standard PDFs. Poppler handles complex spacing, mixed fonts, and large documents well.
The second is a built-in PHP parser. It kicks in when pdftotext is not available on the server. It reads the raw content streams inside the PDF and pulls the text directly. Slightly less precise on documents with unusual layouts, but handles most standard PDFs without issues.
You do not choose between them. The tool selects automatically based on what is available. Both methods produce clean, usable text for standard documents.
Three Common Mistakes to Avoid
Uploading a scanned PDF and expecting text output. Check the file first. Try to highlight a word. If nothing selects, you have a scan, not a text-layer PDF.
Ignoring the line break setting and then wondering why the output is a wall of text. Tick “Preserve line breaks” for documents you plan to read or paste into a document editor. Turn it off for documents you plan to process with code or feed into an AI tool that handles its own tokenization.
Converting a large image-heavy PDF and hitting the size limit. Compress or split the file first. The 10 MB limit exists because large image-heavy PDFs carry very little extractable text relative to their file size. A 40 MB presentation packed with embedded graphics will hit the limit before the converter reaches most of the text.
FAQs
My PDF looks fine on screen but the converter returns no text. What went wrong?
Your PDF is a scan. Scanned PDFs store pages as images inside the file container. The converter reads text layers, not images. Open your PDF, try to click and drag to highlight a word. If nothing selects, the file is image-based. Run it through OCR software first to create a searchable PDF, then convert it here.
The extracted text is jumbled and out of order. How do I fix this?
The PDF stores its text in an internal order that does not always match what you see on screen. Multi-column layouts, sidebars, and complex table structures are the usual causes. Tick “Preserve line breaks” if you have not already. This gives the output the best chance of matching the visual layout. For documents with heavy formatting, a short manual edit after conversion is the fastest fix.
Can I convert a password-protected PDF?
No. A locked PDF blocks all reading access, including text extraction. Remove the password first. Open the PDF in your PDF reader, enter the correct password, then save an unlocked copy using “Save as” or “Export”. Upload that unlocked file here.
Does the converter keep my file after processing?
No. The file goes to the server for processing and gets discarded after the text comes back to you. Nothing is stored, indexed, or shared. If you handle sensitive documents, check your organisation’s data policies before uploading to any external tool.
The word count in the stats bar looks too low. Some text is missing.
A few things cause this. Some PDFs use custom or embedded fonts that do not map to standard characters. The converter extracts what it reads, and non-standard font encoding can cause some characters to drop. PDFs with heavy graphic design layered over text sometimes obscure the text layer. Check the output against a specific page in the source PDF to identify where the gap is. If the missing content sits in a heavily designed section, the PDF may store it as an image rather than text.
Can I convert multiple PDFs at once?
The converter processes one file at a time. Click New after each conversion to reset the form. There is no limit to how many files you convert in a session.
Does this tool work on my phone?
Yes. The converter runs in any modern mobile browser. The upload box, settings, and output text box all function on a phone screen. Tap the upload area to open your file picker, or share a PDF directly from your phone’s files app to the browser tab. The Download button saves the .txt file to your phone’s default downloads location.
The PDF has images and diagrams. Will those appear in the output?
No. This converter extracts text only. Images, charts, diagrams, and embedded graphics do not transfer to the .txt output. You get every word in the document. You do not get the visuals.
Related Tools
Already extracted your text and need to do more with your documents?
- Merge multiple PDFs into one before converting with a PDF Merger
- Reduce a large PDF file size before upload with a PDF Compressor
- Turn a Word document into a PDF with a DOCX to PDF Converter


