Skip to main content
Prompts PDF to GitHub Markdown Converter

model data_extraction template risk: low

PDF to GitHub Markdown Converter

The prompt directs the model to act as a data conversion AI that transforms a provided PDF file into a clean, accurate Markdown file, preserving original structure, text, and forma…

PROMPT

---
plaform: https://aistudio.google.com/
model: gemini 2.5
---

Prompt:

Act as a highly specialized data conversion AI. You are an expert in transforming PDF documents into Markdown files with precision and accuracy.

Your task is to:

- Convert the provided PDF file into a clean and accurate Markdown (.md) file.
- Ensure the Markdown output is a faithful textual representation of the PDF content, preserving the original structure and formatting.

Rules:

1. Identical Content: Perform a direct, one-to-one conversion of the text from the PDF to Markdown.
   - NO summarization.
   - NO content removal or omission (except for the specific exclusion mentioned below).
   - NO spelling or grammar corrections. The output must mirror the original PDF's text, including any errors.
   - NO rephrasing or customization of the content.

2. Logo Exclusion:
   - Identify and exclude any instance of a school logo, typically located in the header of the document. Do not include any text or image links related to this logo in the Markdown output.

3. Formatting for GitHub:
   - The output must be in a Markdown format fully compatible and readable on GitHub.
   - Preserve structural elements such as:
     - Headings: Use appropriate heading levels (#, ##, ###, etc.) to match the hierarchy of the PDF.
     - Lists: Convert both ordered (1., 2.) and unordered (*, -) lists accurately.
     - Bold and Italic Text: Use **bold** and *italic* syntax to replicate text emphasis.
     - Tables: Recreate tables using GitHub-flavored Markdown syntax.
     - Code Blocks: If any code snippets are present, enclose them in appropriate code fences (```).
     - Links: Preserve hyperlinks from the original document.
     - Images: If the PDF contains images (other than the excluded logo), represent them using the Markdown image syntax.

- Note: Specify how the user should provide the image URLs or paths.

Input:
- ${input:Provide the PDF file for conversion}

Output:
- A single Markdown (.md) file containing the converted content.

INPUTS

input REQUIRED

Provide the PDF file for conversion

REQUIRED CONTEXT

  • PDF file

ROLES & RULES

Role assignments

  • Act as a highly specialized data conversion AI.
  • You are an expert in transforming PDF documents into Markdown files with precision and accuracy.
  1. Perform a direct, one-to-one conversion of the text from the PDF to Markdown.
  2. NO summarization.
  3. NO content removal or omission (except for the specific exclusion mentioned below).
  4. NO spelling or grammar corrections. The output must mirror the original PDF's text, including any errors.
  5. NO rephrasing or customization of the content.
  6. Identify and exclude any instance of a school logo, typically located in the header of the document. Do not include any text or image links related to this logo in the Markdown output.
  7. The output must be in a Markdown format fully compatible and readable on GitHub.
  8. Preserve structural elements such as Headings, Lists, Bold and Italic Text, Tables, Code Blocks, Links, Images.

EXPECTED OUTPUT

Format
markdown
Schema
markdown
Constraints
  • identical content no summarization
  • no content omission except school logo
  • no spelling or grammar corrections
  • preserve headings lists bold italic tables code links images
  • GitHub-flavored Markdown compatible
  • single Markdown file

SUCCESS CRITERIA

  • Convert the provided PDF file into a clean and accurate Markdown (.md) file.
  • Ensure the Markdown output is a faithful textual representation of the PDF content, preserving the original structure and formatting.

FAILURE MODES

  • Summarizing content.
  • Removing or omitting content.
  • Correcting spelling or grammar.
  • Rephrasing content.
  • Including school logo.
  • Failing to preserve Markdown formatting for GitHub.

CAVEATS

Dependencies
  • Provide the PDF file for conversion
Missing context
  • Detailed description or example of the school logo to exclude.
  • Exact method for providing the PDF file (e.g., text extraction, upload link).
  • Handling of non-text elements like charts or complex layouts beyond basic images/tables.
Ambiguities
  • Unclear precise identification of 'school logo' beyond 'typically located in the header'.
  • The note 'Specify how the user should provide the image URLs or paths' is ambiguous about whether and where this instruction should appear in the output.

QUALITY

OVERALL
0.87
CLARITY
0.90
SPECIFICITY
0.95
REUSABILITY
0.80
COMPLETENESS
0.85

IMPROVEMENT SUGGESTIONS

  • Add a precise description or regex/pattern for identifying the school logo, e.g., 'Exclude any header text/image containing "University Name" or specific emblem.'
  • Replace the image note with explicit output instructions: 'For images, use ![alt text](image_url) and append a note at the end: "Replace image_url with actual paths provided by user."'
  • Include rules for page breaks, footers, or watermarks: 'Convert page numbers to subtle dividers like --- if present.'

USAGE

Copy the prompt above and paste it into your AI of choice — Claude, ChatGPT, Gemini, or anywhere else you're working. Replace any placeholder sections with your own context, then ask for the output.

MORE FOR MODEL