Skip to content

πŸš€ Production-ready HTML to PDF microservice using Docker + Puppeteer. Deploy on Render, Koyeb, Railway. Fast, secure, scalable PDF generation API.

Notifications You must be signed in to change notification settings

Tactition/WebAssets-Pdf-Generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

35 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“„ WebAssets PDF Generator

Docker Node.js Puppeteer License

Production-Ready HTML to PDF Microservice
Serverless β€’ Scalable β€’ Lightning Fast

πŸš€ Quick Start β€’ πŸ“– Documentation β€’ 🎯 Deploy β€’ πŸ’¬ Support


WebAssets PDF Generator Demo

🌟 Overview

WebAssets PDF Generator is a production-grade, Dockerized microservice that converts HTML pages to high-quality PDFs using Chrome/Chromium rendering engine. Built for reliability and performance, it handles everything from simple invoices to complex, multi-page reports with pixel-perfect accuracy.

✨ Key Features

Core Capabilities

  • 🎨 Pixel-Perfect Rendering - Uses real Chrome engine for accurate PDF generation
  • ⚑ Blazing Fast - Optimized Docker image with <1 minute cold starts
  • πŸ”’ Secure by Default - API key authentication and isolated execution
  • 🌍 Deploy Anywhere - Run on Render, Koyeb, Railway, AWS, or any Docker platform
  • πŸ“± Responsive Support - Generates PDFs from mobile-optimized pages
  • 🎯 Zero Configuration - Works out of the box with sensible defaults

Production-Ready Queue System ⭐ NEW

  • 🚦 Concurrent Job Management - Limits simultaneous PDF generations (max 2 on free tier)
  • πŸ“‹ Fair Job Queuing - First-come-first-served processing prevents crashes
  • ⏱️ Queue Wait Tracking - Monitor how long jobs wait before processing
  • πŸ’Ύ Job History - Stores last 100 jobs with full metrics
  • πŸ”„ Automatic Retries - Graceful failure handling with detailed error logs

Enhanced Monitoring Dashboard ⭐ NEW

  • πŸ“Š Real-time Queue Status - See jobs waiting, processing, and completed
  • πŸ“ˆ Daily Analytics - Success rates, average duration, file sizes
  • πŸ’½ Memory Tracking - Per-job memory usage monitoring
  • πŸ“‰ Performance Metrics - Detailed render times and queue wait statistics
  • ⚠️ Error Logging - Comprehensive error tracking per job
  • πŸ”„ Auto-refresh - Dashboard updates every 5 seconds

🎯 Use Cases

Industry Application
🏒 SaaS Platforms Generate invoices, reports, contracts
✈️ Travel & Hospitality Create booking confirmations, itineraries, vouchers
πŸ₯ Healthcare Export patient records, prescriptions, lab reports
πŸ“Š Analytics Dashboards Convert charts and graphs to printable PDFs
πŸ“§ Email Marketing Archive campaigns as PDF backups
πŸŽ“ Education Generate certificates, transcripts, course materials

πŸš€ Quick Start

Prerequisites

  • Docker installed on your system
  • Node.js 18+ (for local development)

Option 1: Using Docker (Recommended)

# Clone the repository
git clone https://github.com/Tactition/webassets-pdf-generator.git
cd webassets-pdf-generator

# Build the Docker image
docker build -t pdf-generator .

# Run the container
docker run -p 3000:3000 \
  -e PDF_SECRET_KEY=your_secret_key_here \
  pdf-generator

Visit http://localhost:3000 to see the dashboard!

Option 2: Local Development

# Install dependencies
npm install

# Set environment variable
export PDF_SECRET_KEY=your_secret_key_here

# Start the server
npm start

πŸ“‘ API Reference

Submit PDF Generation Job (Queue-based)

Endpoint: POST /generate

Request:

{
  "targetUrl": "https://example.com/invoice?id=123",
  "secretKey": "your_secret_key_here",
  "filename": "invoice_123.pdf"
}

Response: Job queued (immediate response)

{
  "status": "queued",
  "jobId": "abc-123-def-456",
  "position": 3,
  "estimatedWait": "~12 seconds",
  "statusUrl": "/status/abc-123-def-456",
  "downloadUrl": "/download/abc-123-def-456"
}

Check Job Status

Endpoint: GET /status/:jobId

Response (Queued):

{
  "status": "queued",
  "message": "Job is in queue or processing"
}

Response (Completed):

{
  "status": "completed",
  "jobId": "abc-123-def-456",
  "filename": "invoice_123.pdf",
  "duration": 3.24,
  "fileSize": 245678,
  "memoryUsed": 128
}

Response (Failed):

{
  "status": "failed",
  "error": "net::ERR_CONNECTION_REFUSED at http://..."
}

Download Generated PDF

Endpoint: GET /download/:jobId

Response: Binary PDF file (available for 5 minutes after completion)

Status Codes:

  • 200 - PDF ready, downloading
  • 404 - Job not found or PDF expired
  • 202 - Job still processing, try again later

Full Workflow Example

# Step 1: Submit job
JOB_RESPONSE=$(curl -X POST https://your-service.com/generate \
  -H "Content-Type: application/json" \
  -d '{
    "targetUrl": "https://example.com/invoice",
    "secretKey": "your_secret_key_here",
    "filename": "invoice.pdf"
  }')

# Extract job ID
JOB_ID=$(echo $JOB_RESPONSE | jq -r '.jobId')

# Step 2: Poll for completion
while true; do
  STATUS=$(curl -s https://your-service.com/status/$JOB_ID | jq -r '.status')
  if [ "$STATUS" == "completed" ]; then
    break
  elif [ "$STATUS" == "failed" ]; then
    echo "Job failed"
    exit 1
  fi
  sleep 1
done

# Step 3: Download PDF
curl https://your-service.com/download/$JOB_ID --output invoice.pdf

Health Check

Endpoint: GET /health

Response:

{
  "status": "healthy",
  "uptime": 12345,
  "queue": {
    "size": 3,
    "pending": 2
  }
}

🎨 Dashboard Preview

The service includes a production-grade monitoring dashboard built with Tailwind CSS:

Real-time Queue Metrics

  • πŸ“Š Queue Size - Jobs waiting to be processed
  • βš™οΈ Active Jobs - Currently processing (X/2)
  • βœ… Success Rate - Percentage of successful completions
  • ❌ Failed Today - Count of failed jobs

Daily Performance Analytics

  • ⏱️ Avg Duration - Average PDF generation time
  • πŸ“¦ Avg File Size - Average output PDF size
  • βœ… Completed Jobs - Total successful today
  • πŸ’Ύ Current Memory - Real-time memory usage

Detailed Job History Table

Each job displays:

  • ⏰ Timestamp
  • βœ…/❌ Status (Success/Failed)
  • πŸ“„ Filename
  • ⏱️ Duration
  • πŸ“¦ File Size
  • πŸ’½ Memory Used
  • ⏳ Queue Wait Time
  • ⚠️ Error Message (if failed)

Dashboard auto-refreshes every 5 seconds to show live updates.


🌐 Deployment

Deploy to Render.com (Free Tier)

  1. Fork this repository
  2. Go to render.com β†’ New Web Service
  3. Connect your GitHub repository
  4. Render auto-detects the Dockerfile
  5. Set environment variable: PDF_SECRET_KEY
  6. Click Create Web Service

βœ… Done! Your service will be live at https://your-app.onrender.com

Deploy to Koyeb

  1. Push this repo to GitHub
  2. Go to koyeb.com β†’ Create App
  3. Select GitHub β†’ Choose your repository
  4. Build method: Docker
  5. Port: 3000
  6. Set environment: PDF_SECRET_KEY
  7. Deploy!

Deploy to Railway

  1. Visit railway.app
  2. Click New Project β†’ Deploy from GitHub
  3. Railway auto-detects Dockerfile
  4. Add environment variable: PDF_SECRET_KEY
  5. Get your URL from the dashboard

βš™οΈ Configuration

Environment Variables

Variable Required Default Description
PDF_SECRET_KEY βœ… Yes - Authentication key for API requests
PORT ❌ No 3000 Server port

Queue Configuration

The queue system is configured in server.js with these defaults:

Setting Value Description
Concurrency 2 Max simultaneous PDF generations
Timeout 60000ms Max time per job (60 seconds)
Job Retention 5 minutes How long completed PDFs are cached
History Size 100 jobs Max jobs stored in memory

To modify queue settings, edit server.js:

const pdfQueue = new PQueue({ 
    concurrency: 2,        // Increase for more powerful servers
    timeout: 60000,        // Adjust timeout as needed
    throwOnTimeout: true
});

⚠️ Note: On free-tier hosting, keep concurrency at 2 to prevent memory exhaustion.

Puppeteer Options

Customize PDF generation by modifying server.js:

const pdfBuffer = await page.pdf({
    format: 'A4',           // A4, Letter, Legal, A3
    printBackground: true,  // Include CSS backgrounds
    margin: {               // Page margins
        top: '10mm',
        right: '10mm',
        bottom: '10mm',
        left: '10mm'
    },
    displayHeaderFooter: true,
    headerTemplate: '<div>Custom Header</div>',
    footerTemplate: '<div>Page <span class="pageNumber"></span></div>'
});

πŸ› οΈ Integration Examples

PHP Integration

$pdf_url = 'https://your-service.com/generate';
$data = json_encode([
    'targetUrl' => 'https://example.com/invoice',
    'secretKey' => 'your_secret_key_here',
    'filename' => 'invoice.pdf'
]);

$ch = curl_init($pdf_url);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$pdf_binary = curl_exec($ch);
curl_close($ch);

header('Content-Type: application/pdf');
header('Content-Disposition: attachment; filename="invoice.pdf"');
echo $pdf_binary;

Python Integration

import requests

response = requests.post('https://your-service.com/generate', json={
    'targetUrl': 'https://example.com/page',
    'secretKey': 'your_secret_key_here',
    'filename': 'output.pdf'
})

with open('output.pdf', 'wb') as f:
    f.write(response.content)

JavaScript/Node.js Integration

const axios = require('axios');
const fs = require('fs');

const response = await axios.post('https://your-service.com/generate', {
    targetUrl: 'https://example.com/page',
    secretKey: 'your_secret_key_here',
    filename: 'output.pdf'
}, {
    responseType: 'arraybuffer'
});

fs.writeFileSync('output.pdf', response.data);

🐳 Docker Architecture

This service uses the official Puppeteer Docker image which includes:

  • βœ… Pre-installed Chromium browser
  • βœ… All required system dependencies
  • βœ… Optimized for <1 minute builds
  • βœ… Regular security updates

Image Size: ~200MB (compressed)
Build Time: ~60 seconds
Cold Start: ~5-10 seconds


πŸ”’ Security Best Practices

  1. Use Strong Secret Keys: Generate random 32+ character keys
  2. Enable HTTPS: Always use SSL for production deployments
  3. Rate Limiting: Implement rate limiting on your reverse proxy
  4. Whitelist IPs: Restrict access to known IP addresses if possible
  5. Monitor Logs: Watch for suspicious activity in dashboard

πŸ“Š Performance Benchmarks

With Queue System (Production)

Metric Free Tier Paid Tier
Max Concurrent Jobs 2 simultaneous 4-8 simultaneous
Average Generation Time 2-5 seconds 2-4 seconds
Queue Wait Time ~4s per queued job ~2s per queued job
Memory Usage ~120MB per job ~120MB per job
Max PDF Size Unlimited Unlimited
Jobs Processed/Hour ~720 jobs ~1800 jobs
Crash Resilience βœ… 100% (queue prevents overload) βœ… 100%

Without Queue (Old - Not Recommended)

⚠️ Concurrent Requests: 10+ simultaneous = Server Crash
βœ… With Queue: Unlimited requests = All Successfully Processed


🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments

  • Puppeteer Team - For the amazing headless Chrome library
  • Docker Community - For simplified deployment workflows
  • Open Source Contributors - For inspiration and support

πŸ‘¨β€πŸ’» About the Author

Zahid

Entrepreneur & Self-Taught Software Developer

πŸ“ Srinagar, Kashmir

GitHub LinkedIn Instagram YouTube X/Twitter Threads

Zahid is a self-taught software developer and entrepreneur passionate about building scalable, production-ready solutions. Through WebAssets, he creates tools that empower businesses to automate workflows and enhance productivity.

"Building software that solves real problems, one line of code at a time."

🏒 WebAssets

WebAssets is a software solutions company specializing in:

  • πŸ”§ Custom SaaS development
  • ☁️ Cloud-native microservices
  • πŸ€– Workflow automation tools
  • πŸ“Š Business intelligence dashboards

Get in Touch:
πŸ“§ Email: webassets.tech@gmail.com
πŸ“± Phone: +91 788 980 4942
🌐 Instagram: @webassets.tech
πŸŽ₯ YouTube: @WebAssetsTech
πŸ’Ό LinkedIn: Zahid
🧡 Threads: @webassets.tech


πŸ”§ Troubleshooting

Queue-Related Issues

Job stuck in queue forever:

  • Check dashboard to see if service is processing jobs
  • Restart the service (jobs in queue will be lost)
  • Check Koyeb/Render logs for errors

"PDF not found or expired" error:

  • PDFs are cached for 5 minutes after completion
  • Download immediately after job completes
  • If expired, submit a new generation request

High queue wait times:

  • Normal on free tier with 2 concurrent jobs
  • Upgrade to paid tier for higher concurrency
  • Or consider self-hosting with more resources

Memory exceeded errors:

  • Reduce concurrency in server.js (from 2 to 1)
  • Limit complexity of HTML pages being converted
  • Upgrade to hosting with more RAM

Jobs failing silently:

  • Check /status/:jobId for error messages
  • Verify targetUrl is publicly accessible
  • Check Chrome console errors in logs

πŸ’¬ Support

Need help? Have questions?


⭐ Star this repo if you find it helpful!

Made with ❀️ by Zahid @ WebAssets

GitHub stars GitHub forks

About

πŸš€ Production-ready HTML to PDF microservice using Docker + Puppeteer. Deploy on Render, Koyeb, Railway. Fast, secure, scalable PDF generation API.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published