Introduction
In this blog post, we'll walk through how to convert a PowerPoint presentation (PPTX) to a PDF and then convert that PDF into a series of images using Java. We'll utilize the Apache POI library for reading PPTX files, iText for creating PDF files, and PDFBox for rendering PDF files to images. Let's get started!
Prerequisites
Before we begin, make sure you have the following set up:
Java Development Kit (JDK) installed.
Maven for dependency management.
An IDE such as IntelliJ IDEA or Eclipse.
Maven Dependencies
First, let's define our project dependencies in the pom.xml
file. Add the following dependencies to your Maven project:
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>ppt-to-pdf</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
<dependencies>
<!-- Apache POI for reading PPT files -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>5.2.3</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>5.2.3</version>
</dependency>
<!-- iText for creating PDF files -->
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itextpdf</artifactId>
<version>5.5.13.3</version>
</dependency>
<!-- PDFBox for rendering PDF files -->
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>2.0.27</version>
</dependency>
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox-tools</artifactId>
<version>2.0.27</version>
</dependency>
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox-app</artifactId>
<version>2.0.27</version>
</dependency>
<!-- Log4j2 dependencies -->
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.20.0</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>2.20.0</version>
</dependency>
</dependencies>
</project>
Code Explanation
Convert PPTX to PDF
- We'll start by converting the PowerPoint presentation to a PDF.
import org.apache.poi.xslf.usermodel.XMLSlideShow;
import org.apache.poi.xslf.usermodel.XSLFSlide;
import com.itextpdf.text.Document;
import com.itextpdf.text.Image;
import com.itextpdf.text.Rectangle;
import com.itextpdf.text.pdf.PdfWriter;
import javax.imageio.ImageIO;
import java.awt.Graphics2D;
import java.awt.geom.Rectangle2D;
import java.awt.image.BufferedImage;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
public class Main {
public static void main(String[] args) throws IOException {
String pptxFileName = "Sample.pptx";
String pdfFileName = "presentation.pdf";
// Convert PPTX to PDF
try {
FileInputStream inputStream = new FileInputStream(new File(pptxFileName));
XMLSlideShow ppt = new XMLSlideShow(inputStream);
Document pdfDocument = new Document(new Rectangle((float) ppt.getPageSize().getWidth(), (float) ppt.getPageSize().getHeight()));
PdfWriter writer = PdfWriter.getInstance(pdfDocument, new FileOutputStream(pdfFileName));
pdfDocument.open();
for (XSLFSlide slide : ppt.getSlides()) {
BufferedImage img = new BufferedImage((int) ppt.getPageSize().getWidth(), (int) ppt.getPageSize().getHeight(), BufferedImage.TYPE_INT_RGB);
Graphics2D graphics = img.createGraphics();
graphics.setRenderingHint(java.awt.RenderingHints.KEY_ANTIALIASING, java.awt.RenderingHints.VALUE_ANTIALIAS_ON);
graphics.setRenderingHint(java.awt.RenderingHints.KEY_RENDERING, java.awt.RenderingHints.VALUE_RENDER_QUALITY);
graphics.setRenderingHint(java.awt.RenderingHints.KEY_INTERPOLATION, java.awt.RenderingHints.VALUE_INTERPOLATION_BICUBIC);
// Set background color
graphics.setColor(java.awt.Color.WHITE);
graphics.fill(new Rectangle2D.Float(0, 0, img.getWidth(), img.getHeight()));
// Draw slide content on the graphics
slide.draw(graphics);
// Convert graphics to image
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
ImageIO.write(img, "png", byteArrayOutputStream);
byte[] bytes = byteArrayOutputStream.toByteArray();
// Add image to PDF document
Image pdfImage = Image.getInstance(bytes);
float pdfWidth = pdfDocument.getPageSize().getWidth();
float pdfHeight = pdfDocument.getPageSize().getHeight();
float imgWidth = pdfImage.getWidth();
float imgHeight = pdfImage.getHeight();
float scaleX = pdfWidth / imgWidth;
float scaleY = pdfHeight / imgHeight;
float scale = Math.min(scaleX, scaleY);
pdfImage.scaleAbsolute(imgWidth * scale, imgHeight * scale);
pdfImage.setAbsolutePosition(
(pdfWidth - imgWidth * scale) / 2,
(pdfHeight - imgHeight * scale) / 2
);
pdfDocument.add(pdfImage);
pdfDocument.newPage(); // Start a new page for the next slide
graphics.dispose();
}
pdfDocument.close();
System.out.println("PPTX converted to PDF successfully!");
} catch (Exception e) {
e.printStackTrace();
}
// Convert PDF to images
try (PDDocument document = PDDocument.load(new File(pdfFileName))) {
PDFRenderer pdfRenderer = new PDFRenderer(document);
for (int page = 0; page < document.getNumberOfPages(); ++page) {
BufferedImage bim = pdfRenderer.renderImageWithDPI(page, 300);
String outputFileName = pdfFileName.replace(".pdf", "") + "-" + (page + 1) + ".png";
ImageIOUtil.writeImage(bim, outputFileName, 300);
}
System.out.println("PDF converted to images successfully!");
} catch (IOException e) {
e.printStackTrace();
}
}
}
Step-by-Step Explanation
Step 1: Reading the PPTX File
We start by reading the PPTX file into an XMLSlideShow
object.
FileInputStream inputStream = new FileInputStream(new File(pptxFileName));
XMLSlideShow ppt = new XMLSlideShow(inputStream);
Step 2 : Creating a PDF document
Next, we create a new PDF document with the same dimensions as the PPTX slides.
Document pdfDocument = new Document(new Rectangle((float) ppt.getPageSize().getWidth(), (float) ppt.getPageSize().getHeight()));
PdfWriter writer = PdfWriter.getInstance(pdfDocument, new FileOutputStream(pdfFileName));
pdfDocument.open();
Step 3 : Processing each slide
For each slide in the PPTX file, we create a BufferedImage
, draw the slide content onto this image, convert it to a byte array, and add it to the PDF document.
for (XSLFSlide slide : ppt.getSlides()) {
BufferedImage img = new BufferedImage((int) ppt.getPageSize().getWidth(), (int) ppt.getPageSize().getHeight(), BufferedImage.TYPE_INT_RGB);
Graphics2D graphics = img.createGraphics();
graphics.setRenderingHint(java.awt.RenderingHints.KEY_ANTIALIASING, java.awt.RenderingHints.VALUE_ANTIALIAS_ON);
graphics.setRenderingHint(java.awt.RenderingHints.KEY_RENDERING, java.awt.RenderingHints.VALUE_RENDER_QUALITY);
graphics.setRenderingHint(java.awt.RenderingHints.KEY_INTERPOLATION, java.awt.RenderingHints.VALUE_INTERPOLATION_BICUBIC);
// Set background color
graphics.setColor(java.awt.Color.WHITE);
graphics.fill(new Rectangle2D.Float(0, 0, img.getWidth(), img.getHeight()));
// Draw slide content on the graphics
slide.draw(graphics);
// Convert graphics to image
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
ImageIO.write(img, "png", byteArrayOutputStream);
byte[] bytes = byteArrayOutputStream.toByteArray();
// Add image to PDF document
Image pdfImage = Image.getInstance(bytes);
float pdfWidth = pdfDocument.getPageSize().getWidth();
float pdfHeight = pdfDocument.getPageSize().getHeight();
float imgWidth = pdfImage.getWidth();
float imgHeight = pdfImage.getHeight();
float scaleX = pdfWidth / imgWidth;
float scaleY = pdfHeight / imgHeight;
float scale = Math.min(scaleX, scaleY);
pdfImage.scaleAbsolute(imgWidth * scale, imgHeight * scale);
pdfImage.setAbsolutePosition(
(pdfWidth - imgWidth * scale) / 2,
(pdfHeight - imgHeight * scale) / 2
);
pdfDocument.add(pdfImage);
pdfDocument.newPage(); // Start a new page for the next slide
graphics.dispose();
}
Step 4: Closing the PDF Document
Finally, we close the PDF document to complete the conversion
pdfDocument.close();
Convert PDF to Images
Next, we'll convert the PDF to images using PDFBox.
Step 1: Loading the PDF File
We start by loading the PDF document into a
PDDocument
object.
try (PDDocument document = PDDocument.load(new File(pdfFileName))) {
PDFRenderer pdfRenderer = new PDFRenderer(document);
Step 2: Rendering Each Page to an Image
For each page in the PDF document, we render it to a BufferedImage
with 300 DPI resolution and save it as a PNG file.
for (int page = 0; page < document.getNumberOfPages(); ++page) {
BufferedImage bim = pdfRenderer.renderImageWithDPI(page, 300);
String outputFileName = pdfFileName.replace(".pdf", "") + "-" + (page + 1) + ".png";
ImageIOUtil.writeImage(bim, outputFileName, 300);
}
Conclusion
In this blog post, we demonstrated how to convert a PowerPoint presentation to a PDF and then convert that PDF into images using Java. By leveraging Apache POI, iText, and PDFBox libraries, we can efficiently handle these conversions in our Java applications. I hope you found this guide helpful and that it inspires you to explore further possibilities with these powerful libraries.
Feel free to leave comments or questions below!