I am currently working on coding a Python program that takes two PDF files, a question paper, and an answer paper. The program will initially allow the user to preview both papers.

Here is my previous question (solved) relating to the above description

The link for the question paper I'll be using: https://www.cambridgeinternational.org/Images/520512-june-2021-question-paper-21.pdf

After confirming their selected files with a button, a new window will open, where each multiple-choice question in the paper will be individually boxed. I am currently working on this functionality.

Below is my current attempt, I am still trying to figure out the overall concept

import tkinter as tk
from tkinter import filedialog, messagebox
from PyPDF2 import PdfReader
import re

def box_questions(pdf_path, canvas):
    pdf = PdfReader(open(pdf_path, 'rb'))
    num_pages = len(pdf.pages)

    for page_num in range(num_pages):
        page = pdf.pages[page_num]
        page_text = page.extract_text()
        questions = []
        start_index = 0

        for match in re.finditer(r'^\d+\.', page_text, re.MULTILINE):
            end_index = match.end()
            questions.append((start_index, end_index))
            start_index = end_index

        for start, end in questions:
            x1 = 10
            y1 = (start + end) // 2
            x2 = canvas.winfo_width() - 10
            y2 = y1 + 20
            canvas.create_rectangle(x1, y1, x2, y2, outline="red", width=2)

root = tk.Tk()
root.title("PDF Questions Highlighter")

canvas = tk.Canvas(root, width=800, height=600)
canvas.pack()

pdf_path = filedialog.askopenfilename(title="Select PDF File")
if pdf_path:
    box_questions(pdf_path, canvas)
else:
    messagebox.showerror("Error", "Please select a PDF file.")

root.mainloop()

I would appreciate any comments or suggestions provided to me.

0

There are 0 best solutions below