档案编号处理(python)

一、安装依赖(pip install pandas PyPDF2 reportlab openpyxl)

二、做excel表 A列身份证号 B列为编号(第一行表头要有)

三、文件夹内PDF以身份证号命名。

四、文件代码(python)

import os
import pandas as pd
from PyPDF2 import PdfReader, PdfWriter
from reportlab.pdfgen import canvas
from reportlab.lib.units import mm
from io import BytesIO

# 当前目录
root_folder = os.getcwd()

# 自动查找 Excel 文件
excel_file = next(f for f in os.listdir(root_folder) if f.lower().endswith(('.xls', '.xlsx')))
df = pd.read_excel(os.path.join(root_folder, excel_file))

# 身份证号 -> 编号 映射
id_to_code = dict(zip(df.iloc[:, 0].astype(str), df.iloc[:, 1].astype(str)))

# 记录未处理文件
unmatched = []

# 遍历所有子目录和文件
for folder_path, _, files in os.walk(root_folder):
    for filename in files:
        if filename.lower().endswith('.pdf'):
            pdf_path = os.path.join(folder_path, filename)
            id_number = os.path.splitext(filename)[0]

            if id_number in id_to_code:
                code = id_to_code[id_number]

                reader = PdfReader(pdf_path)
                writer = PdfWriter()

                for page in reader.pages:
                    width = float(page.mediabox.width)
                    height = float(page.mediabox.height)

                    packet = BytesIO()
                    c = canvas.Canvas(packet, pagesize=(width, height))
                    c.setFont("Helvetica", 10)
                    c.drawRightString(width - 10 * mm, height - 10 * mm, code)
                    c.save()
                    packet.seek(0)

                    watermark = PdfReader(packet)
                    page.merge_page(watermark.pages[0])
                    writer.add_page(page)

                with open(pdf_path, 'wb') as f:
                    writer.write(f)
            else:
                unmatched.append([os.path.relpath(pdf_path, root_folder)])

# 导出未匹配的文件为 Excel(如果有)
if unmatched:
    unmatched_df = pd.DataFrame(unmatched, columns=['未匹配PDF文件'])
    unmatched_df.to_excel(os.path.join(root_folder, '未匹配文件.xlsx'), index=False)

print("处理完成。")

 

 

 

小青冈文章部分来自网络收集,如有侵权,请联系作者删除。

微信扫描
点击一下复制微信号
weinxin
toanyue
抖音扫描
点击一下复制抖音号
weinxin
jdding
初升高
 
初升高

发表评论

匿名网友
:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:
确定

拖动滑块以完成验证