check and run again.
Thanks, Deviprasad. You're running two programs:
-
Program 1: Downloads and caches packages (Debian and Python), installs them, runs Python code.
-
Program 2: Uses only cached packages from Google Drive and runs the same Python code offline.
❌ Error Root Causes:
-
cp: cannot stat '/content/drive/MyDrive/Lib/archives/partial/*.deb'
→ This means no.debfiles were downloaded or stored duringapt-getwith-o=dir::cache. -
dpkg: cannot access archive '/content/drive/MyDrive/Lib/debs/*.deb'
→ This is a result of thecpfailing earlier. -
PDFInfoNotInstalledErrorfrompdf2image
→ This meanspdfinfo(frompoppler-utils) is not installed or not found in PATH in Program 2.
✅ Solution Strategy
To fix this, change how you cache .deb files in Program 1, and ensure pdfinfo exists and is executable in Program 2.
✅ Program 1: Install & Cache All Required Packages
✅ Cache Python packages:
✅ Program 2: Install from Cached Packages
Install Python packages offline:
🧪 Test PDF Functionality (same xyz Python code)
Here’s a test snippet that uses pdf2image and pytesseract:
✅ Summary:
-
Use
--download-onlyand specifydir::cache::archivesdirectly to get.debfiles. -
pdfinfomust be installed viapoppler-utils, which your second program must install from.deb, not viaapt-get(which won't work offline). -
pdf2imagerelies onpdfinfo— so without poppler-utils properly installed, you'll always getPDFInfoNotInstalledError.
Would you like me to give you a single Colab-friendly setup that handles both modes (online and offline) with checks?