check and run again.

- May 03, 2025

Thanks, Deviprasad. You're running two programs:

Program 1: Downloads and caches packages (Debian and Python), installs them, runs Python code.
Program 2: Uses only cached packages from Google Drive and runs the same Python code offline.

❌ Error Root Causes:

cp: cannot stat '/content/drive/MyDrive/Lib/archives/partial/*.deb'
→ This means no .deb files were downloaded or stored during apt-get with -o=dir::cache.
dpkg: cannot access archive '/content/drive/MyDrive/Lib/debs/*.deb'
→ This is a result of the cp failing earlier.
PDFInfoNotInstalledError from pdf2image
→ This means pdfinfo (from poppler-utils) is not installed or not found in PATH in Program 2.

✅ Solution Strategy

To fix this, change how you cache .deb files in Program 1, and ensure pdfinfo exists and is executable in Program 2.

✅ Program 1: Install & Cache All Required Packages

bash
# Create necessary cache and deb storage folders
!mkdir -p /content/drive/MyDrive/Lib/debs

# Update APT index
!apt-get update -qq

# Download .deb files only (no install yet)
!apt-get -qq --download-only -o=dir::cache::archives="/content/drive/MyDrive/Lib/debs" install poppler-utils tesseract-ocr-tel

# Install downloaded .deb files to make them available now
!dpkg -i /content/drive/MyDrive/Lib/debs/*.deb || true

# Fix any missing dependencies
!apt-get -f install -y

✅ Cache Python packages:

bash
!pip download -d /content/drive/MyDrive/Lib/ Pillow pytesseract pdf2image

✅ Program 2: Install from Cached Packages

bash
# Install .deb files from Google Drive
!dpkg -i /content/drive/MyDrive/Lib/debs/*.deb || true

# Fix dependencies if needed
!apt-get -f install -y

Install Python packages offline:

bash
!pip install --no-index --find-links=/content/drive/MyDrive/Lib/ Pillow pytesseract pdf2image

🧪 Test PDF Functionality (same `xyz` Python code)

Here’s a test snippet that uses pdf2image and pytesseract:

python
from pdf2image import convert_from_path
import pytesseract

images = convert_from_path("sample.pdf")
for i, img in enumerate(images):
    text = pytesseract.image_to_string(img, lang='tel')
    print(f"Page {i+1} Text:\n{text}")

✅ Summary:

Use --download-only and specify dir::cache::archives directly to get .deb files.
pdfinfo must be installed via poppler-utils, which your second program must install from .deb, not via apt-get (which won't work offline).
pdf2image relies on pdfinfo — so without poppler-utils properly installed, you'll always get PDFInfoNotInstalledError.

Would you like me to give you a single Colab-friendly setup that handles both modes (online and offline) with checks?

Search This Blog

My important workaround in my journey

check and run again.

❌ Error Root Causes:

✅ Solution Strategy

✅ Program 1: Install & Cache All Required Packages

✅ Cache Python packages:

✅ Program 2: Install from Cached Packages

Install Python packages offline:

🧪 Test PDF Functionality (same `xyz` Python code)

✅ Summary:

Popular posts from this blog

SAP CPI : camle expression in sap cpi , cm, router, filter and groovy script. format

pss book: గురు ప్రార్థనామంజరి . completed 21st july 2024

pss book : శ్రీకృష్ణుడు దేవుడా, భగవంతుడా completed , second review needed. 26th April 2024

check and run again.

❌ Error Root Causes:

✅ Solution Strategy

✅ Program 1: Install & Cache All Required Packages

✅ Cache Python packages:

✅ Program 2: Install from Cached Packages

Install Python packages offline:

🧪 Test PDF Functionality (same xyz Python code)

✅ Summary:

Popular posts from this blog

SAP CPI : camle expression in sap cpi , cm, router, filter and groovy script. format

pss book: గురు ప్రార్థనామంజరి . completed 21st july 2024

pss book : శ్రీకృష్ణుడు దేవుడా, భగవంతుడా completed , second review needed. 26th April 2024

🧪 Test PDF Functionality (same `xyz` Python code)