ocr code and key file , step i followed and worked as below.

- July 31, 2024

chatgpt step.s

tell me step by step following activites: To perform Optical Character Recognition (OCR) on a PDF file using Google Cloud Vision, you can follow these steps: Activate OCR Services on Google Cloud Platform (GCP): Enable the Vision API: Go to the GCP Console, select your project, and enable the Vision API from the API Library. Create a Service Account: Navigate to IAM & Admin > Service Accounts, create a new service account, and download the JSON key file for authentication.

To perform Optical Character Recognition (OCR) on a PDF file using Google Cloud Vision, follow these steps:

Step 1: Activate OCR Services on Google Cloud Platform (GCP)

1. Enable the Vision API:

Go to the Google Cloud Console.
Select your project or create a new one.
Navigate to API & Services > Library.
Search for "Vision API" and click on it.
Click Enable to activate the API.

2. Create a Service Account:

Go to IAM & Admin > Service Accounts.
Click Create Service Account.
Enter a name and description for the service account, then click Create.
Assign the role Project > Editor (or a more restrictive role if preferred), then click Continue.
Click Done.
Find the created service account in the list, click the three dots in the Actions column, and select Manage keys.
Click Add key > Create new key.
Select JSON and click Create to download the key file.

---------------------------------------------------------

{

"type": "service_account",

"project_id": "premium-bastion-430411-j1",

"private_key_id": "37bfbbd7e015f1ececd337eeed962cf0340c77c6",

"private_key": "-----BEGIN PRIVATE KEY-----\nMIIEvgIBADANBgkqhkiG9w0BAQEFAASCBKgwggSkAgEAAoIBAQDFVyvPcZ3gdJpb\ncAGQ8xPOY+4eG4sTNa2LxFAEAk9BwitZhXkESUoql+tQ3N9ViJUK0FvovXX9M61i\nUoa/6+faTN7MZx4Orl7IHkOamHOZ8k7m8VUNM27L8Re1q1jY8X4LVZYwKAewU9oU\numJGhvHi9dlLqa/wjjs90L6rcAQoimFpmOVRgRs55i7N8kLQXPNHesySgZUd/bPq\nGlRy2lhBtPaVsj7qBiA612mxcIG8RxK3vCcy4a00VFGxc0Nm+AfU36qbmkPZ/2Hg\nlvWvfdq0E0JETM2KPNJRdhUJk1tLtzG9VHkPtAPqH51hFYHNRgOsi0lUfpmuekOz\nQPXorRRFAgMBAAECggEAJANcg2cAoM2mvbkbEBDMNIYYFnaAUukg2rjgwVqKjD2b\nG6M4Cog1eMBwuTkVqM5cJLURkM/CkIq1wZO985hhn30Xy8Wlvpgjw/GB9NyNfDK0\nI4E3PEl+lslQwl6awDyR/A+KcQ9dF9i1m8LH3IcT7Y9gMGavLP9H4GgP48l1eNA1\n43/D5S5o0GHpP/tOHTmJXSGO3sqcWDz8P7DwRpFvQi4iC9I41NEe4xbf+qF1iqFK\nWN4FfzqoD/92coKqQIy/XeS0ea5uHHmEZTxQkWymTEWC+87ASSpYxMn70DnrQGds\np2HhEKjlDhA6U2rlBQUWRrIGAoH4Pw7pgTQfeHi3AQKBgQDyEyYJDShq7jEC7pyW\nFxJAufc3Q22zN7NruwJVOx/JHwo/+TM8XG2dMvQLHons1hPu7CEXRA6aJZHTM9DW\ny6YSmGaE/ABtVw0hFif8pLxtSWSR8vme/+xDWG7SPAd2n4TZUIL3oWVW7DaeRtzA\n/7HgcpyYCCxbPkmZxUuGQiIL2QKBgQDQsUE26r1rUBn+y+MoWIzTSaQ0NbPsHOLu\nA1Ew4xnQTK0FKt/UrppQ8Hn+IonEamuKqbFbrcBA5Hlos4zq6UEtsCSBRSAXsNiM\ngSDvgEi2AG8BHQ90lATUg4AkRn0XptS90rxvDdvTB2Low3a7U0UgZ9FPbK+CZcM7\nIuwA7s0kTQKBgQCrRhMWOmILbiXtPig6JW67rxgT2LhoIhPnODSJW0C9CyI0azvu\nDDNeEuwKZfwOPtVxqU4saPGv7L8FYNb08re41m1K0AX5VqsqasO2BFRxOolWnkQx\nVD+00n/ubZyWOH1FSKqxTFcoLg7i455TS1ZuSIk1JCtGD4cBrk8mWeo+2QKBgQCQ\nU8YSQM23RbJ0sHTtFjBuf2N8aBeCl4epq8me0bCsihqulDQwVoCKsYTs6N32ZNLz\nFuB90vA+8MO2UVaHr8FP452beT70o21f3n30pPekRzB9yKdDrIqankckz0r77Hj8\nvp1rAF6i11zkNSoL607HVoU/NJ88wzv9ze0ByCU/kQKBgGDC8FIdOqmJHO0uuTHE\ncZ4CO5k4roL04MLxc+NAWI+2B/ArqwVdaOHNjUSojJ+M1DOSol0ZEgW0QbmAUJf8\nOg+NaNOmLjTLIm2P+qD4M3jcgcFPxgOuUIaUsxvC+seGzZZ7I6l/dNNCI5hHXWD7\n13oUuZbkOIJ8oYs9X0mAno5M\n-----END PRIVATE KEY-----\n",

"client_email": "ocr-serviceaccount@premium-bastion-430411-j1.iam.gserviceaccount.com",

"client_id": "111306160424609019264",

"auth_uri": "https://accounts.google.com/o/oauth2/auth",

"token_uri": "https://oauth2.googleapis.com/token",

"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",

"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/ocr-serviceaccount%40premium-bastion-430411-j1.iam.gserviceaccount.com",

"universe_domain": "googleapis.com"

}

--------------------------

!pip show google-cloud-vision pdf2image
!apt-cache policy poppler-utils
# this is for doc to text ; ocr.
import io
import os
from google.cloud import vision
from google.cloud.vision_v1 import types
from pdf2image import convert_from_path
from google.oauth2 import service_account

# Paths to the files
credentials_path = '/content/credentials_doc.json'
pdf_file_path = '/content/Theerpu.pdf'
output_text_file_path = '/content/output_doc_text.txt'

# Verify the file existence
if os.path.exists(credentials_path):
    print("Credentials file found.")
else:
    print("Credentials file not found. Please check the path.")

if os.path.exists(pdf_file_path):
    print("PDF file found.")
else:
    print("PDF file not found. Please check the path.")

# Authenticate using service account
credentials = service_account.Credentials.from_service_account_file(credentials_path)
client = vision.ImageAnnotatorClient(credentials=credentials)

# Convert PDF to images
images = convert_from_path(pdf_file_path, dpi=300)

# Function to perform OCR on an image
def perform_ocr(image):
    content = io.BytesIO()
    image.save(content, format='PNG')
    content = content.getvalue()
    image = types.Image(content=content)
    response = client.document_text_detection(image=image)
    return response.full_text_annotation.text

# Process each image and save the text
with open(output_text_file_path, 'w', encoding='utf-8') as output_file:
    for i, image in enumerate(images):
        text = perform_ocr(image)
        output_file.write(f"Page {i + 1}\n")
        output_file.write(text)
        output_file.write("\n\n")

print(f'OCR text written to file "{output_text_file_path}"')

Search This Blog

My important workaround in my journey