browser to mp3, java code, (pyton alternative)

- April 19, 2025

can we prepare same code in java, how to dowload required lib files? step by step informaiton ? Language support - Speech service - Azure AI services | Microsoft Learn , link says we could java code also take reference

Yes, Deviprasad! You can prepare the same functionality in Java using the Microsoft Azure Speech SDK, which supports Text-to-Speech (TTS) in various languages, including Telugu (te-IN-ShrutiNeural).

Below is a step-by-step guide on how to do this in Java, including how to download and use required libraries:

✅ 1. Pre-requisites

Java JDK (version 8 or higher):
Install from: https://www.oracle.com/java/technologies/javase-downloads.html
Maven or Gradle for dependency management (Maven preferred here)
Azure Speech Key & Region:
Sign up here: https://portal.azure.com → Create a Speech resource and get:
- SPEECH_KEY
- SPEECH_REGION

✅ 2. Create Maven Java Project

Use any IDE (e.g., IntelliJ, Eclipse) or just command line.

bash
mvn archetype:generate -DgroupId=com.example.azuretts -DartifactId=azure-tts-java -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false
cd azure-tts-java

✅ 3. Add Azure SDK Dependency

Edit your pom.xml and add:

xml
<dependencies>
    <dependency>
        <groupId>com.microsoft.cognitiveservices.speech</groupId>
        <artifactId>client-sdk</artifactId>
        <version>1.31.0</version>
    </dependency>
</dependencies>

<repositories>
    <repository>
        <id>maven-central</id>
        <url>https://repo.maven.apache.org/maven2</url>
    </repository>
</repositories>

✅ 4. Java Code to Convert Telugu Text to MP3

Create this under src/main/java/com/example/azuretts/App.java:

java
package com.example.azuretts;

import com.microsoft.cognitiveservices.speech.*;
import com.microsoft.cognitiveservices.speech.audio.*;

import java.io.*;
import java.util.regex.*;

public class App {

    // Replace with your Azure Speech key and region
    private static final String SPEECH_KEY = "YOUR_SPEECH_KEY";
    private static final String SPEECH_REGION = "YOUR_REGION";

    public static void main(String[] args) throws Exception {
        String inputText = "శ్లోకం 11 ..... గాలికి త్రోయబడదు"; // Replace with real text

        SpeechConfig config = SpeechConfig.fromSubscription(SPEECH_KEY, SPEECH_REGION);
        config.setSpeechSynthesisVoiceName("te-IN-ShrutiNeural");

        // Optional: Save to file
        String outputFile = "output_telugu.mp3";
        AudioConfig audioConfig = AudioConfig.fromWavFileOutput(outputFile); // or MP3 if supported

        // Split sentences by full stop
        String[] sentences = inputText.split("(?<=\\.)\\s*");

        SpeechSynthesizer synthesizer = new SpeechSynthesizer(config, audioConfig);

        ByteArrayOutputStream finalAudio = new ByteArrayOutputStream();

        for (int i = 0; i < sentences.length; i++) {
            String sentence = sentences[i].trim();
            if (!sentence.isEmpty()) {
                SpeechSynthesisResult result = synthesizer.SpeakTextAsync(sentence).get();

                if (result.getReason() == ResultReason.SynthesizingAudioCompleted) {
                    System.out.println("✅ Synthesized: " + sentence);

                    byte[] audioData = result.getAudioData();
                    finalAudio.write(audioData);

                    // Add 1s silence if not last sentence
                    if (i < sentences.length - 1) {
                        byte[] silence = generateSilenceWav(1000); // 1 sec
                        finalAudio.write(silence);
                    }
                } else {
                    System.err.println("❌ Error: " + result.getErrorDetails());
                }
            }
        }

        // Save to output file
        FileOutputStream fos = new FileOutputStream(outputFile);
        fos.write(finalAudio.toByteArray());
        fos.close();
        synthesizer.close();
        System.out.println("✅ Audio file saved: " + outputFile);
    }

    // Generates 1-second silence in WAV format (16-bit PCM, mono, 16000Hz)
    private static byte[] generateSilenceWav(int durationMs) throws IOException {
        int sampleRate = 16000;
        int numSamples = durationMs * sampleRate / 1000;
        ByteArrayOutputStream silence = new ByteArrayOutputStream();

        // WAV Header
        DataOutputStream out = new DataOutputStream(silence);
        int byteRate = sampleRate * 2;
        int dataSize = numSamples * 2;

        out.writeBytes("RIFF");
        out.writeInt(Integer.reverseBytes(36 + dataSize));
        out.writeBytes("WAVEfmt ");
        out.writeInt(Integer.reverseBytes(16));
        out.writeShort(Short.reverseBytes((short) 1)); // PCM
        out.writeShort(Short.reverseBytes((short) 1)); // Mono
        out.writeInt(Integer.reverseBytes(sampleRate));
        out.writeInt(Integer.reverseBytes(byteRate));
        out.writeShort(Short.reverseBytes((short) 2)); // Block align
        out.writeShort(Short.reverseBytes((short) 16)); // Bits/sample
        out.writeBytes("data");
        out.writeInt(Integer.reverseBytes(dataSize));

        // Write silence (0s)
        for (int i = 0; i < numSamples; i++) {
            out.writeShort(0);
        }

        out.close();
        return silence.toByteArray();
    }
}

✅ 5. Run the Code

bash
mvn compile
mvn exec:java -Dexec.mainClass="com.example.azuretts.App"

Make sure to replace YOUR_SPEECH_KEY and YOUR_REGION.

📚 Reference:

Microsoft Learn Docs: Speech SDK for Java (Text-to-Speech)

Search This Blog

My important workaround in my journey