TextToSpeech、playEarcon 和 .wav 文件

发布于 2025-01-05 16:16:44 字数 4633 浏览 0 评论 0原文

在我的一个应用程序中，我有一项活动，该活动通过语音合成字母数字参考字符串、字母/数字，例如“ABC123”听起来像“Ay，bee，sea，一二三”。由于这是一组有限的声音，我认为最好通过使用 playEarcon 方法播放预先录制的数字和字母的 .wav 文件，使 TTS 引擎能够在没有互联网连接的情况下工作。

我已将所有 36 个 wav 文件放在 res/raw 文件夹中，并在初始化 TTS 引擎时将资源 id 映射到字母。这很有效，但是 .apk 现在要大得多，因为 wav 文件未压缩地存储在 apk 中。我想让 apk 的大小更小。

在另一个问题的答案它指出 wav 文件不参与压缩。（我不明白为什么，因为它们通常会压缩到原始版本的 40% 左右）检查 apk 的内部结构，这似乎是真的。

由于代码中没有引用资源文件的扩展名，我尝试将 wav 重命名为 .waw、.abc、.spc。所有这些都被压缩，但不幸的是，除非扩展名是 .wav，否则 playEarcon 方法在调用时不会产生声音。

简而言之，我想强制 TTS 引擎播放没有 wav 扩展名的文件，或者说服它压缩 .wav 文件。

所有建议将不胜感激。无论如何，我在下面发布了最小的可演示代码示例。我的工作文件名为 gb_a.wav、gb_b.wav 等。如果扩展名更改，它们就会停止发声。

public class WavSpeakerActivity extends Activity implements
        RadioGroup.OnCheckedChangeListener, TextToSpeech.OnInitListener {

    static final int mGBLetterResIds[] = { R.raw.gb_a, R.raw.gb_b, R.raw.gb_c,
            R.raw.gb_d, R.raw.gb_e, R.raw.gb_f, R.raw.gb_g, R.raw.gb_h,
            R.raw.gb_i, R.raw.gb_j, R.raw.gb_k, R.raw.gb_l, R.raw.gb_m,
            R.raw.gb_n, R.raw.gb_o, R.raw.gb_p, R.raw.gb_q, R.raw.gb_r,
            R.raw.gb_s, R.raw.gb_t, R.raw.gb_u, R.raw.gb_v, R.raw.gb_w,
            R.raw.gb_x, R.raw.gb_y, R.raw.gb_z };
    static final int mGBNumberResIds[] = { R.raw.gb_zero, R.raw.gb_one,
            R.raw.gb_two, R.raw.gb_three, R.raw.gb_four, R.raw.gb_five,
            R.raw.gb_six, R.raw.gb_seven, R.raw.gb_eight, R.raw.gb_nine };

    static final String mGbStr = "GB";
    static final String mAlphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
    static final String mNumbers = "0123456789";
    private String mPpackageName = null;
    private String mTextToSpeak = null;
    private RadioGroup mRadioGroup = null;// two buttons one sets letters, the other numbers
    private TextToSpeech mTts = null;

    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);
        mTts = new TextToSpeech(this, this);
        mRadioGroup = (RadioGroup) findViewById(R.id.radioGroup1);
        mRadioGroup.setOnCheckedChangeListener(this);
    }

    @Override
    protected void onResume() {
        super.onResume();
        RadioGroup rg = (RadioGroup) findViewById(R.id.radioGroup1);
        switchText(rg);
        mPpackageName = getPackageName();
    }

    @Override
    public void onDestroy() {
        // Don't forget to shutdown speech engine
        if (mTts != null) {
            mTts.stop();
            mTts.shutdown();
        }
        super.onDestroy();
    }

    private void switchText(RadioGroup rg) {
        // select letters or digits as the String to speak
        int checkedButton = rg.getCheckedRadioButtonId();
        switch (checkedButton) {
            case R.id.alphabet:
                mTextToSpeak = mAlphabet;
                break;
            case R.id.numbers:
                mTextToSpeak = mNumbers;
                break;
        }
    }

    public void myClickHandler(View target) {
        // Just the one button has been clicked - the 'Speak' one
        String earconKey;
        String lang = Locale.UK.getCountry(); // will be "GB", just have UK in this small example
        mTts.setLanguage(Locale.UK); // skip error checking for brevity's sake
        String text = mTextToSpeak.replaceAll("\\s", "");// remove spaces (if any)
        char c;
        for (int i = 0; i < text.length(); i++) {
            c = text.charAt(i);
            if ( Character.isLetter(c) || Character.isDigit(c) ) {
                earconKey = lang + Character.toString(c); // GBA, GBB..GBZ, GB0.. GB9
                mTts.playEarcon(earconKey, TextToSpeech.QUEUE_ADD, null);
            }
        }
    }

    @Override
    public void onInit(int status) {
        // doesn't seem we need to check status or setLanguage if we're just playing earcons
        mapEarCons(); // map letter/digit sounds to resource ids
    }

    private void mapEarCons() {
        String key;
        for (char c = 'A'; c <= 'Z' ; c++){
            key = mGbStr + Character.toString(c); // GBA, GBB .. GBZ
            mTts.addEarcon(key, mPpackageName, mGBLetterResIds[c - 'A'] );// add it
        }
        for (int i = 0 ; i <= 9; i++){
            key = mGbStr + Integer.toString(i); // GB0, GB1 .. GB9
            mTts.addEarcon(key, mPpackageName, mGBNumberResIds[i] );
        }
    }

    @Override
    public void onCheckedChanged(RadioGroup rg, int arg1) { switchText(rg); }
}

。。

原文

In one of my applications I have an activity which speech synthesises alphanumeric reference strings, letter/number by letter/number e.g "ABC123" sounds as "Ay, bee, sea, one two three". As this is a limited set of sounds I thought it would be good to enable the TTS engine to work without an internet connection by playing prerecorded .wav files of the numbers and letters using the playEarcon method.

I have placed all 36 wav files in the res/raw folder and mapped the resource ids to the letters when initialising the TTS engine. This works well, however the .apk is now much larger as the wav files are stored uncompressed in the apk. I would like to make the apk's size smaller.

In the answer to another question it states that wav files are excluded from compression. ( I don't see why, as they typically zip down to about 40% of the original) One inspecting the apk's internals, this appears to be true.

As the extension of the resource files is not referred to in the code, I tried renaming the wavs, to variously .waw, .abc, .spc. All of these get compressed but unfortunately the playEarcon method produces no sound when invoked unless the extension is .wav.

In short I would like to coerce the TTS engine into playing files without a wav extension, or persuade it to compress the .wav files.

All suggestions will be gratefully received. For what it's worth I'm posting the smallest demonstrable code sample below. My working files are named gb_a.wav, gb_b.wav etc. If the extension is changed, they stop sounding.

public class WavSpeakerActivity extends Activity implements
        RadioGroup.OnCheckedChangeListener, TextToSpeech.OnInitListener {

    static final int mGBLetterResIds[] = { R.raw.gb_a, R.raw.gb_b, R.raw.gb_c,
            R.raw.gb_d, R.raw.gb_e, R.raw.gb_f, R.raw.gb_g, R.raw.gb_h,
            R.raw.gb_i, R.raw.gb_j, R.raw.gb_k, R.raw.gb_l, R.raw.gb_m,
            R.raw.gb_n, R.raw.gb_o, R.raw.gb_p, R.raw.gb_q, R.raw.gb_r,
            R.raw.gb_s, R.raw.gb_t, R.raw.gb_u, R.raw.gb_v, R.raw.gb_w,
            R.raw.gb_x, R.raw.gb_y, R.raw.gb_z };
    static final int mGBNumberResIds[] = { R.raw.gb_zero, R.raw.gb_one,
            R.raw.gb_two, R.raw.gb_three, R.raw.gb_four, R.raw.gb_five,
            R.raw.gb_six, R.raw.gb_seven, R.raw.gb_eight, R.raw.gb_nine };

    static final String mGbStr = "GB";
    static final String mAlphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
    static final String mNumbers = "0123456789";
    private String mPpackageName = null;
    private String mTextToSpeak = null;
    private RadioGroup mRadioGroup = null;// two buttons one sets letters, the other numbers
    private TextToSpeech mTts = null;

    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);
        mTts = new TextToSpeech(this, this);
        mRadioGroup = (RadioGroup) findViewById(R.id.radioGroup1);
        mRadioGroup.setOnCheckedChangeListener(this);
    }

    @Override
    protected void onResume() {
        super.onResume();
        RadioGroup rg = (RadioGroup) findViewById(R.id.radioGroup1);
        switchText(rg);
        mPpackageName = getPackageName();
    }

    @Override
    public void onDestroy() {
        // Don't forget to shutdown speech engine
        if (mTts != null) {
            mTts.stop();
            mTts.shutdown();
        }
        super.onDestroy();
    }

    private void switchText(RadioGroup rg) {
        // select letters or digits as the String to speak
        int checkedButton = rg.getCheckedRadioButtonId();
        switch (checkedButton) {
            case R.id.alphabet:
                mTextToSpeak = mAlphabet;
                break;
            case R.id.numbers:
                mTextToSpeak = mNumbers;
                break;
        }
    }

    public void myClickHandler(View target) {
        // Just the one button has been clicked - the 'Speak' one
        String earconKey;
        String lang = Locale.UK.getCountry(); // will be "GB", just have UK in this small example
        mTts.setLanguage(Locale.UK); // skip error checking for brevity's sake
        String text = mTextToSpeak.replaceAll("\\s", "");// remove spaces (if any)
        char c;
        for (int i = 0; i < text.length(); i++) {
            c = text.charAt(i);
            if ( Character.isLetter(c) || Character.isDigit(c) ) {
                earconKey = lang + Character.toString(c); // GBA, GBB..GBZ, GB0.. GB9
                mTts.playEarcon(earconKey, TextToSpeech.QUEUE_ADD, null);
            }
        }
    }

    @Override
    public void onInit(int status) {
        // doesn't seem we need to check status or setLanguage if we're just playing earcons
        mapEarCons(); // map letter/digit sounds to resource ids
    }

    private void mapEarCons() {
        String key;
        for (char c = 'A'; c <= 'Z' ; c++){
            key = mGbStr + Character.toString(c); // GBA, GBB .. GBZ
            mTts.addEarcon(key, mPpackageName, mGBLetterResIds[c - 'A'] );// add it
        }
        for (int i = 0 ; i <= 9; i++){
            key = mGbStr + Integer.toString(i); // GB0, GB1 .. GB9
            mTts.addEarcon(key, mPpackageName, mGBNumberResIds[i] );
        }
    }

    @Override
    public void onCheckedChanged(RadioGroup rg, int arg1) { switchText(rg); }
}

.
.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

鱼窥荷 2025-01-12 16:16:44

为什么不尝试压缩 wav 文件：根据维基百科 wav 文件是数据容器。最常用于未压缩的 PCM 声音，但也可用于存储使用各种编解码器压缩的数据（您可以找到 wFormatTags 的可能值（存储数据的类型），例如此处）。
您可以将资源保存到本地文件系统，并使用 addEarcon(String Earcon, String filename) 而不是 addEarcon(String Earcon, String packagename, int resourceId)。
您可以使用 aapt 和 -0 wav 命令行开关来制作包含从压缩类型中排除的 wav 文件的 apk。