C#实现中文录音文件转为文本文字_Asp.net

我们有一个中文录音文件.mp3格式或者是.wav格式，如果我们想要提取录音文件中的文字内容，我们可以采用以下方法，不需要使用azure speech api 密钥注册通过离线的方式实现。

1.首先我们先在nuget中下载两个包 naudio 2.2.1、whisper.net 1.7.3

2.另外我们还需要从hugging face网址中下载一个 ggml-medium.bin 文件

3. 代码部分，由于我们whisper模型只支持16khz的语音文件

所以我们要把不同音频格式的文件统一转为16000hz的音频数据文件，如下是具体代码：

using naudio.wave;
using system;
 
public class audioresampler
{
    public static void convertto16khz(string inputfile, string outputfile)
    {
        // 打开原始音频文件
        using (var reader = new wavefilereader(inputfile))
        {
            // 创建目标音频格式 16khz，单声道，16位
            var targetformat = new waveformat(16000, 1); // 16000hz, mono, 16-bit
 
            // 创建转换流，使用 waveformatconversionstream 进行重采样
            using (var conversionstream = new waveformatconversionstream(targetformat, reader))
            {
                // 将转换后的音频数据写入新文件
                wavefilewriter.createwavefile(outputfile, conversionstream);
                console.writeline("文件已转换为 16khz 格式");
            }
        }
    }
}
 
// 使用示例
class program
{
    static void main(string[] args)
    {
        string inputfile = @"path_to_input_file.wav";  // 输入文件路径
        string outputfile = @"path_to_output_file_16khz.wav";  // 输出文件路径
        audioresampler.convertto16khz(inputfile, outputfile);
    }
}

4.接下来是详细的具体代码

public async task analyze()
{
    //模型
    string modelfilepath = system.io.path.combine(appdomain.currentdomain.basedirectory, "ggml-medium-q8_0.bin");
    // 初始化whisper工厂和处理器
    var whisperfactory = whisperfactory.frompath(modelfilepath);
    var processor = whisperfactory.createbuilder()
        .withlanguage("zh") // 设置识别的语言为中文
        .build();
    try
    {
        string audiofilename = "path_to_output_file_16khz.wav";
        string audiofilepath = system.io.path.combine(appdomain.currentdomain.basedirectory, audiofilename);
        // 读取音频文件
        using var audiostream = file.openread(audiofilepath);
 
        // 处理音频文件并输出结果
        console.writeline("transcribing audio file...");
 
        await foreach (segmentdata result in processor.processasync(audiostream, default))
        {
            console.writeline($"{result.start}->{result.end}: {result.text}");
        }
    }
    catch (exception ex)
    {
        console.writeline($"an error occurred: {ex.message}");
    }
    console.writeline("press any key to exit...");
}

其中需要注意的是 ggml-medium-q8_0.bin文件的绝对路径，此文件的获取方式上述已说明。

string modelfilepath = system.io.path.combine(appdomain.currentdomain.basedirectory, "ggml-medium-q8_0.bin");

到此这篇关于c#实现中文录音文件转为文本文字的文章就介绍到这了,更多相关c#录音转文本内容请搜索代码网以前的文章或继续浏览下面的相关文章希望大家以后多多支持代码网！

C#表格开发之DataGridView控件详解

一、概要使用datagridview控件，您可以显示和编辑来自许多不同类型数据源的表格数据。datagridview控件为显示数据提供了一个可定制的表格。dat... [阅读全文]

C#中Socket通信编程的异步实现流程分析

什么是socket编程的异步是实现所谓socket编程的异步实现是指按照异步过程来实现socket编程，那么什么是异步过程呢，我们把在完成了一次调用后通过状态、... [阅读全文]

win10完美解决无法安装.Net framework3.5的问题(仅需三步)

最近要搞个项目需要windows，所以把windows电脑重装了，在重新配置渗透测试环境的时候遇见了各种问题，现在记录一下踩过的坑，防止以后自己忘了，也让大家遇见这些问题的时候少走…

2024年12月20日 • 编程语言

windows Server 2019 KVM 虚拟机安装.net3.5环境的实现

看了网上许多为windows server 2019 kvm 虚拟机安装.net3.5环境的教程，大多是通过下载源文件来安装的，这种方法费时费力，本文通过为windows serv…

2024年12月20日 • 编程语言

Core i7-2600K处理器还能否流畅运行主流游戏? 七款游戏性能测评

core i7-2600k是英特尔于2011年所发布的一款中高处理器，在当年非常受欢迎。但是，现在毕竟13年过去了，它已“老态龙钟”。其单核性能大约相当于c…

2024年12月19日 • 编程语言

C#连接ClickHouse数据库的步骤指南

1. 安装 clickhouse.client 包首先，您需要在您的项目中安装 clickhouse.client 包。您可以使用 nuget 包管理器来完成此... [阅读全文]


验证码：

验证码：

C#实现中文录音文件转为文本文字

2024年12月26日 • Asp.net •我要评论

相关文章:

win10完美解决无法安装.Net framework3.5的问题(仅需三步)

windows Server 2019 KVM 虚拟机安装.net3.5环境的实现

Core i7-2600K处理器还能否流畅运行主流游戏? 七款游戏性能测评

发表评论