python调用百度AI接口：识别图片上文字

20190512做了个界面，代码正在优化，需要的话留言。

第一步：进入百度AI官网

https://ai.baidu.com/

注册账号，创建应用：

第二步：安装百度AI库

pip install baidu-aip

前提是已经安装了python环境。

第三步：调用百度AI接口

from aip import AipOcr
#import configparser

#在这三个地方填入参数
api_id = ''
api_key = ''
secret_key = ''
client = AipOcr(api_id, api_key, secret_key)    #到这里都是固定用法
with open('logo.jpg', 'rb') as f:
    img = f.read()
text = client.basicGeneral(img)
#通用文字识别方式识别图片内容，一天50000次，像什么高精度版就是basicAccurate，具体参考下方aipocr模块文档
for each in text.get('words_result'):
    print(each.get('words'))        #返回的是个json，内容在这里面

python【概述】【环境安装】【IDE开发工具】

安装使用Python SDK有如下方式　

如果已经安装了pip，执行 pip install baidu-aip 即可。
如果已安装setuptools，执行 python setup.py install 即可。

实验1：语音合成（将文字转为语音）

from aip import AipSpeech

""" 你的 APPID AK SK """
APP_ID = '写注册的APP_ID'
API_KEY = '写注册的API_KEY'
SECRET_KEY = '写注册的SECRET_KEY'

client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)

result  = client.synthesis('中南山下活死人墓预知后事如何请听下回分解', 'zh', 1, {
    'vol': 5,
    'spd': 3,
    'pit': 9,
    'per': 3,
})
# 识别正确返回语音二进制 错误则返回dict 参照下面错误码
if not isinstance(result, dict):
    with open('auido.mp3', 'wb') as f:
        f.write(result)

参数备注：

实验2：语音识别（语音转为文字）+语音合成

from aip import AipSpeech
import os


""" 你的 APPID AK SK """
APP_ID = '同上'
API_KEY = '同上'
SECRET_KEY = '同上'

client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)

#语音识别（将音频转为文字）

# 读取文件
def get_file_content(filePath):

    #实现文件自动转换（只要修改下面的本地文件并传值就可以执行）
    any2pcm=f"ffmpeg -y  -i {filePath}  -acodec pcm_s16le -f s16le -ac 1 -ar 16000 {filePath}.pcm"
    os.system(any2pcm)

    with open(f"{filePath}.pcm", 'rb') as fp:
        return fp.read()

# 识别本地文件
res=client.asr(get_file_content('ea.m4a'), 'pcm', 16000, {
    'dev_pid': 1536,
})

#拿到的是视频转换后的文字
print(res['result'][0])

#语音合成（再将文字转为音频）
result  = client.synthesis(res['result'][0], 'zh', 1, {
    'vol': 5,
    'spd': 3,
    'pit': 9,
    'per': 3,
})
# 识别正确返回语音二进制 错误则返回dict 参照下面错误码
if not isinstance(result, dict):
    with open('auido.mp3', 'wb') as f:
        f.write(result)

#调用系统默认播放器
os.system("auido.mp3")

注意点：

　　1. 语音识别的格式只支持pcm,万能格式转换命令为：

　　　　ffmpeg -y -i audio.wav（原文件名） -acodec pcm_s16le -f s16le -ac 1 -ar 16000 audio.pcm（转换后的文件名字）

　　2. os.system()是直接可以执行系统命令

接口调用准备

1.进入网站：
https://console.bce.baidu.com/?_=1535519624081&fromai=1#/aip/overview
2.选择产品服务里的文字识别，点击创建应用
3.选择文字识别，创建完成
4.创建后选择该应用，可以看到里面有AppID、API Key、Secret Key三个参数，将三个参数放入下面实例对应的三个地方
5.参考实例，然后根据自己的需求来写程序
6.参考这里也行：http://ai.baidu.com/forum/topic/show/867951

aipocr模块参考

http://ai.baidu.com/docs#/OCR-Python-SDK/top

方法2：按照官方文档步骤访问接口

实例
（动物识别）
1.获取assess_token（有效期30天）

import requests

api_key = ''
secret_key = ''
url = 'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=' + str(api_key) + '&client_secret=' + str(secret_key)

res = eval(requests.get(url).text)
assess_token = res['access_token']
print(assess_token)

2.调用接口

import requests
import base64

url = 'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=' + str(api_key) + '&client_secret=' + str(secret_key)

res = requests.get(url).text
a = eval(res)
access_token = a['access_token']
#print(access_token)
animal = 'https://aip.baidubce.com/rest/2.0/image-classify/v1/animal?access_token=' + str(access_token)
header = {
    'Content-Type':'application/x-www-form-urlencoded'
}
data = {}
with open('animal.jpg', 'rb') as f:
    image = base64.b64encode(f.read())
    data["image"] = str(image, 'utf-8')

    res2 = requests.post(url=animal,data=data, headers=header).text
    for each in eval(res2)['result']:
        print(each['name'], '\t相似度：', each['score'])

返回结果：
狮子  相似度： 0.997508
白狮  相似度： 9.78454e-05
中国藏獒    相似度： 4.06185e-05
藏獒  相似度： 3.36377e-05
美洲狮     相似度： 3.19272e-05
狮虎兽     相似度： 2.50163e-05

语音识别示例参考

from aip import AipSpeech

APP_ID = ''
API_KEY = ''
SECRET_KEY = ''

client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)

# 读取文件
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

# 识别本地文件
result = client.asr(get_file_content('a.wav'), 'wav', 16000, {
    'dev_pid': 1536,
})
print(result)

人脸搜索示例参考

注意需要建立人脸库，可以参考：https://baijiahao.baidu.com/s?id=1597732584611121837&wfr=spider&for=pc

import requests
import base64

api_key = ''
secret_key = ''
url = 'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=' + str(api_key) + '&client_secret=' + str(secret_key)

res = eval(requests.get(url).text)
access_token = res['access_token']
face_search = 'https://aip.baidubce.com/rest/2.0/face/v3/search' + '?access_token=' + str(access_token)
header = {
    'Content-Type':'application/json'
}
data = {
    "image_type":"BASE64",
    "group_id_list":"group_1",  #这个要在人脸识别的应用里创建人脸库，并添加相应图片
    #这里弄了几个明星的照片
}
with open('3.jpg', 'rb') as f:
    image = base64.b64encode(f.read())
    data["image"] = str(image, 'utf-8')

res2 = eval(requests.post(url=face_search, data=data, headers=header).text)
#转成dict

print("识别结果：", res2["result"]["user_list"][0]["user_id"])
print("准确率：", res2["result"]["user_list"][0]["score"])

运行结果：
识别结果： gakki
准确率： 96.145095825195

官方文档参考

http://ai.baidu.com/docs#/Auth/top

api错误信息及url整理

http://aixiaoshuai.mydoc.io/?t=234527

作者：

喜欢围棋和编程。查看的所有文章