前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >OpenCv+Qt5.12.2:文字检测与文本识别

OpenCv+Qt5.12.2:文字检测与文本识别

原创
作者头像
何其不顾四月天
发布2023-03-19 21:46:50
1.5K0
发布2023-03-19 21:46:50
举报
文章被收录于专栏:四月天的专栏四月天的专栏

OpenCv + Qt5.12.2 文字检测与文本识别

前言

好久没有进行一些相关的更新的了,去年一共更新了四篇,最近一直在做音视频相关的直播服务,又是重新学习积攒经验的一个过程。去年疫情也比较严重,等到解封,又一直很忙,最近又算有了一些时间,所以想着可以做一些更新了,又拿起了 OpenCV,做一些相关更新了。其实代码相关的工作,在上一篇 OpenCV-摄像头相关的完成之后已经做完了,只是一直没有写相关博客,这次先给做完。

简介

文本检测与文本识别都是基于原生OpenCV的扩张模块来实现的,基本流程是按照 OpenCV 文字检测与识别模块来实现的,只不过是我做了一些关于Ot与OpenCV的集成工作做成了项目。大致工作流程为:图片选择功能选择图片保存

相关的文档我在内外网搜索后发现大致几篇一样的文档,来源不可考,大致都贴出来:

OpenCV 文字檢測與識別模塊 - 台部落 / OpenCV 文字检测与识别模块 - CSDN

OPENCV 文字检测与识别模块 - 灰信网

文档基本相同,CSDN与灰信网完全相同,台部落是资源路径不同,台部落是原始模型资源路径,CSDN与灰信网的路径相同是一个网盘。但是台部落与CSDN博主是同一个名字。那就是灰信网。

资源路径

编译相关的已经在前两篇文档已经描述过了,路径如下: OpenCv4.4.0+Qt5.12.2+OpenCv-Contrib-4.4.0

那就描述一下本期需要用到的一些资源:

文字检测

资源文件描述如下: textDetector.hpp 文档中 37-39行。详细内容如下:

代码语言:c
复制
/** @brief TextDetectorCNN class provides the functionallity of text bounding box detection.
 This class is representing to find bounding boxes of text words given an input image.
 This class uses OpenCV dnn module to load pre-trained model described in @cite LiaoSBWL17.
 The original repository with the modified SSD Caffe version: https://github.com/MhLiao/TextBoxes.
 Model can be downloaded from [DropBox](https://www.dropbox.com/s/g8pjzv2de9gty8g/TextBoxes_icdar13.caffemodel?dl=0).
 Modified .prototxt file with the model description can be found in `opencv_contrib/modules/text/samples/textbox.prototxt`.
 */

textbox.prototxt - 本地文档模块目录中,按照路径查找即可。

TextBoxes_icdar13.caffemodel - TextBoxes_icdar13.caffemodel

文字识别

所需要的资源如下:见相关网页描述: OpenCV.org, text_recognition_cnn.cpp,不过也只是贴出了相关路径而已,原始博客中提到的关于

代码语言:c++
复制
    cout << "   Demo of text recognition CNN for text detection." << endl
         << "   Max Jaderberg et al.: Reading Text in the Wild with Convolutional Neural Networks, IJCV 2015"<<endl<<endl
         << "   Usage: " << progFname << " <output_file> <input_image>" << endl
         << "   Caffe Model files (textbox.prototxt, TextBoxes_icdar13.caffemodel)"<<endl
         << "     must be in the current directory. See the documentation of text::TextDetectorCNN class to get download links." << endl
         << "   Obtaining text recognition Caffe Model files in linux shell:" << endl
         << "   wget http://nicolaou.homouniversalis.org/assets/vgg_text/dictnet_vgg.caffemodel" << endl
         << "   wget http://nicolaou.homouniversalis.org/assets/vgg_text/dictnet_vgg_deploy.prototxt" << endl
         << "   wget http://nicolaou.homouniversalis.org/assets/vgg_text/dictnet_vgg_labels.txt" <<endl << endl;

相关路径已经失效。

vgg_text,是一些快照文件,只有两个比较小的文件资源,模型module已经是没有的了。最后还是使用CSDN博主的资源,利用百度网盘下载了,折磨人。

其他涉及到资源文件,基本都在模块的文件路径下:

代码语言:shell
复制
trained_classifierNM1.xml
trained_classifierNM2.xml
OCRHMM_transitions_table.xml
OCRHMM_knn_model_data.xml.gz
trained_classifier_erGrouping.xml

路径如下:

代码语言:shell
复制
opencv_contrib-4.4.0\modules\text\samples

其他的一些图片资源也可以在当前目录下找到。

代码

头文件:

代码语言:c++
复制
#ifndef MAINWINDOW_H
#define MAINWINDOW_H

#include <QMainWindow>
#include <iostream>
#include <fstream>
#include <vector>
#include <opencv2/opencv.hpp>
#include <opencv2/text.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/dnn.hpp>
#include <opencv2/features2d.hpp>

class ParallelExtracCSER: public cv::ParallelLoopBody
{
private:
    std::vector<cv::Mat> &channels;
    std::vector<std::vector<cv::text::ERStat>> &regions;
    std::vector<cv::Ptr<cv::text::ERFilter>> erFiter_1;
    std::vector<cv::Ptr<cv::text::ERFilter>> erFiter_2;
public:
    ParallelExtracCSER(std::vector<cv::Mat> &_channels, std::vector<std::vector<cv::text::ERStat>> &_regions,
                       std::vector<cv::Ptr<cv::text::ERFilter>> _erFiter_1, std::vector<cv::Ptr<cv::text::ERFilter>> _erFiter_2)
        : channels(_channels), regions(_regions), erFiter_1(_erFiter_1), erFiter_2(_erFiter_2){}
    virtual void operator()( const cv::Range &r) const CV_OVERRIDE
    {
        for(int c = r.start; c < r.end; c++)
        {
            erFiter_1[c]->run(channels[c], regions[c]);
            erFiter_2[c]->run(channels[c], regions[c]);
        }
    }
    ParallelExtracCSER & operator=(const ParallelExtracCSER &a);
};

template  <class T>
class ParallelOCR: public cv::ParallelLoopBody
{
private:
    std::vector<cv::Mat> &detections;
    std::vector<std::string> &outputs;
    std::vector<std::vector<cv::Rect> > &boxes;
    std::vector<std::vector<std::string> > &words;
    std::vector<std::vector<float> > &confidences;
    std::vector<cv::Ptr<T> > &ocrs;
public:
    ParallelOCR(std::vector<cv::Mat> &_detections, std::vector<std::string> &_outputs, std::vector< std::vector<cv::Rect> > &_boxes,
                std::vector< std::vector<std::string> > &_words, std::vector< std::vector<float> > &_confidences,
                std::vector< cv::Ptr<T> > &_ocrs):detections(_detections),outputs(_outputs),boxes(_boxes),words(_words),confidences(_confidences),ocrs(_ocrs)
    {}

    virtual void operator()(const cv::Range &r) const CV_OVERRIDE
    {
        for(int c=r.start; c < r.end; c++)
        {
            ocrs[c%ocrs.size()]->run(detections[c], outputs[c], &boxes[c], &words[c], &confidences[c], cv::text::OCR_LEVEL_WORD);
        }
    }
    ParallelOCR & operator=(const ParallelOCR &a);
};

namespace Ui {
class MainWindow;
}

class MainWindow : public QMainWindow
{
    Q_OBJECT

public:
    explicit MainWindow(QWidget *parent = nullptr);
    ~MainWindow();

private:
    Ui::MainWindow *ui;
    void WindowInit();
    std::string sourcePath;
    void showImage(cv::Mat &image);
    bool fileExists(const std::string &filename);
    void textboxDraw(cv::Mat src, std::vector<cv::Rect> &groups, std::vector<float> &probs, std::vector<int> &indexes);
    bool isRepetitive(const std::string &s);
    void erDraw(std::vector<cv::Mat> &channels, std::vector<std::vector<cv::text::ERStat>> &regions, std::vector<cv::Vec2i> group, cv::Mat segmentation);

public slots:
    void slot_importImage();
    void slot_saveImage();
    void slot_textDetector();
    void slot_textRecognizer();
};


#endif // MAINWINDOW_H

MainWindow类是主要的Ctrl模块,其他两个类 ParallelExtracCSERParallelOCR属于业务类了,主要功能模块实现相关的。

函数实现

槽函数

主要对应四个主要功能,图片导入,图片保存,文本检测,文本识别

1. slot_importImage()
代码语言:c++
复制
void MainWindow::slot_importImage()
{
    QString imagePath = QFileDialog::getOpenFileName(this,"选择图片","./","*png *jpg *jpeg");
    QImage image;
    if(image.load(imagePath))
        qDebug() << "导入图片成功" << imagePath;
    sourcePath = QDir::toNativeSeparators(imagePath).toStdString();
    qDebug() << "图片路径:" << QDir::toNativeSeparators(imagePath);
    int imageWidth = image.width();
    int imageHeight = image.height();

    if(imageWidth > 640)
    {
        imageHeight = (640*10 / imageWidth) * imageHeight /10;
        imageWidth = 640;
    }

    if(imageHeight > 480)
    {
        imageWidth = (480*10 / imageHeight) * imageWidth /10;
        imageHeight = 480;
    }

    image = image.scaled(imageWidth, imageHeight, Qt::IgnoreAspectRatio, Qt::SmoothTransformation);
    this->resize(imageWidth*2+2,imageHeight);
    ui->label_source->setPixmap(QPixmap::fromImage(image));
}
2.slot_saveImage()
代码语言:c++
复制
void MainWindow::slot_saveImage()
{
    if(currentActive.isEmpty() || sourcePath.empty())
    {
        qDebug() << "currentActive is " << currentActive.isEmpty() << " sourcePath: " << sourcePath.empty();
        return;
    }
    QString source_path_name = QString::fromStdString(sourcePath);
    size_t pos = sourcePath.find('.');
    if(pos == std::string::npos)
    {
        qDebug() << QString::fromStdString(sourcePath) << " iamget format is error";
        return;
    }
    QStringList sourcePaths = source_path_name.split('.');
    QString saveName = sourcePaths.at(0) + "_" + currentActive + "." + sourcePaths.at(1);
    if(ui->label_result->pixmap()->save(saveName, sourcePaths.at(1).toStdString().c_str()))
    {
        qDebug() << saveName << " save success.";
    }
    else
    {
        qDebug() << saveName << " save fail.";
    }
}
3.slot_textDetector()
代码语言:c++
复制
void MainWindow::slot_textDetector()
{
    const std::string modelArch = "textbox.prototxt" ;
    const std::string moddelWeights = "TextBoxes_icdar13.caffemodel";
    if(!fileExists(modelArch) || !fileExists(moddelWeights))
    {
        qDebug() << "Model files not found in the current directory. Aborting!";
        return;
    }

    if(sourcePath.empty())
    {
        qDebug() << "图片路径无效,请检查图片是否存在!";
        return;
    }
    cv::Mat image = cv::imread(sourcePath, cv::IMREAD_COLOR);
    if(image.empty())
    {
        qDebug() << "image is empty" << sourcePath.c_str();
        return;
    }

    qDebug() << "Starting Text Box Demo";
    cv::Ptr<cv::text::TextDetectorCNN> textSpotter = cv::text::TextDetectorCNN::create(modelArch, moddelWeights);
    std::vector<cv::Rect> bbox;
    std::vector<float> outProbabillities;
    textSpotter->detect(image, bbox, outProbabillities);
    std::vector<int> indexes;
    cv::dnn::NMSBoxes(bbox, outProbabillities, 0.4f, 0.5f, indexes);

    cv::Mat imageCopy = image.clone();
//    float threshold = 0.5;
//    for(int i = 0; i < bbox.size(); i++)
//    {
//        if(outProbabillities[i] > threshold)
//        {
//            cv::Rect rect = bbox[i];
//            cv::rectangle(imageCopy,rect,cv::Scalar(255,0,0),2);
//        }
//    }
    textboxDraw(imageCopy, bbox, outProbabillities, indexes);
    showImage(imageCopy);

    imageCopy = image.clone();
    cv::Ptr<cv::text::OCRHolisticWordRecognizer> wordSpotter =
            cv::text::OCRHolisticWordRecognizer::create("dictnet_vgg_deploy.prototxt", "dictnet_vgg.caffemodel", "dictnet_vgg_labels.txt");
    for(size_t i = 0; i < indexes.size(); i++)
    {
        cv::Mat wordImg;
        cv::cvtColor(image(bbox[indexes[i]]),wordImg, cv::COLOR_BGR2GRAY);
        std::string word;
        std::vector<float> confs;
        wordSpotter->run(wordImg, word, nullptr, nullptr, &confs);

        cv::Rect currrentBox = bbox[indexes[i]];
        rectangle(imageCopy, currrentBox, cv::Scalar( 0, 255, 255 ), 2, cv::LINE_AA);

        int baseLine = 0;
        cv::Size labelSize = cv::getTextSize(word, cv::FONT_HERSHEY_PLAIN, 1, 1, &baseLine);
        int yLeftBottom = std::max(currrentBox.y, labelSize.height);
        rectangle(imageCopy, cv::Point(currrentBox.x, yLeftBottom - labelSize.height),
                  cv::Point(currrentBox.x +labelSize.width, yLeftBottom + baseLine), cv::Scalar( 255, 255, 255 ), cv::FILLED);

        putText(imageCopy, word, cv::Point(currrentBox.x , yLeftBottom), cv::FONT_HERSHEY_PLAIN, 1, cv::Scalar( 0,0,0 ), 1, cv::LINE_AA);
    }
    showImage(imageCopy);
}
4.slot_textRecognizer()
代码语言:c++
复制
void MainWindow::slot_textRecognizer()
{
    if(sourcePath.empty())
    {
        qDebug() << "图片路径无效,请检查图片是否存在!";
        return;
    }
    cv::Mat image = cv::imread(sourcePath, cv::IMREAD_COLOR);
    if(image.empty())
    {
        qDebug() << "image is empty" << sourcePath.c_str();
        return;
    }

    bool downsize = false;
    int RegionType = 1;
    int GroupingAlgorithm = 0;
    int Recongnition = 0;
    cv::String regionTypeString[2] = {"ERStats","MSER"};
    cv::String GroupingAlgorithmsStr[2] = {"exhaustive_search", "multioriented"};
    cv::String recognitionsStr[2] = {"Tesseract", "NM_chain_features + KNN"};

    std::vector<cv::Mat> channels;
    std::vector<std::vector<cv::text::ERStat>> regions(2);

    cv::Mat gray,outImage;
    // Create ERFilter objects with the 1st and 2nd stage default classifiers
    // since er algorithm is not reentrant we need one filter for channel
    std::vector< cv::Ptr<cv::text::ERFilter> > erFilters1;
    std::vector< cv::Ptr<cv::text::ERFilter> > erFilters2;

    if(!fileExists("trained_classifierNM1.xml") || !fileExists("trained_classifierNM2.xml")
            || !fileExists("OCRHMM_transitions_table.xml") || !fileExists("OCRHMM_knn_model_data.xml.gz") || !fileExists("trained_classifier_erGrouping.xml"))
    {
        qDebug() << " trained_classifierNM1.xml file not found!";
        return;
    }

    for(int i = 0; i<2; i++ )
    {
        cv::Ptr<cv::text::ERFilter> erFilter1 = createERFilterNM1(cv::text::loadClassifierNM1("trained_classifierNM1.xml"), 8, 0.00015f, 0.13f, 0.2f, true, 0.1f);
        cv::Ptr<cv::text::ERFilter> erFilter2 = createERFilterNM2(cv::text::loadClassifierNM2("trained_classifierNM2.xml"), 0.5);
        erFilters1.push_back(erFilter1);
        erFilters2.push_back(erFilter2);
    }

    int numOcrs = 10;
    std::vector<cv::Ptr<cv::text::OCRTesseract>> ocrs;
    for(int o = 0; o < numOcrs; o++)
    {
        ocrs.push_back(cv::text::OCRTesseract::create());
    }

    cv::Mat transitionP;
    std::string filename = "OCRHMM_transitions_table.xml";
    cv::FileStorage fs(filename, cv::FileStorage::READ);
    fs["transition_probabilities"] >> transitionP;
    fs.release();

    cv::Mat emissionP = cv::Mat::eye(62, 62, CV_64FC1);
    std::string voc = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";

    std::vector< cv::Ptr<cv::text::OCRHMMDecoder>> decoders;

    for(int o = 0; o <numOcrs; o++)
    {
        decoders.push_back(cv::text::OCRHMMDecoder::create(cv::text::loadOCRHMMClassifierNM("OCRHMM_knn_model_data.xml.gz"),
                          voc, transitionP, emissionP));
    }

    double tAll = (double)cv::getTickCount();

    if(downsize)
        cv::resize(image,image,cv::Size(image.size().width,image.size().height),0,0,cv::INTER_LINEAR_EXACT);
    cv::cvtColor(image, gray, cv::COLOR_BGR2GRAY);
    channels.clear();
    channels.push_back(gray);
    channels.push_back(255 - gray);

    regions[0].clear();
    regions[1].clear();

    switch (RegionType) {
    case 0:
        cv::parallel_for_(cv::Range(0, (int)channels.size()), ParallelExtracCSER(channels, regions, erFilters1, erFilters2));
        break;
    case 1:
    {
        std::vector<std::vector<cv::Point>> contours;
        std::vector<cv::Rect> bboxes;
        cv::Ptr<cv::MSER> mesr = cv::MSER::create(21, (int)(0.00002*gray.cols*gray.rows), (int)(0.05*gray.cols * gray.rows), 1, 0.7);
        mesr->detectRegions(gray, contours, bboxes);

        if(contours.size() > 0)
            MSERsToERStats(gray, contours, regions);
    }
    break;
    }

    std::vector< std::vector<cv::Vec2i>> nmRegionGroups;
    std::vector<cv::Rect> nmBoxes;
    switch (GroupingAlgorithm) {
    case 0:
        cv::text::erGrouping(image, channels, regions, nmRegionGroups, nmBoxes, cv::text::ERGROUPING_ORIENTATION_HORIZ);
        break;
    case 1:
        cv::text::erGrouping(image, channels, regions, nmRegionGroups, nmBoxes, cv::text::ERGROUPING_ORIENTATION_ANY, "trained_classifier_erGrouping.xml", 0.5);
        break;
    }

    /*Text Recognition (OCR)*/
    int bottom_bar_height = outImage.rows/7 ;
    cv::copyMakeBorder(image, outImage, 0, bottom_bar_height, 0, 0, cv::BORDER_CONSTANT, cv::Scalar(150, 150, 150));
    float scale_font = (float)(bottom_bar_height /85.0);
    std::vector<std::string> words_detection;
    float min_confidence1 = 0.f, min_confidence2 = 0.f;

    if (Recongnition == 0)
    {
        min_confidence1 = 51.f;
        min_confidence2 = 60.f;
    }

    std::vector<cv::Mat> detections;

    for (int i=0; i<(int)nmBoxes.size(); i++)
    {
        rectangle(outImage, nmBoxes[i].tl(), nmBoxes[i].br(), cv::Scalar(255,255,0),3);

        cv::Mat group_img = cv::Mat::zeros(image.rows+2, image.cols+2, CV_8UC1);
        erDraw(channels, regions, nmRegionGroups[i], group_img);
        group_img(nmBoxes[i]).copyTo(group_img);
        copyMakeBorder(group_img,group_img,15,15,15,15,cv::BORDER_CONSTANT,cv::Scalar(0));
        detections.push_back(group_img);
    }
    std::vector<std::string> outputs((int)detections.size());
    std::vector< std::vector<cv::Rect> > boxes((int)detections.size());
    std::vector< std::vector<std::string> > words((int)detections.size());
    std::vector< std::vector<float> > confidences((int)detections.size());
    // parallel process detections in batches of ocrs.size() (== num_ocrs)
    for (int i=0; i<(int)detections.size(); i=i+(int)numOcrs)
    {
        cv::Range r;
        if (i+(int)numOcrs <= (int)detections.size())
            r = cv::Range(i,i+(int)numOcrs);
        else
            r = cv::Range(i,(int)detections.size());

        switch(Recongnition)
        {
        case 0: // Tesseract
            qDebug() << "+++++";
            cv::parallel_for_(r, ParallelOCR<cv::text::OCRTesseract>(detections, outputs, boxes, words, confidences, ocrs));
            qDebug() << "---";
            break;
        case 1: // NM_chain_features + KNN
            cv::parallel_for_(r, ParallelOCR<cv::text::OCRHMMDecoder>(detections, outputs, boxes, words, confidences, decoders));
            break;
        }
    }
    for(auto &it : outputs)
    {
        qDebug() << QString::fromStdString(it);
    }
    for (int i=0; i<(int)detections.size(); i++)
    {
        outputs[i].erase(remove(outputs[i].begin(), outputs[i].end(), '\n'), outputs[i].end());
        //cout << "OCR output = \"" << outputs[i] << "\" length = " << outputs[i].size() << endl;
        if (outputs[i].size() < 3)
            continue;

        for (int j=0; j<(int)boxes[i].size(); j++)
        {
            boxes[i][j].x += nmBoxes[i].x-15;
            boxes[i][j].y += nmBoxes[i].y-15;

            //cout << "  word = " << words[j] << "\t confidence = " << confidences[j] << endl;
            if ((words[i][j].size() < 2) || (confidences[i][j] < min_confidence1) ||
                    ((words[i][j].size()==2) && (words[i][j][0] == words[i][j][1])) ||
                    ((words[i][j].size()< 4) && (confidences[i][j] < min_confidence2)) ||
                    isRepetitive(words[i][j]))
                continue;
            words_detection.push_back(words[i][j]);
            rectangle(outImage, boxes[i][j].tl(), boxes[i][j].br(), cv::Scalar(255,0,255),3);
            cv::Size word_size = getTextSize(words[i][j], cv::FONT_HERSHEY_SIMPLEX, (double)scale_font, (int)(3*scale_font), nullptr);
            cv::rectangle(outImage, boxes[i][j].tl()-cv::Point(3,word_size.height+3), boxes[i][j].tl()+cv::Point(word_size.width,0), cv::Scalar(255,0,255),-1);
            cv::putText(outImage, words[i][j], boxes[i][j].tl()-cv::Point(1,1), cv::FONT_HERSHEY_SIMPLEX, scale_font, cv::Scalar(255,255,255),(int)(3*scale_font));
        }
    }
    tAll = ((double)cv::getTickCount() - tAll)*1000/cv::getTickFrequency();
    int text_thickness = 1+(outImage.rows/500);
    std::string fps_info = cv::format("%2.1f Fps. %dx%d", (float)(1000 / tAll), image.cols, image.rows);
    cv::putText(outImage, fps_info, cv::Point( 10,outImage.rows-5 ), cv::FONT_HERSHEY_DUPLEX, scale_font, cv::Scalar(255,0,0), text_thickness);
    cv::putText(outImage, regionTypeString[RegionType], cv::Point((int)(outImage.cols*0.5), outImage.rows - (int)(bottom_bar_height/ 1.5)), cv::FONT_HERSHEY_DUPLEX, scale_font, cv::Scalar(255,0,0), text_thickness);
    cv::putText(outImage, GroupingAlgorithmsStr[GroupingAlgorithm], cv::Point((int)(outImage.cols*0.5),outImage.rows-((int)(bottom_bar_height /3)+4) ), cv::FONT_HERSHEY_DUPLEX, scale_font, cv::Scalar(255,0,0), text_thickness);
    cv::putText(outImage, regionTypeString[Recongnition], cv::Point((int)(outImage.cols*0.5),outImage.rows-5 ), cv::FONT_HERSHEY_DUPLEX, scale_font, cv::Scalar(255,0,0), text_thickness);
    showImage(outImage);
}

Ctrl函数

代码语言:c++
复制
void MainWindow::WindowInit()
{
    //设置菜单
    QMenu* file = ui->menuBar->addMenu(QString("文件"));
    QAction* importImage = file->addAction(QString("选择图片"));
    QAction* saveImage = file->addAction(QString("保存"));

    QMenu* funtion = ui->menuBar->addMenu(QString("功能"));
    QAction* textDetector = funtion->addAction(QString("文字检测"));
    QAction* textRecognizer = funtion->addAction(QString("文字识别"));

    //绑定信号与槽函数
    connect(importImage,&QAction::triggered,this,&MainWindow::slot_importImage);
    connect(saveImage,&QAction::triggered,this,&MainWindow::slot_saveImage);
    connect(textDetector,&QAction::triggered,this,&MainWindow::slot_textDetector);
    connect(textRecognizer,&QAction::triggered,this,&MainWindow::slot_textRecognizer);
}

Qt图片显示函数

做了一个图片显示,附带缩放显示

代码语言:c++
复制
void MainWindow::showImage(cv::Mat &image)
{
    cv::Mat outImage;
    cv::cvtColor(image, outImage, cv::COLOR_BGR2RGB);
    QImage qImage = QImage((const unsigned char*)(outImage.data),outImage.cols,outImage.rows,outImage.step,QImage::Format_RGB888);
    int imageWidth = qImage.width();
    int imageHeight = qImage.height();

    if(imageWidth > 640)
    {
        imageHeight = (640*10 / imageWidth) * imageHeight /10;
        imageWidth = 640;
    }

    if(imageHeight > 480)
    {
        imageWidth = (480*10 / imageHeight) * imageWidth /10;
        imageHeight = 480;
    }

    qImage = qImage.scaled(imageWidth, imageHeight, Qt::IgnoreAspectRatio, Qt::SmoothTransformation);
    ui->label_result->setPixmap(QPixmap::fromImage(qImage));
}

文字绘制

代码语言:c++
复制
void MainWindow::textboxDraw(cv::Mat src, std::vector<cv::Rect>& groups, std::vector<float>& probs, std::vector<int>& indexes)
{
    for (size_t i = 0; i < indexes.size(); i++)
    {
        if (src.type() == CV_8UC3)
        {
            cv::Rect currrentBox = groups[indexes[i]];
            cv::rectangle(src, currrentBox, cv::Scalar( 0, 255, 255 ), 2, cv::LINE_AA);
            cv::String cvlabel = cv::format("%.2f", probs[indexes[i]]);
            qDebug() << "text box: " << currrentBox.size().width << " " <<currrentBox.size().height << " confidence: " << probs[indexes[i]] << "\n";

            int baseLine = 0;
            cv::Size labelSize = getTextSize(cvlabel, cv::FONT_HERSHEY_PLAIN, 1, 1, &baseLine);
            int yLeftBottom = std::max(currrentBox.y, labelSize.height);
            cv::rectangle(src, cv::Point(currrentBox.x, yLeftBottom - labelSize.height),
                      cv::Point(currrentBox.x + labelSize.width, yLeftBottom + baseLine), cv::Scalar( 255, 255, 255 ), cv::FILLED);

            cv::putText(src, cvlabel, cv::Point(currrentBox.x, yLeftBottom), cv::FONT_HERSHEY_PLAIN, 1, cv::Scalar( 0,0,0 ), 1, cv::LINE_AA);
        }
        else
            cv::rectangle(src, groups[i], cv::Scalar( 255 ), 3, 8 );
    }
}

源码

基本流程如上,相关的函数解释与释义都已经附上,更详细的说明解释,见上述博客内容,就不再做一边赘述了。

源码

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • OpenCv + Qt5.12.2 文字检测与文本识别
    • 前言
      • 简介
        • 资源路径
          • 文字检测
          • 文字识别
        • 代码
          • 头文件:
          • 函数实现
          • Ctrl函数
          • Qt图片显示函数
          • 文字绘制
          • 源码
      相关产品与服务
      文字识别
      文字识别(Optical Character Recognition,OCR)基于腾讯优图实验室的深度学习技术,将图片上的文字内容,智能识别成为可编辑的文本。OCR 支持身份证、名片等卡证类和票据类的印刷体识别,也支持运单等手写体识别,支持提供定制化服务,可以有效地代替人工录入信息。
      领券
      问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档