Posted��by��Yida��Wang

Passion��with��OpenCV��and��other��Open��Source��Communities

Hello,��I��am��Yida��Wang,��a��first��year��Master��Student��in��Pattern��Recognition��of��PRIS��Lab��in��SICE��school��of��BUPT.����I��have��wonderful��experience��with��OpenCV��and��several��other��open��source��communities.��

I��have��won��the��1st��prize��in��Scilab��Open��Source��Contest��2014��and��Excellent��Developer��in��CSDN��Summer��of��Code��2014��in��BladeRF��commmunity.��All��certificate��could��be��provided��if��it��is��necessary.����

As��for��WINE��project,��I��have��developed��a��PCANet��structure��to��automatically��match��opencv��source��figures��with��Microsoft��figures��sot��that��the��editor��in��linux��platform��could��match��the��correct��figure��copied��from��Microsoft��Office.��

This��time��I��think��it��is��convenient��to��use��Caffe��and��OpenCV��for��deep��learning��and��other��CV��issues��for��some��pattern��recognition��applications.

I��have��also��own��an��Nvidia��K40��GPU��donated��by��Nvidia��for��research,��I��think��I��can��extend��the��package��for��GPU��calculation!

Motivation��for��Proposal

Framework��in��Computer��Vision-��Deep��Learning��Project

I���m��Concentrating��on��Deep��Learning��coding��with��OpenCV��and��other��open��source��project��like��Caffe��and��cuDNN��for��a��period��of��time.��My��project��contains��two��meaningful��contents,��the��1st��one��is��a��BP��algorithm��eliminated��CNN��structure��which��is��far��less��time��consuming��than��traditional��CNN��and��the��2nd��one��is��an��entire��Deep��Leaning��structure��on��image��recognition��based��on��Caffe.

��

New��Idea��for��a��Fast��CNN

My��idea��is��based��on��the��thriving��pattern��recognition��method��called�����Deep��Neural��Networks�������which��has��been��used��by��Google��to��build��the��amazing��recognition��structure��last��year.��As��Deep��Neural��Network��behaves��amazingly��in��pattern��recognition,��there��are��already��useful��tools��such��as��Caffe��being��implemented��for��research.��Some��of��the��useful��codes��could��be��modified��and��embedded��in��OpenCV��for��the��development��of��DNN.��

But��at��the��same��time,��the��problem��lies��in��the��dependence��on��hardware��such��as��powerful��GPU��which��is��not��easy��to��be��got��by��ordinary��people.��The��time��consuming��process��is��mainly��caused��by��2��points:��the��Randomness��of��the��convolutional��kernel��and��the��countless��back��propagation��step.

I��have��been��studying��on��a��powerful��single��direction��fast��CNN��called��PCANet��for��research��and��commercial��use.��

Here��are��some��benefit��in��such��structure:

1.��Such��structure��hasn���t��the��BP��algorithm��used��in��DNN��which��cost��much��operation.��

2.��It��has��some��tiny��PCA��filters��instead.��The��filters�����shape��is��just��like��the��shape��in��the��training��results��of��normal��CNN.

3.��My��structure��could��be��used��in��initiating��the��CNN��in��the��future��to��optimize��the��training��time��in��CNN.

4.��

Combination��of��Caffe��and��PCANet��in��WINE

The�����blob�����structure��defined��in��Caffe��is��clear��for��the��layer��based��CNN��and��the��input��data��of��convolution��kernel��and��bias��in��one��layer��could��be��replaced��by��the��PCANet��filter��for��initialization.��So��the��modified��structure��combined��with��Caffe��and��PCANet��could��be��powerful��both��in��speed��of��convergence��and��recognition��stability��with��the��help��of��OpenCV.

��

So��my��idea��consists��2��stage:��PCANet��implementation��and��Caffe��modifying��for��OpenCV.

��

The��1st��direction��called��Fast-CNN(CellNet)

My��1st��idea��is��an��implementation��of��PCANet��which��has��different��structure��of��local��filters��learning��from��different��databases��while��holding��the��stability��and��generality��of��general��CNN��at��the��same��time.

Here��is��the��General��model��of��PCANet:

��

��

I��have��been��using��this��algorithm��in��one��of��the��most��difficult��face��database��called�����LFW�����and��achieves��92%��recognition��rate��with��just��this��algorithm��alone.��

��

My��friend��Qian��Hong,��has��guild��me��to��help��him��solve��the��general��figure��recognition��problem��recently��to��reduce��the��human��labor��cost��in��selecting��the��most��likely��figure��in��open��source��figure��database��to��the��Microsoft��official��figure.����I��use��PCANet��for��such��problem��with��particular��parameters��and��got��the��result��partly��shown��below:

��

Target��database

��

��

Open��source��database

��

��

��

The��matching��result��shows��the��accuracy��of��PCANet,��they��are��all��extract��from��raw��database,��and��there��are��many��shape��alike��matches.��My��algorithm��just��could��select��the��best��substitute.

��

Now��I��have��modified��the��structure��for��general��visual��object��classification��problems��and��join��the��Detection-Feature��extraction-Classification��process��together��into��an��entire��extension��neural��network��both��for��my��study��and��the��project��itself.��By��attempting��to��modify��my��previous��Matlab��codes��about��CellNet��into��C++��codes��with��the��help��of��OpenCV(especially��in��Matrix��operations),��I��could��get��some��raw��features��extracted��from��a��photo��through��the��network��these��days.��The��performance��is��still��as��excellent��as��I��mentioned��last��few��weeks,��but��I��am��concerning��about��another��confusion��related��to��what��you��have��told��me.

��

The��2nd��direction:��Modifying��Caffe��with��OpenCV��in��WINE

In��fact��besides��studying��in��artificial��architecture,��I��am��also��research��in��normal��neural��networks��at��the��same��time,��so��came��the��2nd��content��I��just��showed.��Deep��learning��is��powerful��indeed,��it��could��exceed��the��human��capabilities��in��the��future��due��to��Big��Data��and��fast��2-D��computation��(including��GPU).��But��the��complex��back��and��forth��process��cost��really��much��even��in��the��future.��So��I��am��still��searching��the��better��solution��between��one-way��network��and��two-way��network.������

��

I��noticed��that��OpenCV��has��posted��that��it��would��be��cool��if��OpenCV��could��load��and��run��deep��networks��trained��with��popular��DNN��packages��like��Caffe��or��Torch.��I��am��glad��to��hear��that��because��I��am��also��doing��some��experiment��on��CNN��with��Caffe.��

��

Than��there��are��some��main��task��to��be��done��in��the��future:

��

1.��Modify��the��Caffe���s��data��structure��with��OpenCV,��especially��the�����blob�����structure��defined��in��Caffe.��The��layer��really��appears��more��clear��with��the��help��of�����blob���,��but��the��basic��element��could��be��more��flexible��with��the��OpenCV��data��structure.

��

2.��Some��prerequisites��should��be��simplified��including��BLAS��&��Boost.��We��may��just��need��some��basic��dependencies��to��form��a��CNN.

��

3.��Whether��including��GPU��calculating��process��is��under��consideration��due��to��the��short��time.��

��

Conclusion

Just��as��described����in��detail,��my��idea��is��composed��of��a��fast��DNN��which��is��almost��completed��and��a��Caffe��based��DNN.��I��attached��the��paper��describing��the��fast��DNN��called�����PCANet���.��The��source��code��is��published��on��Github��with��the��name��of��CellNet��with��the��URL:��

https://github.com/Wangyida/CellNet

As��for��the��Caffe��based��DNN,��I��thought��it��is��much��more��easy��to��develop��a��CPU��version��rather��than��a��GPU��version��at��first.��I��am��worried��about��that��the��time��might��not��be��enough��to��apply��OpenCV��on��Caffe��without��bugs��from��now��on��though��there��are��already��some��utilizations��of��OpenCV��in��Caffe.

I��wonder��if��the��idea��is��meaningful��enough?����Especially��the��1st��part.