资讯王 发表于 2015-6-6 14:39:42

[教学] Android OCR tess-two librarry一看就上手Image to Text

自作Android OCR的步骤如下Simple Android OCR App Tutorial

1. 先现在 tess-two download zip file
https://github.com/rmtheis/tess-two

2. unzip之后,倒入进你的eclipse
File -> Import -> Existing Projects into workspace -> tess-two directory

3. Add Android NDK library in tess-two project
Note: if no,system error liblept.so shared library
java.lang.UnsatisfiedLinkError: Couldn't load lept from loader dalvik.system.PathClassLoader

a. Download NDK from https://developer.android.com/tools/sdk/ndk/index.html
b. 之后Extract android-ndk-r10e-windows-x86_64.exe
c. Go to Windows -> Preference -> Android -> NDK -> point to your android-ndk-r10e folder

a. Right click tess-two project -> Properties -> Android -> Tick "Is Library"
b. Go to Builders -> Click New -> Select Program -> Click OK
c. Enter Name "tess ndk", Location - Click Browser File System and choose android-ndk-r10e\ndk-build.cmd, Working Directory - Click Browse Workplace and select tess-two ${workspace_loc:/tess-two}

Refer to https://droidcomp.wordpress.com/2012/08/04/building-the-tesseract-ndk-library-for-android/

4.Download take photo app http://labs.makemachine.net/2010/03/simple-android-photo-capture/
Unzip it, File -> Import -> Existing Projects into workspace -> PhotoCaptureExample

a. Right click tess-two project -> Properties -> Android -> Library -> Add tess-two project
b. AndroidManifest.xml add
    <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>
    <uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"/>
c. PhotoCaptureExample.java -> Replace the code below
Log.i( "MakeMachine", "onPhotoTaken" );
           
            _taken = true;
           
            BitmapFactory.Options options = new BitmapFactory.Options();
      options.inSampleSize = 4;
           
            Bitmap bitmap = BitmapFactory.decodeFile( _path, options );
           
            _image.setImageBitmap(bitmap);
           
            //_field.setVisibility( View.GONE );
           
            // _path = path to the image to be OCRed
            ExifInterface exif = null;
                try {
                        exif = new ExifInterface(_path);
                } catch (IOException e) {
                        // TODO Auto-generated catch block
                        e.printStackTrace();
                }
            int exifOrientation = exif.getAttributeInt(
                    ExifInterface.TAG_ORIENTATION,
                    ExifInterface.ORIENTATION_NORMAL);

            int rotate = 0;

            switch (exifOrientation) {
            case ExifInterface.ORIENTATION_ROTATE_90:
              rotate = 90;
              break;
            case ExifInterface.ORIENTATION_ROTATE_180:
              rotate = 180;
              break;
            case ExifInterface.ORIENTATION_ROTATE_270:
              rotate = 270;
              break;
            }

            if (rotate != 0) {
              int w = bitmap.getWidth();
              int h = bitmap.getHeight();

              // Setting pre rotate
              Matrix mtx = new Matrix();
              mtx.preRotate(rotate);

              // Rotating Bitmap & convert to ARGB_8888, required by tess
              bitmap = Bitmap.createBitmap(bitmap, 0, 0, w, h, mtx, false);
            }
            bitmap = bitmap.copy(Bitmap.Config.ARGB_8888, true);
           
            TessBaseAPI baseApi = new TessBaseAPI();
            // DATA_PATH = Path to the storage
            String lang = "eng";
            String datapath = Environment.getExternalStorageDirectory() + "/tesseract/";
            baseApi.init("/mnt/sdcard/tesseract/", lang);
            //baseApi.init("/mnt/sdcard/tesseract/tessdata/eng.traineddata", "eng");
            baseApi.setImage(bitmap);
            String recognizedText = baseApi.getUTF8Text();
            baseApi.end();
            Context context = getApplicationContext();
            //CharSequence text = "Hello toast!";
            int duration = Toast.LENGTH_SHORT;

            Toast toast = Toast.makeText(context, recognizedText, duration);
            toast.show();
            _field.setText(recognizedText);
d. Change path
//_path = Environment.getExternalStorageDirectory() + "/images/make_machine_example.jpg";
      _path = "/mnt/sdcard/tesseract/images/make_machine_example.jpg";
e. Go to your phone /mnt/sdcard/tesseract/tessdata/eng.traineddata, make sure you have the language package语言包
if no, download from here https://code.google.com/p/tesseract-ocr/downloads/list
unzip eng.traineddata.gz, then copy eng.traineddata and paste under /mnt/sdcard/tesseract/tessdata

Refer to http://gaut.am/making-an-ocr-android-app-using-tesseract/

5. Done



FAQ
1. Data path must contain subfolder tessdata!
Your path cant contain this text tessdata

2. android could not initialize tesseract api with language=eng
After Downloading, remember to unzip eng.traineddata.gz, and just take eng.traineddata

3. Crash TessBaseAPI baseApi = new TessBaseAPI
Make sure the NDK is installed/configured correctly

4. Fatal signal 11 tesseract
After Downloading, remember to unzip eng.traineddata.gz, and just take eng.traineddata
页: [1]
查看完整版本: [教学] Android OCR tess-two librarry一看就上手Image to Text