Metadata-Version: 2.1
Name: ocr4all_pixel_classifier
Version: 0.2.1
Summary: UNKNOWN
Home-page: https://gitlab2.informatik.uni-wuerzburg.de/chw71yx/page-segmentation.git
Author: Christoph Wick, Alexander Hartelt, Alexander Gehrke
Author-email: christoph.wick@informatik.uni-wuerzburg.de, alexander.hartelt@informatik.uni-wuerzburg.de, alexander.gehrke@informatik.uni-wuerzburg.de
License: UNKNOWN
Description: # OCR4All Pixel Classifier
        
        ## Requirements
        
        Python dependencies are specified in `requirements.txt` / `setup.py`.
        
        You must install the package via pip with either `ocr4all_pixel_classifier[tf_cpu]` to
        use CPU version of tensorflow or `ocr4all_pixel_classifier[tf_gpu]` to use GPU (CUDA)
        version of tensorflow. For the latter, your system should be set up with CUDA 9
        and CuDNN 7.
        
        ## Usage
        
        ### Pixel classifier
        
        #### Classification
        
        To run a model on some input images, use `ocr4all-pixel-classifier predict`:
        
        ```sh
        ocr4all-pixel-classifier predict --load PATH_TO_MODEL \
        	--output OUTPUT_PATH \
        	--binary PATH_TO_BINARY_IMAGES \
        	--images PATH_TO_SOURCE_IMAGES \
        	--norm PATH_TO_NORMALIZATIONS
        ```
        (`ocr4all-pixel-classifier` is an alias for `ocr4all-pixel-classifier predict`)
        
        This will create three folders at the output path:
        - `color`: the classification as color image, with pixel color corresponding to
        	the class for that pixel
        - `inverted`: inverted binary image with classification of foreground pixels
        	only (i.e. background is black, foreground is white or class color)
        - `overlay`: classification image layered transparently over the original image
        
        #### Training
        
        For training, you first have to create dataset files. A dataset file is a JSON
        file containing three arrays, for train, test and evaluation data (also
        called train/validation/test in other publications). The JSON file uses the
        following format:
        
        ```json
        {
        	"train": [
        		//datasets here
        	],
        	"test": [
        		//datasets here
        	],
        	"eval": [
        		//datasets here
        	]
        }
        ```
        
        A dataset describes a single input image and consists of several paths: the
        original image, a binarized version and the mask (pixel color corresponds to
        class). Furthermore, the line height of the page in pixels must be specified:
        ```json
        {
        	"binary_path": "/path/to/image/binary/filename.bin.png",
        	"image_path":  "/path/to/image/color/filename.jpg",
        	"mask_path":  "/path/to/image/mask/filename_MASK.png",
        	"line_height_px": 18
        }
        ```
        
        The generation of dataset files can be automated using `ocr4all-pixel-classifier
        create-dataset-file`. Refer to the command's `--help` output for further
        information.
        
        To start the training:
        
        ```sh
        ocr4all-pixel-classifier train \
            --train DATASET_FILE.json --test DATASET_FILE.json --eval DATASET_FILE.json \
            --output MODEL_TARGET_PATH \
            --n_iter 5000
        ```
        The parameters `--train`, `--test` and `--eval` may be followed by any number of
        dataset files or patterns (shell globbing).
        
        Refer to `ocr4all-pixel-classifier train --help` for further parameters provided to
        affect the training procedure.
        
        You can combine several dataset files into a _split file_. The format of the
        split file is:
        
        ```json
        {
        	"label": "name of split",
        	"train": [
        		"/path/to/dataset1.json",
        		"/path/to/dataset2.json",
        		...
        	],
        	"test": [
        		//dataset paths here
        	],
        	"eval": [
        		//dataset paths here
        	]
        }
        ```
        To use a split file, add the `--split_file` parameter.
        
        ### Examples
        
        See the examples for [dataset generation](examples/dataset-creation-example.sh) and [training](examples/model-training-example.sh)
        
        ### `ocr4all-pixel-classifier compute-image-normalizations` / `ocrd_compute_normalizations`
        
        Calculate image normalizations, i.e. scaling factors based on average line
        height.
        
        Required arguments:
        
        - `--input_dir`: location of images
        - `--output_dir`: target location of norm files
        
        Optional arguments:
        - `--average_all`: Average height over all images
        - `--inverse`
        
Keywords: OCR,page segmentation,pixel classifier
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Description-Content-Type: text/markdown
