
1. Title of Database: Optical Recognition of Handwritten Digits

2. Source:
	E. Alpaydin, C. Kaynak
	Department of Computer Engineering
	Bogazici University, 80815 Istanbul Turkey
	alpaydin@boun.edu.tr
	July 1998

3. Past Usage:
	C. Kaynak (1995) Methods of Combining Multiple Classifiers and Their
	Applications to Handwritten Digit Recognition, 
	MSc Thesis, Institute of Graduate Studies in Science and 
	Engineering, Bogazici University.

	E. Alpaydin, C. Kaynak (1998) Cascading Classifiers, Kybernetika,
	to appear. ftp://ftp.icsi.berkeley.edu/pub/ai/ethem/kyb.ps.Z

4. Relevant Information:
	We used preprocessing programs made available by NIST to extract
	normalized bitmaps of handwritten digits from a preprinted form. From
	a total of 43 people, 30 contributed to the training set and different
	13 to the test set. 32x32 bitmaps are divided into nonoverlapping 
	blocks of 4x4 and the number of on pixels are counted in each block.
	This generates an input matrix of 8x8 where each element is an 
	integer in the range 0..16. This reduces dimensionality and gives 
	invariance to small distortions.

	For info on NIST preprocessing routines, see 
	M. D. Garris, J. L. Blue, G. T. Candela, D. L. Dimmick, J. Geist, 
	P. J. Grother, S. A. Janet, and C. L. Wilson, NIST Form-Based 
	Handprint Recognition System, NISTIR 5469, 1994.

5. Number of Instances
	optdigits.tra	Training	3823
	
	The way we used the dataset was to use half of training for 
	actual training, one-fourth for validation and one-fourth
	for writer-dependent testing. The test set was used for 
	writer-independent testing and is the actual quality measure.

6. Number of Attributes
	64 input+1 class attribute

7. For Each Attribute:
	All input attributes are integers in the range 0..16.
	The last attribute is the class code 0..9

8. Missing Attribute Values
	None

9. Class Distribution
	Class:	No of examples in training set
	0:  376
	1:  389
	2:  380
	3:  389
	4:  387
	5:  376
	6:  377
	7:  387
	8:  380
	9:  382


