Keras CNN training to recognize captcha: Get low loss and get low accuracy












0















I want to train a model which can recognize the captcha like this



captcha



I want to recognize each of the word in the picture



so I create the cnn model below



first I have a model like this:



print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
out = Dense(1024, activation='relu')(out)
out = Dropout(0.3)(out)
out = Dense(512, activation='relu')(out)
out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]

model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()


layer



When I am training the model
I think that it has a high loss but not really bad accuracy on valid set



like this:



Epoch 47/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9421 - digit1_loss: 0.2456 - digit2_loss: 0.2657 - digit3_loss: 0.2316 - digit4_loss: 0.1992 - digit1_acc: 0.9400 - digit2_acc: 0.9190 - digit3_acc: 0.9330 - digit4_acc: 0.9300 - val_loss: 5.0476 - val_digit1_loss: 0.5545 - val_digit2_loss: 1.8687 - val_digit3_loss: 1.8951 - val_digit4_loss: 0.7294 - val_digit1_acc: 0.8300 - val_digit2_acc: 0.5500 - val_digit3_acc: 0.5200 - val_digit4_acc: 0.8000

Epoch 00047: val_digit4_acc did not improve from 0.83000
Epoch 48/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9154 - digit1_loss: 0.1681 - digit2_loss: 0.2992 - digit3_loss: 0.2556 - digit4_loss: 0.1924 - digit1_acc: 0.9520 - digit2_acc: 0.9180 - digit3_acc: 0.9220 - digit4_acc: 0.9370 - val_loss: 4.6983 - val_digit1_loss: 0.4929 - val_digit2_loss: 1.8220 - val_digit3_loss: 1.6665 - val_digit4_loss: 0.7170 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5300 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.8300

Epoch 00048: val_digit4_acc improved from 0.83000 to 0.83000, saving model to cnn_model.hdf5
Epoch 49/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.8703 - digit1_loss: 0.1813 - digit2_loss: 0.2374 - digit3_loss: 0.2537 - digit4_loss: 0.1979 - digit1_acc: 0.9450 - digit2_acc: 0.9240 - digit3_acc: 0.9250 - digit4_acc: 0.9400 - val_loss: 4.6405 - val_digit1_loss: 0.4936 - val_digit2_loss: 1.8665 - val_digit3_loss: 1.5744 - val_digit4_loss: 0.7060 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.7800

Epoch 00049: val_digit4_acc did not improve from 0.83000
Epoch 50/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9112 - digit1_loss: 0.2036 - digit2_loss: 0.2543 - digit3_loss: 0.2222 - digit4_loss: 0.2312 - digit1_acc: 0.9360 - digit2_acc: 0.9170 - digit3_acc: 0.9290 - digit4_acc: 0.9330 - val_loss: 4.9354 - val_digit1_loss: 0.5632 - val_digit2_loss: 1.8869 - val_digit3_loss: 1.7899 - val_digit4_loss: 0.6954 - val_digit1_acc: 0.8600 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5700 - val_digit4_acc: 0.7900


then I use model.evaluate on valid set and I get the loss and accuracy like this:



(Loss is high and accuracy is high too)



Test loss: 4.9354219818115235
Test accuracy: 0.5632282853126526


second, I change the model like this:(take out 2 hidden layer)



# Create CNN Model
print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
#out = Dense(1024, activation='relu')(out)
#out = Dropout(0.3)(out)
#out = Dense(512, activation='relu')(out)
#out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]

model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()


layer2



When I am training the model
I think that it has a low loss and good accuracy on valid set



like this:



Epoch 47/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0536 - digit1_loss: 0.0023 - digit2_loss: 0.0231 - digit3_loss: 0.0230 - digit4_loss: 0.0052 - digit1_acc: 1.0000 - digit2_acc: 0.9980 - digit3_acc: 0.9990 - digit4_acc: 0.9990 - val_loss: 1.2679 - val_digit1_loss: 0.1059 - val_digit2_loss: 0.6560 - val_digit3_loss: 0.4402 - val_digit4_loss: 0.0658 - val_digit1_acc: 0.9600 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8900 - val_digit4_acc: 0.9900

Epoch 00047: val_digit4_acc did not improve from 0.99000
Epoch 48/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0686 - digit1_loss: 0.0044 - digit2_loss: 0.0269 - digit3_loss: 0.0238 - digit4_loss: 0.0136 - digit1_acc: 0.9990 - digit2_acc: 0.9980 - digit3_acc: 0.9980 - digit4_acc: 0.9950 - val_loss: 1.2249 - val_digit1_loss: 0.1170 - val_digit2_loss: 0.6593 - val_digit3_loss: 0.4152 - val_digit4_loss: 0.0334 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8800 - val_digit4_acc: 0.9900

Epoch 00048: val_digit4_acc did not improve from 0.99000
Epoch 49/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0736 - digit1_loss: 0.0087 - digit2_loss: 0.0309 - digit3_loss: 0.0268 - digit4_loss: 0.0071 - digit1_acc: 0.9980 - digit2_acc: 0.9940 - digit3_acc: 0.9950 - digit4_acc: 0.9970 - val_loss: 1.3238 - val_digit1_loss: 0.1229 - val_digit2_loss: 0.6496 - val_digit3_loss: 0.4951 - val_digit4_loss: 0.0562 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8400 - val_digit3_acc: 0.8500 - val_digit4_acc: 0.9900

Epoch 00049: val_digit4_acc did not improve from 0.99000
Epoch 50/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0935 - digit1_loss: 0.0050 - digit2_loss: 0.0354 - digit3_loss: 0.0499 - digit4_loss: 0.0032 - digit1_acc: 0.9980 - digit2_acc: 0.9910 - digit3_acc: 0.9890 - digit4_acc: 0.9990 - val_loss: 1.7740 - val_digit1_loss: 0.1539 - val_digit2_loss: 0.9237 - val_digit3_loss: 0.6273 - val_digit4_loss: 0.0690 - val_digit1_acc: 0.9300 - val_digit2_acc: 0.7900 - val_digit3_acc: 0.8400 - val_digit4_acc: 0.9900


but when I use model.evaluate on valid set and I get the loss and accuracy like this:



(lower loss and lower accuracy)



Test loss: 1.773969256877899
Test accuracy: 0.1539082833379507


Why can't I get higher accuracy after getting lower loss?
Or something wrong when I use "evaluate" method?










share|improve this question























  • Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…

    – constt
    Jan 1 at 9:21











  • OK Thanks~ I will try that.

    – Sun Hao Lun
    Jan 2 at 6:59
















0















I want to train a model which can recognize the captcha like this



captcha



I want to recognize each of the word in the picture



so I create the cnn model below



first I have a model like this:



print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
out = Dense(1024, activation='relu')(out)
out = Dropout(0.3)(out)
out = Dense(512, activation='relu')(out)
out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]

model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()


layer



When I am training the model
I think that it has a high loss but not really bad accuracy on valid set



like this:



Epoch 47/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9421 - digit1_loss: 0.2456 - digit2_loss: 0.2657 - digit3_loss: 0.2316 - digit4_loss: 0.1992 - digit1_acc: 0.9400 - digit2_acc: 0.9190 - digit3_acc: 0.9330 - digit4_acc: 0.9300 - val_loss: 5.0476 - val_digit1_loss: 0.5545 - val_digit2_loss: 1.8687 - val_digit3_loss: 1.8951 - val_digit4_loss: 0.7294 - val_digit1_acc: 0.8300 - val_digit2_acc: 0.5500 - val_digit3_acc: 0.5200 - val_digit4_acc: 0.8000

Epoch 00047: val_digit4_acc did not improve from 0.83000
Epoch 48/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9154 - digit1_loss: 0.1681 - digit2_loss: 0.2992 - digit3_loss: 0.2556 - digit4_loss: 0.1924 - digit1_acc: 0.9520 - digit2_acc: 0.9180 - digit3_acc: 0.9220 - digit4_acc: 0.9370 - val_loss: 4.6983 - val_digit1_loss: 0.4929 - val_digit2_loss: 1.8220 - val_digit3_loss: 1.6665 - val_digit4_loss: 0.7170 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5300 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.8300

Epoch 00048: val_digit4_acc improved from 0.83000 to 0.83000, saving model to cnn_model.hdf5
Epoch 49/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.8703 - digit1_loss: 0.1813 - digit2_loss: 0.2374 - digit3_loss: 0.2537 - digit4_loss: 0.1979 - digit1_acc: 0.9450 - digit2_acc: 0.9240 - digit3_acc: 0.9250 - digit4_acc: 0.9400 - val_loss: 4.6405 - val_digit1_loss: 0.4936 - val_digit2_loss: 1.8665 - val_digit3_loss: 1.5744 - val_digit4_loss: 0.7060 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.7800

Epoch 00049: val_digit4_acc did not improve from 0.83000
Epoch 50/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9112 - digit1_loss: 0.2036 - digit2_loss: 0.2543 - digit3_loss: 0.2222 - digit4_loss: 0.2312 - digit1_acc: 0.9360 - digit2_acc: 0.9170 - digit3_acc: 0.9290 - digit4_acc: 0.9330 - val_loss: 4.9354 - val_digit1_loss: 0.5632 - val_digit2_loss: 1.8869 - val_digit3_loss: 1.7899 - val_digit4_loss: 0.6954 - val_digit1_acc: 0.8600 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5700 - val_digit4_acc: 0.7900


then I use model.evaluate on valid set and I get the loss and accuracy like this:



(Loss is high and accuracy is high too)



Test loss: 4.9354219818115235
Test accuracy: 0.5632282853126526


second, I change the model like this:(take out 2 hidden layer)



# Create CNN Model
print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
#out = Dense(1024, activation='relu')(out)
#out = Dropout(0.3)(out)
#out = Dense(512, activation='relu')(out)
#out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]

model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()


layer2



When I am training the model
I think that it has a low loss and good accuracy on valid set



like this:



Epoch 47/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0536 - digit1_loss: 0.0023 - digit2_loss: 0.0231 - digit3_loss: 0.0230 - digit4_loss: 0.0052 - digit1_acc: 1.0000 - digit2_acc: 0.9980 - digit3_acc: 0.9990 - digit4_acc: 0.9990 - val_loss: 1.2679 - val_digit1_loss: 0.1059 - val_digit2_loss: 0.6560 - val_digit3_loss: 0.4402 - val_digit4_loss: 0.0658 - val_digit1_acc: 0.9600 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8900 - val_digit4_acc: 0.9900

Epoch 00047: val_digit4_acc did not improve from 0.99000
Epoch 48/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0686 - digit1_loss: 0.0044 - digit2_loss: 0.0269 - digit3_loss: 0.0238 - digit4_loss: 0.0136 - digit1_acc: 0.9990 - digit2_acc: 0.9980 - digit3_acc: 0.9980 - digit4_acc: 0.9950 - val_loss: 1.2249 - val_digit1_loss: 0.1170 - val_digit2_loss: 0.6593 - val_digit3_loss: 0.4152 - val_digit4_loss: 0.0334 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8800 - val_digit4_acc: 0.9900

Epoch 00048: val_digit4_acc did not improve from 0.99000
Epoch 49/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0736 - digit1_loss: 0.0087 - digit2_loss: 0.0309 - digit3_loss: 0.0268 - digit4_loss: 0.0071 - digit1_acc: 0.9980 - digit2_acc: 0.9940 - digit3_acc: 0.9950 - digit4_acc: 0.9970 - val_loss: 1.3238 - val_digit1_loss: 0.1229 - val_digit2_loss: 0.6496 - val_digit3_loss: 0.4951 - val_digit4_loss: 0.0562 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8400 - val_digit3_acc: 0.8500 - val_digit4_acc: 0.9900

Epoch 00049: val_digit4_acc did not improve from 0.99000
Epoch 50/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0935 - digit1_loss: 0.0050 - digit2_loss: 0.0354 - digit3_loss: 0.0499 - digit4_loss: 0.0032 - digit1_acc: 0.9980 - digit2_acc: 0.9910 - digit3_acc: 0.9890 - digit4_acc: 0.9990 - val_loss: 1.7740 - val_digit1_loss: 0.1539 - val_digit2_loss: 0.9237 - val_digit3_loss: 0.6273 - val_digit4_loss: 0.0690 - val_digit1_acc: 0.9300 - val_digit2_acc: 0.7900 - val_digit3_acc: 0.8400 - val_digit4_acc: 0.9900


but when I use model.evaluate on valid set and I get the loss and accuracy like this:



(lower loss and lower accuracy)



Test loss: 1.773969256877899
Test accuracy: 0.1539082833379507


Why can't I get higher accuracy after getting lower loss?
Or something wrong when I use "evaluate" method?










share|improve this question























  • Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…

    – constt
    Jan 1 at 9:21











  • OK Thanks~ I will try that.

    – Sun Hao Lun
    Jan 2 at 6:59














0












0








0








I want to train a model which can recognize the captcha like this



captcha



I want to recognize each of the word in the picture



so I create the cnn model below



first I have a model like this:



print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
out = Dense(1024, activation='relu')(out)
out = Dropout(0.3)(out)
out = Dense(512, activation='relu')(out)
out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]

model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()


layer



When I am training the model
I think that it has a high loss but not really bad accuracy on valid set



like this:



Epoch 47/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9421 - digit1_loss: 0.2456 - digit2_loss: 0.2657 - digit3_loss: 0.2316 - digit4_loss: 0.1992 - digit1_acc: 0.9400 - digit2_acc: 0.9190 - digit3_acc: 0.9330 - digit4_acc: 0.9300 - val_loss: 5.0476 - val_digit1_loss: 0.5545 - val_digit2_loss: 1.8687 - val_digit3_loss: 1.8951 - val_digit4_loss: 0.7294 - val_digit1_acc: 0.8300 - val_digit2_acc: 0.5500 - val_digit3_acc: 0.5200 - val_digit4_acc: 0.8000

Epoch 00047: val_digit4_acc did not improve from 0.83000
Epoch 48/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9154 - digit1_loss: 0.1681 - digit2_loss: 0.2992 - digit3_loss: 0.2556 - digit4_loss: 0.1924 - digit1_acc: 0.9520 - digit2_acc: 0.9180 - digit3_acc: 0.9220 - digit4_acc: 0.9370 - val_loss: 4.6983 - val_digit1_loss: 0.4929 - val_digit2_loss: 1.8220 - val_digit3_loss: 1.6665 - val_digit4_loss: 0.7170 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5300 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.8300

Epoch 00048: val_digit4_acc improved from 0.83000 to 0.83000, saving model to cnn_model.hdf5
Epoch 49/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.8703 - digit1_loss: 0.1813 - digit2_loss: 0.2374 - digit3_loss: 0.2537 - digit4_loss: 0.1979 - digit1_acc: 0.9450 - digit2_acc: 0.9240 - digit3_acc: 0.9250 - digit4_acc: 0.9400 - val_loss: 4.6405 - val_digit1_loss: 0.4936 - val_digit2_loss: 1.8665 - val_digit3_loss: 1.5744 - val_digit4_loss: 0.7060 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.7800

Epoch 00049: val_digit4_acc did not improve from 0.83000
Epoch 50/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9112 - digit1_loss: 0.2036 - digit2_loss: 0.2543 - digit3_loss: 0.2222 - digit4_loss: 0.2312 - digit1_acc: 0.9360 - digit2_acc: 0.9170 - digit3_acc: 0.9290 - digit4_acc: 0.9330 - val_loss: 4.9354 - val_digit1_loss: 0.5632 - val_digit2_loss: 1.8869 - val_digit3_loss: 1.7899 - val_digit4_loss: 0.6954 - val_digit1_acc: 0.8600 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5700 - val_digit4_acc: 0.7900


then I use model.evaluate on valid set and I get the loss and accuracy like this:



(Loss is high and accuracy is high too)



Test loss: 4.9354219818115235
Test accuracy: 0.5632282853126526


second, I change the model like this:(take out 2 hidden layer)



# Create CNN Model
print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
#out = Dense(1024, activation='relu')(out)
#out = Dropout(0.3)(out)
#out = Dense(512, activation='relu')(out)
#out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]

model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()


layer2



When I am training the model
I think that it has a low loss and good accuracy on valid set



like this:



Epoch 47/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0536 - digit1_loss: 0.0023 - digit2_loss: 0.0231 - digit3_loss: 0.0230 - digit4_loss: 0.0052 - digit1_acc: 1.0000 - digit2_acc: 0.9980 - digit3_acc: 0.9990 - digit4_acc: 0.9990 - val_loss: 1.2679 - val_digit1_loss: 0.1059 - val_digit2_loss: 0.6560 - val_digit3_loss: 0.4402 - val_digit4_loss: 0.0658 - val_digit1_acc: 0.9600 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8900 - val_digit4_acc: 0.9900

Epoch 00047: val_digit4_acc did not improve from 0.99000
Epoch 48/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0686 - digit1_loss: 0.0044 - digit2_loss: 0.0269 - digit3_loss: 0.0238 - digit4_loss: 0.0136 - digit1_acc: 0.9990 - digit2_acc: 0.9980 - digit3_acc: 0.9980 - digit4_acc: 0.9950 - val_loss: 1.2249 - val_digit1_loss: 0.1170 - val_digit2_loss: 0.6593 - val_digit3_loss: 0.4152 - val_digit4_loss: 0.0334 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8800 - val_digit4_acc: 0.9900

Epoch 00048: val_digit4_acc did not improve from 0.99000
Epoch 49/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0736 - digit1_loss: 0.0087 - digit2_loss: 0.0309 - digit3_loss: 0.0268 - digit4_loss: 0.0071 - digit1_acc: 0.9980 - digit2_acc: 0.9940 - digit3_acc: 0.9950 - digit4_acc: 0.9970 - val_loss: 1.3238 - val_digit1_loss: 0.1229 - val_digit2_loss: 0.6496 - val_digit3_loss: 0.4951 - val_digit4_loss: 0.0562 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8400 - val_digit3_acc: 0.8500 - val_digit4_acc: 0.9900

Epoch 00049: val_digit4_acc did not improve from 0.99000
Epoch 50/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0935 - digit1_loss: 0.0050 - digit2_loss: 0.0354 - digit3_loss: 0.0499 - digit4_loss: 0.0032 - digit1_acc: 0.9980 - digit2_acc: 0.9910 - digit3_acc: 0.9890 - digit4_acc: 0.9990 - val_loss: 1.7740 - val_digit1_loss: 0.1539 - val_digit2_loss: 0.9237 - val_digit3_loss: 0.6273 - val_digit4_loss: 0.0690 - val_digit1_acc: 0.9300 - val_digit2_acc: 0.7900 - val_digit3_acc: 0.8400 - val_digit4_acc: 0.9900


but when I use model.evaluate on valid set and I get the loss and accuracy like this:



(lower loss and lower accuracy)



Test loss: 1.773969256877899
Test accuracy: 0.1539082833379507


Why can't I get higher accuracy after getting lower loss?
Or something wrong when I use "evaluate" method?










share|improve this question














I want to train a model which can recognize the captcha like this



captcha



I want to recognize each of the word in the picture



so I create the cnn model below



first I have a model like this:



print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
out = Dense(1024, activation='relu')(out)
out = Dropout(0.3)(out)
out = Dense(512, activation='relu')(out)
out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]

model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()


layer



When I am training the model
I think that it has a high loss but not really bad accuracy on valid set



like this:



Epoch 47/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9421 - digit1_loss: 0.2456 - digit2_loss: 0.2657 - digit3_loss: 0.2316 - digit4_loss: 0.1992 - digit1_acc: 0.9400 - digit2_acc: 0.9190 - digit3_acc: 0.9330 - digit4_acc: 0.9300 - val_loss: 5.0476 - val_digit1_loss: 0.5545 - val_digit2_loss: 1.8687 - val_digit3_loss: 1.8951 - val_digit4_loss: 0.7294 - val_digit1_acc: 0.8300 - val_digit2_acc: 0.5500 - val_digit3_acc: 0.5200 - val_digit4_acc: 0.8000

Epoch 00047: val_digit4_acc did not improve from 0.83000
Epoch 48/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9154 - digit1_loss: 0.1681 - digit2_loss: 0.2992 - digit3_loss: 0.2556 - digit4_loss: 0.1924 - digit1_acc: 0.9520 - digit2_acc: 0.9180 - digit3_acc: 0.9220 - digit4_acc: 0.9370 - val_loss: 4.6983 - val_digit1_loss: 0.4929 - val_digit2_loss: 1.8220 - val_digit3_loss: 1.6665 - val_digit4_loss: 0.7170 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5300 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.8300

Epoch 00048: val_digit4_acc improved from 0.83000 to 0.83000, saving model to cnn_model.hdf5
Epoch 49/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.8703 - digit1_loss: 0.1813 - digit2_loss: 0.2374 - digit3_loss: 0.2537 - digit4_loss: 0.1979 - digit1_acc: 0.9450 - digit2_acc: 0.9240 - digit3_acc: 0.9250 - digit4_acc: 0.9400 - val_loss: 4.6405 - val_digit1_loss: 0.4936 - val_digit2_loss: 1.8665 - val_digit3_loss: 1.5744 - val_digit4_loss: 0.7060 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.7800

Epoch 00049: val_digit4_acc did not improve from 0.83000
Epoch 50/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9112 - digit1_loss: 0.2036 - digit2_loss: 0.2543 - digit3_loss: 0.2222 - digit4_loss: 0.2312 - digit1_acc: 0.9360 - digit2_acc: 0.9170 - digit3_acc: 0.9290 - digit4_acc: 0.9330 - val_loss: 4.9354 - val_digit1_loss: 0.5632 - val_digit2_loss: 1.8869 - val_digit3_loss: 1.7899 - val_digit4_loss: 0.6954 - val_digit1_acc: 0.8600 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5700 - val_digit4_acc: 0.7900


then I use model.evaluate on valid set and I get the loss and accuracy like this:



(Loss is high and accuracy is high too)



Test loss: 4.9354219818115235
Test accuracy: 0.5632282853126526


second, I change the model like this:(take out 2 hidden layer)



# Create CNN Model
print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
#out = Dense(1024, activation='relu')(out)
#out = Dropout(0.3)(out)
#out = Dense(512, activation='relu')(out)
#out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]

model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()


layer2



When I am training the model
I think that it has a low loss and good accuracy on valid set



like this:



Epoch 47/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0536 - digit1_loss: 0.0023 - digit2_loss: 0.0231 - digit3_loss: 0.0230 - digit4_loss: 0.0052 - digit1_acc: 1.0000 - digit2_acc: 0.9980 - digit3_acc: 0.9990 - digit4_acc: 0.9990 - val_loss: 1.2679 - val_digit1_loss: 0.1059 - val_digit2_loss: 0.6560 - val_digit3_loss: 0.4402 - val_digit4_loss: 0.0658 - val_digit1_acc: 0.9600 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8900 - val_digit4_acc: 0.9900

Epoch 00047: val_digit4_acc did not improve from 0.99000
Epoch 48/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0686 - digit1_loss: 0.0044 - digit2_loss: 0.0269 - digit3_loss: 0.0238 - digit4_loss: 0.0136 - digit1_acc: 0.9990 - digit2_acc: 0.9980 - digit3_acc: 0.9980 - digit4_acc: 0.9950 - val_loss: 1.2249 - val_digit1_loss: 0.1170 - val_digit2_loss: 0.6593 - val_digit3_loss: 0.4152 - val_digit4_loss: 0.0334 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8800 - val_digit4_acc: 0.9900

Epoch 00048: val_digit4_acc did not improve from 0.99000
Epoch 49/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0736 - digit1_loss: 0.0087 - digit2_loss: 0.0309 - digit3_loss: 0.0268 - digit4_loss: 0.0071 - digit1_acc: 0.9980 - digit2_acc: 0.9940 - digit3_acc: 0.9950 - digit4_acc: 0.9970 - val_loss: 1.3238 - val_digit1_loss: 0.1229 - val_digit2_loss: 0.6496 - val_digit3_loss: 0.4951 - val_digit4_loss: 0.0562 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8400 - val_digit3_acc: 0.8500 - val_digit4_acc: 0.9900

Epoch 00049: val_digit4_acc did not improve from 0.99000
Epoch 50/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0935 - digit1_loss: 0.0050 - digit2_loss: 0.0354 - digit3_loss: 0.0499 - digit4_loss: 0.0032 - digit1_acc: 0.9980 - digit2_acc: 0.9910 - digit3_acc: 0.9890 - digit4_acc: 0.9990 - val_loss: 1.7740 - val_digit1_loss: 0.1539 - val_digit2_loss: 0.9237 - val_digit3_loss: 0.6273 - val_digit4_loss: 0.0690 - val_digit1_acc: 0.9300 - val_digit2_acc: 0.7900 - val_digit3_acc: 0.8400 - val_digit4_acc: 0.9900


but when I use model.evaluate on valid set and I get the loss and accuracy like this:



(lower loss and lower accuracy)



Test loss: 1.773969256877899
Test accuracy: 0.1539082833379507


Why can't I get higher accuracy after getting lower loss?
Or something wrong when I use "evaluate" method?







python tensorflow machine-learning keras deep-learning






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jan 1 at 8:06









Sun Hao LunSun Hao Lun

305




305













  • Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…

    – constt
    Jan 1 at 9:21











  • OK Thanks~ I will try that.

    – Sun Hao Lun
    Jan 2 at 6:59



















  • Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…

    – constt
    Jan 1 at 9:21











  • OK Thanks~ I will try that.

    – Sun Hao Lun
    Jan 2 at 6:59

















Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…

– constt
Jan 1 at 9:21





Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…

– constt
Jan 1 at 9:21













OK Thanks~ I will try that.

– Sun Hao Lun
Jan 2 at 6:59





OK Thanks~ I will try that.

– Sun Hao Lun
Jan 2 at 6:59












1 Answer
1






active

oldest

votes


















0














I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.



Please check:
Are they really independent? Especially train and cross-validation set.

How large your sets are? If you have enough variance in cross-validation.






share|improve this answer
























  • Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?

    – Sun Hao Lun
    Jan 2 at 7:06











  • It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.

    – OndraN
    Jan 2 at 16:21











  • You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.

    – OndraN
    Jan 2 at 16:22











  • Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.

    – OndraN
    Jan 2 at 16:22











  • 3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.

    – OndraN
    Jan 2 at 16:22











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53993955%2fkeras-cnn-training-to-recognize-captcha-get-low-loss-and-get-low-accuracy%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.



Please check:
Are they really independent? Especially train and cross-validation set.

How large your sets are? If you have enough variance in cross-validation.






share|improve this answer
























  • Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?

    – Sun Hao Lun
    Jan 2 at 7:06











  • It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.

    – OndraN
    Jan 2 at 16:21











  • You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.

    – OndraN
    Jan 2 at 16:22











  • Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.

    – OndraN
    Jan 2 at 16:22











  • 3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.

    – OndraN
    Jan 2 at 16:22
















0














I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.



Please check:
Are they really independent? Especially train and cross-validation set.

How large your sets are? If you have enough variance in cross-validation.






share|improve this answer
























  • Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?

    – Sun Hao Lun
    Jan 2 at 7:06











  • It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.

    – OndraN
    Jan 2 at 16:21











  • You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.

    – OndraN
    Jan 2 at 16:22











  • Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.

    – OndraN
    Jan 2 at 16:22











  • 3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.

    – OndraN
    Jan 2 at 16:22














0












0








0







I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.



Please check:
Are they really independent? Especially train and cross-validation set.

How large your sets are? If you have enough variance in cross-validation.






share|improve this answer













I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.



Please check:
Are they really independent? Especially train and cross-validation set.

How large your sets are? If you have enough variance in cross-validation.







share|improve this answer












share|improve this answer



share|improve this answer










answered Jan 1 at 16:50









OndraNOndraN

74




74













  • Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?

    – Sun Hao Lun
    Jan 2 at 7:06











  • It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.

    – OndraN
    Jan 2 at 16:21











  • You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.

    – OndraN
    Jan 2 at 16:22











  • Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.

    – OndraN
    Jan 2 at 16:22











  • 3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.

    – OndraN
    Jan 2 at 16:22



















  • Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?

    – Sun Hao Lun
    Jan 2 at 7:06











  • It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.

    – OndraN
    Jan 2 at 16:21











  • You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.

    – OndraN
    Jan 2 at 16:22











  • Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.

    – OndraN
    Jan 2 at 16:22











  • 3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.

    – OndraN
    Jan 2 at 16:22

















Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?

– Sun Hao Lun
Jan 2 at 7:06





Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?

– Sun Hao Lun
Jan 2 at 7:06













It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.

– OndraN
Jan 2 at 16:21





It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.

– OndraN
Jan 2 at 16:21













You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.

– OndraN
Jan 2 at 16:22





You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.

– OndraN
Jan 2 at 16:22













Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.

– OndraN
Jan 2 at 16:22





Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.

– OndraN
Jan 2 at 16:22













3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.

– OndraN
Jan 2 at 16:22





3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.

– OndraN
Jan 2 at 16:22




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53993955%2fkeras-cnn-training-to-recognize-captcha-get-low-loss-and-get-low-accuracy%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Can a sorcerer learn a 5th-level spell early by creating spell slots using the Font of Magic feature?

Does disintegrating a polymorphed enemy still kill it after the 2018 errata?

A Topological Invariant for $pi_3(U(n))$