Keras CNN training to recognize captcha: Get low loss and get low accuracy
I want to train a model which can recognize the captcha like this
I want to recognize each of the word in the picture
so I create the cnn model below
first I have a model like this:
print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
out = Dense(1024, activation='relu')(out)
out = Dropout(0.3)(out)
out = Dense(512, activation='relu')(out)
out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]
model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
When I am training the model
I think that it has a high loss but not really bad accuracy on valid set
like this:
Epoch 47/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9421 - digit1_loss: 0.2456 - digit2_loss: 0.2657 - digit3_loss: 0.2316 - digit4_loss: 0.1992 - digit1_acc: 0.9400 - digit2_acc: 0.9190 - digit3_acc: 0.9330 - digit4_acc: 0.9300 - val_loss: 5.0476 - val_digit1_loss: 0.5545 - val_digit2_loss: 1.8687 - val_digit3_loss: 1.8951 - val_digit4_loss: 0.7294 - val_digit1_acc: 0.8300 - val_digit2_acc: 0.5500 - val_digit3_acc: 0.5200 - val_digit4_acc: 0.8000
Epoch 00047: val_digit4_acc did not improve from 0.83000
Epoch 48/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9154 - digit1_loss: 0.1681 - digit2_loss: 0.2992 - digit3_loss: 0.2556 - digit4_loss: 0.1924 - digit1_acc: 0.9520 - digit2_acc: 0.9180 - digit3_acc: 0.9220 - digit4_acc: 0.9370 - val_loss: 4.6983 - val_digit1_loss: 0.4929 - val_digit2_loss: 1.8220 - val_digit3_loss: 1.6665 - val_digit4_loss: 0.7170 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5300 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.8300
Epoch 00048: val_digit4_acc improved from 0.83000 to 0.83000, saving model to cnn_model.hdf5
Epoch 49/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.8703 - digit1_loss: 0.1813 - digit2_loss: 0.2374 - digit3_loss: 0.2537 - digit4_loss: 0.1979 - digit1_acc: 0.9450 - digit2_acc: 0.9240 - digit3_acc: 0.9250 - digit4_acc: 0.9400 - val_loss: 4.6405 - val_digit1_loss: 0.4936 - val_digit2_loss: 1.8665 - val_digit3_loss: 1.5744 - val_digit4_loss: 0.7060 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.7800
Epoch 00049: val_digit4_acc did not improve from 0.83000
Epoch 50/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9112 - digit1_loss: 0.2036 - digit2_loss: 0.2543 - digit3_loss: 0.2222 - digit4_loss: 0.2312 - digit1_acc: 0.9360 - digit2_acc: 0.9170 - digit3_acc: 0.9290 - digit4_acc: 0.9330 - val_loss: 4.9354 - val_digit1_loss: 0.5632 - val_digit2_loss: 1.8869 - val_digit3_loss: 1.7899 - val_digit4_loss: 0.6954 - val_digit1_acc: 0.8600 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5700 - val_digit4_acc: 0.7900
then I use model.evaluate on valid set and I get the loss and accuracy like this:
(Loss is high and accuracy is high too)
Test loss: 4.9354219818115235
Test accuracy: 0.5632282853126526
second, I change the model like this:(take out 2 hidden layer)
# Create CNN Model
print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
#out = Dense(1024, activation='relu')(out)
#out = Dropout(0.3)(out)
#out = Dense(512, activation='relu')(out)
#out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]
model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
When I am training the model
I think that it has a low loss and good accuracy on valid set
like this:
Epoch 47/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0536 - digit1_loss: 0.0023 - digit2_loss: 0.0231 - digit3_loss: 0.0230 - digit4_loss: 0.0052 - digit1_acc: 1.0000 - digit2_acc: 0.9980 - digit3_acc: 0.9990 - digit4_acc: 0.9990 - val_loss: 1.2679 - val_digit1_loss: 0.1059 - val_digit2_loss: 0.6560 - val_digit3_loss: 0.4402 - val_digit4_loss: 0.0658 - val_digit1_acc: 0.9600 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8900 - val_digit4_acc: 0.9900
Epoch 00047: val_digit4_acc did not improve from 0.99000
Epoch 48/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0686 - digit1_loss: 0.0044 - digit2_loss: 0.0269 - digit3_loss: 0.0238 - digit4_loss: 0.0136 - digit1_acc: 0.9990 - digit2_acc: 0.9980 - digit3_acc: 0.9980 - digit4_acc: 0.9950 - val_loss: 1.2249 - val_digit1_loss: 0.1170 - val_digit2_loss: 0.6593 - val_digit3_loss: 0.4152 - val_digit4_loss: 0.0334 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8800 - val_digit4_acc: 0.9900
Epoch 00048: val_digit4_acc did not improve from 0.99000
Epoch 49/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0736 - digit1_loss: 0.0087 - digit2_loss: 0.0309 - digit3_loss: 0.0268 - digit4_loss: 0.0071 - digit1_acc: 0.9980 - digit2_acc: 0.9940 - digit3_acc: 0.9950 - digit4_acc: 0.9970 - val_loss: 1.3238 - val_digit1_loss: 0.1229 - val_digit2_loss: 0.6496 - val_digit3_loss: 0.4951 - val_digit4_loss: 0.0562 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8400 - val_digit3_acc: 0.8500 - val_digit4_acc: 0.9900
Epoch 00049: val_digit4_acc did not improve from 0.99000
Epoch 50/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0935 - digit1_loss: 0.0050 - digit2_loss: 0.0354 - digit3_loss: 0.0499 - digit4_loss: 0.0032 - digit1_acc: 0.9980 - digit2_acc: 0.9910 - digit3_acc: 0.9890 - digit4_acc: 0.9990 - val_loss: 1.7740 - val_digit1_loss: 0.1539 - val_digit2_loss: 0.9237 - val_digit3_loss: 0.6273 - val_digit4_loss: 0.0690 - val_digit1_acc: 0.9300 - val_digit2_acc: 0.7900 - val_digit3_acc: 0.8400 - val_digit4_acc: 0.9900
but when I use model.evaluate on valid set and I get the loss and accuracy like this:
(lower loss and lower accuracy)
Test loss: 1.773969256877899
Test accuracy: 0.1539082833379507
Why can't I get higher accuracy after getting lower loss?
Or something wrong when I use "evaluate" method?
python tensorflow machine-learning keras deep-learning
add a comment |
I want to train a model which can recognize the captcha like this
I want to recognize each of the word in the picture
so I create the cnn model below
first I have a model like this:
print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
out = Dense(1024, activation='relu')(out)
out = Dropout(0.3)(out)
out = Dense(512, activation='relu')(out)
out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]
model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
When I am training the model
I think that it has a high loss but not really bad accuracy on valid set
like this:
Epoch 47/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9421 - digit1_loss: 0.2456 - digit2_loss: 0.2657 - digit3_loss: 0.2316 - digit4_loss: 0.1992 - digit1_acc: 0.9400 - digit2_acc: 0.9190 - digit3_acc: 0.9330 - digit4_acc: 0.9300 - val_loss: 5.0476 - val_digit1_loss: 0.5545 - val_digit2_loss: 1.8687 - val_digit3_loss: 1.8951 - val_digit4_loss: 0.7294 - val_digit1_acc: 0.8300 - val_digit2_acc: 0.5500 - val_digit3_acc: 0.5200 - val_digit4_acc: 0.8000
Epoch 00047: val_digit4_acc did not improve from 0.83000
Epoch 48/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9154 - digit1_loss: 0.1681 - digit2_loss: 0.2992 - digit3_loss: 0.2556 - digit4_loss: 0.1924 - digit1_acc: 0.9520 - digit2_acc: 0.9180 - digit3_acc: 0.9220 - digit4_acc: 0.9370 - val_loss: 4.6983 - val_digit1_loss: 0.4929 - val_digit2_loss: 1.8220 - val_digit3_loss: 1.6665 - val_digit4_loss: 0.7170 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5300 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.8300
Epoch 00048: val_digit4_acc improved from 0.83000 to 0.83000, saving model to cnn_model.hdf5
Epoch 49/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.8703 - digit1_loss: 0.1813 - digit2_loss: 0.2374 - digit3_loss: 0.2537 - digit4_loss: 0.1979 - digit1_acc: 0.9450 - digit2_acc: 0.9240 - digit3_acc: 0.9250 - digit4_acc: 0.9400 - val_loss: 4.6405 - val_digit1_loss: 0.4936 - val_digit2_loss: 1.8665 - val_digit3_loss: 1.5744 - val_digit4_loss: 0.7060 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.7800
Epoch 00049: val_digit4_acc did not improve from 0.83000
Epoch 50/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9112 - digit1_loss: 0.2036 - digit2_loss: 0.2543 - digit3_loss: 0.2222 - digit4_loss: 0.2312 - digit1_acc: 0.9360 - digit2_acc: 0.9170 - digit3_acc: 0.9290 - digit4_acc: 0.9330 - val_loss: 4.9354 - val_digit1_loss: 0.5632 - val_digit2_loss: 1.8869 - val_digit3_loss: 1.7899 - val_digit4_loss: 0.6954 - val_digit1_acc: 0.8600 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5700 - val_digit4_acc: 0.7900
then I use model.evaluate on valid set and I get the loss and accuracy like this:
(Loss is high and accuracy is high too)
Test loss: 4.9354219818115235
Test accuracy: 0.5632282853126526
second, I change the model like this:(take out 2 hidden layer)
# Create CNN Model
print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
#out = Dense(1024, activation='relu')(out)
#out = Dropout(0.3)(out)
#out = Dense(512, activation='relu')(out)
#out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]
model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
When I am training the model
I think that it has a low loss and good accuracy on valid set
like this:
Epoch 47/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0536 - digit1_loss: 0.0023 - digit2_loss: 0.0231 - digit3_loss: 0.0230 - digit4_loss: 0.0052 - digit1_acc: 1.0000 - digit2_acc: 0.9980 - digit3_acc: 0.9990 - digit4_acc: 0.9990 - val_loss: 1.2679 - val_digit1_loss: 0.1059 - val_digit2_loss: 0.6560 - val_digit3_loss: 0.4402 - val_digit4_loss: 0.0658 - val_digit1_acc: 0.9600 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8900 - val_digit4_acc: 0.9900
Epoch 00047: val_digit4_acc did not improve from 0.99000
Epoch 48/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0686 - digit1_loss: 0.0044 - digit2_loss: 0.0269 - digit3_loss: 0.0238 - digit4_loss: 0.0136 - digit1_acc: 0.9990 - digit2_acc: 0.9980 - digit3_acc: 0.9980 - digit4_acc: 0.9950 - val_loss: 1.2249 - val_digit1_loss: 0.1170 - val_digit2_loss: 0.6593 - val_digit3_loss: 0.4152 - val_digit4_loss: 0.0334 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8800 - val_digit4_acc: 0.9900
Epoch 00048: val_digit4_acc did not improve from 0.99000
Epoch 49/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0736 - digit1_loss: 0.0087 - digit2_loss: 0.0309 - digit3_loss: 0.0268 - digit4_loss: 0.0071 - digit1_acc: 0.9980 - digit2_acc: 0.9940 - digit3_acc: 0.9950 - digit4_acc: 0.9970 - val_loss: 1.3238 - val_digit1_loss: 0.1229 - val_digit2_loss: 0.6496 - val_digit3_loss: 0.4951 - val_digit4_loss: 0.0562 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8400 - val_digit3_acc: 0.8500 - val_digit4_acc: 0.9900
Epoch 00049: val_digit4_acc did not improve from 0.99000
Epoch 50/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0935 - digit1_loss: 0.0050 - digit2_loss: 0.0354 - digit3_loss: 0.0499 - digit4_loss: 0.0032 - digit1_acc: 0.9980 - digit2_acc: 0.9910 - digit3_acc: 0.9890 - digit4_acc: 0.9990 - val_loss: 1.7740 - val_digit1_loss: 0.1539 - val_digit2_loss: 0.9237 - val_digit3_loss: 0.6273 - val_digit4_loss: 0.0690 - val_digit1_acc: 0.9300 - val_digit2_acc: 0.7900 - val_digit3_acc: 0.8400 - val_digit4_acc: 0.9900
but when I use model.evaluate on valid set and I get the loss and accuracy like this:
(lower loss and lower accuracy)
Test loss: 1.773969256877899
Test accuracy: 0.1539082833379507
Why can't I get higher accuracy after getting lower loss?
Or something wrong when I use "evaluate" method?
python tensorflow machine-learning keras deep-learning
Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…
– constt
Jan 1 at 9:21
OK Thanks~ I will try that.
– Sun Hao Lun
Jan 2 at 6:59
add a comment |
I want to train a model which can recognize the captcha like this
I want to recognize each of the word in the picture
so I create the cnn model below
first I have a model like this:
print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
out = Dense(1024, activation='relu')(out)
out = Dropout(0.3)(out)
out = Dense(512, activation='relu')(out)
out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]
model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
When I am training the model
I think that it has a high loss but not really bad accuracy on valid set
like this:
Epoch 47/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9421 - digit1_loss: 0.2456 - digit2_loss: 0.2657 - digit3_loss: 0.2316 - digit4_loss: 0.1992 - digit1_acc: 0.9400 - digit2_acc: 0.9190 - digit3_acc: 0.9330 - digit4_acc: 0.9300 - val_loss: 5.0476 - val_digit1_loss: 0.5545 - val_digit2_loss: 1.8687 - val_digit3_loss: 1.8951 - val_digit4_loss: 0.7294 - val_digit1_acc: 0.8300 - val_digit2_acc: 0.5500 - val_digit3_acc: 0.5200 - val_digit4_acc: 0.8000
Epoch 00047: val_digit4_acc did not improve from 0.83000
Epoch 48/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9154 - digit1_loss: 0.1681 - digit2_loss: 0.2992 - digit3_loss: 0.2556 - digit4_loss: 0.1924 - digit1_acc: 0.9520 - digit2_acc: 0.9180 - digit3_acc: 0.9220 - digit4_acc: 0.9370 - val_loss: 4.6983 - val_digit1_loss: 0.4929 - val_digit2_loss: 1.8220 - val_digit3_loss: 1.6665 - val_digit4_loss: 0.7170 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5300 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.8300
Epoch 00048: val_digit4_acc improved from 0.83000 to 0.83000, saving model to cnn_model.hdf5
Epoch 49/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.8703 - digit1_loss: 0.1813 - digit2_loss: 0.2374 - digit3_loss: 0.2537 - digit4_loss: 0.1979 - digit1_acc: 0.9450 - digit2_acc: 0.9240 - digit3_acc: 0.9250 - digit4_acc: 0.9400 - val_loss: 4.6405 - val_digit1_loss: 0.4936 - val_digit2_loss: 1.8665 - val_digit3_loss: 1.5744 - val_digit4_loss: 0.7060 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.7800
Epoch 00049: val_digit4_acc did not improve from 0.83000
Epoch 50/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9112 - digit1_loss: 0.2036 - digit2_loss: 0.2543 - digit3_loss: 0.2222 - digit4_loss: 0.2312 - digit1_acc: 0.9360 - digit2_acc: 0.9170 - digit3_acc: 0.9290 - digit4_acc: 0.9330 - val_loss: 4.9354 - val_digit1_loss: 0.5632 - val_digit2_loss: 1.8869 - val_digit3_loss: 1.7899 - val_digit4_loss: 0.6954 - val_digit1_acc: 0.8600 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5700 - val_digit4_acc: 0.7900
then I use model.evaluate on valid set and I get the loss and accuracy like this:
(Loss is high and accuracy is high too)
Test loss: 4.9354219818115235
Test accuracy: 0.5632282853126526
second, I change the model like this:(take out 2 hidden layer)
# Create CNN Model
print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
#out = Dense(1024, activation='relu')(out)
#out = Dropout(0.3)(out)
#out = Dense(512, activation='relu')(out)
#out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]
model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
When I am training the model
I think that it has a low loss and good accuracy on valid set
like this:
Epoch 47/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0536 - digit1_loss: 0.0023 - digit2_loss: 0.0231 - digit3_loss: 0.0230 - digit4_loss: 0.0052 - digit1_acc: 1.0000 - digit2_acc: 0.9980 - digit3_acc: 0.9990 - digit4_acc: 0.9990 - val_loss: 1.2679 - val_digit1_loss: 0.1059 - val_digit2_loss: 0.6560 - val_digit3_loss: 0.4402 - val_digit4_loss: 0.0658 - val_digit1_acc: 0.9600 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8900 - val_digit4_acc: 0.9900
Epoch 00047: val_digit4_acc did not improve from 0.99000
Epoch 48/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0686 - digit1_loss: 0.0044 - digit2_loss: 0.0269 - digit3_loss: 0.0238 - digit4_loss: 0.0136 - digit1_acc: 0.9990 - digit2_acc: 0.9980 - digit3_acc: 0.9980 - digit4_acc: 0.9950 - val_loss: 1.2249 - val_digit1_loss: 0.1170 - val_digit2_loss: 0.6593 - val_digit3_loss: 0.4152 - val_digit4_loss: 0.0334 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8800 - val_digit4_acc: 0.9900
Epoch 00048: val_digit4_acc did not improve from 0.99000
Epoch 49/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0736 - digit1_loss: 0.0087 - digit2_loss: 0.0309 - digit3_loss: 0.0268 - digit4_loss: 0.0071 - digit1_acc: 0.9980 - digit2_acc: 0.9940 - digit3_acc: 0.9950 - digit4_acc: 0.9970 - val_loss: 1.3238 - val_digit1_loss: 0.1229 - val_digit2_loss: 0.6496 - val_digit3_loss: 0.4951 - val_digit4_loss: 0.0562 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8400 - val_digit3_acc: 0.8500 - val_digit4_acc: 0.9900
Epoch 00049: val_digit4_acc did not improve from 0.99000
Epoch 50/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0935 - digit1_loss: 0.0050 - digit2_loss: 0.0354 - digit3_loss: 0.0499 - digit4_loss: 0.0032 - digit1_acc: 0.9980 - digit2_acc: 0.9910 - digit3_acc: 0.9890 - digit4_acc: 0.9990 - val_loss: 1.7740 - val_digit1_loss: 0.1539 - val_digit2_loss: 0.9237 - val_digit3_loss: 0.6273 - val_digit4_loss: 0.0690 - val_digit1_acc: 0.9300 - val_digit2_acc: 0.7900 - val_digit3_acc: 0.8400 - val_digit4_acc: 0.9900
but when I use model.evaluate on valid set and I get the loss and accuracy like this:
(lower loss and lower accuracy)
Test loss: 1.773969256877899
Test accuracy: 0.1539082833379507
Why can't I get higher accuracy after getting lower loss?
Or something wrong when I use "evaluate" method?
python tensorflow machine-learning keras deep-learning
I want to train a model which can recognize the captcha like this
I want to recognize each of the word in the picture
so I create the cnn model below
first I have a model like this:
print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
out = Dense(1024, activation='relu')(out)
out = Dropout(0.3)(out)
out = Dense(512, activation='relu')(out)
out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]
model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
When I am training the model
I think that it has a high loss but not really bad accuracy on valid set
like this:
Epoch 47/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9421 - digit1_loss: 0.2456 - digit2_loss: 0.2657 - digit3_loss: 0.2316 - digit4_loss: 0.1992 - digit1_acc: 0.9400 - digit2_acc: 0.9190 - digit3_acc: 0.9330 - digit4_acc: 0.9300 - val_loss: 5.0476 - val_digit1_loss: 0.5545 - val_digit2_loss: 1.8687 - val_digit3_loss: 1.8951 - val_digit4_loss: 0.7294 - val_digit1_acc: 0.8300 - val_digit2_acc: 0.5500 - val_digit3_acc: 0.5200 - val_digit4_acc: 0.8000
Epoch 00047: val_digit4_acc did not improve from 0.83000
Epoch 48/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9154 - digit1_loss: 0.1681 - digit2_loss: 0.2992 - digit3_loss: 0.2556 - digit4_loss: 0.1924 - digit1_acc: 0.9520 - digit2_acc: 0.9180 - digit3_acc: 0.9220 - digit4_acc: 0.9370 - val_loss: 4.6983 - val_digit1_loss: 0.4929 - val_digit2_loss: 1.8220 - val_digit3_loss: 1.6665 - val_digit4_loss: 0.7170 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5300 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.8300
Epoch 00048: val_digit4_acc improved from 0.83000 to 0.83000, saving model to cnn_model.hdf5
Epoch 49/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.8703 - digit1_loss: 0.1813 - digit2_loss: 0.2374 - digit3_loss: 0.2537 - digit4_loss: 0.1979 - digit1_acc: 0.9450 - digit2_acc: 0.9240 - digit3_acc: 0.9250 - digit4_acc: 0.9400 - val_loss: 4.6405 - val_digit1_loss: 0.4936 - val_digit2_loss: 1.8665 - val_digit3_loss: 1.5744 - val_digit4_loss: 0.7060 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.7800
Epoch 00049: val_digit4_acc did not improve from 0.83000
Epoch 50/50
1000/1000 [==============================] - 2s 2ms/step - loss: 0.9112 - digit1_loss: 0.2036 - digit2_loss: 0.2543 - digit3_loss: 0.2222 - digit4_loss: 0.2312 - digit1_acc: 0.9360 - digit2_acc: 0.9170 - digit3_acc: 0.9290 - digit4_acc: 0.9330 - val_loss: 4.9354 - val_digit1_loss: 0.5632 - val_digit2_loss: 1.8869 - val_digit3_loss: 1.7899 - val_digit4_loss: 0.6954 - val_digit1_acc: 0.8600 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5700 - val_digit4_acc: 0.7900
then I use model.evaluate on valid set and I get the loss and accuracy like this:
(Loss is high and accuracy is high too)
Test loss: 4.9354219818115235
Test accuracy: 0.5632282853126526
second, I change the model like this:(take out 2 hidden layer)
# Create CNN Model
print("Creating CNN model...")
a = Input((40, 80, 3))
out = a
out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)
#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Dropout(0.3)(out)
out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)
out = BatchNormalization()(out)
out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)
out = Flatten()(out)
out = Dropout(0.3)(out)
#out = Dense(1024, activation='relu')(out)
#out = Dropout(0.3)(out)
#out = Dense(512, activation='relu')(out)
#out = Dropout(0.3)(out)
out = [Dense(36, name='digit1', activation='softmax')(out),
Dense(36, name='digit2', activation='softmax')(out),
Dense(36, name='digit3', activation='softmax')(out),
Dense(36, name='digit4', activation='softmax')(out)]
model = Model(inputs=a, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
When I am training the model
I think that it has a low loss and good accuracy on valid set
like this:
Epoch 47/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0536 - digit1_loss: 0.0023 - digit2_loss: 0.0231 - digit3_loss: 0.0230 - digit4_loss: 0.0052 - digit1_acc: 1.0000 - digit2_acc: 0.9980 - digit3_acc: 0.9990 - digit4_acc: 0.9990 - val_loss: 1.2679 - val_digit1_loss: 0.1059 - val_digit2_loss: 0.6560 - val_digit3_loss: 0.4402 - val_digit4_loss: 0.0658 - val_digit1_acc: 0.9600 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8900 - val_digit4_acc: 0.9900
Epoch 00047: val_digit4_acc did not improve from 0.99000
Epoch 48/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0686 - digit1_loss: 0.0044 - digit2_loss: 0.0269 - digit3_loss: 0.0238 - digit4_loss: 0.0136 - digit1_acc: 0.9990 - digit2_acc: 0.9980 - digit3_acc: 0.9980 - digit4_acc: 0.9950 - val_loss: 1.2249 - val_digit1_loss: 0.1170 - val_digit2_loss: 0.6593 - val_digit3_loss: 0.4152 - val_digit4_loss: 0.0334 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8800 - val_digit4_acc: 0.9900
Epoch 00048: val_digit4_acc did not improve from 0.99000
Epoch 49/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0736 - digit1_loss: 0.0087 - digit2_loss: 0.0309 - digit3_loss: 0.0268 - digit4_loss: 0.0071 - digit1_acc: 0.9980 - digit2_acc: 0.9940 - digit3_acc: 0.9950 - digit4_acc: 0.9970 - val_loss: 1.3238 - val_digit1_loss: 0.1229 - val_digit2_loss: 0.6496 - val_digit3_loss: 0.4951 - val_digit4_loss: 0.0562 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8400 - val_digit3_acc: 0.8500 - val_digit4_acc: 0.9900
Epoch 00049: val_digit4_acc did not improve from 0.99000
Epoch 50/50
1000/1000 [==============================] - 1s 1ms/step - loss: 0.0935 - digit1_loss: 0.0050 - digit2_loss: 0.0354 - digit3_loss: 0.0499 - digit4_loss: 0.0032 - digit1_acc: 0.9980 - digit2_acc: 0.9910 - digit3_acc: 0.9890 - digit4_acc: 0.9990 - val_loss: 1.7740 - val_digit1_loss: 0.1539 - val_digit2_loss: 0.9237 - val_digit3_loss: 0.6273 - val_digit4_loss: 0.0690 - val_digit1_acc: 0.9300 - val_digit2_acc: 0.7900 - val_digit3_acc: 0.8400 - val_digit4_acc: 0.9900
but when I use model.evaluate on valid set and I get the loss and accuracy like this:
(lower loss and lower accuracy)
Test loss: 1.773969256877899
Test accuracy: 0.1539082833379507
Why can't I get higher accuracy after getting lower loss?
Or something wrong when I use "evaluate" method?
python tensorflow machine-learning keras deep-learning
python tensorflow machine-learning keras deep-learning
asked Jan 1 at 8:06
Sun Hao LunSun Hao Lun
305
305
Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…
– constt
Jan 1 at 9:21
OK Thanks~ I will try that.
– Sun Hao Lun
Jan 2 at 6:59
add a comment |
Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…
– constt
Jan 1 at 9:21
OK Thanks~ I will try that.
– Sun Hao Lun
Jan 2 at 6:59
Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…
– constt
Jan 1 at 9:21
Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…
– constt
Jan 1 at 9:21
OK Thanks~ I will try that.
– Sun Hao Lun
Jan 2 at 6:59
OK Thanks~ I will try that.
– Sun Hao Lun
Jan 2 at 6:59
add a comment |
1 Answer
1
active
oldest
votes
I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.
Please check:
Are they really independent? Especially train and cross-validation set.
How large your sets are? If you have enough variance in cross-validation.
Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?
– Sun Hao Lun
Jan 2 at 7:06
It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.
– OndraN
Jan 2 at 16:21
You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.
– OndraN
Jan 2 at 16:22
Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.
– OndraN
Jan 2 at 16:22
3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.
– OndraN
Jan 2 at 16:22
|
show 1 more comment
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53993955%2fkeras-cnn-training-to-recognize-captcha-get-low-loss-and-get-low-accuracy%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.
Please check:
Are they really independent? Especially train and cross-validation set.
How large your sets are? If you have enough variance in cross-validation.
Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?
– Sun Hao Lun
Jan 2 at 7:06
It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.
– OndraN
Jan 2 at 16:21
You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.
– OndraN
Jan 2 at 16:22
Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.
– OndraN
Jan 2 at 16:22
3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.
– OndraN
Jan 2 at 16:22
|
show 1 more comment
I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.
Please check:
Are they really independent? Especially train and cross-validation set.
How large your sets are? If you have enough variance in cross-validation.
Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?
– Sun Hao Lun
Jan 2 at 7:06
It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.
– OndraN
Jan 2 at 16:21
You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.
– OndraN
Jan 2 at 16:22
Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.
– OndraN
Jan 2 at 16:22
3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.
– OndraN
Jan 2 at 16:22
|
show 1 more comment
I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.
Please check:
Are they really independent? Especially train and cross-validation set.
How large your sets are? If you have enough variance in cross-validation.
I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.
Please check:
Are they really independent? Especially train and cross-validation set.
How large your sets are? If you have enough variance in cross-validation.
answered Jan 1 at 16:50
OndraNOndraN
74
74
Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?
– Sun Hao Lun
Jan 2 at 7:06
It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.
– OndraN
Jan 2 at 16:21
You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.
– OndraN
Jan 2 at 16:22
Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.
– OndraN
Jan 2 at 16:22
3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.
– OndraN
Jan 2 at 16:22
|
show 1 more comment
Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?
– Sun Hao Lun
Jan 2 at 7:06
It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.
– OndraN
Jan 2 at 16:21
You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.
– OndraN
Jan 2 at 16:22
Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.
– OndraN
Jan 2 at 16:22
3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.
– OndraN
Jan 2 at 16:22
Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?
– Sun Hao Lun
Jan 2 at 7:06
Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?
– Sun Hao Lun
Jan 2 at 7:06
It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.
– OndraN
Jan 2 at 16:21
It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.
– OndraN
Jan 2 at 16:21
You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.
– OndraN
Jan 2 at 16:22
You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.
– OndraN
Jan 2 at 16:22
Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.
– OndraN
Jan 2 at 16:22
Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.
– OndraN
Jan 2 at 16:22
3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.
– OndraN
Jan 2 at 16:22
3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.
– OndraN
Jan 2 at 16:22
|
show 1 more comment
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53993955%2fkeras-cnn-training-to-recognize-captcha-get-low-loss-and-get-low-accuracy%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…
– constt
Jan 1 at 9:21
OK Thanks~ I will try that.
– Sun Hao Lun
Jan 2 at 6:59