Keras CNN training to recognize captcha: Get low loss and get low accuracy

I want to train a model which can recognize the captcha like this

captcha

I want to recognize each of the word in the picture

so I create the cnn model below

first I have a model like this:

print("Creating CNN model...")

a = Input((40, 80, 3))

out = a

out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out) 

out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)

out = Flatten()(out)

out = Dropout(0.3)(out)

out = Dense(1024, activation='relu')(out)

out = Dropout(0.3)(out)

out = Dense(512, activation='relu')(out)

out = Dropout(0.3)(out)

out = [Dense(36, name='digit1', activation='softmax')(out),

       Dense(36, name='digit2', activation='softmax')(out),

       Dense(36, name='digit3', activation='softmax')(out),

       Dense(36, name='digit4', activation='softmax')(out)]



model = Model(inputs=a, outputs=out)

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model.summary()

layer

When I am training the model
I think that it has a high loss but not really bad accuracy on valid set

like this:

Epoch 47/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.9421 - digit1_loss: 0.2456 - digit2_loss: 0.2657 - digit3_loss: 0.2316 - digit4_loss: 0.1992 - digit1_acc: 0.9400 - digit2_acc: 0.9190 - digit3_acc: 0.9330 - digit4_acc: 0.9300 - val_loss: 5.0476 - val_digit1_loss: 0.5545 - val_digit2_loss: 1.8687 - val_digit3_loss: 1.8951 - val_digit4_loss: 0.7294 - val_digit1_acc: 0.8300 - val_digit2_acc: 0.5500 - val_digit3_acc: 0.5200 - val_digit4_acc: 0.8000



Epoch 00047: val_digit4_acc did not improve from 0.83000

Epoch 48/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.9154 - digit1_loss: 0.1681 - digit2_loss: 0.2992 - digit3_loss: 0.2556 - digit4_loss: 0.1924 - digit1_acc: 0.9520 - digit2_acc: 0.9180 - digit3_acc: 0.9220 - digit4_acc: 0.9370 - val_loss: 4.6983 - val_digit1_loss: 0.4929 - val_digit2_loss: 1.8220 - val_digit3_loss: 1.6665 - val_digit4_loss: 0.7170 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5300 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.8300



Epoch 00048: val_digit4_acc improved from 0.83000 to 0.83000, saving model to cnn_model.hdf5

Epoch 49/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.8703 - digit1_loss: 0.1813 - digit2_loss: 0.2374 - digit3_loss: 0.2537 - digit4_loss: 0.1979 - digit1_acc: 0.9450 - digit2_acc: 0.9240 - digit3_acc: 0.9250 - digit4_acc: 0.9400 - val_loss: 4.6405 - val_digit1_loss: 0.4936 - val_digit2_loss: 1.8665 - val_digit3_loss: 1.5744 - val_digit4_loss: 0.7060 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.7800



Epoch 00049: val_digit4_acc did not improve from 0.83000

Epoch 50/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.9112 - digit1_loss: 0.2036 - digit2_loss: 0.2543 - digit3_loss: 0.2222 - digit4_loss: 0.2312 - digit1_acc: 0.9360 - digit2_acc: 0.9170 - digit3_acc: 0.9290 - digit4_acc: 0.9330 - val_loss: 4.9354 - val_digit1_loss: 0.5632 - val_digit2_loss: 1.8869 - val_digit3_loss: 1.7899 - val_digit4_loss: 0.6954 - val_digit1_acc: 0.8600 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5700 - val_digit4_acc: 0.7900

then I use model.evaluate on valid set and I get the loss and accuracy like this:

(Loss is high and accuracy is high too)

Test loss: 4.9354219818115235

Test accuracy: 0.5632282853126526

second, I change the model like this:(take out 2 hidden layer)

# Create CNN Model

print("Creating CNN model...")

a = Input((40, 80, 3))

out = a

out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out) 

out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)

out = Flatten()(out)

out = Dropout(0.3)(out)

#out = Dense(1024, activation='relu')(out)

#out = Dropout(0.3)(out)

#out = Dense(512, activation='relu')(out)

#out = Dropout(0.3)(out)

out = [Dense(36, name='digit1', activation='softmax')(out),

       Dense(36, name='digit2', activation='softmax')(out),

       Dense(36, name='digit3', activation='softmax')(out),

       Dense(36, name='digit4', activation='softmax')(out)]



model = Model(inputs=a, outputs=out)

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) 

model.summary()

layer2

When I am training the model
I think that it has a low loss and good accuracy on valid set

like this:

Epoch 47/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0536 - digit1_loss: 0.0023 - digit2_loss: 0.0231 - digit3_loss: 0.0230 - digit4_loss: 0.0052 - digit1_acc: 1.0000 - digit2_acc: 0.9980 - digit3_acc: 0.9990 - digit4_acc: 0.9990 - val_loss: 1.2679 - val_digit1_loss: 0.1059 - val_digit2_loss: 0.6560 - val_digit3_loss: 0.4402 - val_digit4_loss: 0.0658 - val_digit1_acc: 0.9600 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8900 - val_digit4_acc: 0.9900



Epoch 00047: val_digit4_acc did not improve from 0.99000

Epoch 48/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0686 - digit1_loss: 0.0044 - digit2_loss: 0.0269 - digit3_loss: 0.0238 - digit4_loss: 0.0136 - digit1_acc: 0.9990 - digit2_acc: 0.9980 - digit3_acc: 0.9980 - digit4_acc: 0.9950 - val_loss: 1.2249 - val_digit1_loss: 0.1170 - val_digit2_loss: 0.6593 - val_digit3_loss: 0.4152 - val_digit4_loss: 0.0334 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8800 - val_digit4_acc: 0.9900



Epoch 00048: val_digit4_acc did not improve from 0.99000

Epoch 49/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0736 - digit1_loss: 0.0087 - digit2_loss: 0.0309 - digit3_loss: 0.0268 - digit4_loss: 0.0071 - digit1_acc: 0.9980 - digit2_acc: 0.9940 - digit3_acc: 0.9950 - digit4_acc: 0.9970 - val_loss: 1.3238 - val_digit1_loss: 0.1229 - val_digit2_loss: 0.6496 - val_digit3_loss: 0.4951 - val_digit4_loss: 0.0562 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8400 - val_digit3_acc: 0.8500 - val_digit4_acc: 0.9900



Epoch 00049: val_digit4_acc did not improve from 0.99000

Epoch 50/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0935 - digit1_loss: 0.0050 - digit2_loss: 0.0354 - digit3_loss: 0.0499 - digit4_loss: 0.0032 - digit1_acc: 0.9980 - digit2_acc: 0.9910 - digit3_acc: 0.9890 - digit4_acc: 0.9990 - val_loss: 1.7740 - val_digit1_loss: 0.1539 - val_digit2_loss: 0.9237 - val_digit3_loss: 0.6273 - val_digit4_loss: 0.0690 - val_digit1_acc: 0.9300 - val_digit2_acc: 0.7900 - val_digit3_acc: 0.8400 - val_digit4_acc: 0.9900

but when I use model.evaluate on valid set and I get the loss and accuracy like this:

(lower loss and lower accuracy)

Test loss: 1.773969256877899

Test accuracy: 0.1539082833379507

Why can't I get higher accuracy after getting lower loss?
Or something wrong when I use "evaluate" method?

asked Jan 1 at 8:06

Sun Hao Lun

305

Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…

– constt
Jan 1 at 9:21

OK Thanks~ I will try that.

– Sun Hao Lun
Jan 2 at 6:59

add a comment |

I want to train a model which can recognize the captcha like this

captcha

I want to recognize each of the word in the picture

so I create the cnn model below

first I have a model like this:

print("Creating CNN model...")

a = Input((40, 80, 3))

out = a

out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out) 

out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)

out = Flatten()(out)

out = Dropout(0.3)(out)

out = Dense(1024, activation='relu')(out)

out = Dropout(0.3)(out)

out = Dense(512, activation='relu')(out)

out = Dropout(0.3)(out)

out = [Dense(36, name='digit1', activation='softmax')(out),

       Dense(36, name='digit2', activation='softmax')(out),

       Dense(36, name='digit3', activation='softmax')(out),

       Dense(36, name='digit4', activation='softmax')(out)]



model = Model(inputs=a, outputs=out)

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model.summary()

layer

When I am training the model
I think that it has a high loss but not really bad accuracy on valid set

like this:

Epoch 47/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.9421 - digit1_loss: 0.2456 - digit2_loss: 0.2657 - digit3_loss: 0.2316 - digit4_loss: 0.1992 - digit1_acc: 0.9400 - digit2_acc: 0.9190 - digit3_acc: 0.9330 - digit4_acc: 0.9300 - val_loss: 5.0476 - val_digit1_loss: 0.5545 - val_digit2_loss: 1.8687 - val_digit3_loss: 1.8951 - val_digit4_loss: 0.7294 - val_digit1_acc: 0.8300 - val_digit2_acc: 0.5500 - val_digit3_acc: 0.5200 - val_digit4_acc: 0.8000



Epoch 00047: val_digit4_acc did not improve from 0.83000

Epoch 48/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.9154 - digit1_loss: 0.1681 - digit2_loss: 0.2992 - digit3_loss: 0.2556 - digit4_loss: 0.1924 - digit1_acc: 0.9520 - digit2_acc: 0.9180 - digit3_acc: 0.9220 - digit4_acc: 0.9370 - val_loss: 4.6983 - val_digit1_loss: 0.4929 - val_digit2_loss: 1.8220 - val_digit3_loss: 1.6665 - val_digit4_loss: 0.7170 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5300 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.8300



Epoch 00048: val_digit4_acc improved from 0.83000 to 0.83000, saving model to cnn_model.hdf5

Epoch 49/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.8703 - digit1_loss: 0.1813 - digit2_loss: 0.2374 - digit3_loss: 0.2537 - digit4_loss: 0.1979 - digit1_acc: 0.9450 - digit2_acc: 0.9240 - digit3_acc: 0.9250 - digit4_acc: 0.9400 - val_loss: 4.6405 - val_digit1_loss: 0.4936 - val_digit2_loss: 1.8665 - val_digit3_loss: 1.5744 - val_digit4_loss: 0.7060 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.7800



Epoch 00049: val_digit4_acc did not improve from 0.83000

Epoch 50/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.9112 - digit1_loss: 0.2036 - digit2_loss: 0.2543 - digit3_loss: 0.2222 - digit4_loss: 0.2312 - digit1_acc: 0.9360 - digit2_acc: 0.9170 - digit3_acc: 0.9290 - digit4_acc: 0.9330 - val_loss: 4.9354 - val_digit1_loss: 0.5632 - val_digit2_loss: 1.8869 - val_digit3_loss: 1.7899 - val_digit4_loss: 0.6954 - val_digit1_acc: 0.8600 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5700 - val_digit4_acc: 0.7900

then I use model.evaluate on valid set and I get the loss and accuracy like this:

(Loss is high and accuracy is high too)

Test loss: 4.9354219818115235

Test accuracy: 0.5632282853126526

second, I change the model like this:(take out 2 hidden layer)

# Create CNN Model

print("Creating CNN model...")

a = Input((40, 80, 3))

out = a

out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out) 

out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)

out = Flatten()(out)

out = Dropout(0.3)(out)

#out = Dense(1024, activation='relu')(out)

#out = Dropout(0.3)(out)

#out = Dense(512, activation='relu')(out)

#out = Dropout(0.3)(out)

out = [Dense(36, name='digit1', activation='softmax')(out),

       Dense(36, name='digit2', activation='softmax')(out),

       Dense(36, name='digit3', activation='softmax')(out),

       Dense(36, name='digit4', activation='softmax')(out)]



model = Model(inputs=a, outputs=out)

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) 

model.summary()

layer2

When I am training the model
I think that it has a low loss and good accuracy on valid set

like this:

Epoch 47/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0536 - digit1_loss: 0.0023 - digit2_loss: 0.0231 - digit3_loss: 0.0230 - digit4_loss: 0.0052 - digit1_acc: 1.0000 - digit2_acc: 0.9980 - digit3_acc: 0.9990 - digit4_acc: 0.9990 - val_loss: 1.2679 - val_digit1_loss: 0.1059 - val_digit2_loss: 0.6560 - val_digit3_loss: 0.4402 - val_digit4_loss: 0.0658 - val_digit1_acc: 0.9600 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8900 - val_digit4_acc: 0.9900



Epoch 00047: val_digit4_acc did not improve from 0.99000

Epoch 48/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0686 - digit1_loss: 0.0044 - digit2_loss: 0.0269 - digit3_loss: 0.0238 - digit4_loss: 0.0136 - digit1_acc: 0.9990 - digit2_acc: 0.9980 - digit3_acc: 0.9980 - digit4_acc: 0.9950 - val_loss: 1.2249 - val_digit1_loss: 0.1170 - val_digit2_loss: 0.6593 - val_digit3_loss: 0.4152 - val_digit4_loss: 0.0334 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8800 - val_digit4_acc: 0.9900



Epoch 00048: val_digit4_acc did not improve from 0.99000

Epoch 49/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0736 - digit1_loss: 0.0087 - digit2_loss: 0.0309 - digit3_loss: 0.0268 - digit4_loss: 0.0071 - digit1_acc: 0.9980 - digit2_acc: 0.9940 - digit3_acc: 0.9950 - digit4_acc: 0.9970 - val_loss: 1.3238 - val_digit1_loss: 0.1229 - val_digit2_loss: 0.6496 - val_digit3_loss: 0.4951 - val_digit4_loss: 0.0562 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8400 - val_digit3_acc: 0.8500 - val_digit4_acc: 0.9900



Epoch 00049: val_digit4_acc did not improve from 0.99000

Epoch 50/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0935 - digit1_loss: 0.0050 - digit2_loss: 0.0354 - digit3_loss: 0.0499 - digit4_loss: 0.0032 - digit1_acc: 0.9980 - digit2_acc: 0.9910 - digit3_acc: 0.9890 - digit4_acc: 0.9990 - val_loss: 1.7740 - val_digit1_loss: 0.1539 - val_digit2_loss: 0.9237 - val_digit3_loss: 0.6273 - val_digit4_loss: 0.0690 - val_digit1_acc: 0.9300 - val_digit2_acc: 0.7900 - val_digit3_acc: 0.8400 - val_digit4_acc: 0.9900

but when I use model.evaluate on valid set and I get the loss and accuracy like this:

(lower loss and lower accuracy)

Test loss: 1.773969256877899

Test accuracy: 0.1539082833379507

Why can't I get higher accuracy after getting lower loss?
Or something wrong when I use "evaluate" method?

asked Jan 1 at 8:06

Sun Hao Lun

305

Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…

– constt
Jan 1 at 9:21

OK Thanks~ I will try that.

– Sun Hao Lun
Jan 2 at 6:59

add a comment |

I want to train a model which can recognize the captcha like this

captcha

I want to recognize each of the word in the picture

so I create the cnn model below

first I have a model like this:

print("Creating CNN model...")

a = Input((40, 80, 3))

out = a

out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out) 

out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)

out = Flatten()(out)

out = Dropout(0.3)(out)

out = Dense(1024, activation='relu')(out)

out = Dropout(0.3)(out)

out = Dense(512, activation='relu')(out)

out = Dropout(0.3)(out)

out = [Dense(36, name='digit1', activation='softmax')(out),

       Dense(36, name='digit2', activation='softmax')(out),

       Dense(36, name='digit3', activation='softmax')(out),

       Dense(36, name='digit4', activation='softmax')(out)]



model = Model(inputs=a, outputs=out)

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model.summary()

layer

When I am training the model
I think that it has a high loss but not really bad accuracy on valid set

like this:

Epoch 47/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.9421 - digit1_loss: 0.2456 - digit2_loss: 0.2657 - digit3_loss: 0.2316 - digit4_loss: 0.1992 - digit1_acc: 0.9400 - digit2_acc: 0.9190 - digit3_acc: 0.9330 - digit4_acc: 0.9300 - val_loss: 5.0476 - val_digit1_loss: 0.5545 - val_digit2_loss: 1.8687 - val_digit3_loss: 1.8951 - val_digit4_loss: 0.7294 - val_digit1_acc: 0.8300 - val_digit2_acc: 0.5500 - val_digit3_acc: 0.5200 - val_digit4_acc: 0.8000



Epoch 00047: val_digit4_acc did not improve from 0.83000

Epoch 48/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.9154 - digit1_loss: 0.1681 - digit2_loss: 0.2992 - digit3_loss: 0.2556 - digit4_loss: 0.1924 - digit1_acc: 0.9520 - digit2_acc: 0.9180 - digit3_acc: 0.9220 - digit4_acc: 0.9370 - val_loss: 4.6983 - val_digit1_loss: 0.4929 - val_digit2_loss: 1.8220 - val_digit3_loss: 1.6665 - val_digit4_loss: 0.7170 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5300 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.8300



Epoch 00048: val_digit4_acc improved from 0.83000 to 0.83000, saving model to cnn_model.hdf5

Epoch 49/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.8703 - digit1_loss: 0.1813 - digit2_loss: 0.2374 - digit3_loss: 0.2537 - digit4_loss: 0.1979 - digit1_acc: 0.9450 - digit2_acc: 0.9240 - digit3_acc: 0.9250 - digit4_acc: 0.9400 - val_loss: 4.6405 - val_digit1_loss: 0.4936 - val_digit2_loss: 1.8665 - val_digit3_loss: 1.5744 - val_digit4_loss: 0.7060 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.7800



Epoch 00049: val_digit4_acc did not improve from 0.83000

Epoch 50/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.9112 - digit1_loss: 0.2036 - digit2_loss: 0.2543 - digit3_loss: 0.2222 - digit4_loss: 0.2312 - digit1_acc: 0.9360 - digit2_acc: 0.9170 - digit3_acc: 0.9290 - digit4_acc: 0.9330 - val_loss: 4.9354 - val_digit1_loss: 0.5632 - val_digit2_loss: 1.8869 - val_digit3_loss: 1.7899 - val_digit4_loss: 0.6954 - val_digit1_acc: 0.8600 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5700 - val_digit4_acc: 0.7900

then I use model.evaluate on valid set and I get the loss and accuracy like this:

(Loss is high and accuracy is high too)

Test loss: 4.9354219818115235

Test accuracy: 0.5632282853126526

second, I change the model like this:(take out 2 hidden layer)

# Create CNN Model

print("Creating CNN model...")

a = Input((40, 80, 3))

out = a

out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out) 

out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)

out = Flatten()(out)

out = Dropout(0.3)(out)

#out = Dense(1024, activation='relu')(out)

#out = Dropout(0.3)(out)

#out = Dense(512, activation='relu')(out)

#out = Dropout(0.3)(out)

out = [Dense(36, name='digit1', activation='softmax')(out),

       Dense(36, name='digit2', activation='softmax')(out),

       Dense(36, name='digit3', activation='softmax')(out),

       Dense(36, name='digit4', activation='softmax')(out)]



model = Model(inputs=a, outputs=out)

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) 

model.summary()

layer2

When I am training the model
I think that it has a low loss and good accuracy on valid set

like this:

Epoch 47/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0536 - digit1_loss: 0.0023 - digit2_loss: 0.0231 - digit3_loss: 0.0230 - digit4_loss: 0.0052 - digit1_acc: 1.0000 - digit2_acc: 0.9980 - digit3_acc: 0.9990 - digit4_acc: 0.9990 - val_loss: 1.2679 - val_digit1_loss: 0.1059 - val_digit2_loss: 0.6560 - val_digit3_loss: 0.4402 - val_digit4_loss: 0.0658 - val_digit1_acc: 0.9600 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8900 - val_digit4_acc: 0.9900



Epoch 00047: val_digit4_acc did not improve from 0.99000

Epoch 48/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0686 - digit1_loss: 0.0044 - digit2_loss: 0.0269 - digit3_loss: 0.0238 - digit4_loss: 0.0136 - digit1_acc: 0.9990 - digit2_acc: 0.9980 - digit3_acc: 0.9980 - digit4_acc: 0.9950 - val_loss: 1.2249 - val_digit1_loss: 0.1170 - val_digit2_loss: 0.6593 - val_digit3_loss: 0.4152 - val_digit4_loss: 0.0334 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8800 - val_digit4_acc: 0.9900



Epoch 00048: val_digit4_acc did not improve from 0.99000

Epoch 49/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0736 - digit1_loss: 0.0087 - digit2_loss: 0.0309 - digit3_loss: 0.0268 - digit4_loss: 0.0071 - digit1_acc: 0.9980 - digit2_acc: 0.9940 - digit3_acc: 0.9950 - digit4_acc: 0.9970 - val_loss: 1.3238 - val_digit1_loss: 0.1229 - val_digit2_loss: 0.6496 - val_digit3_loss: 0.4951 - val_digit4_loss: 0.0562 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8400 - val_digit3_acc: 0.8500 - val_digit4_acc: 0.9900



Epoch 00049: val_digit4_acc did not improve from 0.99000

Epoch 50/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0935 - digit1_loss: 0.0050 - digit2_loss: 0.0354 - digit3_loss: 0.0499 - digit4_loss: 0.0032 - digit1_acc: 0.9980 - digit2_acc: 0.9910 - digit3_acc: 0.9890 - digit4_acc: 0.9990 - val_loss: 1.7740 - val_digit1_loss: 0.1539 - val_digit2_loss: 0.9237 - val_digit3_loss: 0.6273 - val_digit4_loss: 0.0690 - val_digit1_acc: 0.9300 - val_digit2_acc: 0.7900 - val_digit3_acc: 0.8400 - val_digit4_acc: 0.9900

but when I use model.evaluate on valid set and I get the loss and accuracy like this:

(lower loss and lower accuracy)

Test loss: 1.773969256877899

Test accuracy: 0.1539082833379507

Why can't I get higher accuracy after getting lower loss?
Or something wrong when I use "evaluate" method?

asked Jan 1 at 8:06

Sun Hao Lun

305

I want to train a model which can recognize the captcha like this

captcha

I want to recognize each of the word in the picture

so I create the cnn model below

first I have a model like this:

print("Creating CNN model...")

a = Input((40, 80, 3))

out = a

out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out) 

out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)

out = Flatten()(out)

out = Dropout(0.3)(out)

out = Dense(1024, activation='relu')(out)

out = Dropout(0.3)(out)

out = Dense(512, activation='relu')(out)

out = Dropout(0.3)(out)

out = [Dense(36, name='digit1', activation='softmax')(out),

       Dense(36, name='digit2', activation='softmax')(out),

       Dense(36, name='digit3', activation='softmax')(out),

       Dense(36, name='digit4', activation='softmax')(out)]



model = Model(inputs=a, outputs=out)

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model.summary()

layer

When I am training the model
I think that it has a high loss but not really bad accuracy on valid set

like this:

Epoch 47/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.9421 - digit1_loss: 0.2456 - digit2_loss: 0.2657 - digit3_loss: 0.2316 - digit4_loss: 0.1992 - digit1_acc: 0.9400 - digit2_acc: 0.9190 - digit3_acc: 0.9330 - digit4_acc: 0.9300 - val_loss: 5.0476 - val_digit1_loss: 0.5545 - val_digit2_loss: 1.8687 - val_digit3_loss: 1.8951 - val_digit4_loss: 0.7294 - val_digit1_acc: 0.8300 - val_digit2_acc: 0.5500 - val_digit3_acc: 0.5200 - val_digit4_acc: 0.8000



Epoch 00047: val_digit4_acc did not improve from 0.83000

Epoch 48/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.9154 - digit1_loss: 0.1681 - digit2_loss: 0.2992 - digit3_loss: 0.2556 - digit4_loss: 0.1924 - digit1_acc: 0.9520 - digit2_acc: 0.9180 - digit3_acc: 0.9220 - digit4_acc: 0.9370 - val_loss: 4.6983 - val_digit1_loss: 0.4929 - val_digit2_loss: 1.8220 - val_digit3_loss: 1.6665 - val_digit4_loss: 0.7170 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5300 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.8300



Epoch 00048: val_digit4_acc improved from 0.83000 to 0.83000, saving model to cnn_model.hdf5

Epoch 49/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.8703 - digit1_loss: 0.1813 - digit2_loss: 0.2374 - digit3_loss: 0.2537 - digit4_loss: 0.1979 - digit1_acc: 0.9450 - digit2_acc: 0.9240 - digit3_acc: 0.9250 - digit4_acc: 0.9400 - val_loss: 4.6405 - val_digit1_loss: 0.4936 - val_digit2_loss: 1.8665 - val_digit3_loss: 1.5744 - val_digit4_loss: 0.7060 - val_digit1_acc: 0.8700 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5900 - val_digit4_acc: 0.7800



Epoch 00049: val_digit4_acc did not improve from 0.83000

Epoch 50/50

1000/1000 [==============================] - 2s 2ms/step - loss: 0.9112 - digit1_loss: 0.2036 - digit2_loss: 0.2543 - digit3_loss: 0.2222 - digit4_loss: 0.2312 - digit1_acc: 0.9360 - digit2_acc: 0.9170 - digit3_acc: 0.9290 - digit4_acc: 0.9330 - val_loss: 4.9354 - val_digit1_loss: 0.5632 - val_digit2_loss: 1.8869 - val_digit3_loss: 1.7899 - val_digit4_loss: 0.6954 - val_digit1_acc: 0.8600 - val_digit2_acc: 0.5000 - val_digit3_acc: 0.5700 - val_digit4_acc: 0.7900

then I use model.evaluate on valid set and I get the loss and accuracy like this:

(Loss is high and accuracy is high too)

Test loss: 4.9354219818115235

Test accuracy: 0.5632282853126526

second, I change the model like this:(take out 2 hidden layer)

# Create CNN Model

print("Creating CNN model...")

a = Input((40, 80, 3))

out = a

out = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out)

out = Conv2D(filters=128, kernel_size=(3, 3), padding='same', activation='relu')(out)

#out = Conv2D(filters=128, kernel_size=(3, 3), activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2))(out)

out = Dropout(0.3)(out) 

out = Conv2D(filters=256, kernel_size=(3, 3), padding='same', activation='relu')(out)

out = BatchNormalization()(out)

out = MaxPooling2D(pool_size=(2, 2) , dim_ordering="th")(out)

out = Flatten()(out)

out = Dropout(0.3)(out)

#out = Dense(1024, activation='relu')(out)

#out = Dropout(0.3)(out)

#out = Dense(512, activation='relu')(out)

#out = Dropout(0.3)(out)

out = [Dense(36, name='digit1', activation='softmax')(out),

       Dense(36, name='digit2', activation='softmax')(out),

       Dense(36, name='digit3', activation='softmax')(out),

       Dense(36, name='digit4', activation='softmax')(out)]



model = Model(inputs=a, outputs=out)

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) 

model.summary()

layer2

When I am training the model
I think that it has a low loss and good accuracy on valid set

like this:

Epoch 47/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0536 - digit1_loss: 0.0023 - digit2_loss: 0.0231 - digit3_loss: 0.0230 - digit4_loss: 0.0052 - digit1_acc: 1.0000 - digit2_acc: 0.9980 - digit3_acc: 0.9990 - digit4_acc: 0.9990 - val_loss: 1.2679 - val_digit1_loss: 0.1059 - val_digit2_loss: 0.6560 - val_digit3_loss: 0.4402 - val_digit4_loss: 0.0658 - val_digit1_acc: 0.9600 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8900 - val_digit4_acc: 0.9900



Epoch 00047: val_digit4_acc did not improve from 0.99000

Epoch 48/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0686 - digit1_loss: 0.0044 - digit2_loss: 0.0269 - digit3_loss: 0.0238 - digit4_loss: 0.0136 - digit1_acc: 0.9990 - digit2_acc: 0.9980 - digit3_acc: 0.9980 - digit4_acc: 0.9950 - val_loss: 1.2249 - val_digit1_loss: 0.1170 - val_digit2_loss: 0.6593 - val_digit3_loss: 0.4152 - val_digit4_loss: 0.0334 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8200 - val_digit3_acc: 0.8800 - val_digit4_acc: 0.9900



Epoch 00048: val_digit4_acc did not improve from 0.99000

Epoch 49/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0736 - digit1_loss: 0.0087 - digit2_loss: 0.0309 - digit3_loss: 0.0268 - digit4_loss: 0.0071 - digit1_acc: 0.9980 - digit2_acc: 0.9940 - digit3_acc: 0.9950 - digit4_acc: 0.9970 - val_loss: 1.3238 - val_digit1_loss: 0.1229 - val_digit2_loss: 0.6496 - val_digit3_loss: 0.4951 - val_digit4_loss: 0.0562 - val_digit1_acc: 0.9500 - val_digit2_acc: 0.8400 - val_digit3_acc: 0.8500 - val_digit4_acc: 0.9900



Epoch 00049: val_digit4_acc did not improve from 0.99000

Epoch 50/50

1000/1000 [==============================] - 1s 1ms/step - loss: 0.0935 - digit1_loss: 0.0050 - digit2_loss: 0.0354 - digit3_loss: 0.0499 - digit4_loss: 0.0032 - digit1_acc: 0.9980 - digit2_acc: 0.9910 - digit3_acc: 0.9890 - digit4_acc: 0.9990 - val_loss: 1.7740 - val_digit1_loss: 0.1539 - val_digit2_loss: 0.9237 - val_digit3_loss: 0.6273 - val_digit4_loss: 0.0690 - val_digit1_acc: 0.9300 - val_digit2_acc: 0.7900 - val_digit3_acc: 0.8400 - val_digit4_acc: 0.9900

but when I use model.evaluate on valid set and I get the loss and accuracy like this:

(lower loss and lower accuracy)

Test loss: 1.773969256877899

Test accuracy: 0.1539082833379507

Why can't I get higher accuracy after getting lower loss?
Or something wrong when I use "evaluate" method?

python tensorflow machine-learning keras deep-learning

asked Jan 1 at 8:06

Sun Hao Lun

305

asked Jan 1 at 8:06

Sun Hao Lun

305

asked Jan 1 at 8:06

Sun Hao Lun

305

asked Jan 1 at 8:06

Sun Hao Lun

305

asked Jan 1 at 8:06

Sun Hao Lun

305

Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…

– constt
Jan 1 at 9:21

OK Thanks~ I will try that.

– Sun Hao Lun
Jan 2 at 6:59

add a comment |

Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…

– constt
Jan 1 at 9:21

OK Thanks~ I will try that.

– Sun Hao Lun
Jan 2 at 6:59

Try to use a different approach to this problem. I'd suggest encoding an input image using a CNN as you do in your example, then using obtained fixed length embedding vectors as input to an RNN with LSTM units. The RNN will generate a fixed length, four in your case, character sequence. Here is a nice guide on how it could be implemented in Keras blog.keras.io/…

– constt
Jan 1 at 9:21

OK Thanks~ I will try that.

– Sun Hao Lun
Jan 2 at 6:59

add a comment |

1 Answer
1

active

oldest

votes

I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.

Please check:
Are they really independent? Especially train and cross-validation set.

How large your sets are? If you have enough variance in cross-validation.

answered Jan 1 at 16:50

OndraN

Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?

– Sun Hao Lun
Jan 2 at 7:06

It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.

– OndraN
Jan 2 at 16:21

You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.

– OndraN
Jan 2 at 16:22

Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.

– OndraN
Jan 2 at 16:22

3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.

– OndraN
Jan 2 at 16:22

|
show 1 more comment

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53993955%2fkeras-cnn-training-to-recognize-captcha-get-low-loss-and-get-low-accuracy%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.

Please check:
Are they really independent? Especially train and cross-validation set.

How large your sets are? If you have enough variance in cross-validation.

answered Jan 1 at 16:50

OndraN

Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?

– Sun Hao Lun
Jan 2 at 7:06

It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.

– OndraN
Jan 2 at 16:21

You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.

– OndraN
Jan 2 at 16:22

Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.

– OndraN
Jan 2 at 16:22

3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.

– OndraN
Jan 2 at 16:22

|
show 1 more comment

I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.

Please check:
Are they really independent? Especially train and cross-validation set.

How large your sets are? If you have enough variance in cross-validation.

answered Jan 1 at 16:50

OndraN

Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?

– Sun Hao Lun
Jan 2 at 7:06

It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.

– OndraN
Jan 2 at 16:21

You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.

– OndraN
Jan 2 at 16:22

Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.

– OndraN
Jan 2 at 16:22

3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.

– OndraN
Jan 2 at 16:22

|
show 1 more comment

I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.

Please check:
Are they really independent? Especially train and cross-validation set.

How large your sets are? If you have enough variance in cross-validation.

answered Jan 1 at 16:50

OndraN

I would focus on the training data. To me, it looks like the model is over-trained. The second possible option is a problem with train-, cross-validation- and test-set.

Please check:
Are they really independent? Especially train and cross-validation set.

How large your sets are? If you have enough variance in cross-validation.

answered Jan 1 at 16:50

OndraN

answered Jan 1 at 16:50

OndraN

answered Jan 1 at 16:50

OndraN

answered Jan 1 at 16:50

OndraN

Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?

– Sun Hao Lun
Jan 2 at 7:06

It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.

– OndraN
Jan 2 at 16:21

You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.

– OndraN
Jan 2 at 16:22

Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.

– OndraN
Jan 2 at 16:22

3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.

– OndraN
Jan 2 at 16:22

|
show 1 more comment

Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?

– Sun Hao Lun
Jan 2 at 7:06

It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.

– OndraN
Jan 2 at 16:21

You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.

– OndraN
Jan 2 at 16:22

Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.

– OndraN
Jan 2 at 16:22

3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.

– OndraN
Jan 2 at 16:22

Thanks for your response. I have checked that the training set and validation set are independent. I used about 1000 on training and about 100 on validation set. Are they too small to train the model?

– Sun Hao Lun
Jan 2 at 7:06

It depends on the type of the problem. In your case, I think is on the edge. 1000 isn't too much, especially if a shape of the characters are really changing across the set. I would expect that maybe network didn't see enough data to generalize and maybe cross-validation is somehow correlated to the training set (similar type of letter corruption, a similar order of letters etc.), what can do cross-validation easier than evaluation.

– OndraN
Jan 2 at 16:21

You are able to evaluate cross-validation set, so I am expecting that the problem is in the data and not in the code (you probably evaluate cross-validation set the same way as the evaluation set). I am Ph.D. candidate focused on audio and text, so far I was only playing with picture processing for fun, so I can be still wrong.

– OndraN
Jan 2 at 16:22

Here is my approach to troubleshooting, what I would try (it independent on the type of the problem): 1. check histogram of character for train, cross-validation, and evaluation. It should be balanced (especially train and cross-validation), or you will find some bias in data (what is wrong for training). 2. check histogram of characters for every position. This is just to be sure that, you don't have every last character in the training set and cross-validation. This should be also balanced.

– OndraN
Jan 2 at 16:22

3. jackknifing mix all data together and create new training-, cross-validation, and evaluation-set and check the results.

– OndraN
Jan 2 at 16:22

|
show 1 more comment

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu