Description: We use Siamese Network, which has 3 inputs. Out of 3 inputs, one is an anchor sample, and another is a positive type sample, and the third is a negative type sample. We use CNN to get the fine feature,which is 1*1*1024 tensor . The features of 3 samples are get based on one CNN. So CNN weights are shared. The euclidean distance between two samples' fine features is calculated, and we use anchor-positive distance and anchor-negative distance to define triplet loss.
Meanwhile, we consider the classification task, and we connect a FC layer of 4135 units(the number of all classes) with the forehead fine feature, and then we define a softmax BCE loss.
The total loss is the sum of triplet loss and 3 BCE losses. We use Adam to train the model, and learning rate is 0.001 for first 20 epoch ,and learning rate is 0.0001 for last 20 epoch.
For inference, we only consider the softmax classification.