Deep Learning - A Number Of Naive Questions About Caffe
Solution 1:
Let's take a look at one of the examples provided with BVLC/caffe: bvlc_reference_caffenet
.
You'll notice that in fact there are 3'.prototxt'
files:
train_val.prototxt
: this file describe the net architecture for the training phase.depoly.prototxt
: this file describe the net architecture for test time ("deployment").solver.prototxt
: this file is very small and contains "meta parameters" for training. For example, the learning rate policy, regulariztion etc.
The net architecture represented by train_val.prototxt
and deploy.prototxt
should be mostly similar. There are few main difference between the two:
Input data: during training one usually use a predefined set of inputs for training/validation. Therefore,
train_val
usually contains an explicit input layer, e.g.,"HDF5Data"
layer or a"Data"
layer. On the other hand,deploy
usually does not know in advance what inputs it will get, it only contains a statement:input:"data"input_shape { dim:10dim:3dim:227dim:227 }
that declares what input the net expects and what should be its dimensions. Alternatively, One can put an
"Input"
layer:layer { name: "input" type: "Input" top: "data" input_param { shape { dim: 10dim: 3dim: 227dim: 227 } } }
- Input labels: during training we supply the net with the "ground truth" expected outputs, this information is obviously not available during
deploy
. - Loss layers: during training one must define a loss layer. This layer tells the solver in what direction it should tune the parameters at each iteration. This loss compares the net's current prediction to the expected "ground truth". The gradient of the loss is back-propagated to the rest of the net and this is what drives the learning process. During
deploy
there is no loss and no back-propagation.
In caffe, you supply a train_val.prototxt
describing the net, the train/val datasets and the loss. In addition, you supply a solver.prototxt
describing the meta parameters for training. The output of the training process is a .caffemodel
binary file containing the trained parameters of the net.
Once the net was trained, you can use the deploy.prototxt
with the .caffemodel
parameters to predict outputs for new and unseen inputs.
Solution 2:
Yes but, there is diffrent types of .prototxt files for example
https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_train_test.prototxt
this is for the training and testing network
for commandline training ypu can use a solver file which is also .prototxt file for example
https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_solver.prototxt
Post a Comment for "Deep Learning - A Number Of Naive Questions About Caffe"