How to add a L2 regularization term in my loss function












2















I’m going to compare the difference between with and without regularization, so I want to custom two loss functions.



My loss function with L2 norm:



enter image description here



###NET  
class CNN(nn.Module):
def __init__(self):
super(CNN,self).__init__()
self.layer1 = nn.Sequential(
nn.Conv2d(3, 16, kernel_size = 5, padding=2),
nn.ReLU(),
nn.MaxPool2d(2))
self.layer2 = nn.Sequential(
nn.Conv2d(16, 32, kernel_size = 5, padding=2),
nn.ReLU(),
nn.MaxPool2d(2))
self.layer3 = nn.Sequential(
nn.Conv2d(32, 32, kernel_size = 5, padding=2),
nn.ReLU(),
nn.MaxPool2d(4))
self.fc = nn.Linear(32*32*32,11)
def forward(self, x):
out = self.layer1(x)
out = self.layer2(out)
out = self.layer3(out)
out = out.view(out.size(0), -1)
out = self.fc(out)
return out

net = CNN()

###OPTIMIZER
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr = LR, momentum = MOMENTUM)



1.How can I add a L2 norm in my loss function?



2.If I want to write the loss function by myself (without using optim.SGD) and do the grad-decent by autograd, how can I do?




Thanks for your help!










share|improve this question

























  • You don't need to write two different loss functions if you want to try with and without regularization. You just need to write the one with regularization, and set the damping parameter alpha to zero when you want to try without regularization. Please edit and write the loss function with regularization so we can guide you.

    – Kefeng91
    May 3 '18 at 8:32













  • @Kefeng91 criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr = LR, weight_decay=0.01) -------------- I got an advice that I can add a parameter 'weight_decay'. If I set it to zero, it means the loss function without regularization, and if I set to to arbitrary value I'll get what I need. Right?

    – Weimin Chan
    May 3 '18 at 8:51








  • 1





    What I meant is that SO is not a "please write the code for me" forum. You should first try to write the forward function yourself and then come back to us with more details about what you tried.

    – Kefeng91
    May 3 '18 at 8:56






  • 1





    @Kefeng91 Sure, there's some misunderstanding.I wrote the net by myself but I don't know how to make a custom loss function. This is just a tiny question/part in my code.And I'm sorry for that I didn't notice that this is almost the main structure of all network.

    – Weimin Chan
    May 3 '18 at 9:10











  • Given the equation of the entropy, if you set alpha to zero, you will have no regularization. If you set alpha to anything else, you will have regularization.

    – Kefeng91
    May 3 '18 at 9:15
















2















I’m going to compare the difference between with and without regularization, so I want to custom two loss functions.



My loss function with L2 norm:



enter image description here



###NET  
class CNN(nn.Module):
def __init__(self):
super(CNN,self).__init__()
self.layer1 = nn.Sequential(
nn.Conv2d(3, 16, kernel_size = 5, padding=2),
nn.ReLU(),
nn.MaxPool2d(2))
self.layer2 = nn.Sequential(
nn.Conv2d(16, 32, kernel_size = 5, padding=2),
nn.ReLU(),
nn.MaxPool2d(2))
self.layer3 = nn.Sequential(
nn.Conv2d(32, 32, kernel_size = 5, padding=2),
nn.ReLU(),
nn.MaxPool2d(4))
self.fc = nn.Linear(32*32*32,11)
def forward(self, x):
out = self.layer1(x)
out = self.layer2(out)
out = self.layer3(out)
out = out.view(out.size(0), -1)
out = self.fc(out)
return out

net = CNN()

###OPTIMIZER
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr = LR, momentum = MOMENTUM)



1.How can I add a L2 norm in my loss function?



2.If I want to write the loss function by myself (without using optim.SGD) and do the grad-decent by autograd, how can I do?




Thanks for your help!










share|improve this question

























  • You don't need to write two different loss functions if you want to try with and without regularization. You just need to write the one with regularization, and set the damping parameter alpha to zero when you want to try without regularization. Please edit and write the loss function with regularization so we can guide you.

    – Kefeng91
    May 3 '18 at 8:32













  • @Kefeng91 criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr = LR, weight_decay=0.01) -------------- I got an advice that I can add a parameter 'weight_decay'. If I set it to zero, it means the loss function without regularization, and if I set to to arbitrary value I'll get what I need. Right?

    – Weimin Chan
    May 3 '18 at 8:51








  • 1





    What I meant is that SO is not a "please write the code for me" forum. You should first try to write the forward function yourself and then come back to us with more details about what you tried.

    – Kefeng91
    May 3 '18 at 8:56






  • 1





    @Kefeng91 Sure, there's some misunderstanding.I wrote the net by myself but I don't know how to make a custom loss function. This is just a tiny question/part in my code.And I'm sorry for that I didn't notice that this is almost the main structure of all network.

    – Weimin Chan
    May 3 '18 at 9:10











  • Given the equation of the entropy, if you set alpha to zero, you will have no regularization. If you set alpha to anything else, you will have regularization.

    – Kefeng91
    May 3 '18 at 9:15














2












2








2








I’m going to compare the difference between with and without regularization, so I want to custom two loss functions.



My loss function with L2 norm:



enter image description here



###NET  
class CNN(nn.Module):
def __init__(self):
super(CNN,self).__init__()
self.layer1 = nn.Sequential(
nn.Conv2d(3, 16, kernel_size = 5, padding=2),
nn.ReLU(),
nn.MaxPool2d(2))
self.layer2 = nn.Sequential(
nn.Conv2d(16, 32, kernel_size = 5, padding=2),
nn.ReLU(),
nn.MaxPool2d(2))
self.layer3 = nn.Sequential(
nn.Conv2d(32, 32, kernel_size = 5, padding=2),
nn.ReLU(),
nn.MaxPool2d(4))
self.fc = nn.Linear(32*32*32,11)
def forward(self, x):
out = self.layer1(x)
out = self.layer2(out)
out = self.layer3(out)
out = out.view(out.size(0), -1)
out = self.fc(out)
return out

net = CNN()

###OPTIMIZER
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr = LR, momentum = MOMENTUM)



1.How can I add a L2 norm in my loss function?



2.If I want to write the loss function by myself (without using optim.SGD) and do the grad-decent by autograd, how can I do?




Thanks for your help!










share|improve this question
















I’m going to compare the difference between with and without regularization, so I want to custom two loss functions.



My loss function with L2 norm:



enter image description here



###NET  
class CNN(nn.Module):
def __init__(self):
super(CNN,self).__init__()
self.layer1 = nn.Sequential(
nn.Conv2d(3, 16, kernel_size = 5, padding=2),
nn.ReLU(),
nn.MaxPool2d(2))
self.layer2 = nn.Sequential(
nn.Conv2d(16, 32, kernel_size = 5, padding=2),
nn.ReLU(),
nn.MaxPool2d(2))
self.layer3 = nn.Sequential(
nn.Conv2d(32, 32, kernel_size = 5, padding=2),
nn.ReLU(),
nn.MaxPool2d(4))
self.fc = nn.Linear(32*32*32,11)
def forward(self, x):
out = self.layer1(x)
out = self.layer2(out)
out = self.layer3(out)
out = out.view(out.size(0), -1)
out = self.fc(out)
return out

net = CNN()

###OPTIMIZER
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr = LR, momentum = MOMENTUM)



1.How can I add a L2 norm in my loss function?



2.If I want to write the loss function by myself (without using optim.SGD) and do the grad-decent by autograd, how can I do?




Thanks for your help!







python pytorch






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited May 3 '18 at 9:26







Weimin Chan

















asked May 3 '18 at 7:34









Weimin ChanWeimin Chan

456




456













  • You don't need to write two different loss functions if you want to try with and without regularization. You just need to write the one with regularization, and set the damping parameter alpha to zero when you want to try without regularization. Please edit and write the loss function with regularization so we can guide you.

    – Kefeng91
    May 3 '18 at 8:32













  • @Kefeng91 criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr = LR, weight_decay=0.01) -------------- I got an advice that I can add a parameter 'weight_decay'. If I set it to zero, it means the loss function without regularization, and if I set to to arbitrary value I'll get what I need. Right?

    – Weimin Chan
    May 3 '18 at 8:51








  • 1





    What I meant is that SO is not a "please write the code for me" forum. You should first try to write the forward function yourself and then come back to us with more details about what you tried.

    – Kefeng91
    May 3 '18 at 8:56






  • 1





    @Kefeng91 Sure, there's some misunderstanding.I wrote the net by myself but I don't know how to make a custom loss function. This is just a tiny question/part in my code.And I'm sorry for that I didn't notice that this is almost the main structure of all network.

    – Weimin Chan
    May 3 '18 at 9:10











  • Given the equation of the entropy, if you set alpha to zero, you will have no regularization. If you set alpha to anything else, you will have regularization.

    – Kefeng91
    May 3 '18 at 9:15



















  • You don't need to write two different loss functions if you want to try with and without regularization. You just need to write the one with regularization, and set the damping parameter alpha to zero when you want to try without regularization. Please edit and write the loss function with regularization so we can guide you.

    – Kefeng91
    May 3 '18 at 8:32













  • @Kefeng91 criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr = LR, weight_decay=0.01) -------------- I got an advice that I can add a parameter 'weight_decay'. If I set it to zero, it means the loss function without regularization, and if I set to to arbitrary value I'll get what I need. Right?

    – Weimin Chan
    May 3 '18 at 8:51








  • 1





    What I meant is that SO is not a "please write the code for me" forum. You should first try to write the forward function yourself and then come back to us with more details about what you tried.

    – Kefeng91
    May 3 '18 at 8:56






  • 1





    @Kefeng91 Sure, there's some misunderstanding.I wrote the net by myself but I don't know how to make a custom loss function. This is just a tiny question/part in my code.And I'm sorry for that I didn't notice that this is almost the main structure of all network.

    – Weimin Chan
    May 3 '18 at 9:10











  • Given the equation of the entropy, if you set alpha to zero, you will have no regularization. If you set alpha to anything else, you will have regularization.

    – Kefeng91
    May 3 '18 at 9:15

















You don't need to write two different loss functions if you want to try with and without regularization. You just need to write the one with regularization, and set the damping parameter alpha to zero when you want to try without regularization. Please edit and write the loss function with regularization so we can guide you.

– Kefeng91
May 3 '18 at 8:32







You don't need to write two different loss functions if you want to try with and without regularization. You just need to write the one with regularization, and set the damping parameter alpha to zero when you want to try without regularization. Please edit and write the loss function with regularization so we can guide you.

– Kefeng91
May 3 '18 at 8:32















@Kefeng91 criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr = LR, weight_decay=0.01) -------------- I got an advice that I can add a parameter 'weight_decay'. If I set it to zero, it means the loss function without regularization, and if I set to to arbitrary value I'll get what I need. Right?

– Weimin Chan
May 3 '18 at 8:51







@Kefeng91 criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr = LR, weight_decay=0.01) -------------- I got an advice that I can add a parameter 'weight_decay'. If I set it to zero, it means the loss function without regularization, and if I set to to arbitrary value I'll get what I need. Right?

– Weimin Chan
May 3 '18 at 8:51






1




1





What I meant is that SO is not a "please write the code for me" forum. You should first try to write the forward function yourself and then come back to us with more details about what you tried.

– Kefeng91
May 3 '18 at 8:56





What I meant is that SO is not a "please write the code for me" forum. You should first try to write the forward function yourself and then come back to us with more details about what you tried.

– Kefeng91
May 3 '18 at 8:56




1




1





@Kefeng91 Sure, there's some misunderstanding.I wrote the net by myself but I don't know how to make a custom loss function. This is just a tiny question/part in my code.And I'm sorry for that I didn't notice that this is almost the main structure of all network.

– Weimin Chan
May 3 '18 at 9:10





@Kefeng91 Sure, there's some misunderstanding.I wrote the net by myself but I don't know how to make a custom loss function. This is just a tiny question/part in my code.And I'm sorry for that I didn't notice that this is almost the main structure of all network.

– Weimin Chan
May 3 '18 at 9:10













Given the equation of the entropy, if you set alpha to zero, you will have no regularization. If you set alpha to anything else, you will have regularization.

– Kefeng91
May 3 '18 at 9:15





Given the equation of the entropy, if you set alpha to zero, you will have no regularization. If you set alpha to anything else, you will have regularization.

– Kefeng91
May 3 '18 at 9:15












1 Answer
1






active

oldest

votes


















1














You can explicitly compute the norm of the weights yourself, and add it to the loss.



reg = 0
for param in CNN.parameters():
reg += 0.5 * (param ** 2).sum() # you can replace it with abs().sum() to get L1 regularization
loss = criterion(CNN(x), y) + reg_lambda * reg # make the regularization part of the loss
loss.backward() # continue as usuall


See this thread for more info.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f50149376%2fhow-to-add-a-l2-regularization-term-in-my-loss-function%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    You can explicitly compute the norm of the weights yourself, and add it to the loss.



    reg = 0
    for param in CNN.parameters():
    reg += 0.5 * (param ** 2).sum() # you can replace it with abs().sum() to get L1 regularization
    loss = criterion(CNN(x), y) + reg_lambda * reg # make the regularization part of the loss
    loss.backward() # continue as usuall


    See this thread for more info.






    share|improve this answer




























      1














      You can explicitly compute the norm of the weights yourself, and add it to the loss.



      reg = 0
      for param in CNN.parameters():
      reg += 0.5 * (param ** 2).sum() # you can replace it with abs().sum() to get L1 regularization
      loss = criterion(CNN(x), y) + reg_lambda * reg # make the regularization part of the loss
      loss.backward() # continue as usuall


      See this thread for more info.






      share|improve this answer


























        1












        1








        1







        You can explicitly compute the norm of the weights yourself, and add it to the loss.



        reg = 0
        for param in CNN.parameters():
        reg += 0.5 * (param ** 2).sum() # you can replace it with abs().sum() to get L1 regularization
        loss = criterion(CNN(x), y) + reg_lambda * reg # make the regularization part of the loss
        loss.backward() # continue as usuall


        See this thread for more info.






        share|improve this answer













        You can explicitly compute the norm of the weights yourself, and add it to the loss.



        reg = 0
        for param in CNN.parameters():
        reg += 0.5 * (param ** 2).sum() # you can replace it with abs().sum() to get L1 regularization
        loss = criterion(CNN(x), y) + reg_lambda * reg # make the regularization part of the loss
        loss.backward() # continue as usuall


        See this thread for more info.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 2 at 6:52









        ShaiShai

        70.6k23138247




        70.6k23138247
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f50149376%2fhow-to-add-a-l2-regularization-term-in-my-loss-function%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            MongoDB - Not Authorized To Execute Command

            How to fix TextFormField cause rebuild widget in Flutter

            in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith