How to add a L2 regularization term in my loss function

I’m going to compare the difference between with and without regularization, so I want to custom two loss functions.

My loss function with L2 norm:

enter image description here

###NET  

class CNN(nn.Module):

def __init__(self):

    super(CNN,self).__init__()

    self.layer1 = nn.Sequential(

        nn.Conv2d(3, 16, kernel_size = 5, padding=2),

        nn.ReLU(),

        nn.MaxPool2d(2))

    self.layer2 = nn.Sequential(

        nn.Conv2d(16, 32, kernel_size = 5, padding=2),

        nn.ReLU(),

        nn.MaxPool2d(2))

    self.layer3 = nn.Sequential(

        nn.Conv2d(32, 32, kernel_size = 5, padding=2),

        nn.ReLU(),

        nn.MaxPool2d(4))

    self.fc = nn.Linear(32*32*32,11)

def forward(self, x):

    out = self.layer1(x)

    out = self.layer2(out)

    out = self.layer3(out)

    out = out.view(out.size(0), -1)

    out = self.fc(out)

    return out



net = CNN()



###OPTIMIZER

criterion = nn.CrossEntropyLoss()

optimizer = optim.SGD(net.parameters(), lr = LR, momentum = MOMENTUM)

1.How can I add a L2 norm in my loss function?

2.If I want to write the loss function by myself (without using optim.SGD) and do the grad-decent by autograd, how can I do?

Thanks for your help!

edited May 3 '18 at 9:26

asked May 3 '18 at 7:34

Weimin Chan

456

You don't need to write two different loss functions if you want to try with and without regularization. You just need to write the one with regularization, and set the damping parameter alpha to zero when you want to try without regularization. Please edit and write the loss function with regularization so we can guide you.

– Kefeng91
May 3 '18 at 8:32

@Kefeng91 criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr = LR, weight_decay=0.01) -------------- I got an advice that I can add a parameter 'weight_decay'. If I set it to zero, it means the loss function without regularization, and if I set to to arbitrary value I'll get what I need. Right?

– Weimin Chan
May 3 '18 at 8:51

1

What I meant is that SO is not a "please write the code for me" forum. You should first try to write the forward function yourself and then come back to us with more details about what you tried.

– Kefeng91
May 3 '18 at 8:56

1

@Kefeng91 Sure, there's some misunderstanding.I wrote the net by myself but I don't know how to make a custom loss function. This is just a tiny question/part in my code.And I'm sorry for that I didn't notice that this is almost the main structure of all network.

– Weimin Chan
May 3 '18 at 9:10

Given the equation of the entropy, if you set alpha to zero, you will have no regularization. If you set alpha to anything else, you will have regularization.

– Kefeng91
May 3 '18 at 9:15

|
show 1 more comment

I’m going to compare the difference between with and without regularization, so I want to custom two loss functions.

My loss function with L2 norm:

enter image description here

###NET  

class CNN(nn.Module):

def __init__(self):

    super(CNN,self).__init__()

    self.layer1 = nn.Sequential(

        nn.Conv2d(3, 16, kernel_size = 5, padding=2),

        nn.ReLU(),

        nn.MaxPool2d(2))

    self.layer2 = nn.Sequential(

        nn.Conv2d(16, 32, kernel_size = 5, padding=2),

        nn.ReLU(),

        nn.MaxPool2d(2))

    self.layer3 = nn.Sequential(

        nn.Conv2d(32, 32, kernel_size = 5, padding=2),

        nn.ReLU(),

        nn.MaxPool2d(4))

    self.fc = nn.Linear(32*32*32,11)

def forward(self, x):

    out = self.layer1(x)

    out = self.layer2(out)

    out = self.layer3(out)

    out = out.view(out.size(0), -1)

    out = self.fc(out)

    return out



net = CNN()



###OPTIMIZER

criterion = nn.CrossEntropyLoss()

optimizer = optim.SGD(net.parameters(), lr = LR, momentum = MOMENTUM)

1.How can I add a L2 norm in my loss function?

2.If I want to write the loss function by myself (without using optim.SGD) and do the grad-decent by autograd, how can I do?

Thanks for your help!

edited May 3 '18 at 9:26

asked May 3 '18 at 7:34

Weimin Chan

456

You don't need to write two different loss functions if you want to try with and without regularization. You just need to write the one with regularization, and set the damping parameter alpha to zero when you want to try without regularization. Please edit and write the loss function with regularization so we can guide you.

– Kefeng91
May 3 '18 at 8:32

@Kefeng91 criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr = LR, weight_decay=0.01) -------------- I got an advice that I can add a parameter 'weight_decay'. If I set it to zero, it means the loss function without regularization, and if I set to to arbitrary value I'll get what I need. Right?

– Weimin Chan
May 3 '18 at 8:51

1

What I meant is that SO is not a "please write the code for me" forum. You should first try to write the forward function yourself and then come back to us with more details about what you tried.

– Kefeng91
May 3 '18 at 8:56

1

@Kefeng91 Sure, there's some misunderstanding.I wrote the net by myself but I don't know how to make a custom loss function. This is just a tiny question/part in my code.And I'm sorry for that I didn't notice that this is almost the main structure of all network.

– Weimin Chan
May 3 '18 at 9:10

Given the equation of the entropy, if you set alpha to zero, you will have no regularization. If you set alpha to anything else, you will have regularization.

– Kefeng91
May 3 '18 at 9:15

|
show 1 more comment

I’m going to compare the difference between with and without regularization, so I want to custom two loss functions.

My loss function with L2 norm:

enter image description here

###NET  

class CNN(nn.Module):

def __init__(self):

    super(CNN,self).__init__()

    self.layer1 = nn.Sequential(

        nn.Conv2d(3, 16, kernel_size = 5, padding=2),

        nn.ReLU(),

        nn.MaxPool2d(2))

    self.layer2 = nn.Sequential(

        nn.Conv2d(16, 32, kernel_size = 5, padding=2),

        nn.ReLU(),

        nn.MaxPool2d(2))

    self.layer3 = nn.Sequential(

        nn.Conv2d(32, 32, kernel_size = 5, padding=2),

        nn.ReLU(),

        nn.MaxPool2d(4))

    self.fc = nn.Linear(32*32*32,11)

def forward(self, x):

    out = self.layer1(x)

    out = self.layer2(out)

    out = self.layer3(out)

    out = out.view(out.size(0), -1)

    out = self.fc(out)

    return out



net = CNN()



###OPTIMIZER

criterion = nn.CrossEntropyLoss()

optimizer = optim.SGD(net.parameters(), lr = LR, momentum = MOMENTUM)

1.How can I add a L2 norm in my loss function?

2.If I want to write the loss function by myself (without using optim.SGD) and do the grad-decent by autograd, how can I do?

Thanks for your help!

edited May 3 '18 at 9:26

asked May 3 '18 at 7:34

Weimin Chan

456

I’m going to compare the difference between with and without regularization, so I want to custom two loss functions.

My loss function with L2 norm:

enter image description here

###NET  

class CNN(nn.Module):

def __init__(self):

    super(CNN,self).__init__()

    self.layer1 = nn.Sequential(

        nn.Conv2d(3, 16, kernel_size = 5, padding=2),

        nn.ReLU(),

        nn.MaxPool2d(2))

    self.layer2 = nn.Sequential(

        nn.Conv2d(16, 32, kernel_size = 5, padding=2),

        nn.ReLU(),

        nn.MaxPool2d(2))

    self.layer3 = nn.Sequential(

        nn.Conv2d(32, 32, kernel_size = 5, padding=2),

        nn.ReLU(),

        nn.MaxPool2d(4))

    self.fc = nn.Linear(32*32*32,11)

def forward(self, x):

    out = self.layer1(x)

    out = self.layer2(out)

    out = self.layer3(out)

    out = out.view(out.size(0), -1)

    out = self.fc(out)

    return out



net = CNN()



###OPTIMIZER

criterion = nn.CrossEntropyLoss()

optimizer = optim.SGD(net.parameters(), lr = LR, momentum = MOMENTUM)

1.How can I add a L2 norm in my loss function?

2.If I want to write the loss function by myself (without using optim.SGD) and do the grad-decent by autograd, how can I do?

Thanks for your help!

python pytorch

edited May 3 '18 at 9:26

asked May 3 '18 at 7:34

Weimin Chan

456

edited May 3 '18 at 9:26

asked May 3 '18 at 7:34

Weimin Chan

456

edited May 3 '18 at 9:26

asked May 3 '18 at 7:34

Weimin Chan

456

asked May 3 '18 at 7:34

Weimin Chan

456

asked May 3 '18 at 7:34

Weimin Chan

456

You don't need to write two different loss functions if you want to try with and without regularization. You just need to write the one with regularization, and set the damping parameter alpha to zero when you want to try without regularization. Please edit and write the loss function with regularization so we can guide you.

– Kefeng91
May 3 '18 at 8:32

@Kefeng91 criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr = LR, weight_decay=0.01) -------------- I got an advice that I can add a parameter 'weight_decay'. If I set it to zero, it means the loss function without regularization, and if I set to to arbitrary value I'll get what I need. Right?

– Weimin Chan
May 3 '18 at 8:51

1

What I meant is that SO is not a "please write the code for me" forum. You should first try to write the forward function yourself and then come back to us with more details about what you tried.

– Kefeng91
May 3 '18 at 8:56

1

@Kefeng91 Sure, there's some misunderstanding.I wrote the net by myself but I don't know how to make a custom loss function. This is just a tiny question/part in my code.And I'm sorry for that I didn't notice that this is almost the main structure of all network.

– Weimin Chan
May 3 '18 at 9:10

Given the equation of the entropy, if you set alpha to zero, you will have no regularization. If you set alpha to anything else, you will have regularization.

– Kefeng91
May 3 '18 at 9:15

|
show 1 more comment

You don't need to write two different loss functions if you want to try with and without regularization. You just need to write the one with regularization, and set the damping parameter alpha to zero when you want to try without regularization. Please edit and write the loss function with regularization so we can guide you.

– Kefeng91
May 3 '18 at 8:32

@Kefeng91 criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr = LR, weight_decay=0.01) -------------- I got an advice that I can add a parameter 'weight_decay'. If I set it to zero, it means the loss function without regularization, and if I set to to arbitrary value I'll get what I need. Right?

– Weimin Chan
May 3 '18 at 8:51

1

What I meant is that SO is not a "please write the code for me" forum. You should first try to write the forward function yourself and then come back to us with more details about what you tried.

– Kefeng91
May 3 '18 at 8:56

1

@Kefeng91 Sure, there's some misunderstanding.I wrote the net by myself but I don't know how to make a custom loss function. This is just a tiny question/part in my code.And I'm sorry for that I didn't notice that this is almost the main structure of all network.

– Weimin Chan
May 3 '18 at 9:10

Given the equation of the entropy, if you set alpha to zero, you will have no regularization. If you set alpha to anything else, you will have regularization.

– Kefeng91
May 3 '18 at 9:15

You don't need to write two different loss functions if you want to try with and without regularization. You just need to write the one with regularization, and set the damping parameter alpha to zero when you want to try without regularization. Please edit and write the loss function with regularization so we can guide you.

– Kefeng91
May 3 '18 at 8:32

@Kefeng91 criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr = LR, weight_decay=0.01) -------------- I got an advice that I can add a parameter 'weight_decay'. If I set it to zero, it means the loss function without regularization, and if I set to to arbitrary value I'll get what I need. Right?

– Weimin Chan
May 3 '18 at 8:51

What I meant is that SO is not a "please write the code for me" forum. You should first try to write the forward function yourself and then come back to us with more details about what you tried.

– Kefeng91
May 3 '18 at 8:56

@Kefeng91 Sure, there's some misunderstanding.I wrote the net by myself but I don't know how to make a custom loss function. This is just a tiny question/part in my code.And I'm sorry for that I didn't notice that this is almost the main structure of all network.

– Weimin Chan
May 3 '18 at 9:10

Given the equation of the entropy, if you set alpha to zero, you will have no regularization. If you set alpha to anything else, you will have regularization.

– Kefeng91
May 3 '18 at 9:15

|
show 1 more comment

1 Answer
1

active

oldest

votes

You can explicitly compute the norm of the weights yourself, and add it to the loss.

reg = 0

for param in CNN.parameters():

  reg += 0.5 * (param ** 2).sum()  # you can replace it with abs().sum() to get L1 regularization

loss = criterion(CNN(x), y) + reg_lambda * reg  # make the regularization part of the loss

loss.backward()  # continue as usuall

See this thread for more info.

answered Jan 2 at 6:52

Shai

70.6k23138247

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f50149376%2fhow-to-add-a-l2-regularization-term-in-my-loss-function%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You can explicitly compute the norm of the weights yourself, and add it to the loss.

reg = 0

for param in CNN.parameters():

  reg += 0.5 * (param ** 2).sum()  # you can replace it with abs().sum() to get L1 regularization

loss = criterion(CNN(x), y) + reg_lambda * reg  # make the regularization part of the loss

loss.backward()  # continue as usuall

See this thread for more info.

answered Jan 2 at 6:52

Shai

70.6k23138247

add a comment |

You can explicitly compute the norm of the weights yourself, and add it to the loss.

reg = 0

for param in CNN.parameters():

  reg += 0.5 * (param ** 2).sum()  # you can replace it with abs().sum() to get L1 regularization

loss = criterion(CNN(x), y) + reg_lambda * reg  # make the regularization part of the loss

loss.backward()  # continue as usuall

See this thread for more info.

answered Jan 2 at 6:52

Shai

70.6k23138247

add a comment |

You can explicitly compute the norm of the weights yourself, and add it to the loss.

reg = 0

for param in CNN.parameters():

  reg += 0.5 * (param ** 2).sum()  # you can replace it with abs().sum() to get L1 regularization

loss = criterion(CNN(x), y) + reg_lambda * reg  # make the regularization part of the loss

loss.backward()  # continue as usuall

See this thread for more info.

answered Jan 2 at 6:52

Shai

70.6k23138247

You can explicitly compute the norm of the weights yourself, and add it to the loss.

reg = 0

for param in CNN.parameters():

  reg += 0.5 * (param ** 2).sum()  # you can replace it with abs().sum() to get L1 regularization

loss = criterion(CNN(x), y) + reg_lambda * reg  # make the regularization part of the loss

loss.backward()  # continue as usuall

See this thread for more info.

answered Jan 2 at 6:52

Shai

70.6k23138247

answered Jan 2 at 6:52

Shai

70.6k23138247

answered Jan 2 at 6:52

Shai

70.6k23138247

answered Jan 2 at 6:52

Shai

70.6k23138247

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu