Finding Objects Most Similar to Other Objects as a “Linear Combination”












1












$begingroup$


I have a small collection of objects, ${o_i}_{i=1}^{20},$ that have several properties (all quantitative, or can be made to be quantitative) that differ among themselves. So $o_1$ has, say, $10$ properties, ${p_{1j}}_{j=1}^{10},$ and something similar for the other $19$ objects. I have $5$ more objects that are of the same general kind as the original $20,$ but have yet different properties. Unfortunately, these properties are not even the same categories among all $25$ total objects. Some of the properties are common to all the objects, and some are not.



There is one important property, I'll call "usage", that is common to all $25$ objects. We'll let the usage of object $i$ be written as $u_i$. Note that $u_ige 0;forall,i.$



What I would like to do is find a way to write each of the $5$ new usages as a linear combination of the $20$ original object usages, with objects among the $20$ that are more similar contributing more.



For example, let's take one of the new objects, $o_{21}$. I would like to write
$$u_{21}=sum_{i=1}^{20}a_i u_i,qquad 0le a_ile 1;forall,i,qquad text{s.t.};sum_{i=1}^{20}a_i=1.$$
Now suppose that $o_7$ was the most similar to $o_{21}:$ then $a_7$ should be larger than all the other $a_{i}$'s. It is not important that this "linear combination" allow me to predict the properties of $o_{21}.$ It's only important that $sum_{i=1}^{20}a_i=1$ and that objects more similar to $o_{21}$ have correspondingly larger $a_i$'s.



The target variables here are, for each of the $5$ new objects, the $a_i$'s that satisfy the above criteria. The $u_i$ are known for the $20$ original objects, and unknown for the $5$ new objects, so the $u_i$ are also target variables for the new objects. However, as knowing the $a_i$ will determine the $u_i$ for the new objects, the immediate goal for this question is to find the $a_i$'s.



Now it's not too difficult to find out which of the $20$ original objects are "closest" to, say, $o_{21}:$ normalize the common quantitative categories, drop the ones not common to all, and use the Euclidean distance norm. (I have no a priori notion of which properties might be more important than others, so I'd rather treat them all on an equal footing.) If I divided each of these distances by the sum total of all the distances, I would get the opposite of what I want: the "closer" objects would have correspondingly smaller $a_i$'s.



So it comes down to this question: what sort of function would be good to switch this around, so that the closer objects get larger $a_i$'s? Subtract each distance from the maximum distance and then normalize?










share|cite|improve this question











$endgroup$












  • $begingroup$
    The first problem that I can see here is that there's no reason why your original objects should span the vector space of properties.
    $endgroup$
    – user3482749
    Jan 19 at 20:39










  • $begingroup$
    That's unimportant in my application. This application is kind of like Natural Language Processing, e.g., if I'm wanting to find a word most similar to another in meaning. In a situation like that, it's not important if the set of words you have to work with spans the space of all meanings. You just want the closest one. In this case, I don't want to lose the information that the objects farther away give me, hence the linear combination. I could just go with the closest, but in this application, I think the linear combination gives me a bit more finesse.
    $endgroup$
    – Adrian Keister
    Jan 19 at 20:42










  • $begingroup$
    It isn't, though: the very first thing that you do is write your new object as a linear combination of the old ones. If the old objects don't span, how can you know that's even possible?
    $endgroup$
    – user3482749
    Jan 19 at 20:43










  • $begingroup$
    You're quite right; I already know it's not possible in any exact sense. The "linear combination" is only there as a way of expressing the idea that $o_{21}$ is like all these $20$ objects, but it's more like $o_7$ than $o_9,$ and it's more like $o_{10}$ than $o_{3}$.
    $endgroup$
    – Adrian Keister
    Jan 19 at 20:46










  • $begingroup$
    Perhaps, then, you could instead edit the question to say what you mean. Because right now, it's essentially impossible to answer.
    $endgroup$
    – user3482749
    Jan 19 at 20:53
















1












$begingroup$


I have a small collection of objects, ${o_i}_{i=1}^{20},$ that have several properties (all quantitative, or can be made to be quantitative) that differ among themselves. So $o_1$ has, say, $10$ properties, ${p_{1j}}_{j=1}^{10},$ and something similar for the other $19$ objects. I have $5$ more objects that are of the same general kind as the original $20,$ but have yet different properties. Unfortunately, these properties are not even the same categories among all $25$ total objects. Some of the properties are common to all the objects, and some are not.



There is one important property, I'll call "usage", that is common to all $25$ objects. We'll let the usage of object $i$ be written as $u_i$. Note that $u_ige 0;forall,i.$



What I would like to do is find a way to write each of the $5$ new usages as a linear combination of the $20$ original object usages, with objects among the $20$ that are more similar contributing more.



For example, let's take one of the new objects, $o_{21}$. I would like to write
$$u_{21}=sum_{i=1}^{20}a_i u_i,qquad 0le a_ile 1;forall,i,qquad text{s.t.};sum_{i=1}^{20}a_i=1.$$
Now suppose that $o_7$ was the most similar to $o_{21}:$ then $a_7$ should be larger than all the other $a_{i}$'s. It is not important that this "linear combination" allow me to predict the properties of $o_{21}.$ It's only important that $sum_{i=1}^{20}a_i=1$ and that objects more similar to $o_{21}$ have correspondingly larger $a_i$'s.



The target variables here are, for each of the $5$ new objects, the $a_i$'s that satisfy the above criteria. The $u_i$ are known for the $20$ original objects, and unknown for the $5$ new objects, so the $u_i$ are also target variables for the new objects. However, as knowing the $a_i$ will determine the $u_i$ for the new objects, the immediate goal for this question is to find the $a_i$'s.



Now it's not too difficult to find out which of the $20$ original objects are "closest" to, say, $o_{21}:$ normalize the common quantitative categories, drop the ones not common to all, and use the Euclidean distance norm. (I have no a priori notion of which properties might be more important than others, so I'd rather treat them all on an equal footing.) If I divided each of these distances by the sum total of all the distances, I would get the opposite of what I want: the "closer" objects would have correspondingly smaller $a_i$'s.



So it comes down to this question: what sort of function would be good to switch this around, so that the closer objects get larger $a_i$'s? Subtract each distance from the maximum distance and then normalize?










share|cite|improve this question











$endgroup$












  • $begingroup$
    The first problem that I can see here is that there's no reason why your original objects should span the vector space of properties.
    $endgroup$
    – user3482749
    Jan 19 at 20:39










  • $begingroup$
    That's unimportant in my application. This application is kind of like Natural Language Processing, e.g., if I'm wanting to find a word most similar to another in meaning. In a situation like that, it's not important if the set of words you have to work with spans the space of all meanings. You just want the closest one. In this case, I don't want to lose the information that the objects farther away give me, hence the linear combination. I could just go with the closest, but in this application, I think the linear combination gives me a bit more finesse.
    $endgroup$
    – Adrian Keister
    Jan 19 at 20:42










  • $begingroup$
    It isn't, though: the very first thing that you do is write your new object as a linear combination of the old ones. If the old objects don't span, how can you know that's even possible?
    $endgroup$
    – user3482749
    Jan 19 at 20:43










  • $begingroup$
    You're quite right; I already know it's not possible in any exact sense. The "linear combination" is only there as a way of expressing the idea that $o_{21}$ is like all these $20$ objects, but it's more like $o_7$ than $o_9,$ and it's more like $o_{10}$ than $o_{3}$.
    $endgroup$
    – Adrian Keister
    Jan 19 at 20:46










  • $begingroup$
    Perhaps, then, you could instead edit the question to say what you mean. Because right now, it's essentially impossible to answer.
    $endgroup$
    – user3482749
    Jan 19 at 20:53














1












1








1


1



$begingroup$


I have a small collection of objects, ${o_i}_{i=1}^{20},$ that have several properties (all quantitative, or can be made to be quantitative) that differ among themselves. So $o_1$ has, say, $10$ properties, ${p_{1j}}_{j=1}^{10},$ and something similar for the other $19$ objects. I have $5$ more objects that are of the same general kind as the original $20,$ but have yet different properties. Unfortunately, these properties are not even the same categories among all $25$ total objects. Some of the properties are common to all the objects, and some are not.



There is one important property, I'll call "usage", that is common to all $25$ objects. We'll let the usage of object $i$ be written as $u_i$. Note that $u_ige 0;forall,i.$



What I would like to do is find a way to write each of the $5$ new usages as a linear combination of the $20$ original object usages, with objects among the $20$ that are more similar contributing more.



For example, let's take one of the new objects, $o_{21}$. I would like to write
$$u_{21}=sum_{i=1}^{20}a_i u_i,qquad 0le a_ile 1;forall,i,qquad text{s.t.};sum_{i=1}^{20}a_i=1.$$
Now suppose that $o_7$ was the most similar to $o_{21}:$ then $a_7$ should be larger than all the other $a_{i}$'s. It is not important that this "linear combination" allow me to predict the properties of $o_{21}.$ It's only important that $sum_{i=1}^{20}a_i=1$ and that objects more similar to $o_{21}$ have correspondingly larger $a_i$'s.



The target variables here are, for each of the $5$ new objects, the $a_i$'s that satisfy the above criteria. The $u_i$ are known for the $20$ original objects, and unknown for the $5$ new objects, so the $u_i$ are also target variables for the new objects. However, as knowing the $a_i$ will determine the $u_i$ for the new objects, the immediate goal for this question is to find the $a_i$'s.



Now it's not too difficult to find out which of the $20$ original objects are "closest" to, say, $o_{21}:$ normalize the common quantitative categories, drop the ones not common to all, and use the Euclidean distance norm. (I have no a priori notion of which properties might be more important than others, so I'd rather treat them all on an equal footing.) If I divided each of these distances by the sum total of all the distances, I would get the opposite of what I want: the "closer" objects would have correspondingly smaller $a_i$'s.



So it comes down to this question: what sort of function would be good to switch this around, so that the closer objects get larger $a_i$'s? Subtract each distance from the maximum distance and then normalize?










share|cite|improve this question











$endgroup$




I have a small collection of objects, ${o_i}_{i=1}^{20},$ that have several properties (all quantitative, or can be made to be quantitative) that differ among themselves. So $o_1$ has, say, $10$ properties, ${p_{1j}}_{j=1}^{10},$ and something similar for the other $19$ objects. I have $5$ more objects that are of the same general kind as the original $20,$ but have yet different properties. Unfortunately, these properties are not even the same categories among all $25$ total objects. Some of the properties are common to all the objects, and some are not.



There is one important property, I'll call "usage", that is common to all $25$ objects. We'll let the usage of object $i$ be written as $u_i$. Note that $u_ige 0;forall,i.$



What I would like to do is find a way to write each of the $5$ new usages as a linear combination of the $20$ original object usages, with objects among the $20$ that are more similar contributing more.



For example, let's take one of the new objects, $o_{21}$. I would like to write
$$u_{21}=sum_{i=1}^{20}a_i u_i,qquad 0le a_ile 1;forall,i,qquad text{s.t.};sum_{i=1}^{20}a_i=1.$$
Now suppose that $o_7$ was the most similar to $o_{21}:$ then $a_7$ should be larger than all the other $a_{i}$'s. It is not important that this "linear combination" allow me to predict the properties of $o_{21}.$ It's only important that $sum_{i=1}^{20}a_i=1$ and that objects more similar to $o_{21}$ have correspondingly larger $a_i$'s.



The target variables here are, for each of the $5$ new objects, the $a_i$'s that satisfy the above criteria. The $u_i$ are known for the $20$ original objects, and unknown for the $5$ new objects, so the $u_i$ are also target variables for the new objects. However, as knowing the $a_i$ will determine the $u_i$ for the new objects, the immediate goal for this question is to find the $a_i$'s.



Now it's not too difficult to find out which of the $20$ original objects are "closest" to, say, $o_{21}:$ normalize the common quantitative categories, drop the ones not common to all, and use the Euclidean distance norm. (I have no a priori notion of which properties might be more important than others, so I'd rather treat them all on an equal footing.) If I divided each of these distances by the sum total of all the distances, I would get the opposite of what I want: the "closer" objects would have correspondingly smaller $a_i$'s.



So it comes down to this question: what sort of function would be good to switch this around, so that the closer objects get larger $a_i$'s? Subtract each distance from the maximum distance and then normalize?







norm data-analysis






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Jan 19 at 20:58







Adrian Keister

















asked Jan 19 at 20:27









Adrian KeisterAdrian Keister

5,27371933




5,27371933












  • $begingroup$
    The first problem that I can see here is that there's no reason why your original objects should span the vector space of properties.
    $endgroup$
    – user3482749
    Jan 19 at 20:39










  • $begingroup$
    That's unimportant in my application. This application is kind of like Natural Language Processing, e.g., if I'm wanting to find a word most similar to another in meaning. In a situation like that, it's not important if the set of words you have to work with spans the space of all meanings. You just want the closest one. In this case, I don't want to lose the information that the objects farther away give me, hence the linear combination. I could just go with the closest, but in this application, I think the linear combination gives me a bit more finesse.
    $endgroup$
    – Adrian Keister
    Jan 19 at 20:42










  • $begingroup$
    It isn't, though: the very first thing that you do is write your new object as a linear combination of the old ones. If the old objects don't span, how can you know that's even possible?
    $endgroup$
    – user3482749
    Jan 19 at 20:43










  • $begingroup$
    You're quite right; I already know it's not possible in any exact sense. The "linear combination" is only there as a way of expressing the idea that $o_{21}$ is like all these $20$ objects, but it's more like $o_7$ than $o_9,$ and it's more like $o_{10}$ than $o_{3}$.
    $endgroup$
    – Adrian Keister
    Jan 19 at 20:46










  • $begingroup$
    Perhaps, then, you could instead edit the question to say what you mean. Because right now, it's essentially impossible to answer.
    $endgroup$
    – user3482749
    Jan 19 at 20:53


















  • $begingroup$
    The first problem that I can see here is that there's no reason why your original objects should span the vector space of properties.
    $endgroup$
    – user3482749
    Jan 19 at 20:39










  • $begingroup$
    That's unimportant in my application. This application is kind of like Natural Language Processing, e.g., if I'm wanting to find a word most similar to another in meaning. In a situation like that, it's not important if the set of words you have to work with spans the space of all meanings. You just want the closest one. In this case, I don't want to lose the information that the objects farther away give me, hence the linear combination. I could just go with the closest, but in this application, I think the linear combination gives me a bit more finesse.
    $endgroup$
    – Adrian Keister
    Jan 19 at 20:42










  • $begingroup$
    It isn't, though: the very first thing that you do is write your new object as a linear combination of the old ones. If the old objects don't span, how can you know that's even possible?
    $endgroup$
    – user3482749
    Jan 19 at 20:43










  • $begingroup$
    You're quite right; I already know it's not possible in any exact sense. The "linear combination" is only there as a way of expressing the idea that $o_{21}$ is like all these $20$ objects, but it's more like $o_7$ than $o_9,$ and it's more like $o_{10}$ than $o_{3}$.
    $endgroup$
    – Adrian Keister
    Jan 19 at 20:46










  • $begingroup$
    Perhaps, then, you could instead edit the question to say what you mean. Because right now, it's essentially impossible to answer.
    $endgroup$
    – user3482749
    Jan 19 at 20:53
















$begingroup$
The first problem that I can see here is that there's no reason why your original objects should span the vector space of properties.
$endgroup$
– user3482749
Jan 19 at 20:39




$begingroup$
The first problem that I can see here is that there's no reason why your original objects should span the vector space of properties.
$endgroup$
– user3482749
Jan 19 at 20:39












$begingroup$
That's unimportant in my application. This application is kind of like Natural Language Processing, e.g., if I'm wanting to find a word most similar to another in meaning. In a situation like that, it's not important if the set of words you have to work with spans the space of all meanings. You just want the closest one. In this case, I don't want to lose the information that the objects farther away give me, hence the linear combination. I could just go with the closest, but in this application, I think the linear combination gives me a bit more finesse.
$endgroup$
– Adrian Keister
Jan 19 at 20:42




$begingroup$
That's unimportant in my application. This application is kind of like Natural Language Processing, e.g., if I'm wanting to find a word most similar to another in meaning. In a situation like that, it's not important if the set of words you have to work with spans the space of all meanings. You just want the closest one. In this case, I don't want to lose the information that the objects farther away give me, hence the linear combination. I could just go with the closest, but in this application, I think the linear combination gives me a bit more finesse.
$endgroup$
– Adrian Keister
Jan 19 at 20:42












$begingroup$
It isn't, though: the very first thing that you do is write your new object as a linear combination of the old ones. If the old objects don't span, how can you know that's even possible?
$endgroup$
– user3482749
Jan 19 at 20:43




$begingroup$
It isn't, though: the very first thing that you do is write your new object as a linear combination of the old ones. If the old objects don't span, how can you know that's even possible?
$endgroup$
– user3482749
Jan 19 at 20:43












$begingroup$
You're quite right; I already know it's not possible in any exact sense. The "linear combination" is only there as a way of expressing the idea that $o_{21}$ is like all these $20$ objects, but it's more like $o_7$ than $o_9,$ and it's more like $o_{10}$ than $o_{3}$.
$endgroup$
– Adrian Keister
Jan 19 at 20:46




$begingroup$
You're quite right; I already know it's not possible in any exact sense. The "linear combination" is only there as a way of expressing the idea that $o_{21}$ is like all these $20$ objects, but it's more like $o_7$ than $o_9,$ and it's more like $o_{10}$ than $o_{3}$.
$endgroup$
– Adrian Keister
Jan 19 at 20:46












$begingroup$
Perhaps, then, you could instead edit the question to say what you mean. Because right now, it's essentially impossible to answer.
$endgroup$
– user3482749
Jan 19 at 20:53




$begingroup$
Perhaps, then, you could instead edit the question to say what you mean. Because right now, it's essentially impossible to answer.
$endgroup$
– user3482749
Jan 19 at 20:53










0






active

oldest

votes











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3079780%2ffinding-objects-most-similar-to-other-objects-as-a-linear-combination%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Mathematics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3079780%2ffinding-objects-most-similar-to-other-objects-as-a-linear-combination%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

android studio warns about leanback feature tag usage required on manifest while using Unity exported app?

SQL update select statement

'app-layout' is not a known element: how to share Component with different Modules