Why uppercase for $X$ and lowercase for $y$?
$begingroup$
Why is it most of the time (in many websites, articles or demonstration) the feature variable (columns) is denoted by a upper-case 'X' whereas the target variable is a lower-case 'y'?
Looks more like a coding standard to me.
Ex.
X = df.iloc[:, :-1]
y = df.iloc[:, -1]
Just curious because I hardly ever use just a single letter to represent a variable storing meaningful data.
machine-learning classification python cross-validation scikit-learn
$endgroup$
add a comment |
$begingroup$
Why is it most of the time (in many websites, articles or demonstration) the feature variable (columns) is denoted by a upper-case 'X' whereas the target variable is a lower-case 'y'?
Looks more like a coding standard to me.
Ex.
X = df.iloc[:, :-1]
y = df.iloc[:, -1]
Just curious because I hardly ever use just a single letter to represent a variable storing meaningful data.
machine-learning classification python cross-validation scikit-learn
$endgroup$
5
$begingroup$
Consistency with linear algebra notation, I guess? The features usually form a matrix (typically denoted by uppercase) whereas the labels are usually 1d, forming a column vector (typically denoted by lowercase).
$endgroup$
– galoosh33
Jan 27 at 15:49
1
$begingroup$
"I hardly ever use just a single letter to represent a variable storing meaningful data." - the problem is that in common math typesetting, product signs are not written. Thus, it is unclear whether an expression $pi$ represents a single variable called "pi", or the product of two separate variables "p" and "i", $pi=ptimes i$. To avoid this confusion, math-heavy disciplines very rarely use variables containing multiple letters. (When you implement an algorithm, yes, it is very good practice to replace single-letter variables by multi-letter ones, if only for easier search&replace.)
$endgroup$
– Stephan Kolassa
Jan 27 at 21:21
$begingroup$
@StephanKolassa But in programming, a variable namedpidoes never mean a product ofpandi(of course, unless you writepi = p * ior similarly). Even in the languages that allow juxtaposition for product they will be spaced out (p i, for example). (I personally think that allowing omission of the product symbol was of the strongest blunders in math notation history, since it restricts the variable name to one letter, so more or less meaningful designations are not possible without using indices; and the 1-letter names quickly run out, which forces us to use Greek, Gothic etc.)
$endgroup$
– trolley813
Jan 28 at 7:58
1
$begingroup$
@trolley813: the OP was explicitly asking about "websites , articles or demonstration", not programming - but then proceeded to noting (correctly) that this is not the convention in programming. There is simply a category confusion here, which I pointed out.
$endgroup$
– Stephan Kolassa
Jan 28 at 9:02
add a comment |
$begingroup$
Why is it most of the time (in many websites, articles or demonstration) the feature variable (columns) is denoted by a upper-case 'X' whereas the target variable is a lower-case 'y'?
Looks more like a coding standard to me.
Ex.
X = df.iloc[:, :-1]
y = df.iloc[:, -1]
Just curious because I hardly ever use just a single letter to represent a variable storing meaningful data.
machine-learning classification python cross-validation scikit-learn
$endgroup$
Why is it most of the time (in many websites, articles or demonstration) the feature variable (columns) is denoted by a upper-case 'X' whereas the target variable is a lower-case 'y'?
Looks more like a coding standard to me.
Ex.
X = df.iloc[:, :-1]
y = df.iloc[:, -1]
Just curious because I hardly ever use just a single letter to represent a variable storing meaningful data.
machine-learning classification python cross-validation scikit-learn
machine-learning classification python cross-validation scikit-learn
edited Feb 19 at 12:20
amoeba
61.5k15206266
61.5k15206266
asked Jan 27 at 15:39
ranit.branit.b
286
286
5
$begingroup$
Consistency with linear algebra notation, I guess? The features usually form a matrix (typically denoted by uppercase) whereas the labels are usually 1d, forming a column vector (typically denoted by lowercase).
$endgroup$
– galoosh33
Jan 27 at 15:49
1
$begingroup$
"I hardly ever use just a single letter to represent a variable storing meaningful data." - the problem is that in common math typesetting, product signs are not written. Thus, it is unclear whether an expression $pi$ represents a single variable called "pi", or the product of two separate variables "p" and "i", $pi=ptimes i$. To avoid this confusion, math-heavy disciplines very rarely use variables containing multiple letters. (When you implement an algorithm, yes, it is very good practice to replace single-letter variables by multi-letter ones, if only for easier search&replace.)
$endgroup$
– Stephan Kolassa
Jan 27 at 21:21
$begingroup$
@StephanKolassa But in programming, a variable namedpidoes never mean a product ofpandi(of course, unless you writepi = p * ior similarly). Even in the languages that allow juxtaposition for product they will be spaced out (p i, for example). (I personally think that allowing omission of the product symbol was of the strongest blunders in math notation history, since it restricts the variable name to one letter, so more or less meaningful designations are not possible without using indices; and the 1-letter names quickly run out, which forces us to use Greek, Gothic etc.)
$endgroup$
– trolley813
Jan 28 at 7:58
1
$begingroup$
@trolley813: the OP was explicitly asking about "websites , articles or demonstration", not programming - but then proceeded to noting (correctly) that this is not the convention in programming. There is simply a category confusion here, which I pointed out.
$endgroup$
– Stephan Kolassa
Jan 28 at 9:02
add a comment |
5
$begingroup$
Consistency with linear algebra notation, I guess? The features usually form a matrix (typically denoted by uppercase) whereas the labels are usually 1d, forming a column vector (typically denoted by lowercase).
$endgroup$
– galoosh33
Jan 27 at 15:49
1
$begingroup$
"I hardly ever use just a single letter to represent a variable storing meaningful data." - the problem is that in common math typesetting, product signs are not written. Thus, it is unclear whether an expression $pi$ represents a single variable called "pi", or the product of two separate variables "p" and "i", $pi=ptimes i$. To avoid this confusion, math-heavy disciplines very rarely use variables containing multiple letters. (When you implement an algorithm, yes, it is very good practice to replace single-letter variables by multi-letter ones, if only for easier search&replace.)
$endgroup$
– Stephan Kolassa
Jan 27 at 21:21
$begingroup$
@StephanKolassa But in programming, a variable namedpidoes never mean a product ofpandi(of course, unless you writepi = p * ior similarly). Even in the languages that allow juxtaposition for product they will be spaced out (p i, for example). (I personally think that allowing omission of the product symbol was of the strongest blunders in math notation history, since it restricts the variable name to one letter, so more or less meaningful designations are not possible without using indices; and the 1-letter names quickly run out, which forces us to use Greek, Gothic etc.)
$endgroup$
– trolley813
Jan 28 at 7:58
1
$begingroup$
@trolley813: the OP was explicitly asking about "websites , articles or demonstration", not programming - but then proceeded to noting (correctly) that this is not the convention in programming. There is simply a category confusion here, which I pointed out.
$endgroup$
– Stephan Kolassa
Jan 28 at 9:02
5
5
$begingroup$
Consistency with linear algebra notation, I guess? The features usually form a matrix (typically denoted by uppercase) whereas the labels are usually 1d, forming a column vector (typically denoted by lowercase).
$endgroup$
– galoosh33
Jan 27 at 15:49
$begingroup$
Consistency with linear algebra notation, I guess? The features usually form a matrix (typically denoted by uppercase) whereas the labels are usually 1d, forming a column vector (typically denoted by lowercase).
$endgroup$
– galoosh33
Jan 27 at 15:49
1
1
$begingroup$
"I hardly ever use just a single letter to represent a variable storing meaningful data." - the problem is that in common math typesetting, product signs are not written. Thus, it is unclear whether an expression $pi$ represents a single variable called "pi", or the product of two separate variables "p" and "i", $pi=ptimes i$. To avoid this confusion, math-heavy disciplines very rarely use variables containing multiple letters. (When you implement an algorithm, yes, it is very good practice to replace single-letter variables by multi-letter ones, if only for easier search&replace.)
$endgroup$
– Stephan Kolassa
Jan 27 at 21:21
$begingroup$
"I hardly ever use just a single letter to represent a variable storing meaningful data." - the problem is that in common math typesetting, product signs are not written. Thus, it is unclear whether an expression $pi$ represents a single variable called "pi", or the product of two separate variables "p" and "i", $pi=ptimes i$. To avoid this confusion, math-heavy disciplines very rarely use variables containing multiple letters. (When you implement an algorithm, yes, it is very good practice to replace single-letter variables by multi-letter ones, if only for easier search&replace.)
$endgroup$
– Stephan Kolassa
Jan 27 at 21:21
$begingroup$
@StephanKolassa But in programming, a variable named
pi does never mean a product of p and i (of course, unless you write pi = p * i or similarly). Even in the languages that allow juxtaposition for product they will be spaced out (p i, for example). (I personally think that allowing omission of the product symbol was of the strongest blunders in math notation history, since it restricts the variable name to one letter, so more or less meaningful designations are not possible without using indices; and the 1-letter names quickly run out, which forces us to use Greek, Gothic etc.)$endgroup$
– trolley813
Jan 28 at 7:58
$begingroup$
@StephanKolassa But in programming, a variable named
pi does never mean a product of p and i (of course, unless you write pi = p * i or similarly). Even in the languages that allow juxtaposition for product they will be spaced out (p i, for example). (I personally think that allowing omission of the product symbol was of the strongest blunders in math notation history, since it restricts the variable name to one letter, so more or less meaningful designations are not possible without using indices; and the 1-letter names quickly run out, which forces us to use Greek, Gothic etc.)$endgroup$
– trolley813
Jan 28 at 7:58
1
1
$begingroup$
@trolley813: the OP was explicitly asking about "websites , articles or demonstration", not programming - but then proceeded to noting (correctly) that this is not the convention in programming. There is simply a category confusion here, which I pointed out.
$endgroup$
– Stephan Kolassa
Jan 28 at 9:02
$begingroup$
@trolley813: the OP was explicitly asking about "websites , articles or demonstration", not programming - but then proceeded to noting (correctly) that this is not the convention in programming. There is simply a category confusion here, which I pointed out.
$endgroup$
– Stephan Kolassa
Jan 28 at 9:02
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
The question about why $X$ and $y$ are popular choices in mathematical notions has been answered in the History of Science and Mathematics SE website: Why are X and Y commonly used as mathematical placeholders? (In short: cause Descartes said so!)
In terms of Linear Algebra, it is extremely common to use capital Latin letters for matrices (e.g. design matrix $X$) and lowercase Latin letters for vectors (response vector $y$). Standard textbooks on the use of matrices in Statistics (e.g. Matrix Algebra Useful for Statistics by Searle, Matrix Algebra From a Statistician's Perspective by Harville and Matrix Algebra: Theory, Computations, and Applications in Statistics by Gentle) utilise this convention too, so it has become a standard way to denote things.
$endgroup$
add a comment |
$begingroup$
Before you collect any data values on the feature and target variables, these variables can be considered to be random variables provided a random mechanism will be used to select the subjects who will generate these values. In that case, the correct notation for these variables is Y and X (i.e., upper case letters for both).
Recall that the value of a random variable is unknown prior to collecting the data, though its behaviour in the long run can be predicted using probability laws. However, once we collect the data, that value becomes known.
After you collect all desired data values on the feature and target variables, you can use the lower case notation to denote the collection of data values corresponding to the target variable (y) and the feature variables (x). If you have a single feature variable, x is a vector of data values. If you have multiple feature variables, x is a matrix of data values, having one column per feature variable. Usually, y is a vector of data values.
So the upper case notation refers to "random (hence unknown)", while the lower case notation refers to "known". Alternatively, the upper case notation refers to "before collecting the data", while the lower case notation refers to "after collecting the data".
Sadly, the literature is not at all consistent in the use of this notation, which is why you see the (y,X) notation you mention in your question.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f389395%2fwhy-uppercase-for-x-and-lowercase-for-y%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
The question about why $X$ and $y$ are popular choices in mathematical notions has been answered in the History of Science and Mathematics SE website: Why are X and Y commonly used as mathematical placeholders? (In short: cause Descartes said so!)
In terms of Linear Algebra, it is extremely common to use capital Latin letters for matrices (e.g. design matrix $X$) and lowercase Latin letters for vectors (response vector $y$). Standard textbooks on the use of matrices in Statistics (e.g. Matrix Algebra Useful for Statistics by Searle, Matrix Algebra From a Statistician's Perspective by Harville and Matrix Algebra: Theory, Computations, and Applications in Statistics by Gentle) utilise this convention too, so it has become a standard way to denote things.
$endgroup$
add a comment |
$begingroup$
The question about why $X$ and $y$ are popular choices in mathematical notions has been answered in the History of Science and Mathematics SE website: Why are X and Y commonly used as mathematical placeholders? (In short: cause Descartes said so!)
In terms of Linear Algebra, it is extremely common to use capital Latin letters for matrices (e.g. design matrix $X$) and lowercase Latin letters for vectors (response vector $y$). Standard textbooks on the use of matrices in Statistics (e.g. Matrix Algebra Useful for Statistics by Searle, Matrix Algebra From a Statistician's Perspective by Harville and Matrix Algebra: Theory, Computations, and Applications in Statistics by Gentle) utilise this convention too, so it has become a standard way to denote things.
$endgroup$
add a comment |
$begingroup$
The question about why $X$ and $y$ are popular choices in mathematical notions has been answered in the History of Science and Mathematics SE website: Why are X and Y commonly used as mathematical placeholders? (In short: cause Descartes said so!)
In terms of Linear Algebra, it is extremely common to use capital Latin letters for matrices (e.g. design matrix $X$) and lowercase Latin letters for vectors (response vector $y$). Standard textbooks on the use of matrices in Statistics (e.g. Matrix Algebra Useful for Statistics by Searle, Matrix Algebra From a Statistician's Perspective by Harville and Matrix Algebra: Theory, Computations, and Applications in Statistics by Gentle) utilise this convention too, so it has become a standard way to denote things.
$endgroup$
The question about why $X$ and $y$ are popular choices in mathematical notions has been answered in the History of Science and Mathematics SE website: Why are X and Y commonly used as mathematical placeholders? (In short: cause Descartes said so!)
In terms of Linear Algebra, it is extremely common to use capital Latin letters for matrices (e.g. design matrix $X$) and lowercase Latin letters for vectors (response vector $y$). Standard textbooks on the use of matrices in Statistics (e.g. Matrix Algebra Useful for Statistics by Searle, Matrix Algebra From a Statistician's Perspective by Harville and Matrix Algebra: Theory, Computations, and Applications in Statistics by Gentle) utilise this convention too, so it has become a standard way to denote things.
edited Jan 27 at 20:41
answered Jan 27 at 17:51
usεr11852usεr11852
19.5k14275
19.5k14275
add a comment |
add a comment |
$begingroup$
Before you collect any data values on the feature and target variables, these variables can be considered to be random variables provided a random mechanism will be used to select the subjects who will generate these values. In that case, the correct notation for these variables is Y and X (i.e., upper case letters for both).
Recall that the value of a random variable is unknown prior to collecting the data, though its behaviour in the long run can be predicted using probability laws. However, once we collect the data, that value becomes known.
After you collect all desired data values on the feature and target variables, you can use the lower case notation to denote the collection of data values corresponding to the target variable (y) and the feature variables (x). If you have a single feature variable, x is a vector of data values. If you have multiple feature variables, x is a matrix of data values, having one column per feature variable. Usually, y is a vector of data values.
So the upper case notation refers to "random (hence unknown)", while the lower case notation refers to "known". Alternatively, the upper case notation refers to "before collecting the data", while the lower case notation refers to "after collecting the data".
Sadly, the literature is not at all consistent in the use of this notation, which is why you see the (y,X) notation you mention in your question.
$endgroup$
add a comment |
$begingroup$
Before you collect any data values on the feature and target variables, these variables can be considered to be random variables provided a random mechanism will be used to select the subjects who will generate these values. In that case, the correct notation for these variables is Y and X (i.e., upper case letters for both).
Recall that the value of a random variable is unknown prior to collecting the data, though its behaviour in the long run can be predicted using probability laws. However, once we collect the data, that value becomes known.
After you collect all desired data values on the feature and target variables, you can use the lower case notation to denote the collection of data values corresponding to the target variable (y) and the feature variables (x). If you have a single feature variable, x is a vector of data values. If you have multiple feature variables, x is a matrix of data values, having one column per feature variable. Usually, y is a vector of data values.
So the upper case notation refers to "random (hence unknown)", while the lower case notation refers to "known". Alternatively, the upper case notation refers to "before collecting the data", while the lower case notation refers to "after collecting the data".
Sadly, the literature is not at all consistent in the use of this notation, which is why you see the (y,X) notation you mention in your question.
$endgroup$
add a comment |
$begingroup$
Before you collect any data values on the feature and target variables, these variables can be considered to be random variables provided a random mechanism will be used to select the subjects who will generate these values. In that case, the correct notation for these variables is Y and X (i.e., upper case letters for both).
Recall that the value of a random variable is unknown prior to collecting the data, though its behaviour in the long run can be predicted using probability laws. However, once we collect the data, that value becomes known.
After you collect all desired data values on the feature and target variables, you can use the lower case notation to denote the collection of data values corresponding to the target variable (y) and the feature variables (x). If you have a single feature variable, x is a vector of data values. If you have multiple feature variables, x is a matrix of data values, having one column per feature variable. Usually, y is a vector of data values.
So the upper case notation refers to "random (hence unknown)", while the lower case notation refers to "known". Alternatively, the upper case notation refers to "before collecting the data", while the lower case notation refers to "after collecting the data".
Sadly, the literature is not at all consistent in the use of this notation, which is why you see the (y,X) notation you mention in your question.
$endgroup$
Before you collect any data values on the feature and target variables, these variables can be considered to be random variables provided a random mechanism will be used to select the subjects who will generate these values. In that case, the correct notation for these variables is Y and X (i.e., upper case letters for both).
Recall that the value of a random variable is unknown prior to collecting the data, though its behaviour in the long run can be predicted using probability laws. However, once we collect the data, that value becomes known.
After you collect all desired data values on the feature and target variables, you can use the lower case notation to denote the collection of data values corresponding to the target variable (y) and the feature variables (x). If you have a single feature variable, x is a vector of data values. If you have multiple feature variables, x is a matrix of data values, having one column per feature variable. Usually, y is a vector of data values.
So the upper case notation refers to "random (hence unknown)", while the lower case notation refers to "known". Alternatively, the upper case notation refers to "before collecting the data", while the lower case notation refers to "after collecting the data".
Sadly, the literature is not at all consistent in the use of this notation, which is why you see the (y,X) notation you mention in your question.
answered Jan 27 at 17:27
Isabella GhementIsabella Ghement
7,558422
7,558422
add a comment |
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f389395%2fwhy-uppercase-for-x-and-lowercase-for-y%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown

5
$begingroup$
Consistency with linear algebra notation, I guess? The features usually form a matrix (typically denoted by uppercase) whereas the labels are usually 1d, forming a column vector (typically denoted by lowercase).
$endgroup$
– galoosh33
Jan 27 at 15:49
1
$begingroup$
"I hardly ever use just a single letter to represent a variable storing meaningful data." - the problem is that in common math typesetting, product signs are not written. Thus, it is unclear whether an expression $pi$ represents a single variable called "pi", or the product of two separate variables "p" and "i", $pi=ptimes i$. To avoid this confusion, math-heavy disciplines very rarely use variables containing multiple letters. (When you implement an algorithm, yes, it is very good practice to replace single-letter variables by multi-letter ones, if only for easier search&replace.)
$endgroup$
– Stephan Kolassa
Jan 27 at 21:21
$begingroup$
@StephanKolassa But in programming, a variable named
pidoes never mean a product ofpandi(of course, unless you writepi = p * ior similarly). Even in the languages that allow juxtaposition for product they will be spaced out (p i, for example). (I personally think that allowing omission of the product symbol was of the strongest blunders in math notation history, since it restricts the variable name to one letter, so more or less meaningful designations are not possible without using indices; and the 1-letter names quickly run out, which forces us to use Greek, Gothic etc.)$endgroup$
– trolley813
Jan 28 at 7:58
1
$begingroup$
@trolley813: the OP was explicitly asking about "websites , articles or demonstration", not programming - but then proceeded to noting (correctly) that this is not the convention in programming. There is simply a category confusion here, which I pointed out.
$endgroup$
– Stephan Kolassa
Jan 28 at 9:02