Convert SQL timestamp column to date format column of Python dataframe
I have data upload in MS Excel format.
enter image description here
This file has a column with dates in "dd.mm.yyyy 00:00:00" format.
Reading file with code:
df = pd.read_excel('data_from_db.xlsx')
I recieve a frame, where dates column has "object" type. Further I convert this column to date format by command:
df['Date_Column'] = pd.to_datetime(df['Date_Column'])
That gives me "datetime64[ns]" type.
But this command does not work correctly each time. I meet rows with muddled data:
- somewhere rows have format "yyyy.mm.dd",
- somwhere "yyyy.dd.mm".
How should I correctly convert excel column with "dd.mm.yyyy 00:00:00" format to column in pandas dataframe with date type and "dd.mm.yyyy" fromat?
P.S. Also, I noticed this oddity: some values in raw date column have str type, another - float. But I can't wrap my head around it, because raw table is an upload from database.
python excel pandas timestamp date-formatting
add a comment |
I have data upload in MS Excel format.
enter image description here
This file has a column with dates in "dd.mm.yyyy 00:00:00" format.
Reading file with code:
df = pd.read_excel('data_from_db.xlsx')
I recieve a frame, where dates column has "object" type. Further I convert this column to date format by command:
df['Date_Column'] = pd.to_datetime(df['Date_Column'])
That gives me "datetime64[ns]" type.
But this command does not work correctly each time. I meet rows with muddled data:
- somewhere rows have format "yyyy.mm.dd",
- somwhere "yyyy.dd.mm".
How should I correctly convert excel column with "dd.mm.yyyy 00:00:00" format to column in pandas dataframe with date type and "dd.mm.yyyy" fromat?
P.S. Also, I noticed this oddity: some values in raw date column have str type, another - float. But I can't wrap my head around it, because raw table is an upload from database.
python excel pandas timestamp date-formatting
Hey there, welcome to StackOverflow! Please provide some more information, e.g. a sample of yourdata_from_db.xlsx
. Have you checked the date format inside the spreadsheet, are they all'dd.mm.yyyy'
?
– Finwood
Nov 20 '18 at 9:57
@Finwood thank you for your attention - I uodated question with table image link.
– Kate
Nov 21 '18 at 13:22
@OleV.V. thank you for your advice - I've corrected tag
– Kate
Nov 21 '18 at 13:22
add a comment |
I have data upload in MS Excel format.
enter image description here
This file has a column with dates in "dd.mm.yyyy 00:00:00" format.
Reading file with code:
df = pd.read_excel('data_from_db.xlsx')
I recieve a frame, where dates column has "object" type. Further I convert this column to date format by command:
df['Date_Column'] = pd.to_datetime(df['Date_Column'])
That gives me "datetime64[ns]" type.
But this command does not work correctly each time. I meet rows with muddled data:
- somewhere rows have format "yyyy.mm.dd",
- somwhere "yyyy.dd.mm".
How should I correctly convert excel column with "dd.mm.yyyy 00:00:00" format to column in pandas dataframe with date type and "dd.mm.yyyy" fromat?
P.S. Also, I noticed this oddity: some values in raw date column have str type, another - float. But I can't wrap my head around it, because raw table is an upload from database.
python excel pandas timestamp date-formatting
I have data upload in MS Excel format.
enter image description here
This file has a column with dates in "dd.mm.yyyy 00:00:00" format.
Reading file with code:
df = pd.read_excel('data_from_db.xlsx')
I recieve a frame, where dates column has "object" type. Further I convert this column to date format by command:
df['Date_Column'] = pd.to_datetime(df['Date_Column'])
That gives me "datetime64[ns]" type.
But this command does not work correctly each time. I meet rows with muddled data:
- somewhere rows have format "yyyy.mm.dd",
- somwhere "yyyy.dd.mm".
How should I correctly convert excel column with "dd.mm.yyyy 00:00:00" format to column in pandas dataframe with date type and "dd.mm.yyyy" fromat?
P.S. Also, I noticed this oddity: some values in raw date column have str type, another - float. But I can't wrap my head around it, because raw table is an upload from database.
python excel pandas timestamp date-formatting
python excel pandas timestamp date-formatting
edited Nov 21 '18 at 13:20
Kate
asked Nov 20 '18 at 9:39
KateKate
12
12
Hey there, welcome to StackOverflow! Please provide some more information, e.g. a sample of yourdata_from_db.xlsx
. Have you checked the date format inside the spreadsheet, are they all'dd.mm.yyyy'
?
– Finwood
Nov 20 '18 at 9:57
@Finwood thank you for your attention - I uodated question with table image link.
– Kate
Nov 21 '18 at 13:22
@OleV.V. thank you for your advice - I've corrected tag
– Kate
Nov 21 '18 at 13:22
add a comment |
Hey there, welcome to StackOverflow! Please provide some more information, e.g. a sample of yourdata_from_db.xlsx
. Have you checked the date format inside the spreadsheet, are they all'dd.mm.yyyy'
?
– Finwood
Nov 20 '18 at 9:57
@Finwood thank you for your attention - I uodated question with table image link.
– Kate
Nov 21 '18 at 13:22
@OleV.V. thank you for your advice - I've corrected tag
– Kate
Nov 21 '18 at 13:22
Hey there, welcome to StackOverflow! Please provide some more information, e.g. a sample of your
data_from_db.xlsx
. Have you checked the date format inside the spreadsheet, are they all 'dd.mm.yyyy'
?– Finwood
Nov 20 '18 at 9:57
Hey there, welcome to StackOverflow! Please provide some more information, e.g. a sample of your
data_from_db.xlsx
. Have you checked the date format inside the spreadsheet, are they all 'dd.mm.yyyy'
?– Finwood
Nov 20 '18 at 9:57
@Finwood thank you for your attention - I uodated question with table image link.
– Kate
Nov 21 '18 at 13:22
@Finwood thank you for your attention - I uodated question with table image link.
– Kate
Nov 21 '18 at 13:22
@OleV.V. thank you for your advice - I've corrected tag
– Kate
Nov 21 '18 at 13:22
@OleV.V. thank you for your advice - I've corrected tag
– Kate
Nov 21 '18 at 13:22
add a comment |
1 Answer
1
active
oldest
votes
Without specifying a format, pd.to_datetime
has to guess from the data how a date string is to be interpreted. With default parameters this fails for the second and third row of your data:
In [5]: date_of_hire = pd.Series(['18.01.2018 0:00:00',
'01.02.2018 0:00:00',
'06.11.2018 0:00:00'])
In [6]: pd.to_datetime(date_of_hire)
Out[6]:
0 2018-01-18
1 2018-01-02
2 2018-06-11
dtype: datetime64[ns]
The quickest solution would be to pass dayfirst=True
:
In [7]: pd.to_datetime(date_of_hire, dayfirst=True)
Out[7]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
If you know the complete format of your data, can specify it directly. This only works if the format is exactly like given, if a row should e.g. lack the time the conversion will fail.
In [8]: pd.to_datetime(date_of_hire, format='%d.%m.%Y %H:%M:%S')
Out[8]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
In case you should have little information about the date format, except for it being consistent, pandas has the ability to infer the format from the data beforehand:
In [9]: pd.to_datetime(date_of_hire, infer_datetime_format=True)
Out[9]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
Thank you @Finwood for your comprehensive information. Your answer is really helpfull! Thanks you and StackOverflow :)
– Kate
Nov 27 '18 at 10:01
In this case, please mark the answer as accepted: stackoverflow.com/help/someone-answers
– Finwood
Nov 30 '18 at 5:56
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53390077%2fconvert-sql-timestamp-column-to-date-format-column-of-python-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Without specifying a format, pd.to_datetime
has to guess from the data how a date string is to be interpreted. With default parameters this fails for the second and third row of your data:
In [5]: date_of_hire = pd.Series(['18.01.2018 0:00:00',
'01.02.2018 0:00:00',
'06.11.2018 0:00:00'])
In [6]: pd.to_datetime(date_of_hire)
Out[6]:
0 2018-01-18
1 2018-01-02
2 2018-06-11
dtype: datetime64[ns]
The quickest solution would be to pass dayfirst=True
:
In [7]: pd.to_datetime(date_of_hire, dayfirst=True)
Out[7]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
If you know the complete format of your data, can specify it directly. This only works if the format is exactly like given, if a row should e.g. lack the time the conversion will fail.
In [8]: pd.to_datetime(date_of_hire, format='%d.%m.%Y %H:%M:%S')
Out[8]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
In case you should have little information about the date format, except for it being consistent, pandas has the ability to infer the format from the data beforehand:
In [9]: pd.to_datetime(date_of_hire, infer_datetime_format=True)
Out[9]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
Thank you @Finwood for your comprehensive information. Your answer is really helpfull! Thanks you and StackOverflow :)
– Kate
Nov 27 '18 at 10:01
In this case, please mark the answer as accepted: stackoverflow.com/help/someone-answers
– Finwood
Nov 30 '18 at 5:56
add a comment |
Without specifying a format, pd.to_datetime
has to guess from the data how a date string is to be interpreted. With default parameters this fails for the second and third row of your data:
In [5]: date_of_hire = pd.Series(['18.01.2018 0:00:00',
'01.02.2018 0:00:00',
'06.11.2018 0:00:00'])
In [6]: pd.to_datetime(date_of_hire)
Out[6]:
0 2018-01-18
1 2018-01-02
2 2018-06-11
dtype: datetime64[ns]
The quickest solution would be to pass dayfirst=True
:
In [7]: pd.to_datetime(date_of_hire, dayfirst=True)
Out[7]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
If you know the complete format of your data, can specify it directly. This only works if the format is exactly like given, if a row should e.g. lack the time the conversion will fail.
In [8]: pd.to_datetime(date_of_hire, format='%d.%m.%Y %H:%M:%S')
Out[8]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
In case you should have little information about the date format, except for it being consistent, pandas has the ability to infer the format from the data beforehand:
In [9]: pd.to_datetime(date_of_hire, infer_datetime_format=True)
Out[9]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
Thank you @Finwood for your comprehensive information. Your answer is really helpfull! Thanks you and StackOverflow :)
– Kate
Nov 27 '18 at 10:01
In this case, please mark the answer as accepted: stackoverflow.com/help/someone-answers
– Finwood
Nov 30 '18 at 5:56
add a comment |
Without specifying a format, pd.to_datetime
has to guess from the data how a date string is to be interpreted. With default parameters this fails for the second and third row of your data:
In [5]: date_of_hire = pd.Series(['18.01.2018 0:00:00',
'01.02.2018 0:00:00',
'06.11.2018 0:00:00'])
In [6]: pd.to_datetime(date_of_hire)
Out[6]:
0 2018-01-18
1 2018-01-02
2 2018-06-11
dtype: datetime64[ns]
The quickest solution would be to pass dayfirst=True
:
In [7]: pd.to_datetime(date_of_hire, dayfirst=True)
Out[7]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
If you know the complete format of your data, can specify it directly. This only works if the format is exactly like given, if a row should e.g. lack the time the conversion will fail.
In [8]: pd.to_datetime(date_of_hire, format='%d.%m.%Y %H:%M:%S')
Out[8]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
In case you should have little information about the date format, except for it being consistent, pandas has the ability to infer the format from the data beforehand:
In [9]: pd.to_datetime(date_of_hire, infer_datetime_format=True)
Out[9]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
Without specifying a format, pd.to_datetime
has to guess from the data how a date string is to be interpreted. With default parameters this fails for the second and third row of your data:
In [5]: date_of_hire = pd.Series(['18.01.2018 0:00:00',
'01.02.2018 0:00:00',
'06.11.2018 0:00:00'])
In [6]: pd.to_datetime(date_of_hire)
Out[6]:
0 2018-01-18
1 2018-01-02
2 2018-06-11
dtype: datetime64[ns]
The quickest solution would be to pass dayfirst=True
:
In [7]: pd.to_datetime(date_of_hire, dayfirst=True)
Out[7]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
If you know the complete format of your data, can specify it directly. This only works if the format is exactly like given, if a row should e.g. lack the time the conversion will fail.
In [8]: pd.to_datetime(date_of_hire, format='%d.%m.%Y %H:%M:%S')
Out[8]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
In case you should have little information about the date format, except for it being consistent, pandas has the ability to infer the format from the data beforehand:
In [9]: pd.to_datetime(date_of_hire, infer_datetime_format=True)
Out[9]:
0 2018-01-18
1 2018-02-01
2 2018-11-06
dtype: datetime64[ns]
answered Nov 22 '18 at 16:05
FinwoodFinwood
2,53011031
2,53011031
Thank you @Finwood for your comprehensive information. Your answer is really helpfull! Thanks you and StackOverflow :)
– Kate
Nov 27 '18 at 10:01
In this case, please mark the answer as accepted: stackoverflow.com/help/someone-answers
– Finwood
Nov 30 '18 at 5:56
add a comment |
Thank you @Finwood for your comprehensive information. Your answer is really helpfull! Thanks you and StackOverflow :)
– Kate
Nov 27 '18 at 10:01
In this case, please mark the answer as accepted: stackoverflow.com/help/someone-answers
– Finwood
Nov 30 '18 at 5:56
Thank you @Finwood for your comprehensive information. Your answer is really helpfull! Thanks you and StackOverflow :)
– Kate
Nov 27 '18 at 10:01
Thank you @Finwood for your comprehensive information. Your answer is really helpfull! Thanks you and StackOverflow :)
– Kate
Nov 27 '18 at 10:01
In this case, please mark the answer as accepted: stackoverflow.com/help/someone-answers
– Finwood
Nov 30 '18 at 5:56
In this case, please mark the answer as accepted: stackoverflow.com/help/someone-answers
– Finwood
Nov 30 '18 at 5:56
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53390077%2fconvert-sql-timestamp-column-to-date-format-column-of-python-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Hey there, welcome to StackOverflow! Please provide some more information, e.g. a sample of your
data_from_db.xlsx
. Have you checked the date format inside the spreadsheet, are they all'dd.mm.yyyy'
?– Finwood
Nov 20 '18 at 9:57
@Finwood thank you for your attention - I uodated question with table image link.
– Kate
Nov 21 '18 at 13:22
@OleV.V. thank you for your advice - I've corrected tag
– Kate
Nov 21 '18 at 13:22