Optimize and speed up MySQL query selection
I'm trying to figure out which is the best way to optimize my current selection query on a MySQL database.
I have 2 MySQL tables with a relationship one-to-many. One is the user
table that contains the unique list of users and It has around 22krows. One is the linedata
table which contains all the possible coordinates for each user and it has around 490k rows.
In this case we can assume the foreign key between the 2 tables is the id
value. In the case of the user table the id is also the auto-increment primary key, while in the linedata table it's not primary key cause we can have more rows for the same user.
The CREATE STMT structure
CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`isActive` tinyint(4) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`gender` varchar(45) COLLATE utf8_unicode_ci NOT NULL,
`age` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=21938 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `linedata` (
`id` int(11) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`timestamp` datetime NOT NULL,
`x` float NOT NULL,
`y` float NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
The selection query
SELECT
u.id,
u.isActive,
u.userId,
u.name,
u.gender,
u.age,
GROUP_CONCAT(CONCAT_WS(', ',timestamp,x, y)
ORDER BY timestamp ASC SEPARATOR '; '
) as linedata_0
FROM user u
JOIN linedata l
ON u.id=l.id
WHERE DATEDIFF(l.timestamp, '2018-02-28T20:00:00.000Z') >= 0
AND DATEDIFF(l.timestamp, '2018-11-20T09:20:08.218Z') <= 0
GROUP BY userId;
The EXPLAIN output
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| ID | SELECT_TYPE | TABLE | TYPE | POSSIBLE_KEYS | KEY | KEY_LEN | REF | ROWS | EXTRA |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| 1 | SIMPLE | l | ALL | NULL | NULL | NULL | NULL | 491157 | "Using where; Using temporary; Using filesort" |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | l.id | 1 | NULL |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
The selection query works if for example I add another WHERE condition for filter single users. Let's say that I want to select just 200 user, then I got around 14 seconds as execution time. Around 7 seconds if I select just the first 100 users. But in case of having only datetime range condition it seems loading without an ending point. Any suggestions?
UPDATE
After following the Rick's suggestions now the query benchmark is around 14 seconds. Here below the EXPLAIN EXTENDED
:
id,select_type,table,type,possible_keys,key,key_len,ref,rows,filtered,Extra
1,PRIMARY,u,index,PRIMARY,PRIMARY,4,NULL,21959,100.00,NULL
1,PRIMARY,l,ref,id_timestamp_index,id_timestamp_index,4,u.id,14,100.00,"Using index condition"
2,"DEPENDENT SUBQUERY",NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,"No tables used"
I have changed a bit some values of the tables:
Where the id
in user table can be joined with userId
in linedata table. And they are integer now. We will have string type just for the userId value in user table cause it is a sort of long string identifier like 0000309ab2912b2fd34350d7e6c079846bb6c5e1f97d3ccb053d15061433e77a_0.
So, just for make a quick example we will have in user
and in linedata
table:
+-------+-----------+-----------+-------------------+--------+---+
| id | isActive | userId | name | gender |age|
+-------+-----------+-----------+-------------------+--------+---+
| 1 | 1 | x4by4d | john | m | 22|
| 2 | 1 | 3ub3ub | bob | m | 50|
+-------+-----------+-----------+-------------------+--------+---+
+-------+-----------+-----------+------+---+
| id | userId |timestamp | x | y |
+-------+-----------+-----------+------+----+
| 1 | 1 | somedate | 30 | 10 |
| 2 | 1 | somedate | 45 | 15 |
| 3 | 1 | somedate | 50 | 20 |
| 4 | 2 | somedate | 20 | 5 |
| 5 | 2 | somedate | 25 | 10 |
+-------+-----------+-----------+------+----+
I have added a compound index made of userId
and timestamp
values in linedata table.
Maybe instead of having as primary key an ai id value for linedata
table, if I add a composite primary key made of userId
+timestamp
? Should increase the performance or maybe not?
mysql select query-optimization rdbms
add a comment |
I'm trying to figure out which is the best way to optimize my current selection query on a MySQL database.
I have 2 MySQL tables with a relationship one-to-many. One is the user
table that contains the unique list of users and It has around 22krows. One is the linedata
table which contains all the possible coordinates for each user and it has around 490k rows.
In this case we can assume the foreign key between the 2 tables is the id
value. In the case of the user table the id is also the auto-increment primary key, while in the linedata table it's not primary key cause we can have more rows for the same user.
The CREATE STMT structure
CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`isActive` tinyint(4) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`gender` varchar(45) COLLATE utf8_unicode_ci NOT NULL,
`age` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=21938 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `linedata` (
`id` int(11) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`timestamp` datetime NOT NULL,
`x` float NOT NULL,
`y` float NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
The selection query
SELECT
u.id,
u.isActive,
u.userId,
u.name,
u.gender,
u.age,
GROUP_CONCAT(CONCAT_WS(', ',timestamp,x, y)
ORDER BY timestamp ASC SEPARATOR '; '
) as linedata_0
FROM user u
JOIN linedata l
ON u.id=l.id
WHERE DATEDIFF(l.timestamp, '2018-02-28T20:00:00.000Z') >= 0
AND DATEDIFF(l.timestamp, '2018-11-20T09:20:08.218Z') <= 0
GROUP BY userId;
The EXPLAIN output
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| ID | SELECT_TYPE | TABLE | TYPE | POSSIBLE_KEYS | KEY | KEY_LEN | REF | ROWS | EXTRA |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| 1 | SIMPLE | l | ALL | NULL | NULL | NULL | NULL | 491157 | "Using where; Using temporary; Using filesort" |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | l.id | 1 | NULL |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
The selection query works if for example I add another WHERE condition for filter single users. Let's say that I want to select just 200 user, then I got around 14 seconds as execution time. Around 7 seconds if I select just the first 100 users. But in case of having only datetime range condition it seems loading without an ending point. Any suggestions?
UPDATE
After following the Rick's suggestions now the query benchmark is around 14 seconds. Here below the EXPLAIN EXTENDED
:
id,select_type,table,type,possible_keys,key,key_len,ref,rows,filtered,Extra
1,PRIMARY,u,index,PRIMARY,PRIMARY,4,NULL,21959,100.00,NULL
1,PRIMARY,l,ref,id_timestamp_index,id_timestamp_index,4,u.id,14,100.00,"Using index condition"
2,"DEPENDENT SUBQUERY",NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,"No tables used"
I have changed a bit some values of the tables:
Where the id
in user table can be joined with userId
in linedata table. And they are integer now. We will have string type just for the userId value in user table cause it is a sort of long string identifier like 0000309ab2912b2fd34350d7e6c079846bb6c5e1f97d3ccb053d15061433e77a_0.
So, just for make a quick example we will have in user
and in linedata
table:
+-------+-----------+-----------+-------------------+--------+---+
| id | isActive | userId | name | gender |age|
+-------+-----------+-----------+-------------------+--------+---+
| 1 | 1 | x4by4d | john | m | 22|
| 2 | 1 | 3ub3ub | bob | m | 50|
+-------+-----------+-----------+-------------------+--------+---+
+-------+-----------+-----------+------+---+
| id | userId |timestamp | x | y |
+-------+-----------+-----------+------+----+
| 1 | 1 | somedate | 30 | 10 |
| 2 | 1 | somedate | 45 | 15 |
| 3 | 1 | somedate | 50 | 20 |
| 4 | 2 | somedate | 20 | 5 |
| 5 | 2 | somedate | 25 | 10 |
+-------+-----------+-----------+------+----+
I have added a compound index made of userId
and timestamp
values in linedata table.
Maybe instead of having as primary key an ai id value for linedata
table, if I add a composite primary key made of userId
+timestamp
? Should increase the performance or maybe not?
mysql select query-optimization rdbms
add a comment |
I'm trying to figure out which is the best way to optimize my current selection query on a MySQL database.
I have 2 MySQL tables with a relationship one-to-many. One is the user
table that contains the unique list of users and It has around 22krows. One is the linedata
table which contains all the possible coordinates for each user and it has around 490k rows.
In this case we can assume the foreign key between the 2 tables is the id
value. In the case of the user table the id is also the auto-increment primary key, while in the linedata table it's not primary key cause we can have more rows for the same user.
The CREATE STMT structure
CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`isActive` tinyint(4) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`gender` varchar(45) COLLATE utf8_unicode_ci NOT NULL,
`age` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=21938 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `linedata` (
`id` int(11) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`timestamp` datetime NOT NULL,
`x` float NOT NULL,
`y` float NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
The selection query
SELECT
u.id,
u.isActive,
u.userId,
u.name,
u.gender,
u.age,
GROUP_CONCAT(CONCAT_WS(', ',timestamp,x, y)
ORDER BY timestamp ASC SEPARATOR '; '
) as linedata_0
FROM user u
JOIN linedata l
ON u.id=l.id
WHERE DATEDIFF(l.timestamp, '2018-02-28T20:00:00.000Z') >= 0
AND DATEDIFF(l.timestamp, '2018-11-20T09:20:08.218Z') <= 0
GROUP BY userId;
The EXPLAIN output
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| ID | SELECT_TYPE | TABLE | TYPE | POSSIBLE_KEYS | KEY | KEY_LEN | REF | ROWS | EXTRA |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| 1 | SIMPLE | l | ALL | NULL | NULL | NULL | NULL | 491157 | "Using where; Using temporary; Using filesort" |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | l.id | 1 | NULL |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
The selection query works if for example I add another WHERE condition for filter single users. Let's say that I want to select just 200 user, then I got around 14 seconds as execution time. Around 7 seconds if I select just the first 100 users. But in case of having only datetime range condition it seems loading without an ending point. Any suggestions?
UPDATE
After following the Rick's suggestions now the query benchmark is around 14 seconds. Here below the EXPLAIN EXTENDED
:
id,select_type,table,type,possible_keys,key,key_len,ref,rows,filtered,Extra
1,PRIMARY,u,index,PRIMARY,PRIMARY,4,NULL,21959,100.00,NULL
1,PRIMARY,l,ref,id_timestamp_index,id_timestamp_index,4,u.id,14,100.00,"Using index condition"
2,"DEPENDENT SUBQUERY",NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,"No tables used"
I have changed a bit some values of the tables:
Where the id
in user table can be joined with userId
in linedata table. And they are integer now. We will have string type just for the userId value in user table cause it is a sort of long string identifier like 0000309ab2912b2fd34350d7e6c079846bb6c5e1f97d3ccb053d15061433e77a_0.
So, just for make a quick example we will have in user
and in linedata
table:
+-------+-----------+-----------+-------------------+--------+---+
| id | isActive | userId | name | gender |age|
+-------+-----------+-----------+-------------------+--------+---+
| 1 | 1 | x4by4d | john | m | 22|
| 2 | 1 | 3ub3ub | bob | m | 50|
+-------+-----------+-----------+-------------------+--------+---+
+-------+-----------+-----------+------+---+
| id | userId |timestamp | x | y |
+-------+-----------+-----------+------+----+
| 1 | 1 | somedate | 30 | 10 |
| 2 | 1 | somedate | 45 | 15 |
| 3 | 1 | somedate | 50 | 20 |
| 4 | 2 | somedate | 20 | 5 |
| 5 | 2 | somedate | 25 | 10 |
+-------+-----------+-----------+------+----+
I have added a compound index made of userId
and timestamp
values in linedata table.
Maybe instead of having as primary key an ai id value for linedata
table, if I add a composite primary key made of userId
+timestamp
? Should increase the performance or maybe not?
mysql select query-optimization rdbms
I'm trying to figure out which is the best way to optimize my current selection query on a MySQL database.
I have 2 MySQL tables with a relationship one-to-many. One is the user
table that contains the unique list of users and It has around 22krows. One is the linedata
table which contains all the possible coordinates for each user and it has around 490k rows.
In this case we can assume the foreign key between the 2 tables is the id
value. In the case of the user table the id is also the auto-increment primary key, while in the linedata table it's not primary key cause we can have more rows for the same user.
The CREATE STMT structure
CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`isActive` tinyint(4) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`gender` varchar(45) COLLATE utf8_unicode_ci NOT NULL,
`age` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=21938 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `linedata` (
`id` int(11) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`timestamp` datetime NOT NULL,
`x` float NOT NULL,
`y` float NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
The selection query
SELECT
u.id,
u.isActive,
u.userId,
u.name,
u.gender,
u.age,
GROUP_CONCAT(CONCAT_WS(', ',timestamp,x, y)
ORDER BY timestamp ASC SEPARATOR '; '
) as linedata_0
FROM user u
JOIN linedata l
ON u.id=l.id
WHERE DATEDIFF(l.timestamp, '2018-02-28T20:00:00.000Z') >= 0
AND DATEDIFF(l.timestamp, '2018-11-20T09:20:08.218Z') <= 0
GROUP BY userId;
The EXPLAIN output
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| ID | SELECT_TYPE | TABLE | TYPE | POSSIBLE_KEYS | KEY | KEY_LEN | REF | ROWS | EXTRA |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| 1 | SIMPLE | l | ALL | NULL | NULL | NULL | NULL | 491157 | "Using where; Using temporary; Using filesort" |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | l.id | 1 | NULL |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
The selection query works if for example I add another WHERE condition for filter single users. Let's say that I want to select just 200 user, then I got around 14 seconds as execution time. Around 7 seconds if I select just the first 100 users. But in case of having only datetime range condition it seems loading without an ending point. Any suggestions?
UPDATE
After following the Rick's suggestions now the query benchmark is around 14 seconds. Here below the EXPLAIN EXTENDED
:
id,select_type,table,type,possible_keys,key,key_len,ref,rows,filtered,Extra
1,PRIMARY,u,index,PRIMARY,PRIMARY,4,NULL,21959,100.00,NULL
1,PRIMARY,l,ref,id_timestamp_index,id_timestamp_index,4,u.id,14,100.00,"Using index condition"
2,"DEPENDENT SUBQUERY",NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,"No tables used"
I have changed a bit some values of the tables:
Where the id
in user table can be joined with userId
in linedata table. And they are integer now. We will have string type just for the userId value in user table cause it is a sort of long string identifier like 0000309ab2912b2fd34350d7e6c079846bb6c5e1f97d3ccb053d15061433e77a_0.
So, just for make a quick example we will have in user
and in linedata
table:
+-------+-----------+-----------+-------------------+--------+---+
| id | isActive | userId | name | gender |age|
+-------+-----------+-----------+-------------------+--------+---+
| 1 | 1 | x4by4d | john | m | 22|
| 2 | 1 | 3ub3ub | bob | m | 50|
+-------+-----------+-----------+-------------------+--------+---+
+-------+-----------+-----------+------+---+
| id | userId |timestamp | x | y |
+-------+-----------+-----------+------+----+
| 1 | 1 | somedate | 30 | 10 |
| 2 | 1 | somedate | 45 | 15 |
| 3 | 1 | somedate | 50 | 20 |
| 4 | 2 | somedate | 20 | 5 |
| 5 | 2 | somedate | 25 | 10 |
+-------+-----------+-----------+------+----+
I have added a compound index made of userId
and timestamp
values in linedata table.
Maybe instead of having as primary key an ai id value for linedata
table, if I add a composite primary key made of userId
+timestamp
? Should increase the performance or maybe not?
mysql select query-optimization rdbms
mysql select query-optimization rdbms
edited Nov 21 '18 at 21:36
Rick James
68.8k559100
68.8k559100
asked Nov 20 '18 at 13:26
UgoLUgoL
470414
470414
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
I need to help you fix several bugs before discussing performance.
First of all, '2018-02-28T20:00:00.000Z'
won't work in MySQL. It needs to be '2018-02-28 20:00:00.000'
and something needs to be done about the timezone.
Then, don't "hide a column in a function". That is DATEDIFF(l.timestamp ...)
cannot use any indexing on timestamp
.
So, instead of
WHERE DATEDIFF(l.timestamp, '2018-02-28T20:00:00.000Z') >= 0
AND DATEDIFF(l.timestamp, '2018-11-20T09:20:08.218Z') <= 0
do something like
WHERE l.timestamp >= '2018-02-28 20:00:00.000'
AND l.timestamp < '2018-11-20 09:20:08.218'
I'm confused about the two tables. Both have id
and userid
, yet you join on id
. Perhaps instead of
CREATE TABLE `linedata` (
`id` int(11) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
...
you meant
CREATE TABLE `linedata` (
`id` int(11) NOT NULL AUTO_INCREMENT, -- (the id for `linedata`)
`userId` int NOT NULL, -- to link to the other table
...
PRIMARY KEY(id)
...
Then there could be several linedata
rows for each user
.
At that point, this
JOIN linedata l ON u.id=l.id
becomes
JOIN linedata l ON u.id=l.userid
Now, for performance: linedata
needs INDEX(userid, timestamp)
- in that order.
Now, think about the output. You are asking for up to 22K rows, with possibly hundreds of "ts,x,y" strung together in one of the columns. What will receive this much data? Will it choke on it?
And GROUP_CONCAT
has a default limit of 1024 bytes. That will allow for about 50 points. If a 'user' can be in more than 50 spots in 9 days, consider increasing group_concat_max_len
before running the query.
To make it work even faster, reformulate it this way:
SELECT u.id, u.isActive, u.userId, u.name, u.gender, u.age,
( SELECT GROUP_CONCAT(CONCAT_WS(', ',timestamp, x, y)
ORDER BY timestamp ASC
SEPARATOR '; ')
) as linedata_0
FROM user u
JOIN linedata l ON u.id = l.userid
WHERE l.timestamp >= '2018-02-28 20:00:00.000'
AND l.timestamp < '2018-11-20 09:20:08.218';
Another thing. You probably want to be able to look up a user by name
; so add INDEX(name)
Oh, what the heck is the VARCHAR(255)
for userID
?? Ids are normally integers.
This covers most of what I was going to suggest. Note, I think the OP included their accurate table structures, it's just that there is no PK in thelinedata
table, and theid
column there does in fact contain the same value as theid
column in theuser
table. TheuserId
looks like an alternative identifier, for some purpose. Likely, your suggested index of(userid,timestamp)
could instead be handled by just making a PRIMARY KEY out of(id,timestamp)
, as a user is unlikely to be allowed to have more than one location at a time.
– Willem Renzema
Nov 21 '18 at 5:41
What says @WillemRenzema is right. I didn't have a pk for table linedata, but anyway I have created an ai pk called id. And I joined the id of user table(pk ai) with the userId value that now is integer too and matches correctly.
– UgoL
Nov 21 '18 at 10:22
@Rick James thank you for your answer. I have corrected certain points.. especially thepersonId
type. I have generated an integer value that corresponds to the uniqueid
of theuser
table now. What I think that cannot work properly is your query suggestion, cause I get just one row result that corresponds to just one user. In theory I should get around 21k rows with the concatenation of each timestamp and coordinates. I think you are missing the finalgroup by u.id
. Am I right? And if I add it the benchmark for get all the data is around 15 secs. I update theEXPLAIN
stmt.
– UgoL
Nov 21 '18 at 10:31
@UgoL - Good start. But before I go any further, please clarifypersonId
versususerId
-- are they the same? Do they uniquely identify each user? Why 255? TheEXPLAIN
and timing are still bad because of lack if index(es) on linedata; please fix. (And maybe some other things that Willem or I mentioned.)
– Rick James
Nov 21 '18 at 17:36
@RickJames yes forgive me for the confusion..personId
isuserId
. Let me update the question with an example for show better everything, especially the relationship betweenuserId
andid
. Cause I made some changes. What you mean with lack of indexes on linedata? It should be correct I supposed, but I'll show too this part in the update question.
– UgoL
Nov 21 '18 at 17:44
|
show 7 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53394029%2foptimize-and-speed-up-mysql-query-selection%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I need to help you fix several bugs before discussing performance.
First of all, '2018-02-28T20:00:00.000Z'
won't work in MySQL. It needs to be '2018-02-28 20:00:00.000'
and something needs to be done about the timezone.
Then, don't "hide a column in a function". That is DATEDIFF(l.timestamp ...)
cannot use any indexing on timestamp
.
So, instead of
WHERE DATEDIFF(l.timestamp, '2018-02-28T20:00:00.000Z') >= 0
AND DATEDIFF(l.timestamp, '2018-11-20T09:20:08.218Z') <= 0
do something like
WHERE l.timestamp >= '2018-02-28 20:00:00.000'
AND l.timestamp < '2018-11-20 09:20:08.218'
I'm confused about the two tables. Both have id
and userid
, yet you join on id
. Perhaps instead of
CREATE TABLE `linedata` (
`id` int(11) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
...
you meant
CREATE TABLE `linedata` (
`id` int(11) NOT NULL AUTO_INCREMENT, -- (the id for `linedata`)
`userId` int NOT NULL, -- to link to the other table
...
PRIMARY KEY(id)
...
Then there could be several linedata
rows for each user
.
At that point, this
JOIN linedata l ON u.id=l.id
becomes
JOIN linedata l ON u.id=l.userid
Now, for performance: linedata
needs INDEX(userid, timestamp)
- in that order.
Now, think about the output. You are asking for up to 22K rows, with possibly hundreds of "ts,x,y" strung together in one of the columns. What will receive this much data? Will it choke on it?
And GROUP_CONCAT
has a default limit of 1024 bytes. That will allow for about 50 points. If a 'user' can be in more than 50 spots in 9 days, consider increasing group_concat_max_len
before running the query.
To make it work even faster, reformulate it this way:
SELECT u.id, u.isActive, u.userId, u.name, u.gender, u.age,
( SELECT GROUP_CONCAT(CONCAT_WS(', ',timestamp, x, y)
ORDER BY timestamp ASC
SEPARATOR '; ')
) as linedata_0
FROM user u
JOIN linedata l ON u.id = l.userid
WHERE l.timestamp >= '2018-02-28 20:00:00.000'
AND l.timestamp < '2018-11-20 09:20:08.218';
Another thing. You probably want to be able to look up a user by name
; so add INDEX(name)
Oh, what the heck is the VARCHAR(255)
for userID
?? Ids are normally integers.
This covers most of what I was going to suggest. Note, I think the OP included their accurate table structures, it's just that there is no PK in thelinedata
table, and theid
column there does in fact contain the same value as theid
column in theuser
table. TheuserId
looks like an alternative identifier, for some purpose. Likely, your suggested index of(userid,timestamp)
could instead be handled by just making a PRIMARY KEY out of(id,timestamp)
, as a user is unlikely to be allowed to have more than one location at a time.
– Willem Renzema
Nov 21 '18 at 5:41
What says @WillemRenzema is right. I didn't have a pk for table linedata, but anyway I have created an ai pk called id. And I joined the id of user table(pk ai) with the userId value that now is integer too and matches correctly.
– UgoL
Nov 21 '18 at 10:22
@Rick James thank you for your answer. I have corrected certain points.. especially thepersonId
type. I have generated an integer value that corresponds to the uniqueid
of theuser
table now. What I think that cannot work properly is your query suggestion, cause I get just one row result that corresponds to just one user. In theory I should get around 21k rows with the concatenation of each timestamp and coordinates. I think you are missing the finalgroup by u.id
. Am I right? And if I add it the benchmark for get all the data is around 15 secs. I update theEXPLAIN
stmt.
– UgoL
Nov 21 '18 at 10:31
@UgoL - Good start. But before I go any further, please clarifypersonId
versususerId
-- are they the same? Do they uniquely identify each user? Why 255? TheEXPLAIN
and timing are still bad because of lack if index(es) on linedata; please fix. (And maybe some other things that Willem or I mentioned.)
– Rick James
Nov 21 '18 at 17:36
@RickJames yes forgive me for the confusion..personId
isuserId
. Let me update the question with an example for show better everything, especially the relationship betweenuserId
andid
. Cause I made some changes. What you mean with lack of indexes on linedata? It should be correct I supposed, but I'll show too this part in the update question.
– UgoL
Nov 21 '18 at 17:44
|
show 7 more comments
I need to help you fix several bugs before discussing performance.
First of all, '2018-02-28T20:00:00.000Z'
won't work in MySQL. It needs to be '2018-02-28 20:00:00.000'
and something needs to be done about the timezone.
Then, don't "hide a column in a function". That is DATEDIFF(l.timestamp ...)
cannot use any indexing on timestamp
.
So, instead of
WHERE DATEDIFF(l.timestamp, '2018-02-28T20:00:00.000Z') >= 0
AND DATEDIFF(l.timestamp, '2018-11-20T09:20:08.218Z') <= 0
do something like
WHERE l.timestamp >= '2018-02-28 20:00:00.000'
AND l.timestamp < '2018-11-20 09:20:08.218'
I'm confused about the two tables. Both have id
and userid
, yet you join on id
. Perhaps instead of
CREATE TABLE `linedata` (
`id` int(11) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
...
you meant
CREATE TABLE `linedata` (
`id` int(11) NOT NULL AUTO_INCREMENT, -- (the id for `linedata`)
`userId` int NOT NULL, -- to link to the other table
...
PRIMARY KEY(id)
...
Then there could be several linedata
rows for each user
.
At that point, this
JOIN linedata l ON u.id=l.id
becomes
JOIN linedata l ON u.id=l.userid
Now, for performance: linedata
needs INDEX(userid, timestamp)
- in that order.
Now, think about the output. You are asking for up to 22K rows, with possibly hundreds of "ts,x,y" strung together in one of the columns. What will receive this much data? Will it choke on it?
And GROUP_CONCAT
has a default limit of 1024 bytes. That will allow for about 50 points. If a 'user' can be in more than 50 spots in 9 days, consider increasing group_concat_max_len
before running the query.
To make it work even faster, reformulate it this way:
SELECT u.id, u.isActive, u.userId, u.name, u.gender, u.age,
( SELECT GROUP_CONCAT(CONCAT_WS(', ',timestamp, x, y)
ORDER BY timestamp ASC
SEPARATOR '; ')
) as linedata_0
FROM user u
JOIN linedata l ON u.id = l.userid
WHERE l.timestamp >= '2018-02-28 20:00:00.000'
AND l.timestamp < '2018-11-20 09:20:08.218';
Another thing. You probably want to be able to look up a user by name
; so add INDEX(name)
Oh, what the heck is the VARCHAR(255)
for userID
?? Ids are normally integers.
This covers most of what I was going to suggest. Note, I think the OP included their accurate table structures, it's just that there is no PK in thelinedata
table, and theid
column there does in fact contain the same value as theid
column in theuser
table. TheuserId
looks like an alternative identifier, for some purpose. Likely, your suggested index of(userid,timestamp)
could instead be handled by just making a PRIMARY KEY out of(id,timestamp)
, as a user is unlikely to be allowed to have more than one location at a time.
– Willem Renzema
Nov 21 '18 at 5:41
What says @WillemRenzema is right. I didn't have a pk for table linedata, but anyway I have created an ai pk called id. And I joined the id of user table(pk ai) with the userId value that now is integer too and matches correctly.
– UgoL
Nov 21 '18 at 10:22
@Rick James thank you for your answer. I have corrected certain points.. especially thepersonId
type. I have generated an integer value that corresponds to the uniqueid
of theuser
table now. What I think that cannot work properly is your query suggestion, cause I get just one row result that corresponds to just one user. In theory I should get around 21k rows with the concatenation of each timestamp and coordinates. I think you are missing the finalgroup by u.id
. Am I right? And if I add it the benchmark for get all the data is around 15 secs. I update theEXPLAIN
stmt.
– UgoL
Nov 21 '18 at 10:31
@UgoL - Good start. But before I go any further, please clarifypersonId
versususerId
-- are they the same? Do they uniquely identify each user? Why 255? TheEXPLAIN
and timing are still bad because of lack if index(es) on linedata; please fix. (And maybe some other things that Willem or I mentioned.)
– Rick James
Nov 21 '18 at 17:36
@RickJames yes forgive me for the confusion..personId
isuserId
. Let me update the question with an example for show better everything, especially the relationship betweenuserId
andid
. Cause I made some changes. What you mean with lack of indexes on linedata? It should be correct I supposed, but I'll show too this part in the update question.
– UgoL
Nov 21 '18 at 17:44
|
show 7 more comments
I need to help you fix several bugs before discussing performance.
First of all, '2018-02-28T20:00:00.000Z'
won't work in MySQL. It needs to be '2018-02-28 20:00:00.000'
and something needs to be done about the timezone.
Then, don't "hide a column in a function". That is DATEDIFF(l.timestamp ...)
cannot use any indexing on timestamp
.
So, instead of
WHERE DATEDIFF(l.timestamp, '2018-02-28T20:00:00.000Z') >= 0
AND DATEDIFF(l.timestamp, '2018-11-20T09:20:08.218Z') <= 0
do something like
WHERE l.timestamp >= '2018-02-28 20:00:00.000'
AND l.timestamp < '2018-11-20 09:20:08.218'
I'm confused about the two tables. Both have id
and userid
, yet you join on id
. Perhaps instead of
CREATE TABLE `linedata` (
`id` int(11) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
...
you meant
CREATE TABLE `linedata` (
`id` int(11) NOT NULL AUTO_INCREMENT, -- (the id for `linedata`)
`userId` int NOT NULL, -- to link to the other table
...
PRIMARY KEY(id)
...
Then there could be several linedata
rows for each user
.
At that point, this
JOIN linedata l ON u.id=l.id
becomes
JOIN linedata l ON u.id=l.userid
Now, for performance: linedata
needs INDEX(userid, timestamp)
- in that order.
Now, think about the output. You are asking for up to 22K rows, with possibly hundreds of "ts,x,y" strung together in one of the columns. What will receive this much data? Will it choke on it?
And GROUP_CONCAT
has a default limit of 1024 bytes. That will allow for about 50 points. If a 'user' can be in more than 50 spots in 9 days, consider increasing group_concat_max_len
before running the query.
To make it work even faster, reformulate it this way:
SELECT u.id, u.isActive, u.userId, u.name, u.gender, u.age,
( SELECT GROUP_CONCAT(CONCAT_WS(', ',timestamp, x, y)
ORDER BY timestamp ASC
SEPARATOR '; ')
) as linedata_0
FROM user u
JOIN linedata l ON u.id = l.userid
WHERE l.timestamp >= '2018-02-28 20:00:00.000'
AND l.timestamp < '2018-11-20 09:20:08.218';
Another thing. You probably want to be able to look up a user by name
; so add INDEX(name)
Oh, what the heck is the VARCHAR(255)
for userID
?? Ids are normally integers.
I need to help you fix several bugs before discussing performance.
First of all, '2018-02-28T20:00:00.000Z'
won't work in MySQL. It needs to be '2018-02-28 20:00:00.000'
and something needs to be done about the timezone.
Then, don't "hide a column in a function". That is DATEDIFF(l.timestamp ...)
cannot use any indexing on timestamp
.
So, instead of
WHERE DATEDIFF(l.timestamp, '2018-02-28T20:00:00.000Z') >= 0
AND DATEDIFF(l.timestamp, '2018-11-20T09:20:08.218Z') <= 0
do something like
WHERE l.timestamp >= '2018-02-28 20:00:00.000'
AND l.timestamp < '2018-11-20 09:20:08.218'
I'm confused about the two tables. Both have id
and userid
, yet you join on id
. Perhaps instead of
CREATE TABLE `linedata` (
`id` int(11) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
...
you meant
CREATE TABLE `linedata` (
`id` int(11) NOT NULL AUTO_INCREMENT, -- (the id for `linedata`)
`userId` int NOT NULL, -- to link to the other table
...
PRIMARY KEY(id)
...
Then there could be several linedata
rows for each user
.
At that point, this
JOIN linedata l ON u.id=l.id
becomes
JOIN linedata l ON u.id=l.userid
Now, for performance: linedata
needs INDEX(userid, timestamp)
- in that order.
Now, think about the output. You are asking for up to 22K rows, with possibly hundreds of "ts,x,y" strung together in one of the columns. What will receive this much data? Will it choke on it?
And GROUP_CONCAT
has a default limit of 1024 bytes. That will allow for about 50 points. If a 'user' can be in more than 50 spots in 9 days, consider increasing group_concat_max_len
before running the query.
To make it work even faster, reformulate it this way:
SELECT u.id, u.isActive, u.userId, u.name, u.gender, u.age,
( SELECT GROUP_CONCAT(CONCAT_WS(', ',timestamp, x, y)
ORDER BY timestamp ASC
SEPARATOR '; ')
) as linedata_0
FROM user u
JOIN linedata l ON u.id = l.userid
WHERE l.timestamp >= '2018-02-28 20:00:00.000'
AND l.timestamp < '2018-11-20 09:20:08.218';
Another thing. You probably want to be able to look up a user by name
; so add INDEX(name)
Oh, what the heck is the VARCHAR(255)
for userID
?? Ids are normally integers.
answered Nov 21 '18 at 5:21
Rick JamesRick James
68.8k559100
68.8k559100
This covers most of what I was going to suggest. Note, I think the OP included their accurate table structures, it's just that there is no PK in thelinedata
table, and theid
column there does in fact contain the same value as theid
column in theuser
table. TheuserId
looks like an alternative identifier, for some purpose. Likely, your suggested index of(userid,timestamp)
could instead be handled by just making a PRIMARY KEY out of(id,timestamp)
, as a user is unlikely to be allowed to have more than one location at a time.
– Willem Renzema
Nov 21 '18 at 5:41
What says @WillemRenzema is right. I didn't have a pk for table linedata, but anyway I have created an ai pk called id. And I joined the id of user table(pk ai) with the userId value that now is integer too and matches correctly.
– UgoL
Nov 21 '18 at 10:22
@Rick James thank you for your answer. I have corrected certain points.. especially thepersonId
type. I have generated an integer value that corresponds to the uniqueid
of theuser
table now. What I think that cannot work properly is your query suggestion, cause I get just one row result that corresponds to just one user. In theory I should get around 21k rows with the concatenation of each timestamp and coordinates. I think you are missing the finalgroup by u.id
. Am I right? And if I add it the benchmark for get all the data is around 15 secs. I update theEXPLAIN
stmt.
– UgoL
Nov 21 '18 at 10:31
@UgoL - Good start. But before I go any further, please clarifypersonId
versususerId
-- are they the same? Do they uniquely identify each user? Why 255? TheEXPLAIN
and timing are still bad because of lack if index(es) on linedata; please fix. (And maybe some other things that Willem or I mentioned.)
– Rick James
Nov 21 '18 at 17:36
@RickJames yes forgive me for the confusion..personId
isuserId
. Let me update the question with an example for show better everything, especially the relationship betweenuserId
andid
. Cause I made some changes. What you mean with lack of indexes on linedata? It should be correct I supposed, but I'll show too this part in the update question.
– UgoL
Nov 21 '18 at 17:44
|
show 7 more comments
This covers most of what I was going to suggest. Note, I think the OP included their accurate table structures, it's just that there is no PK in thelinedata
table, and theid
column there does in fact contain the same value as theid
column in theuser
table. TheuserId
looks like an alternative identifier, for some purpose. Likely, your suggested index of(userid,timestamp)
could instead be handled by just making a PRIMARY KEY out of(id,timestamp)
, as a user is unlikely to be allowed to have more than one location at a time.
– Willem Renzema
Nov 21 '18 at 5:41
What says @WillemRenzema is right. I didn't have a pk for table linedata, but anyway I have created an ai pk called id. And I joined the id of user table(pk ai) with the userId value that now is integer too and matches correctly.
– UgoL
Nov 21 '18 at 10:22
@Rick James thank you for your answer. I have corrected certain points.. especially thepersonId
type. I have generated an integer value that corresponds to the uniqueid
of theuser
table now. What I think that cannot work properly is your query suggestion, cause I get just one row result that corresponds to just one user. In theory I should get around 21k rows with the concatenation of each timestamp and coordinates. I think you are missing the finalgroup by u.id
. Am I right? And if I add it the benchmark for get all the data is around 15 secs. I update theEXPLAIN
stmt.
– UgoL
Nov 21 '18 at 10:31
@UgoL - Good start. But before I go any further, please clarifypersonId
versususerId
-- are they the same? Do they uniquely identify each user? Why 255? TheEXPLAIN
and timing are still bad because of lack if index(es) on linedata; please fix. (And maybe some other things that Willem or I mentioned.)
– Rick James
Nov 21 '18 at 17:36
@RickJames yes forgive me for the confusion..personId
isuserId
. Let me update the question with an example for show better everything, especially the relationship betweenuserId
andid
. Cause I made some changes. What you mean with lack of indexes on linedata? It should be correct I supposed, but I'll show too this part in the update question.
– UgoL
Nov 21 '18 at 17:44
This covers most of what I was going to suggest. Note, I think the OP included their accurate table structures, it's just that there is no PK in the
linedata
table, and the id
column there does in fact contain the same value as the id
column in the user
table. The userId
looks like an alternative identifier, for some purpose. Likely, your suggested index of (userid,timestamp)
could instead be handled by just making a PRIMARY KEY out of (id,timestamp)
, as a user is unlikely to be allowed to have more than one location at a time.– Willem Renzema
Nov 21 '18 at 5:41
This covers most of what I was going to suggest. Note, I think the OP included their accurate table structures, it's just that there is no PK in the
linedata
table, and the id
column there does in fact contain the same value as the id
column in the user
table. The userId
looks like an alternative identifier, for some purpose. Likely, your suggested index of (userid,timestamp)
could instead be handled by just making a PRIMARY KEY out of (id,timestamp)
, as a user is unlikely to be allowed to have more than one location at a time.– Willem Renzema
Nov 21 '18 at 5:41
What says @WillemRenzema is right. I didn't have a pk for table linedata, but anyway I have created an ai pk called id. And I joined the id of user table(pk ai) with the userId value that now is integer too and matches correctly.
– UgoL
Nov 21 '18 at 10:22
What says @WillemRenzema is right. I didn't have a pk for table linedata, but anyway I have created an ai pk called id. And I joined the id of user table(pk ai) with the userId value that now is integer too and matches correctly.
– UgoL
Nov 21 '18 at 10:22
@Rick James thank you for your answer. I have corrected certain points.. especially the
personId
type. I have generated an integer value that corresponds to the unique id
of the user
table now. What I think that cannot work properly is your query suggestion, cause I get just one row result that corresponds to just one user. In theory I should get around 21k rows with the concatenation of each timestamp and coordinates. I think you are missing the final group by u.id
. Am I right? And if I add it the benchmark for get all the data is around 15 secs. I update the EXPLAIN
stmt.– UgoL
Nov 21 '18 at 10:31
@Rick James thank you for your answer. I have corrected certain points.. especially the
personId
type. I have generated an integer value that corresponds to the unique id
of the user
table now. What I think that cannot work properly is your query suggestion, cause I get just one row result that corresponds to just one user. In theory I should get around 21k rows with the concatenation of each timestamp and coordinates. I think you are missing the final group by u.id
. Am I right? And if I add it the benchmark for get all the data is around 15 secs. I update the EXPLAIN
stmt.– UgoL
Nov 21 '18 at 10:31
@UgoL - Good start. But before I go any further, please clarify
personId
versus userId
-- are they the same? Do they uniquely identify each user? Why 255? The EXPLAIN
and timing are still bad because of lack if index(es) on linedata; please fix. (And maybe some other things that Willem or I mentioned.)– Rick James
Nov 21 '18 at 17:36
@UgoL - Good start. But before I go any further, please clarify
personId
versus userId
-- are they the same? Do they uniquely identify each user? Why 255? The EXPLAIN
and timing are still bad because of lack if index(es) on linedata; please fix. (And maybe some other things that Willem or I mentioned.)– Rick James
Nov 21 '18 at 17:36
@RickJames yes forgive me for the confusion..
personId
is userId
. Let me update the question with an example for show better everything, especially the relationship between userId
and id
. Cause I made some changes. What you mean with lack of indexes on linedata? It should be correct I supposed, but I'll show too this part in the update question.– UgoL
Nov 21 '18 at 17:44
@RickJames yes forgive me for the confusion..
personId
is userId
. Let me update the question with an example for show better everything, especially the relationship between userId
and id
. Cause I made some changes. What you mean with lack of indexes on linedata? It should be correct I supposed, but I'll show too this part in the update question.– UgoL
Nov 21 '18 at 17:44
|
show 7 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53394029%2foptimize-and-speed-up-mysql-query-selection%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown