Efficient substring Search in DynamoDB
This is the context of my situation:
- I have a huge DB in dynamoDB with 250.000 items. (Example) table
- I want to be able to "substring search" through 3 attributes, getting the list of all items that match the substrings.
- The attributes i want to be able to search can have the same value among different items.
- My hash key is an id (the only attribute that really differentiates the items).
- I'm using react native as a client
- My schema has these "query types" queries
Where I am:
I first tried querying with the listCaballos query adding the user input as a filter to the query, and using the nextToken recursively to go over the whole table (without using secondary indexes), but it took 6 minutes to go through the table and return the items.
I know secondary indexes help to partition and then order the items through chosen keys (which makes it fast), buuuut I read that that forces the user to make an exact search (not a substring kind of search), and that's not what I need.
I've heard Elastic Search might help.
Any suggestions?
Thanks!
search substring amazon-dynamodb
|
show 7 more comments
This is the context of my situation:
- I have a huge DB in dynamoDB with 250.000 items. (Example) table
- I want to be able to "substring search" through 3 attributes, getting the list of all items that match the substrings.
- The attributes i want to be able to search can have the same value among different items.
- My hash key is an id (the only attribute that really differentiates the items).
- I'm using react native as a client
- My schema has these "query types" queries
Where I am:
I first tried querying with the listCaballos query adding the user input as a filter to the query, and using the nextToken recursively to go over the whole table (without using secondary indexes), but it took 6 minutes to go through the table and return the items.
I know secondary indexes help to partition and then order the items through chosen keys (which makes it fast), buuuut I read that that forces the user to make an exact search (not a substring kind of search), and that's not what I need.
I've heard Elastic Search might help.
Any suggestions?
Thanks!
search substring amazon-dynamodb
How large is the table in bytes?
– cementblocks
Nov 20 '18 at 20:52
You basically have to scan the table to get the results.
– cementblocks
Nov 20 '18 at 20:53
Is this an ad hoc query or a new, regular data access pattern?
– Matthew Pope
Nov 21 '18 at 2:37
@MatthewPope Im not sure what you mean, but I added more info to the question in order to (maybe) answer you.
– Diego Leonvendagar
Nov 21 '18 at 3:57
@DiegoLeonvendagar Is this a one-time search or something you will need to do frequently? The answer to that question makes a difference to the recommended solution.
– Matthew Pope
Nov 21 '18 at 4:23
|
show 7 more comments
This is the context of my situation:
- I have a huge DB in dynamoDB with 250.000 items. (Example) table
- I want to be able to "substring search" through 3 attributes, getting the list of all items that match the substrings.
- The attributes i want to be able to search can have the same value among different items.
- My hash key is an id (the only attribute that really differentiates the items).
- I'm using react native as a client
- My schema has these "query types" queries
Where I am:
I first tried querying with the listCaballos query adding the user input as a filter to the query, and using the nextToken recursively to go over the whole table (without using secondary indexes), but it took 6 minutes to go through the table and return the items.
I know secondary indexes help to partition and then order the items through chosen keys (which makes it fast), buuuut I read that that forces the user to make an exact search (not a substring kind of search), and that's not what I need.
I've heard Elastic Search might help.
Any suggestions?
Thanks!
search substring amazon-dynamodb
This is the context of my situation:
- I have a huge DB in dynamoDB with 250.000 items. (Example) table
- I want to be able to "substring search" through 3 attributes, getting the list of all items that match the substrings.
- The attributes i want to be able to search can have the same value among different items.
- My hash key is an id (the only attribute that really differentiates the items).
- I'm using react native as a client
- My schema has these "query types" queries
Where I am:
I first tried querying with the listCaballos query adding the user input as a filter to the query, and using the nextToken recursively to go over the whole table (without using secondary indexes), but it took 6 minutes to go through the table and return the items.
I know secondary indexes help to partition and then order the items through chosen keys (which makes it fast), buuuut I read that that forces the user to make an exact search (not a substring kind of search), and that's not what I need.
I've heard Elastic Search might help.
Any suggestions?
Thanks!
search substring amazon-dynamodb
search substring amazon-dynamodb
edited Nov 21 '18 at 3:53
Diego Leonvendagar
asked Nov 20 '18 at 18:14


Diego LeonvendagarDiego Leonvendagar
53
53
How large is the table in bytes?
– cementblocks
Nov 20 '18 at 20:52
You basically have to scan the table to get the results.
– cementblocks
Nov 20 '18 at 20:53
Is this an ad hoc query or a new, regular data access pattern?
– Matthew Pope
Nov 21 '18 at 2:37
@MatthewPope Im not sure what you mean, but I added more info to the question in order to (maybe) answer you.
– Diego Leonvendagar
Nov 21 '18 at 3:57
@DiegoLeonvendagar Is this a one-time search or something you will need to do frequently? The answer to that question makes a difference to the recommended solution.
– Matthew Pope
Nov 21 '18 at 4:23
|
show 7 more comments
How large is the table in bytes?
– cementblocks
Nov 20 '18 at 20:52
You basically have to scan the table to get the results.
– cementblocks
Nov 20 '18 at 20:53
Is this an ad hoc query or a new, regular data access pattern?
– Matthew Pope
Nov 21 '18 at 2:37
@MatthewPope Im not sure what you mean, but I added more info to the question in order to (maybe) answer you.
– Diego Leonvendagar
Nov 21 '18 at 3:57
@DiegoLeonvendagar Is this a one-time search or something you will need to do frequently? The answer to that question makes a difference to the recommended solution.
– Matthew Pope
Nov 21 '18 at 4:23
How large is the table in bytes?
– cementblocks
Nov 20 '18 at 20:52
How large is the table in bytes?
– cementblocks
Nov 20 '18 at 20:52
You basically have to scan the table to get the results.
– cementblocks
Nov 20 '18 at 20:53
You basically have to scan the table to get the results.
– cementblocks
Nov 20 '18 at 20:53
Is this an ad hoc query or a new, regular data access pattern?
– Matthew Pope
Nov 21 '18 at 2:37
Is this an ad hoc query or a new, regular data access pattern?
– Matthew Pope
Nov 21 '18 at 2:37
@MatthewPope Im not sure what you mean, but I added more info to the question in order to (maybe) answer you.
– Diego Leonvendagar
Nov 21 '18 at 3:57
@MatthewPope Im not sure what you mean, but I added more info to the question in order to (maybe) answer you.
– Diego Leonvendagar
Nov 21 '18 at 3:57
@DiegoLeonvendagar Is this a one-time search or something you will need to do frequently? The answer to that question makes a difference to the recommended solution.
– Matthew Pope
Nov 21 '18 at 4:23
@DiegoLeonvendagar Is this a one-time search or something you will need to do frequently? The answer to that question makes a difference to the recommended solution.
– Matthew Pope
Nov 21 '18 at 4:23
|
show 7 more comments
2 Answers
2
active
oldest
votes
This is not efficient in DynamoDB. Although you can create secondary indexes to search 'begins_with', substring ('contains') capability is there only for filters which are not efficient in a large data set (Since DynamoDB will use IOPS to query all and then apply the filter).
This kind of a requirement, it is efficient to index the database using another service like AWS ElasticSearch or CloudSearch so that you can apply the query on top of that service and configure continuous indexing.
Getting Started
- Searching DynamoDB Data with Amazon CloudSearch
- Combining DynamoDB and Amazon Elasticsearch with Lambda
- Indexing Amazon DynamoDB Content with Amazon Elasticsearch Service Using AWS Lambda
add a comment |
You will not be able to use secondary indexes to help create a (reasonable) generalized substring search.
There are many ways to solve your problem. Here, I present a few of them, and this is by no means exhaustive.
DynamoDB -> CloudSearch
CloudSearch can provide general search functionality for your data. Basically, you can connect a lambda function to the DynamoDB stream from your table. That lambda function can keep your CloudSearch domain up to date. Here is an overview of this process.
CloudSearch
You could forgo DynamoDB and store this data in CloudSearch. That eliminates the need for the lambda function and means your data is only stored in one place. However, you need to tolerate a higher time to consistency because CloudSearch doesn’t have strongly consistent reads like DynamoDB.
RDS
You could just use a SQL database of some sort. Most of them support a full text search. You can even use AWS Aurora Serverless if you don’t want to manage database instances.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53399091%2fefficient-substring-search-in-dynamodb%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
This is not efficient in DynamoDB. Although you can create secondary indexes to search 'begins_with', substring ('contains') capability is there only for filters which are not efficient in a large data set (Since DynamoDB will use IOPS to query all and then apply the filter).
This kind of a requirement, it is efficient to index the database using another service like AWS ElasticSearch or CloudSearch so that you can apply the query on top of that service and configure continuous indexing.
Getting Started
- Searching DynamoDB Data with Amazon CloudSearch
- Combining DynamoDB and Amazon Elasticsearch with Lambda
- Indexing Amazon DynamoDB Content with Amazon Elasticsearch Service Using AWS Lambda
add a comment |
This is not efficient in DynamoDB. Although you can create secondary indexes to search 'begins_with', substring ('contains') capability is there only for filters which are not efficient in a large data set (Since DynamoDB will use IOPS to query all and then apply the filter).
This kind of a requirement, it is efficient to index the database using another service like AWS ElasticSearch or CloudSearch so that you can apply the query on top of that service and configure continuous indexing.
Getting Started
- Searching DynamoDB Data with Amazon CloudSearch
- Combining DynamoDB and Amazon Elasticsearch with Lambda
- Indexing Amazon DynamoDB Content with Amazon Elasticsearch Service Using AWS Lambda
add a comment |
This is not efficient in DynamoDB. Although you can create secondary indexes to search 'begins_with', substring ('contains') capability is there only for filters which are not efficient in a large data set (Since DynamoDB will use IOPS to query all and then apply the filter).
This kind of a requirement, it is efficient to index the database using another service like AWS ElasticSearch or CloudSearch so that you can apply the query on top of that service and configure continuous indexing.
Getting Started
- Searching DynamoDB Data with Amazon CloudSearch
- Combining DynamoDB and Amazon Elasticsearch with Lambda
- Indexing Amazon DynamoDB Content with Amazon Elasticsearch Service Using AWS Lambda
This is not efficient in DynamoDB. Although you can create secondary indexes to search 'begins_with', substring ('contains') capability is there only for filters which are not efficient in a large data set (Since DynamoDB will use IOPS to query all and then apply the filter).
This kind of a requirement, it is efficient to index the database using another service like AWS ElasticSearch or CloudSearch so that you can apply the query on top of that service and configure continuous indexing.
Getting Started
- Searching DynamoDB Data with Amazon CloudSearch
- Combining DynamoDB and Amazon Elasticsearch with Lambda
- Indexing Amazon DynamoDB Content with Amazon Elasticsearch Service Using AWS Lambda
edited Nov 23 '18 at 3:49
answered Nov 23 '18 at 3:38


AshanAshan
10.3k21935
10.3k21935
add a comment |
add a comment |
You will not be able to use secondary indexes to help create a (reasonable) generalized substring search.
There are many ways to solve your problem. Here, I present a few of them, and this is by no means exhaustive.
DynamoDB -> CloudSearch
CloudSearch can provide general search functionality for your data. Basically, you can connect a lambda function to the DynamoDB stream from your table. That lambda function can keep your CloudSearch domain up to date. Here is an overview of this process.
CloudSearch
You could forgo DynamoDB and store this data in CloudSearch. That eliminates the need for the lambda function and means your data is only stored in one place. However, you need to tolerate a higher time to consistency because CloudSearch doesn’t have strongly consistent reads like DynamoDB.
RDS
You could just use a SQL database of some sort. Most of them support a full text search. You can even use AWS Aurora Serverless if you don’t want to manage database instances.
add a comment |
You will not be able to use secondary indexes to help create a (reasonable) generalized substring search.
There are many ways to solve your problem. Here, I present a few of them, and this is by no means exhaustive.
DynamoDB -> CloudSearch
CloudSearch can provide general search functionality for your data. Basically, you can connect a lambda function to the DynamoDB stream from your table. That lambda function can keep your CloudSearch domain up to date. Here is an overview of this process.
CloudSearch
You could forgo DynamoDB and store this data in CloudSearch. That eliminates the need for the lambda function and means your data is only stored in one place. However, you need to tolerate a higher time to consistency because CloudSearch doesn’t have strongly consistent reads like DynamoDB.
RDS
You could just use a SQL database of some sort. Most of them support a full text search. You can even use AWS Aurora Serverless if you don’t want to manage database instances.
add a comment |
You will not be able to use secondary indexes to help create a (reasonable) generalized substring search.
There are many ways to solve your problem. Here, I present a few of them, and this is by no means exhaustive.
DynamoDB -> CloudSearch
CloudSearch can provide general search functionality for your data. Basically, you can connect a lambda function to the DynamoDB stream from your table. That lambda function can keep your CloudSearch domain up to date. Here is an overview of this process.
CloudSearch
You could forgo DynamoDB and store this data in CloudSearch. That eliminates the need for the lambda function and means your data is only stored in one place. However, you need to tolerate a higher time to consistency because CloudSearch doesn’t have strongly consistent reads like DynamoDB.
RDS
You could just use a SQL database of some sort. Most of them support a full text search. You can even use AWS Aurora Serverless if you don’t want to manage database instances.
You will not be able to use secondary indexes to help create a (reasonable) generalized substring search.
There are many ways to solve your problem. Here, I present a few of them, and this is by no means exhaustive.
DynamoDB -> CloudSearch
CloudSearch can provide general search functionality for your data. Basically, you can connect a lambda function to the DynamoDB stream from your table. That lambda function can keep your CloudSearch domain up to date. Here is an overview of this process.
CloudSearch
You could forgo DynamoDB and store this data in CloudSearch. That eliminates the need for the lambda function and means your data is only stored in one place. However, you need to tolerate a higher time to consistency because CloudSearch doesn’t have strongly consistent reads like DynamoDB.
RDS
You could just use a SQL database of some sort. Most of them support a full text search. You can even use AWS Aurora Serverless if you don’t want to manage database instances.
answered Nov 22 '18 at 22:44
Matthew PopeMatthew Pope
1,4421612
1,4421612
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53399091%2fefficient-substring-search-in-dynamodb%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
How large is the table in bytes?
– cementblocks
Nov 20 '18 at 20:52
You basically have to scan the table to get the results.
– cementblocks
Nov 20 '18 at 20:53
Is this an ad hoc query or a new, regular data access pattern?
– Matthew Pope
Nov 21 '18 at 2:37
@MatthewPope Im not sure what you mean, but I added more info to the question in order to (maybe) answer you.
– Diego Leonvendagar
Nov 21 '18 at 3:57
@DiegoLeonvendagar Is this a one-time search or something you will need to do frequently? The answer to that question makes a difference to the recommended solution.
– Matthew Pope
Nov 21 '18 at 4:23