Efficient substring Search in DynamoDB












0















This is the context of my situation:




  • I have a huge DB in dynamoDB with 250.000 items. (Example) table

  • I want to be able to "substring search" through 3 attributes, getting the list of all items that match the substrings.

  • The attributes i want to be able to search can have the same value among different items.

  • My hash key is an id (the only attribute that really differentiates the items).

  • I'm using react native as a client

  • My schema has these "query types" queries


Where I am:




  • I first tried querying with the listCaballos query adding the user input as a filter to the query, and using the nextToken recursively to go over the whole table (without using secondary indexes), but it took 6 minutes to go through the table and return the items.


  • I know secondary indexes help to partition and then order the items through chosen keys (which makes it fast), buuuut I read that that forces the user to make an exact search (not a substring kind of search), and that's not what I need.


  • I've heard Elastic Search might help.



Any suggestions?



Thanks!










share|improve this question

























  • How large is the table in bytes?

    – cementblocks
    Nov 20 '18 at 20:52











  • You basically have to scan the table to get the results.

    – cementblocks
    Nov 20 '18 at 20:53











  • Is this an ad hoc query or a new, regular data access pattern?

    – Matthew Pope
    Nov 21 '18 at 2:37











  • @MatthewPope Im not sure what you mean, but I added more info to the question in order to (maybe) answer you.

    – Diego Leonvendagar
    Nov 21 '18 at 3:57











  • @DiegoLeonvendagar Is this a one-time search or something you will need to do frequently? The answer to that question makes a difference to the recommended solution.

    – Matthew Pope
    Nov 21 '18 at 4:23
















0















This is the context of my situation:




  • I have a huge DB in dynamoDB with 250.000 items. (Example) table

  • I want to be able to "substring search" through 3 attributes, getting the list of all items that match the substrings.

  • The attributes i want to be able to search can have the same value among different items.

  • My hash key is an id (the only attribute that really differentiates the items).

  • I'm using react native as a client

  • My schema has these "query types" queries


Where I am:




  • I first tried querying with the listCaballos query adding the user input as a filter to the query, and using the nextToken recursively to go over the whole table (without using secondary indexes), but it took 6 minutes to go through the table and return the items.


  • I know secondary indexes help to partition and then order the items through chosen keys (which makes it fast), buuuut I read that that forces the user to make an exact search (not a substring kind of search), and that's not what I need.


  • I've heard Elastic Search might help.



Any suggestions?



Thanks!










share|improve this question

























  • How large is the table in bytes?

    – cementblocks
    Nov 20 '18 at 20:52











  • You basically have to scan the table to get the results.

    – cementblocks
    Nov 20 '18 at 20:53











  • Is this an ad hoc query or a new, regular data access pattern?

    – Matthew Pope
    Nov 21 '18 at 2:37











  • @MatthewPope Im not sure what you mean, but I added more info to the question in order to (maybe) answer you.

    – Diego Leonvendagar
    Nov 21 '18 at 3:57











  • @DiegoLeonvendagar Is this a one-time search or something you will need to do frequently? The answer to that question makes a difference to the recommended solution.

    – Matthew Pope
    Nov 21 '18 at 4:23














0












0








0








This is the context of my situation:




  • I have a huge DB in dynamoDB with 250.000 items. (Example) table

  • I want to be able to "substring search" through 3 attributes, getting the list of all items that match the substrings.

  • The attributes i want to be able to search can have the same value among different items.

  • My hash key is an id (the only attribute that really differentiates the items).

  • I'm using react native as a client

  • My schema has these "query types" queries


Where I am:




  • I first tried querying with the listCaballos query adding the user input as a filter to the query, and using the nextToken recursively to go over the whole table (without using secondary indexes), but it took 6 minutes to go through the table and return the items.


  • I know secondary indexes help to partition and then order the items through chosen keys (which makes it fast), buuuut I read that that forces the user to make an exact search (not a substring kind of search), and that's not what I need.


  • I've heard Elastic Search might help.



Any suggestions?



Thanks!










share|improve this question
















This is the context of my situation:




  • I have a huge DB in dynamoDB with 250.000 items. (Example) table

  • I want to be able to "substring search" through 3 attributes, getting the list of all items that match the substrings.

  • The attributes i want to be able to search can have the same value among different items.

  • My hash key is an id (the only attribute that really differentiates the items).

  • I'm using react native as a client

  • My schema has these "query types" queries


Where I am:




  • I first tried querying with the listCaballos query adding the user input as a filter to the query, and using the nextToken recursively to go over the whole table (without using secondary indexes), but it took 6 minutes to go through the table and return the items.


  • I know secondary indexes help to partition and then order the items through chosen keys (which makes it fast), buuuut I read that that forces the user to make an exact search (not a substring kind of search), and that's not what I need.


  • I've heard Elastic Search might help.



Any suggestions?



Thanks!







search substring amazon-dynamodb






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 21 '18 at 3:53







Diego Leonvendagar

















asked Nov 20 '18 at 18:14









Diego LeonvendagarDiego Leonvendagar

53




53













  • How large is the table in bytes?

    – cementblocks
    Nov 20 '18 at 20:52











  • You basically have to scan the table to get the results.

    – cementblocks
    Nov 20 '18 at 20:53











  • Is this an ad hoc query or a new, regular data access pattern?

    – Matthew Pope
    Nov 21 '18 at 2:37











  • @MatthewPope Im not sure what you mean, but I added more info to the question in order to (maybe) answer you.

    – Diego Leonvendagar
    Nov 21 '18 at 3:57











  • @DiegoLeonvendagar Is this a one-time search or something you will need to do frequently? The answer to that question makes a difference to the recommended solution.

    – Matthew Pope
    Nov 21 '18 at 4:23



















  • How large is the table in bytes?

    – cementblocks
    Nov 20 '18 at 20:52











  • You basically have to scan the table to get the results.

    – cementblocks
    Nov 20 '18 at 20:53











  • Is this an ad hoc query or a new, regular data access pattern?

    – Matthew Pope
    Nov 21 '18 at 2:37











  • @MatthewPope Im not sure what you mean, but I added more info to the question in order to (maybe) answer you.

    – Diego Leonvendagar
    Nov 21 '18 at 3:57











  • @DiegoLeonvendagar Is this a one-time search or something you will need to do frequently? The answer to that question makes a difference to the recommended solution.

    – Matthew Pope
    Nov 21 '18 at 4:23

















How large is the table in bytes?

– cementblocks
Nov 20 '18 at 20:52





How large is the table in bytes?

– cementblocks
Nov 20 '18 at 20:52













You basically have to scan the table to get the results.

– cementblocks
Nov 20 '18 at 20:53





You basically have to scan the table to get the results.

– cementblocks
Nov 20 '18 at 20:53













Is this an ad hoc query or a new, regular data access pattern?

– Matthew Pope
Nov 21 '18 at 2:37





Is this an ad hoc query or a new, regular data access pattern?

– Matthew Pope
Nov 21 '18 at 2:37













@MatthewPope Im not sure what you mean, but I added more info to the question in order to (maybe) answer you.

– Diego Leonvendagar
Nov 21 '18 at 3:57





@MatthewPope Im not sure what you mean, but I added more info to the question in order to (maybe) answer you.

– Diego Leonvendagar
Nov 21 '18 at 3:57













@DiegoLeonvendagar Is this a one-time search or something you will need to do frequently? The answer to that question makes a difference to the recommended solution.

– Matthew Pope
Nov 21 '18 at 4:23





@DiegoLeonvendagar Is this a one-time search or something you will need to do frequently? The answer to that question makes a difference to the recommended solution.

– Matthew Pope
Nov 21 '18 at 4:23












2 Answers
2






active

oldest

votes


















0














This is not efficient in DynamoDB. Although you can create secondary indexes to search 'begins_with', substring ('contains') capability is there only for filters which are not efficient in a large data set (Since DynamoDB will use IOPS to query all and then apply the filter).



This kind of a requirement, it is efficient to index the database using another service like AWS ElasticSearch or CloudSearch so that you can apply the query on top of that service and configure continuous indexing.



Getting Started




  • Searching DynamoDB Data with Amazon CloudSearch

  • Combining DynamoDB and Amazon Elasticsearch with Lambda

  • Indexing Amazon DynamoDB Content with Amazon Elasticsearch Service Using AWS Lambda






share|improve this answer

































    0














    You will not be able to use secondary indexes to help create a (reasonable) generalized substring search.



    There are many ways to solve your problem. Here, I present a few of them, and this is by no means exhaustive.



    DynamoDB -> CloudSearch



    CloudSearch can provide general search functionality for your data. Basically, you can connect a lambda function to the DynamoDB stream from your table. That lambda function can keep your CloudSearch domain up to date. Here is an overview of this process.



    CloudSearch



    You could forgo DynamoDB and store this data in CloudSearch. That eliminates the need for the lambda function and means your data is only stored in one place. However, you need to tolerate a higher time to consistency because CloudSearch doesn’t have strongly consistent reads like DynamoDB.



    RDS



    You could just use a SQL database of some sort. Most of them support a full text search. You can even use AWS Aurora Serverless if you don’t want to manage database instances.






    share|improve this answer























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53399091%2fefficient-substring-search-in-dynamodb%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0














      This is not efficient in DynamoDB. Although you can create secondary indexes to search 'begins_with', substring ('contains') capability is there only for filters which are not efficient in a large data set (Since DynamoDB will use IOPS to query all and then apply the filter).



      This kind of a requirement, it is efficient to index the database using another service like AWS ElasticSearch or CloudSearch so that you can apply the query on top of that service and configure continuous indexing.



      Getting Started




      • Searching DynamoDB Data with Amazon CloudSearch

      • Combining DynamoDB and Amazon Elasticsearch with Lambda

      • Indexing Amazon DynamoDB Content with Amazon Elasticsearch Service Using AWS Lambda






      share|improve this answer






























        0














        This is not efficient in DynamoDB. Although you can create secondary indexes to search 'begins_with', substring ('contains') capability is there only for filters which are not efficient in a large data set (Since DynamoDB will use IOPS to query all and then apply the filter).



        This kind of a requirement, it is efficient to index the database using another service like AWS ElasticSearch or CloudSearch so that you can apply the query on top of that service and configure continuous indexing.



        Getting Started




        • Searching DynamoDB Data with Amazon CloudSearch

        • Combining DynamoDB and Amazon Elasticsearch with Lambda

        • Indexing Amazon DynamoDB Content with Amazon Elasticsearch Service Using AWS Lambda






        share|improve this answer




























          0












          0








          0







          This is not efficient in DynamoDB. Although you can create secondary indexes to search 'begins_with', substring ('contains') capability is there only for filters which are not efficient in a large data set (Since DynamoDB will use IOPS to query all and then apply the filter).



          This kind of a requirement, it is efficient to index the database using another service like AWS ElasticSearch or CloudSearch so that you can apply the query on top of that service and configure continuous indexing.



          Getting Started




          • Searching DynamoDB Data with Amazon CloudSearch

          • Combining DynamoDB and Amazon Elasticsearch with Lambda

          • Indexing Amazon DynamoDB Content with Amazon Elasticsearch Service Using AWS Lambda






          share|improve this answer















          This is not efficient in DynamoDB. Although you can create secondary indexes to search 'begins_with', substring ('contains') capability is there only for filters which are not efficient in a large data set (Since DynamoDB will use IOPS to query all and then apply the filter).



          This kind of a requirement, it is efficient to index the database using another service like AWS ElasticSearch or CloudSearch so that you can apply the query on top of that service and configure continuous indexing.



          Getting Started




          • Searching DynamoDB Data with Amazon CloudSearch

          • Combining DynamoDB and Amazon Elasticsearch with Lambda

          • Indexing Amazon DynamoDB Content with Amazon Elasticsearch Service Using AWS Lambda







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 23 '18 at 3:49

























          answered Nov 23 '18 at 3:38









          AshanAshan

          10.3k21935




          10.3k21935

























              0














              You will not be able to use secondary indexes to help create a (reasonable) generalized substring search.



              There are many ways to solve your problem. Here, I present a few of them, and this is by no means exhaustive.



              DynamoDB -> CloudSearch



              CloudSearch can provide general search functionality for your data. Basically, you can connect a lambda function to the DynamoDB stream from your table. That lambda function can keep your CloudSearch domain up to date. Here is an overview of this process.



              CloudSearch



              You could forgo DynamoDB and store this data in CloudSearch. That eliminates the need for the lambda function and means your data is only stored in one place. However, you need to tolerate a higher time to consistency because CloudSearch doesn’t have strongly consistent reads like DynamoDB.



              RDS



              You could just use a SQL database of some sort. Most of them support a full text search. You can even use AWS Aurora Serverless if you don’t want to manage database instances.






              share|improve this answer




























                0














                You will not be able to use secondary indexes to help create a (reasonable) generalized substring search.



                There are many ways to solve your problem. Here, I present a few of them, and this is by no means exhaustive.



                DynamoDB -> CloudSearch



                CloudSearch can provide general search functionality for your data. Basically, you can connect a lambda function to the DynamoDB stream from your table. That lambda function can keep your CloudSearch domain up to date. Here is an overview of this process.



                CloudSearch



                You could forgo DynamoDB and store this data in CloudSearch. That eliminates the need for the lambda function and means your data is only stored in one place. However, you need to tolerate a higher time to consistency because CloudSearch doesn’t have strongly consistent reads like DynamoDB.



                RDS



                You could just use a SQL database of some sort. Most of them support a full text search. You can even use AWS Aurora Serverless if you don’t want to manage database instances.






                share|improve this answer


























                  0












                  0








                  0







                  You will not be able to use secondary indexes to help create a (reasonable) generalized substring search.



                  There are many ways to solve your problem. Here, I present a few of them, and this is by no means exhaustive.



                  DynamoDB -> CloudSearch



                  CloudSearch can provide general search functionality for your data. Basically, you can connect a lambda function to the DynamoDB stream from your table. That lambda function can keep your CloudSearch domain up to date. Here is an overview of this process.



                  CloudSearch



                  You could forgo DynamoDB and store this data in CloudSearch. That eliminates the need for the lambda function and means your data is only stored in one place. However, you need to tolerate a higher time to consistency because CloudSearch doesn’t have strongly consistent reads like DynamoDB.



                  RDS



                  You could just use a SQL database of some sort. Most of them support a full text search. You can even use AWS Aurora Serverless if you don’t want to manage database instances.






                  share|improve this answer













                  You will not be able to use secondary indexes to help create a (reasonable) generalized substring search.



                  There are many ways to solve your problem. Here, I present a few of them, and this is by no means exhaustive.



                  DynamoDB -> CloudSearch



                  CloudSearch can provide general search functionality for your data. Basically, you can connect a lambda function to the DynamoDB stream from your table. That lambda function can keep your CloudSearch domain up to date. Here is an overview of this process.



                  CloudSearch



                  You could forgo DynamoDB and store this data in CloudSearch. That eliminates the need for the lambda function and means your data is only stored in one place. However, you need to tolerate a higher time to consistency because CloudSearch doesn’t have strongly consistent reads like DynamoDB.



                  RDS



                  You could just use a SQL database of some sort. Most of them support a full text search. You can even use AWS Aurora Serverless if you don’t want to manage database instances.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 22 '18 at 22:44









                  Matthew PopeMatthew Pope

                  1,4421612




                  1,4421612






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53399091%2fefficient-substring-search-in-dynamodb%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      MongoDB - Not Authorized To Execute Command

                      in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

                      How to fix TextFormField cause rebuild widget in Flutter