What datastructure is the fastest to find the best matching prefix?
up vote
0
down vote
favorite
Context: I'm working on an analyzer for useragent strings (Yauaa) and as part of this analysis I want to make an educated guess what brand of the device should be reported. I have an implementation that I need to rewrite to be a lot more efficient.
Because I do not want to have a complete list of all devices I want to do the detection based on the prefix of the model.
So I have a dataset with prefixes and the brand that is associated:
- "GT-" --> "Samsung"
- "LLD-" --> "Huawei"
And then I want to do a .get("GT-1234124") which should result in "Samsung" because that is the "longest matching prefix".
I had a look at the Trie structure but that seems to be for the opposite situation. What I understand is that you start with a set of values and you can efficiently get all the values that starts with the provided prefix.
If I were to implement this from scratch I would use a tree similar to the Trie but walk around it differently. What I'm looking for is a datastructure that does what I need as fast as possible.
What datastructure do you recommend for this usecase?
Is there an existing (proven) implementation I can use?
data-structures binary-search-tree prefix prefix-tree
add a comment |
up vote
0
down vote
favorite
Context: I'm working on an analyzer for useragent strings (Yauaa) and as part of this analysis I want to make an educated guess what brand of the device should be reported. I have an implementation that I need to rewrite to be a lot more efficient.
Because I do not want to have a complete list of all devices I want to do the detection based on the prefix of the model.
So I have a dataset with prefixes and the brand that is associated:
- "GT-" --> "Samsung"
- "LLD-" --> "Huawei"
And then I want to do a .get("GT-1234124") which should result in "Samsung" because that is the "longest matching prefix".
I had a look at the Trie structure but that seems to be for the opposite situation. What I understand is that you start with a set of values and you can efficiently get all the values that starts with the provided prefix.
If I were to implement this from scratch I would use a tree similar to the Trie but walk around it differently. What I'm looking for is a datastructure that does what I need as fast as possible.
What datastructure do you recommend for this usecase?
Is there an existing (proven) implementation I can use?
data-structures binary-search-tree prefix prefix-tree
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Context: I'm working on an analyzer for useragent strings (Yauaa) and as part of this analysis I want to make an educated guess what brand of the device should be reported. I have an implementation that I need to rewrite to be a lot more efficient.
Because I do not want to have a complete list of all devices I want to do the detection based on the prefix of the model.
So I have a dataset with prefixes and the brand that is associated:
- "GT-" --> "Samsung"
- "LLD-" --> "Huawei"
And then I want to do a .get("GT-1234124") which should result in "Samsung" because that is the "longest matching prefix".
I had a look at the Trie structure but that seems to be for the opposite situation. What I understand is that you start with a set of values and you can efficiently get all the values that starts with the provided prefix.
If I were to implement this from scratch I would use a tree similar to the Trie but walk around it differently. What I'm looking for is a datastructure that does what I need as fast as possible.
What datastructure do you recommend for this usecase?
Is there an existing (proven) implementation I can use?
data-structures binary-search-tree prefix prefix-tree
Context: I'm working on an analyzer for useragent strings (Yauaa) and as part of this analysis I want to make an educated guess what brand of the device should be reported. I have an implementation that I need to rewrite to be a lot more efficient.
Because I do not want to have a complete list of all devices I want to do the detection based on the prefix of the model.
So I have a dataset with prefixes and the brand that is associated:
- "GT-" --> "Samsung"
- "LLD-" --> "Huawei"
And then I want to do a .get("GT-1234124") which should result in "Samsung" because that is the "longest matching prefix".
I had a look at the Trie structure but that seems to be for the opposite situation. What I understand is that you start with a set of values and you can efficiently get all the values that starts with the provided prefix.
If I were to implement this from scratch I would use a tree similar to the Trie but walk around it differently. What I'm looking for is a datastructure that does what I need as fast as possible.
What datastructure do you recommend for this usecase?
Is there an existing (proven) implementation I can use?
data-structures binary-search-tree prefix prefix-tree
data-structures binary-search-tree prefix prefix-tree
asked 3 hours ago
Niels Basjes
6,42273852
6,42273852
add a comment |
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53371532%2fwhat-datastructure-is-the-fastest-to-find-the-best-matching-prefix%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown