KSQL ( Confluent ) VS Hive Kafka SQL ( Hortanworks )
What are the difference ?
which one is better ?
when to use ?
Hive Kafka SQL
KSQL
hive apache-kafka hortonworks-data-platform confluent ksql
add a comment |
What are the difference ?
which one is better ?
when to use ?
Hive Kafka SQL
KSQL
hive apache-kafka hortonworks-data-platform confluent ksql
add a comment |
What are the difference ?
which one is better ?
when to use ?
Hive Kafka SQL
KSQL
hive apache-kafka hortonworks-data-platform confluent ksql
What are the difference ?
which one is better ?
when to use ?
Hive Kafka SQL
KSQL
hive apache-kafka hortonworks-data-platform confluent ksql
hive apache-kafka hortonworks-data-platform confluent ksql
asked Jan 1 at 8:20
sharan jainsharan jain
104
104
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Installation
KSQL uses Kafka Streams, and does not depend on Hive, only Kafka and Zookeeper
Hive-Kakfa requires both Kafka, a HiveServer, and a RDBMS (MySQL, Postgres, etc).
EcoSystem
For external integrations, Hive-Kafka does not offer Confluent Avro Schema Registry integration. It might (eventually?) offer Hortonworks Schema Registry integration, though.
Hortonwork's suite of tools around NiFi, Spark, Kafka, SMM, Atlas, Ranger, Hive-Streaming, etc. are probably all well tested together.
Confluent partners with other companies to ensure proper integrations are met with such other tools than Kafka and their Platform.
Interface
AFAIK, Hive-Kafka is only a query engine, it will not create/maintain KStreams/KTable instances like KSQL, and will always require a scan of the Kafka topic. It also has no native REST interface for submitting queries, so the only option for external access would be JDBC/ODBC.
For a UI, Hive works well with HUE or Ambari Views which are both open-source but KSQL primarily only has Confluent Control Center, which is a paid-for solution.
"Better" is an opinion, and but if you already have Hive, I see no reason not to use Hive-Kafka.
IMO, KSQL can compliment Hive-Kafka by defining new topics as both tables and streams, as well as transforming/filtering Confleunt's Avro format into JSON that Hive-Kafka can natively understand. From there you can join existing Hive data (HDFS, S3, HBase, etc) with Hive-Kafka data, though, there will likely be performance impacts of that.
Similarly, you can take Hive-Kafka topics and translate them into Avro in KSQL using the Schema Registry, to use with other tools like Kafka Connect or NiFi to have a more efficient wire format (binary-avro vs. json).
And FWIW, look at the comments section of your first link
This integration is very different from KSQL.
- The primary use case here is to allow users to actually unleash full SQL query use cases against any Kafka topic.
https://github.com/apache/hive/tree/master/kafka-handler#query-table
- You can use it to atomically move data in and out Kafka it self.
https://github.com/apache/hive/tree/master/kafka-handler#query-table
- Query the Kafka Stream as part of the entire Data warehouse like ORC/Parquet tables, Druid Tables, HDFS, S3… etc.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53994020%2fksql-confluent-vs-hive-kafka-sql-hortanworks%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Installation
KSQL uses Kafka Streams, and does not depend on Hive, only Kafka and Zookeeper
Hive-Kakfa requires both Kafka, a HiveServer, and a RDBMS (MySQL, Postgres, etc).
EcoSystem
For external integrations, Hive-Kafka does not offer Confluent Avro Schema Registry integration. It might (eventually?) offer Hortonworks Schema Registry integration, though.
Hortonwork's suite of tools around NiFi, Spark, Kafka, SMM, Atlas, Ranger, Hive-Streaming, etc. are probably all well tested together.
Confluent partners with other companies to ensure proper integrations are met with such other tools than Kafka and their Platform.
Interface
AFAIK, Hive-Kafka is only a query engine, it will not create/maintain KStreams/KTable instances like KSQL, and will always require a scan of the Kafka topic. It also has no native REST interface for submitting queries, so the only option for external access would be JDBC/ODBC.
For a UI, Hive works well with HUE or Ambari Views which are both open-source but KSQL primarily only has Confluent Control Center, which is a paid-for solution.
"Better" is an opinion, and but if you already have Hive, I see no reason not to use Hive-Kafka.
IMO, KSQL can compliment Hive-Kafka by defining new topics as both tables and streams, as well as transforming/filtering Confleunt's Avro format into JSON that Hive-Kafka can natively understand. From there you can join existing Hive data (HDFS, S3, HBase, etc) with Hive-Kafka data, though, there will likely be performance impacts of that.
Similarly, you can take Hive-Kafka topics and translate them into Avro in KSQL using the Schema Registry, to use with other tools like Kafka Connect or NiFi to have a more efficient wire format (binary-avro vs. json).
And FWIW, look at the comments section of your first link
This integration is very different from KSQL.
- The primary use case here is to allow users to actually unleash full SQL query use cases against any Kafka topic.
https://github.com/apache/hive/tree/master/kafka-handler#query-table
- You can use it to atomically move data in and out Kafka it self.
https://github.com/apache/hive/tree/master/kafka-handler#query-table
- Query the Kafka Stream as part of the entire Data warehouse like ORC/Parquet tables, Druid Tables, HDFS, S3… etc.
add a comment |
Installation
KSQL uses Kafka Streams, and does not depend on Hive, only Kafka and Zookeeper
Hive-Kakfa requires both Kafka, a HiveServer, and a RDBMS (MySQL, Postgres, etc).
EcoSystem
For external integrations, Hive-Kafka does not offer Confluent Avro Schema Registry integration. It might (eventually?) offer Hortonworks Schema Registry integration, though.
Hortonwork's suite of tools around NiFi, Spark, Kafka, SMM, Atlas, Ranger, Hive-Streaming, etc. are probably all well tested together.
Confluent partners with other companies to ensure proper integrations are met with such other tools than Kafka and their Platform.
Interface
AFAIK, Hive-Kafka is only a query engine, it will not create/maintain KStreams/KTable instances like KSQL, and will always require a scan of the Kafka topic. It also has no native REST interface for submitting queries, so the only option for external access would be JDBC/ODBC.
For a UI, Hive works well with HUE or Ambari Views which are both open-source but KSQL primarily only has Confluent Control Center, which is a paid-for solution.
"Better" is an opinion, and but if you already have Hive, I see no reason not to use Hive-Kafka.
IMO, KSQL can compliment Hive-Kafka by defining new topics as both tables and streams, as well as transforming/filtering Confleunt's Avro format into JSON that Hive-Kafka can natively understand. From there you can join existing Hive data (HDFS, S3, HBase, etc) with Hive-Kafka data, though, there will likely be performance impacts of that.
Similarly, you can take Hive-Kafka topics and translate them into Avro in KSQL using the Schema Registry, to use with other tools like Kafka Connect or NiFi to have a more efficient wire format (binary-avro vs. json).
And FWIW, look at the comments section of your first link
This integration is very different from KSQL.
- The primary use case here is to allow users to actually unleash full SQL query use cases against any Kafka topic.
https://github.com/apache/hive/tree/master/kafka-handler#query-table
- You can use it to atomically move data in and out Kafka it self.
https://github.com/apache/hive/tree/master/kafka-handler#query-table
- Query the Kafka Stream as part of the entire Data warehouse like ORC/Parquet tables, Druid Tables, HDFS, S3… etc.
add a comment |
Installation
KSQL uses Kafka Streams, and does not depend on Hive, only Kafka and Zookeeper
Hive-Kakfa requires both Kafka, a HiveServer, and a RDBMS (MySQL, Postgres, etc).
EcoSystem
For external integrations, Hive-Kafka does not offer Confluent Avro Schema Registry integration. It might (eventually?) offer Hortonworks Schema Registry integration, though.
Hortonwork's suite of tools around NiFi, Spark, Kafka, SMM, Atlas, Ranger, Hive-Streaming, etc. are probably all well tested together.
Confluent partners with other companies to ensure proper integrations are met with such other tools than Kafka and their Platform.
Interface
AFAIK, Hive-Kafka is only a query engine, it will not create/maintain KStreams/KTable instances like KSQL, and will always require a scan of the Kafka topic. It also has no native REST interface for submitting queries, so the only option for external access would be JDBC/ODBC.
For a UI, Hive works well with HUE or Ambari Views which are both open-source but KSQL primarily only has Confluent Control Center, which is a paid-for solution.
"Better" is an opinion, and but if you already have Hive, I see no reason not to use Hive-Kafka.
IMO, KSQL can compliment Hive-Kafka by defining new topics as both tables and streams, as well as transforming/filtering Confleunt's Avro format into JSON that Hive-Kafka can natively understand. From there you can join existing Hive data (HDFS, S3, HBase, etc) with Hive-Kafka data, though, there will likely be performance impacts of that.
Similarly, you can take Hive-Kafka topics and translate them into Avro in KSQL using the Schema Registry, to use with other tools like Kafka Connect or NiFi to have a more efficient wire format (binary-avro vs. json).
And FWIW, look at the comments section of your first link
This integration is very different from KSQL.
- The primary use case here is to allow users to actually unleash full SQL query use cases against any Kafka topic.
https://github.com/apache/hive/tree/master/kafka-handler#query-table
- You can use it to atomically move data in and out Kafka it self.
https://github.com/apache/hive/tree/master/kafka-handler#query-table
- Query the Kafka Stream as part of the entire Data warehouse like ORC/Parquet tables, Druid Tables, HDFS, S3… etc.
Installation
KSQL uses Kafka Streams, and does not depend on Hive, only Kafka and Zookeeper
Hive-Kakfa requires both Kafka, a HiveServer, and a RDBMS (MySQL, Postgres, etc).
EcoSystem
For external integrations, Hive-Kafka does not offer Confluent Avro Schema Registry integration. It might (eventually?) offer Hortonworks Schema Registry integration, though.
Hortonwork's suite of tools around NiFi, Spark, Kafka, SMM, Atlas, Ranger, Hive-Streaming, etc. are probably all well tested together.
Confluent partners with other companies to ensure proper integrations are met with such other tools than Kafka and their Platform.
Interface
AFAIK, Hive-Kafka is only a query engine, it will not create/maintain KStreams/KTable instances like KSQL, and will always require a scan of the Kafka topic. It also has no native REST interface for submitting queries, so the only option for external access would be JDBC/ODBC.
For a UI, Hive works well with HUE or Ambari Views which are both open-source but KSQL primarily only has Confluent Control Center, which is a paid-for solution.
"Better" is an opinion, and but if you already have Hive, I see no reason not to use Hive-Kafka.
IMO, KSQL can compliment Hive-Kafka by defining new topics as both tables and streams, as well as transforming/filtering Confleunt's Avro format into JSON that Hive-Kafka can natively understand. From there you can join existing Hive data (HDFS, S3, HBase, etc) with Hive-Kafka data, though, there will likely be performance impacts of that.
Similarly, you can take Hive-Kafka topics and translate them into Avro in KSQL using the Schema Registry, to use with other tools like Kafka Connect or NiFi to have a more efficient wire format (binary-avro vs. json).
And FWIW, look at the comments section of your first link
This integration is very different from KSQL.
- The primary use case here is to allow users to actually unleash full SQL query use cases against any Kafka topic.
https://github.com/apache/hive/tree/master/kafka-handler#query-table
- You can use it to atomically move data in and out Kafka it self.
https://github.com/apache/hive/tree/master/kafka-handler#query-table
- Query the Kafka Stream as part of the entire Data warehouse like ORC/Parquet tables, Druid Tables, HDFS, S3… etc.
edited Jan 5 at 20:10
answered Jan 3 at 17:22
cricket_007cricket_007
82.7k1144112
82.7k1144112
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53994020%2fksql-confluent-vs-hive-kafka-sql-hortanworks%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown