How to read a CSV file and then save it as JSON in Spark Scala?

-3

I am trying to read a CSV file that has around 7 million rows, and 22 columns.

How to save it as a JSON file after reading the CSV in a Spark Dataframe?

edited Nov 22 '18 at 16:54

James Z

11.2k71935

asked Nov 22 '18 at 8:58

Sayan Sahoo

226

add a comment |

-3

I am trying to read a CSV file that has around 7 million rows, and 22 columns.

How to save it as a JSON file after reading the CSV in a Spark Dataframe?

edited Nov 22 '18 at 16:54

James Z

11.2k71935

asked Nov 22 '18 at 8:58

Sayan Sahoo

226

add a comment |

-3

I am trying to read a CSV file that has around 7 million rows, and 22 columns.

How to save it as a JSON file after reading the CSV in a Spark Dataframe?

edited Nov 22 '18 at 16:54

James Z

11.2k71935

asked Nov 22 '18 at 8:58

Sayan Sahoo

226

I am trying to read a CSV file that has around 7 million rows, and 22 columns.

How to save it as a JSON file after reading the CSV in a Spark Dataframe?

scala apache-spark apache-spark-sql

edited Nov 22 '18 at 16:54

James Z

11.2k71935

asked Nov 22 '18 at 8:58

Sayan Sahoo

226

edited Nov 22 '18 at 16:54

James Z

11.2k71935

asked Nov 22 '18 at 8:58

Sayan Sahoo

226

edited Nov 22 '18 at 16:54

James Z

11.2k71935

edited Nov 22 '18 at 16:54

James Z

11.2k71935

edited Nov 22 '18 at 16:54

James Z

11.2k71935

asked Nov 22 '18 at 8:58

Sayan Sahoo

226

asked Nov 22 '18 at 8:58

Sayan Sahoo

226

asked Nov 22 '18 at 8:58

Sayan Sahoo

226

add a comment |

1 Answer
1

active

oldest

votes

Read a CSV file as a dataframe

val spark = SparkSession.builder().master("local[2]").appname("test").getOrCreate

val df = spark.read.csv("path to csv")

Now you can perform some operation to df and save as JSON

df.write.json("output path")

Hope this helps!

answered Nov 22 '18 at 9:18

Shankar Koirala

11.8k31641

I tried to do that, but it is showing SparkException, IOException. And in error it is showing "Job is aborted while writing the rows". I don't know why. Can you help? I'm new to Spark, that is why finding it difficult to understand.

– Sayan Sahoo
Nov 22 '18 at 9:40

Why did not you shared what issue you faced, what you tried, can you share the error log?

– Shankar Koirala
Nov 22 '18 at 10:00

ERROR Utils: Aborting task java.io.IOException: (null) entry in command string: null chmod 0644 D:sample.json_temporary_temporaryattempt_20181122150723_0003_m_000000_0part-00000-448b77ae-c17d-45fe-bba0-a6495fd5c6bd-c000.json at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:762) at org.apache.hadoop.util.Shell.execCommand(Shell.java:859) at org.apache.hadoop.util.Shell.execCommand(Shell.java:842) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:661)

– Sayan Sahoo
Nov 22 '18 at 10:26

did you already checked stackoverflow.com/questions/48010634/…?

– Shankar Koirala
Nov 22 '18 at 12:52

Thank you, the issue is resolved now. :)

– Sayan Sahoo
Nov 22 '18 at 13:35

|
show 1 more comment

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53427152%2fhow-to-read-a-csv-file-and-then-save-it-as-json-in-spark-scala%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Read a CSV file as a dataframe

val spark = SparkSession.builder().master("local[2]").appname("test").getOrCreate

val df = spark.read.csv("path to csv")

Now you can perform some operation to df and save as JSON

df.write.json("output path")

Hope this helps!

answered Nov 22 '18 at 9:18

Shankar Koirala

11.8k31641

I tried to do that, but it is showing SparkException, IOException. And in error it is showing "Job is aborted while writing the rows". I don't know why. Can you help? I'm new to Spark, that is why finding it difficult to understand.

– Sayan Sahoo
Nov 22 '18 at 9:40

Why did not you shared what issue you faced, what you tried, can you share the error log?

– Shankar Koirala
Nov 22 '18 at 10:00

ERROR Utils: Aborting task java.io.IOException: (null) entry in command string: null chmod 0644 D:sample.json_temporary_temporaryattempt_20181122150723_0003_m_000000_0part-00000-448b77ae-c17d-45fe-bba0-a6495fd5c6bd-c000.json at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:762) at org.apache.hadoop.util.Shell.execCommand(Shell.java:859) at org.apache.hadoop.util.Shell.execCommand(Shell.java:842) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:661)

– Sayan Sahoo
Nov 22 '18 at 10:26

did you already checked stackoverflow.com/questions/48010634/…?

– Shankar Koirala
Nov 22 '18 at 12:52

Thank you, the issue is resolved now. :)

– Sayan Sahoo
Nov 22 '18 at 13:35

|
show 1 more comment

Read a CSV file as a dataframe

val spark = SparkSession.builder().master("local[2]").appname("test").getOrCreate

val df = spark.read.csv("path to csv")

Now you can perform some operation to df and save as JSON

df.write.json("output path")

Hope this helps!

answered Nov 22 '18 at 9:18

Shankar Koirala

11.8k31641

I tried to do that, but it is showing SparkException, IOException. And in error it is showing "Job is aborted while writing the rows". I don't know why. Can you help? I'm new to Spark, that is why finding it difficult to understand.

– Sayan Sahoo
Nov 22 '18 at 9:40

Why did not you shared what issue you faced, what you tried, can you share the error log?

– Shankar Koirala
Nov 22 '18 at 10:00

ERROR Utils: Aborting task java.io.IOException: (null) entry in command string: null chmod 0644 D:sample.json_temporary_temporaryattempt_20181122150723_0003_m_000000_0part-00000-448b77ae-c17d-45fe-bba0-a6495fd5c6bd-c000.json at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:762) at org.apache.hadoop.util.Shell.execCommand(Shell.java:859) at org.apache.hadoop.util.Shell.execCommand(Shell.java:842) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:661)

– Sayan Sahoo
Nov 22 '18 at 10:26

did you already checked stackoverflow.com/questions/48010634/…?

– Shankar Koirala
Nov 22 '18 at 12:52

Thank you, the issue is resolved now. :)

– Sayan Sahoo
Nov 22 '18 at 13:35

|
show 1 more comment

Read a CSV file as a dataframe

val spark = SparkSession.builder().master("local[2]").appname("test").getOrCreate

val df = spark.read.csv("path to csv")

Now you can perform some operation to df and save as JSON

df.write.json("output path")

Hope this helps!

answered Nov 22 '18 at 9:18

Shankar Koirala

11.8k31641

Read a CSV file as a dataframe

val spark = SparkSession.builder().master("local[2]").appname("test").getOrCreate

val df = spark.read.csv("path to csv")

Now you can perform some operation to df and save as JSON

df.write.json("output path")

Hope this helps!

answered Nov 22 '18 at 9:18

Shankar Koirala

11.8k31641

answered Nov 22 '18 at 9:18

Shankar Koirala

11.8k31641

answered Nov 22 '18 at 9:18

Shankar Koirala

11.8k31641

answered Nov 22 '18 at 9:18

Shankar Koirala

11.8k31641

I tried to do that, but it is showing SparkException, IOException. And in error it is showing "Job is aborted while writing the rows". I don't know why. Can you help? I'm new to Spark, that is why finding it difficult to understand.

– Sayan Sahoo
Nov 22 '18 at 9:40

Why did not you shared what issue you faced, what you tried, can you share the error log?

– Shankar Koirala
Nov 22 '18 at 10:00

ERROR Utils: Aborting task java.io.IOException: (null) entry in command string: null chmod 0644 D:sample.json_temporary_temporaryattempt_20181122150723_0003_m_000000_0part-00000-448b77ae-c17d-45fe-bba0-a6495fd5c6bd-c000.json at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:762) at org.apache.hadoop.util.Shell.execCommand(Shell.java:859) at org.apache.hadoop.util.Shell.execCommand(Shell.java:842) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:661)

– Sayan Sahoo
Nov 22 '18 at 10:26

did you already checked stackoverflow.com/questions/48010634/…?

– Shankar Koirala
Nov 22 '18 at 12:52

Thank you, the issue is resolved now. :)

– Sayan Sahoo
Nov 22 '18 at 13:35

|
show 1 more comment

I tried to do that, but it is showing SparkException, IOException. And in error it is showing "Job is aborted while writing the rows". I don't know why. Can you help? I'm new to Spark, that is why finding it difficult to understand.

– Sayan Sahoo
Nov 22 '18 at 9:40

Why did not you shared what issue you faced, what you tried, can you share the error log?

– Shankar Koirala
Nov 22 '18 at 10:00

ERROR Utils: Aborting task java.io.IOException: (null) entry in command string: null chmod 0644 D:sample.json_temporary_temporaryattempt_20181122150723_0003_m_000000_0part-00000-448b77ae-c17d-45fe-bba0-a6495fd5c6bd-c000.json at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:762) at org.apache.hadoop.util.Shell.execCommand(Shell.java:859) at org.apache.hadoop.util.Shell.execCommand(Shell.java:842) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:661)

– Sayan Sahoo
Nov 22 '18 at 10:26

did you already checked stackoverflow.com/questions/48010634/…?

– Shankar Koirala
Nov 22 '18 at 12:52

Thank you, the issue is resolved now. :)

– Sayan Sahoo
Nov 22 '18 at 13:35

I tried to do that, but it is showing SparkException, IOException. And in error it is showing "Job is aborted while writing the rows". I don't know why. Can you help? I'm new to Spark, that is why finding it difficult to understand.

– Sayan Sahoo
Nov 22 '18 at 9:40

Why did not you shared what issue you faced, what you tried, can you share the error log?

– Shankar Koirala
Nov 22 '18 at 10:00

ERROR Utils: Aborting task java.io.IOException: (null) entry in command string: null chmod 0644 D:sample.json_temporary_temporaryattempt_20181122150723_0003_m_000000_0part-00000-448b77ae-c17d-45fe-bba0-a6495fd5c6bd-c000.json at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:762) at org.apache.hadoop.util.Shell.execCommand(Shell.java:859) at org.apache.hadoop.util.Shell.execCommand(Shell.java:842) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:661)

– Sayan Sahoo
Nov 22 '18 at 10:26

did you already checked stackoverflow.com/questions/48010634/…?

– Shankar Koirala
Nov 22 '18 at 12:52

Thank you, the issue is resolved now. :)

– Sayan Sahoo
Nov 22 '18 at 13:35

|
show 1 more comment

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu