Extracting multiple JSON files into one dataframe
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I am trying to merge multiple json files into one database and despite trying all the approaches found on SO, it fails.
The files provide sensor data. The stages I've completed are:
1. Unzip the files - produces json files saved as '.txt' files
2. Remove the old zip files
3. Parse the '.txt' files to remove some bugs in the content - random 3
letter + comma combos at the start of some lines, e.g. 'prm,{...'
I've got code which will turn them into data frames individually:
stream <- stream_in(file("1.txt"))
flat <- flatten(stream)
df_it <- as.data.frame(flat)
But when I put it into a function:
df_loop <- function(x) {
stream <- stream_in(x)
flat <- flatten(stream)
df_it <- as.data.frame(flat)
df_it
}
And then try to run through it:
df_all <- sapply(file.list, df_loop)
I get:
Error: Argument 'con' must be a connection.
Then I've tried to merge the json files with rbind.fill and merge to no avail.
Not really sure where I'm going so terribly wrong so would appreciate any help.
r json
add a comment |
I am trying to merge multiple json files into one database and despite trying all the approaches found on SO, it fails.
The files provide sensor data. The stages I've completed are:
1. Unzip the files - produces json files saved as '.txt' files
2. Remove the old zip files
3. Parse the '.txt' files to remove some bugs in the content - random 3
letter + comma combos at the start of some lines, e.g. 'prm,{...'
I've got code which will turn them into data frames individually:
stream <- stream_in(file("1.txt"))
flat <- flatten(stream)
df_it <- as.data.frame(flat)
But when I put it into a function:
df_loop <- function(x) {
stream <- stream_in(x)
flat <- flatten(stream)
df_it <- as.data.frame(flat)
df_it
}
And then try to run through it:
df_all <- sapply(file.list, df_loop)
I get:
Error: Argument 'con' must be a connection.
Then I've tried to merge the json files with rbind.fill and merge to no avail.
Not really sure where I'm going so terribly wrong so would appreciate any help.
r json
1
isfile.list
a list of file paths? In that case you need to dostream <- stream_in(file(x))
in your function
– Vivek Kalyanarangan
Jan 3 at 5:21
That worked a treat but would you help me understand why?
– Foothill_trudger
Jan 3 at 7:12
added ans pls check
– Vivek Kalyanarangan
Jan 3 at 7:24
add a comment |
I am trying to merge multiple json files into one database and despite trying all the approaches found on SO, it fails.
The files provide sensor data. The stages I've completed are:
1. Unzip the files - produces json files saved as '.txt' files
2. Remove the old zip files
3. Parse the '.txt' files to remove some bugs in the content - random 3
letter + comma combos at the start of some lines, e.g. 'prm,{...'
I've got code which will turn them into data frames individually:
stream <- stream_in(file("1.txt"))
flat <- flatten(stream)
df_it <- as.data.frame(flat)
But when I put it into a function:
df_loop <- function(x) {
stream <- stream_in(x)
flat <- flatten(stream)
df_it <- as.data.frame(flat)
df_it
}
And then try to run through it:
df_all <- sapply(file.list, df_loop)
I get:
Error: Argument 'con' must be a connection.
Then I've tried to merge the json files with rbind.fill and merge to no avail.
Not really sure where I'm going so terribly wrong so would appreciate any help.
r json
I am trying to merge multiple json files into one database and despite trying all the approaches found on SO, it fails.
The files provide sensor data. The stages I've completed are:
1. Unzip the files - produces json files saved as '.txt' files
2. Remove the old zip files
3. Parse the '.txt' files to remove some bugs in the content - random 3
letter + comma combos at the start of some lines, e.g. 'prm,{...'
I've got code which will turn them into data frames individually:
stream <- stream_in(file("1.txt"))
flat <- flatten(stream)
df_it <- as.data.frame(flat)
But when I put it into a function:
df_loop <- function(x) {
stream <- stream_in(x)
flat <- flatten(stream)
df_it <- as.data.frame(flat)
df_it
}
And then try to run through it:
df_all <- sapply(file.list, df_loop)
I get:
Error: Argument 'con' must be a connection.
Then I've tried to merge the json files with rbind.fill and merge to no avail.
Not really sure where I'm going so terribly wrong so would appreciate any help.
r json
r json
edited Jan 3 at 7:25
Uwe Keim
27.7k32134216
27.7k32134216
asked Jan 3 at 4:35
Foothill_trudgerFoothill_trudger
317
317
1
isfile.list
a list of file paths? In that case you need to dostream <- stream_in(file(x))
in your function
– Vivek Kalyanarangan
Jan 3 at 5:21
That worked a treat but would you help me understand why?
– Foothill_trudger
Jan 3 at 7:12
added ans pls check
– Vivek Kalyanarangan
Jan 3 at 7:24
add a comment |
1
isfile.list
a list of file paths? In that case you need to dostream <- stream_in(file(x))
in your function
– Vivek Kalyanarangan
Jan 3 at 5:21
That worked a treat but would you help me understand why?
– Foothill_trudger
Jan 3 at 7:12
added ans pls check
– Vivek Kalyanarangan
Jan 3 at 7:24
1
1
is
file.list
a list of file paths? In that case you need to do stream <- stream_in(file(x))
in your function– Vivek Kalyanarangan
Jan 3 at 5:21
is
file.list
a list of file paths? In that case you need to do stream <- stream_in(file(x))
in your function– Vivek Kalyanarangan
Jan 3 at 5:21
That worked a treat but would you help me understand why?
– Foothill_trudger
Jan 3 at 7:12
That worked a treat but would you help me understand why?
– Foothill_trudger
Jan 3 at 7:12
added ans pls check
– Vivek Kalyanarangan
Jan 3 at 7:24
added ans pls check
– Vivek Kalyanarangan
Jan 3 at 7:24
add a comment |
1 Answer
1
active
oldest
votes
You need a small change in your function. Change to -
stream <- stream_in(file(x))
Explanation
Start with analyzing your original implementation -
stream <- stream_in(file("1.txt"))
The 1.txt
here is the file path which is getting passed as an input parameter to file()
function. A quick ?file
will tell you that it is a
Function to create, open and close connections, i.e., “generalized
files”, such as possibly compressed files, URLs, pipes, etc.
Now if you do a ?stream_in()
you will find that it is a
function that implements line-by-line processing of JSON data over a
connection, such as a socket, url, file or pipe
Keyword here being socket, url, file or pipe
.
Your file.list
is just a list of file paths, character/strings to be specific. But in order for stream_in()
to work, you need to pass in a file
object, which is the output of file()
function which takes in the file path as a string input.
Chaining that together, you needed to do stream_in(file("/path/to/file.txt"))
.
Once you do that, your sapply
takes iterates each path, creates the file object and passes it as input to stream_in()
.
Hope that helps!
Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.
– Foothill_trudger
Jan 3 at 10:32
You are welcome :-)
– Vivek Kalyanarangan
Jan 3 at 10:34
I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?
– Foothill_trudger
Jan 8 at 5:55
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54016403%2fextracting-multiple-json-files-into-one-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You need a small change in your function. Change to -
stream <- stream_in(file(x))
Explanation
Start with analyzing your original implementation -
stream <- stream_in(file("1.txt"))
The 1.txt
here is the file path which is getting passed as an input parameter to file()
function. A quick ?file
will tell you that it is a
Function to create, open and close connections, i.e., “generalized
files”, such as possibly compressed files, URLs, pipes, etc.
Now if you do a ?stream_in()
you will find that it is a
function that implements line-by-line processing of JSON data over a
connection, such as a socket, url, file or pipe
Keyword here being socket, url, file or pipe
.
Your file.list
is just a list of file paths, character/strings to be specific. But in order for stream_in()
to work, you need to pass in a file
object, which is the output of file()
function which takes in the file path as a string input.
Chaining that together, you needed to do stream_in(file("/path/to/file.txt"))
.
Once you do that, your sapply
takes iterates each path, creates the file object and passes it as input to stream_in()
.
Hope that helps!
Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.
– Foothill_trudger
Jan 3 at 10:32
You are welcome :-)
– Vivek Kalyanarangan
Jan 3 at 10:34
I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?
– Foothill_trudger
Jan 8 at 5:55
add a comment |
You need a small change in your function. Change to -
stream <- stream_in(file(x))
Explanation
Start with analyzing your original implementation -
stream <- stream_in(file("1.txt"))
The 1.txt
here is the file path which is getting passed as an input parameter to file()
function. A quick ?file
will tell you that it is a
Function to create, open and close connections, i.e., “generalized
files”, such as possibly compressed files, URLs, pipes, etc.
Now if you do a ?stream_in()
you will find that it is a
function that implements line-by-line processing of JSON data over a
connection, such as a socket, url, file or pipe
Keyword here being socket, url, file or pipe
.
Your file.list
is just a list of file paths, character/strings to be specific. But in order for stream_in()
to work, you need to pass in a file
object, which is the output of file()
function which takes in the file path as a string input.
Chaining that together, you needed to do stream_in(file("/path/to/file.txt"))
.
Once you do that, your sapply
takes iterates each path, creates the file object and passes it as input to stream_in()
.
Hope that helps!
Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.
– Foothill_trudger
Jan 3 at 10:32
You are welcome :-)
– Vivek Kalyanarangan
Jan 3 at 10:34
I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?
– Foothill_trudger
Jan 8 at 5:55
add a comment |
You need a small change in your function. Change to -
stream <- stream_in(file(x))
Explanation
Start with analyzing your original implementation -
stream <- stream_in(file("1.txt"))
The 1.txt
here is the file path which is getting passed as an input parameter to file()
function. A quick ?file
will tell you that it is a
Function to create, open and close connections, i.e., “generalized
files”, such as possibly compressed files, URLs, pipes, etc.
Now if you do a ?stream_in()
you will find that it is a
function that implements line-by-line processing of JSON data over a
connection, such as a socket, url, file or pipe
Keyword here being socket, url, file or pipe
.
Your file.list
is just a list of file paths, character/strings to be specific. But in order for stream_in()
to work, you need to pass in a file
object, which is the output of file()
function which takes in the file path as a string input.
Chaining that together, you needed to do stream_in(file("/path/to/file.txt"))
.
Once you do that, your sapply
takes iterates each path, creates the file object and passes it as input to stream_in()
.
Hope that helps!
You need a small change in your function. Change to -
stream <- stream_in(file(x))
Explanation
Start with analyzing your original implementation -
stream <- stream_in(file("1.txt"))
The 1.txt
here is the file path which is getting passed as an input parameter to file()
function. A quick ?file
will tell you that it is a
Function to create, open and close connections, i.e., “generalized
files”, such as possibly compressed files, URLs, pipes, etc.
Now if you do a ?stream_in()
you will find that it is a
function that implements line-by-line processing of JSON data over a
connection, such as a socket, url, file or pipe
Keyword here being socket, url, file or pipe
.
Your file.list
is just a list of file paths, character/strings to be specific. But in order for stream_in()
to work, you need to pass in a file
object, which is the output of file()
function which takes in the file path as a string input.
Chaining that together, you needed to do stream_in(file("/path/to/file.txt"))
.
Once you do that, your sapply
takes iterates each path, creates the file object and passes it as input to stream_in()
.
Hope that helps!
answered Jan 3 at 7:24


Vivek KalyanaranganVivek Kalyanarangan
5,1141829
5,1141829
Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.
– Foothill_trudger
Jan 3 at 10:32
You are welcome :-)
– Vivek Kalyanarangan
Jan 3 at 10:34
I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?
– Foothill_trudger
Jan 8 at 5:55
add a comment |
Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.
– Foothill_trudger
Jan 3 at 10:32
You are welcome :-)
– Vivek Kalyanarangan
Jan 3 at 10:34
I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?
– Foothill_trudger
Jan 8 at 5:55
Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.
– Foothill_trudger
Jan 3 at 10:32
Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.
– Foothill_trudger
Jan 3 at 10:32
You are welcome :-)
– Vivek Kalyanarangan
Jan 3 at 10:34
You are welcome :-)
– Vivek Kalyanarangan
Jan 3 at 10:34
I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?
– Foothill_trudger
Jan 8 at 5:55
I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?
– Foothill_trudger
Jan 8 at 5:55
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54016403%2fextracting-multiple-json-files-into-one-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
is
file.list
a list of file paths? In that case you need to dostream <- stream_in(file(x))
in your function– Vivek Kalyanarangan
Jan 3 at 5:21
That worked a treat but would you help me understand why?
– Foothill_trudger
Jan 3 at 7:12
added ans pls check
– Vivek Kalyanarangan
Jan 3 at 7:24