Extracting multiple JSON files into one dataframe





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I am trying to merge multiple json files into one database and despite trying all the approaches found on SO, it fails.



The files provide sensor data. The stages I've completed are:



1. Unzip the files - produces json files saved as '.txt' files
2. Remove the old zip files
3. Parse the '.txt' files to remove some bugs in the content - random 3
letter + comma combos at the start of some lines, e.g. 'prm,{...'


I've got code which will turn them into data frames individually:



stream <- stream_in(file("1.txt"))
flat <- flatten(stream)
df_it <- as.data.frame(flat)


But when I put it into a function:



df_loop <- function(x) {
stream <- stream_in(x)
flat <- flatten(stream)
df_it <- as.data.frame(flat)
df_it
}


And then try to run through it:



df_all <- sapply(file.list, df_loop)


I get:



Error: Argument 'con' must be a connection.


Then I've tried to merge the json files with rbind.fill and merge to no avail.



Not really sure where I'm going so terribly wrong so would appreciate any help.










share|improve this question




















  • 1





    is file.list a list of file paths? In that case you need to do stream <- stream_in(file(x)) in your function

    – Vivek Kalyanarangan
    Jan 3 at 5:21











  • That worked a treat but would you help me understand why?

    – Foothill_trudger
    Jan 3 at 7:12











  • added ans pls check

    – Vivek Kalyanarangan
    Jan 3 at 7:24




















0















I am trying to merge multiple json files into one database and despite trying all the approaches found on SO, it fails.



The files provide sensor data. The stages I've completed are:



1. Unzip the files - produces json files saved as '.txt' files
2. Remove the old zip files
3. Parse the '.txt' files to remove some bugs in the content - random 3
letter + comma combos at the start of some lines, e.g. 'prm,{...'


I've got code which will turn them into data frames individually:



stream <- stream_in(file("1.txt"))
flat <- flatten(stream)
df_it <- as.data.frame(flat)


But when I put it into a function:



df_loop <- function(x) {
stream <- stream_in(x)
flat <- flatten(stream)
df_it <- as.data.frame(flat)
df_it
}


And then try to run through it:



df_all <- sapply(file.list, df_loop)


I get:



Error: Argument 'con' must be a connection.


Then I've tried to merge the json files with rbind.fill and merge to no avail.



Not really sure where I'm going so terribly wrong so would appreciate any help.










share|improve this question




















  • 1





    is file.list a list of file paths? In that case you need to do stream <- stream_in(file(x)) in your function

    – Vivek Kalyanarangan
    Jan 3 at 5:21











  • That worked a treat but would you help me understand why?

    – Foothill_trudger
    Jan 3 at 7:12











  • added ans pls check

    – Vivek Kalyanarangan
    Jan 3 at 7:24
















0












0








0








I am trying to merge multiple json files into one database and despite trying all the approaches found on SO, it fails.



The files provide sensor data. The stages I've completed are:



1. Unzip the files - produces json files saved as '.txt' files
2. Remove the old zip files
3. Parse the '.txt' files to remove some bugs in the content - random 3
letter + comma combos at the start of some lines, e.g. 'prm,{...'


I've got code which will turn them into data frames individually:



stream <- stream_in(file("1.txt"))
flat <- flatten(stream)
df_it <- as.data.frame(flat)


But when I put it into a function:



df_loop <- function(x) {
stream <- stream_in(x)
flat <- flatten(stream)
df_it <- as.data.frame(flat)
df_it
}


And then try to run through it:



df_all <- sapply(file.list, df_loop)


I get:



Error: Argument 'con' must be a connection.


Then I've tried to merge the json files with rbind.fill and merge to no avail.



Not really sure where I'm going so terribly wrong so would appreciate any help.










share|improve this question
















I am trying to merge multiple json files into one database and despite trying all the approaches found on SO, it fails.



The files provide sensor data. The stages I've completed are:



1. Unzip the files - produces json files saved as '.txt' files
2. Remove the old zip files
3. Parse the '.txt' files to remove some bugs in the content - random 3
letter + comma combos at the start of some lines, e.g. 'prm,{...'


I've got code which will turn them into data frames individually:



stream <- stream_in(file("1.txt"))
flat <- flatten(stream)
df_it <- as.data.frame(flat)


But when I put it into a function:



df_loop <- function(x) {
stream <- stream_in(x)
flat <- flatten(stream)
df_it <- as.data.frame(flat)
df_it
}


And then try to run through it:



df_all <- sapply(file.list, df_loop)


I get:



Error: Argument 'con' must be a connection.


Then I've tried to merge the json files with rbind.fill and merge to no avail.



Not really sure where I'm going so terribly wrong so would appreciate any help.







r json






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 3 at 7:25









Uwe Keim

27.7k32134216




27.7k32134216










asked Jan 3 at 4:35









Foothill_trudgerFoothill_trudger

317




317








  • 1





    is file.list a list of file paths? In that case you need to do stream <- stream_in(file(x)) in your function

    – Vivek Kalyanarangan
    Jan 3 at 5:21











  • That worked a treat but would you help me understand why?

    – Foothill_trudger
    Jan 3 at 7:12











  • added ans pls check

    – Vivek Kalyanarangan
    Jan 3 at 7:24
















  • 1





    is file.list a list of file paths? In that case you need to do stream <- stream_in(file(x)) in your function

    – Vivek Kalyanarangan
    Jan 3 at 5:21











  • That worked a treat but would you help me understand why?

    – Foothill_trudger
    Jan 3 at 7:12











  • added ans pls check

    – Vivek Kalyanarangan
    Jan 3 at 7:24










1




1





is file.list a list of file paths? In that case you need to do stream <- stream_in(file(x)) in your function

– Vivek Kalyanarangan
Jan 3 at 5:21





is file.list a list of file paths? In that case you need to do stream <- stream_in(file(x)) in your function

– Vivek Kalyanarangan
Jan 3 at 5:21













That worked a treat but would you help me understand why?

– Foothill_trudger
Jan 3 at 7:12





That worked a treat but would you help me understand why?

– Foothill_trudger
Jan 3 at 7:12













added ans pls check

– Vivek Kalyanarangan
Jan 3 at 7:24







added ans pls check

– Vivek Kalyanarangan
Jan 3 at 7:24














1 Answer
1






active

oldest

votes


















1














You need a small change in your function. Change to -



stream <- stream_in(file(x))


Explanation



Start with analyzing your original implementation -



stream <- stream_in(file("1.txt"))


The 1.txt here is the file path which is getting passed as an input parameter to file() function. A quick ?file will tell you that it is a




Function to create, open and close connections, i.e., “generalized
files”, such as possibly compressed files, URLs, pipes, etc.




Now if you do a ?stream_in() you will find that it is a




function that implements line-by-line processing of JSON data over a
connection, such as a socket, url, file or pipe




Keyword here being socket, url, file or pipe.



Your file.list is just a list of file paths, character/strings to be specific. But in order for stream_in() to work, you need to pass in a file object, which is the output of file() function which takes in the file path as a string input.



Chaining that together, you needed to do stream_in(file("/path/to/file.txt")).



Once you do that, your sapply takes iterates each path, creates the file object and passes it as input to stream_in().



Hope that helps!






share|improve this answer
























  • Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.

    – Foothill_trudger
    Jan 3 at 10:32













  • You are welcome :-)

    – Vivek Kalyanarangan
    Jan 3 at 10:34











  • I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?

    – Foothill_trudger
    Jan 8 at 5:55














Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54016403%2fextracting-multiple-json-files-into-one-dataframe%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














You need a small change in your function. Change to -



stream <- stream_in(file(x))


Explanation



Start with analyzing your original implementation -



stream <- stream_in(file("1.txt"))


The 1.txt here is the file path which is getting passed as an input parameter to file() function. A quick ?file will tell you that it is a




Function to create, open and close connections, i.e., “generalized
files”, such as possibly compressed files, URLs, pipes, etc.




Now if you do a ?stream_in() you will find that it is a




function that implements line-by-line processing of JSON data over a
connection, such as a socket, url, file or pipe




Keyword here being socket, url, file or pipe.



Your file.list is just a list of file paths, character/strings to be specific. But in order for stream_in() to work, you need to pass in a file object, which is the output of file() function which takes in the file path as a string input.



Chaining that together, you needed to do stream_in(file("/path/to/file.txt")).



Once you do that, your sapply takes iterates each path, creates the file object and passes it as input to stream_in().



Hope that helps!






share|improve this answer
























  • Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.

    – Foothill_trudger
    Jan 3 at 10:32













  • You are welcome :-)

    – Vivek Kalyanarangan
    Jan 3 at 10:34











  • I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?

    – Foothill_trudger
    Jan 8 at 5:55


















1














You need a small change in your function. Change to -



stream <- stream_in(file(x))


Explanation



Start with analyzing your original implementation -



stream <- stream_in(file("1.txt"))


The 1.txt here is the file path which is getting passed as an input parameter to file() function. A quick ?file will tell you that it is a




Function to create, open and close connections, i.e., “generalized
files”, such as possibly compressed files, URLs, pipes, etc.




Now if you do a ?stream_in() you will find that it is a




function that implements line-by-line processing of JSON data over a
connection, such as a socket, url, file or pipe




Keyword here being socket, url, file or pipe.



Your file.list is just a list of file paths, character/strings to be specific. But in order for stream_in() to work, you need to pass in a file object, which is the output of file() function which takes in the file path as a string input.



Chaining that together, you needed to do stream_in(file("/path/to/file.txt")).



Once you do that, your sapply takes iterates each path, creates the file object and passes it as input to stream_in().



Hope that helps!






share|improve this answer
























  • Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.

    – Foothill_trudger
    Jan 3 at 10:32













  • You are welcome :-)

    – Vivek Kalyanarangan
    Jan 3 at 10:34











  • I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?

    – Foothill_trudger
    Jan 8 at 5:55
















1












1








1







You need a small change in your function. Change to -



stream <- stream_in(file(x))


Explanation



Start with analyzing your original implementation -



stream <- stream_in(file("1.txt"))


The 1.txt here is the file path which is getting passed as an input parameter to file() function. A quick ?file will tell you that it is a




Function to create, open and close connections, i.e., “generalized
files”, such as possibly compressed files, URLs, pipes, etc.




Now if you do a ?stream_in() you will find that it is a




function that implements line-by-line processing of JSON data over a
connection, such as a socket, url, file or pipe




Keyword here being socket, url, file or pipe.



Your file.list is just a list of file paths, character/strings to be specific. But in order for stream_in() to work, you need to pass in a file object, which is the output of file() function which takes in the file path as a string input.



Chaining that together, you needed to do stream_in(file("/path/to/file.txt")).



Once you do that, your sapply takes iterates each path, creates the file object and passes it as input to stream_in().



Hope that helps!






share|improve this answer













You need a small change in your function. Change to -



stream <- stream_in(file(x))


Explanation



Start with analyzing your original implementation -



stream <- stream_in(file("1.txt"))


The 1.txt here is the file path which is getting passed as an input parameter to file() function. A quick ?file will tell you that it is a




Function to create, open and close connections, i.e., “generalized
files”, such as possibly compressed files, URLs, pipes, etc.




Now if you do a ?stream_in() you will find that it is a




function that implements line-by-line processing of JSON data over a
connection, such as a socket, url, file or pipe




Keyword here being socket, url, file or pipe.



Your file.list is just a list of file paths, character/strings to be specific. But in order for stream_in() to work, you need to pass in a file object, which is the output of file() function which takes in the file path as a string input.



Chaining that together, you needed to do stream_in(file("/path/to/file.txt")).



Once you do that, your sapply takes iterates each path, creates the file object and passes it as input to stream_in().



Hope that helps!







share|improve this answer












share|improve this answer



share|improve this answer










answered Jan 3 at 7:24









Vivek KalyanaranganVivek Kalyanarangan

5,1141829




5,1141829













  • Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.

    – Foothill_trudger
    Jan 3 at 10:32













  • You are welcome :-)

    – Vivek Kalyanarangan
    Jan 3 at 10:34











  • I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?

    – Foothill_trudger
    Jan 8 at 5:55





















  • Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.

    – Foothill_trudger
    Jan 3 at 10:32













  • You are welcome :-)

    – Vivek Kalyanarangan
    Jan 3 at 10:34











  • I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?

    – Foothill_trudger
    Jan 8 at 5:55



















Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.

– Foothill_trudger
Jan 3 at 10:32







Thank you - much appreciated! Will get back to work trying to merge them with rbind.fill or something similar.

– Foothill_trudger
Jan 3 at 10:32















You are welcome :-)

– Vivek Kalyanarangan
Jan 3 at 10:34





You are welcome :-)

– Vivek Kalyanarangan
Jan 3 at 10:34













I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?

– Foothill_trudger
Jan 8 at 5:55







I've followed your advice but now merging into one dataframe seems to crash. What do you think I'm missing to stream_in the files, flatten them and append them to one large data frame?

– Foothill_trudger
Jan 8 at 5:55






















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54016403%2fextracting-multiple-json-files-into-one-dataframe%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

How to fix TextFormField cause rebuild widget in Flutter

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith