Fetch HTML part in java
I have some troubles understanding how can I download only part of html page. I tryed traditional way through URL::openStream
method and BufferedReader
but I'm not quite sure if this way pushes me to download whole page.
The problem is: I have quite big HTML page and I need to parse 2 numbers from it, which updating at least once a second. Way above helps to detect changes once in 2-3 seconds and I wonder if there is way to make it faster. So I thought if fetching page partly can help me.
java html inputstreamreader
add a comment |
I have some troubles understanding how can I download only part of html page. I tryed traditional way through URL::openStream
method and BufferedReader
but I'm not quite sure if this way pushes me to download whole page.
The problem is: I have quite big HTML page and I need to parse 2 numbers from it, which updating at least once a second. Way above helps to detect changes once in 2-3 seconds and I wonder if there is way to make it faster. So I thought if fetching page partly can help me.
java html inputstreamreader
Perhaps you can try Jsoup?
– manfromnowhere
Nov 20 '18 at 10:31
It builds dom from whole page. It quite fast but not enough
– Vlad Doronin
Nov 20 '18 at 10:41
add a comment |
I have some troubles understanding how can I download only part of html page. I tryed traditional way through URL::openStream
method and BufferedReader
but I'm not quite sure if this way pushes me to download whole page.
The problem is: I have quite big HTML page and I need to parse 2 numbers from it, which updating at least once a second. Way above helps to detect changes once in 2-3 seconds and I wonder if there is way to make it faster. So I thought if fetching page partly can help me.
java html inputstreamreader
I have some troubles understanding how can I download only part of html page. I tryed traditional way through URL::openStream
method and BufferedReader
but I'm not quite sure if this way pushes me to download whole page.
The problem is: I have quite big HTML page and I need to parse 2 numbers from it, which updating at least once a second. Way above helps to detect changes once in 2-3 seconds and I wonder if there is way to make it faster. So I thought if fetching page partly can help me.
java html inputstreamreader
java html inputstreamreader
asked Nov 20 '18 at 10:20


Vlad DoroninVlad Doronin
33
33
Perhaps you can try Jsoup?
– manfromnowhere
Nov 20 '18 at 10:31
It builds dom from whole page. It quite fast but not enough
– Vlad Doronin
Nov 20 '18 at 10:41
add a comment |
Perhaps you can try Jsoup?
– manfromnowhere
Nov 20 '18 at 10:31
It builds dom from whole page. It quite fast but not enough
– Vlad Doronin
Nov 20 '18 at 10:41
Perhaps you can try Jsoup?
– manfromnowhere
Nov 20 '18 at 10:31
Perhaps you can try Jsoup?
– manfromnowhere
Nov 20 '18 at 10:31
It builds dom from whole page. It quite fast but not enough
– Vlad Doronin
Nov 20 '18 at 10:41
It builds dom from whole page. It quite fast but not enough
– Vlad Doronin
Nov 20 '18 at 10:41
add a comment |
2 Answers
2
active
oldest
votes
I think you should see how the data is fetched (SSE or WebSocket) and just try to subscribe to that service. If that is impossible try more efficient XML parser. I recommend https://vtd-xml.sourceforge.io/ it can be ~10x faster then DOM parser that comes with JDK.
Also be careful with the BufferedReader.readLine()
as there is a hidden cost of allocation (this is pretty advanced stuff as you have to think about CPU memory bandwidth, L1 cache misses etc..) for the strings that you don't really need.
Example using the library I mentioned:
byte pageInBytes = readAllBytesFromTheURL();
VTDGen vg = new VTDGen();
vg.setDoc(pageInBytes);
vg.parse(false);
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
//Jump to the section that we want to process
ap.selectXPath("/html/body/div");
String fileId = vn.toString(vu.getElementFragment());
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 '18 at 11:35
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 '18 at 12:50
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 '18 at 12:57
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 '18 at 14:18
Posted my code in next answer
– Vlad Doronin
Nov 21 '18 at 12:06
add a comment |
Wrote helper to read url content. Parser for elements in another class.
public class HTMLReaderHelper {
private final URL currentURL;
HTMLReaderHelper(URL url){
currentURL = url;
}
public CharIterator charIterator(){
CharIterator iterator;
try {
iterator = new CharIterator();
} catch(IOException ex){
return null;
}
return iterator;
}
public StringIterator stringIterator(){
return new StringIterator();
}
class CharIterator implements java.util.Iterator<Character>{
private InputStream urlStream;
private boolean isValid;
private Queue<Character> buffer;
private CharIterator() throws IOException {
urlStream = currentURL.openStream();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
char c;
try {
c = (char)urlStream.read();
buffer.add(c);
} catch (IOException ex) {
markInvalid();
return false;
}
return c != (char) -1;
}
@Override
public Character next() {
if(!isValid){
return null;
}
char c;
try {
if(buffer.size() > 0){
return buffer.remove();
}
c = (char)urlStream.read();
} catch (IOException ex) {
markInvalid();
return null;
}
return (c != (char)-1) ? c : null;
}
private void markInvalid(){
isValid = false;
}
}
class StringIterator implements java.util.Iterator<String>{
private CharIterator charPointer;
private Queue<String> buffer;
private boolean isValid;
private StringIterator(){
charPointer = charIterator();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
String value = next();
try {
buffer.add(value);
} catch (NullPointerException ex){
markInvalid();
return false;
}
return isValid;
}
@Override
public String next() {
if(buffer.size() > 0){
return buffer.remove();
}
if(!isValid){
return null;
}
StringBuilder sb = new StringBuilder();
Character currentChar = charPointer.next();
if(currentChar == null){
return null;
}
while (currentChar.equals('n') || currentChar.equals('r')){
currentChar = charPointer.next();
if(currentChar == null){
return null;
}
}
while (currentChar != Character.valueOf('n') && currentChar != Character.valueOf('r')){
sb.append(currentChar);
currentChar = charPointer.next();
}
return sb.toString();
}
private void markInvalid(){
isValid = false;
}
}
}
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53390833%2ffetch-html-part-in-java%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
I think you should see how the data is fetched (SSE or WebSocket) and just try to subscribe to that service. If that is impossible try more efficient XML parser. I recommend https://vtd-xml.sourceforge.io/ it can be ~10x faster then DOM parser that comes with JDK.
Also be careful with the BufferedReader.readLine()
as there is a hidden cost of allocation (this is pretty advanced stuff as you have to think about CPU memory bandwidth, L1 cache misses etc..) for the strings that you don't really need.
Example using the library I mentioned:
byte pageInBytes = readAllBytesFromTheURL();
VTDGen vg = new VTDGen();
vg.setDoc(pageInBytes);
vg.parse(false);
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
//Jump to the section that we want to process
ap.selectXPath("/html/body/div");
String fileId = vn.toString(vu.getElementFragment());
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 '18 at 11:35
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 '18 at 12:50
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 '18 at 12:57
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 '18 at 14:18
Posted my code in next answer
– Vlad Doronin
Nov 21 '18 at 12:06
add a comment |
I think you should see how the data is fetched (SSE or WebSocket) and just try to subscribe to that service. If that is impossible try more efficient XML parser. I recommend https://vtd-xml.sourceforge.io/ it can be ~10x faster then DOM parser that comes with JDK.
Also be careful with the BufferedReader.readLine()
as there is a hidden cost of allocation (this is pretty advanced stuff as you have to think about CPU memory bandwidth, L1 cache misses etc..) for the strings that you don't really need.
Example using the library I mentioned:
byte pageInBytes = readAllBytesFromTheURL();
VTDGen vg = new VTDGen();
vg.setDoc(pageInBytes);
vg.parse(false);
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
//Jump to the section that we want to process
ap.selectXPath("/html/body/div");
String fileId = vn.toString(vu.getElementFragment());
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 '18 at 11:35
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 '18 at 12:50
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 '18 at 12:57
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 '18 at 14:18
Posted my code in next answer
– Vlad Doronin
Nov 21 '18 at 12:06
add a comment |
I think you should see how the data is fetched (SSE or WebSocket) and just try to subscribe to that service. If that is impossible try more efficient XML parser. I recommend https://vtd-xml.sourceforge.io/ it can be ~10x faster then DOM parser that comes with JDK.
Also be careful with the BufferedReader.readLine()
as there is a hidden cost of allocation (this is pretty advanced stuff as you have to think about CPU memory bandwidth, L1 cache misses etc..) for the strings that you don't really need.
Example using the library I mentioned:
byte pageInBytes = readAllBytesFromTheURL();
VTDGen vg = new VTDGen();
vg.setDoc(pageInBytes);
vg.parse(false);
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
//Jump to the section that we want to process
ap.selectXPath("/html/body/div");
String fileId = vn.toString(vu.getElementFragment());
I think you should see how the data is fetched (SSE or WebSocket) and just try to subscribe to that service. If that is impossible try more efficient XML parser. I recommend https://vtd-xml.sourceforge.io/ it can be ~10x faster then DOM parser that comes with JDK.
Also be careful with the BufferedReader.readLine()
as there is a hidden cost of allocation (this is pretty advanced stuff as you have to think about CPU memory bandwidth, L1 cache misses etc..) for the strings that you don't really need.
Example using the library I mentioned:
byte pageInBytes = readAllBytesFromTheURL();
VTDGen vg = new VTDGen();
vg.setDoc(pageInBytes);
vg.parse(false);
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
//Jump to the section that we want to process
ap.selectXPath("/html/body/div");
String fileId = vn.toString(vu.getElementFragment());
edited Nov 20 '18 at 11:22
answered Nov 20 '18 at 11:14


piotr szybickipiotr szybicki
609310
609310
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 '18 at 11:35
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 '18 at 12:50
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 '18 at 12:57
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 '18 at 14:18
Posted my code in next answer
– Vlad Doronin
Nov 21 '18 at 12:06
add a comment |
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 '18 at 11:35
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 '18 at 12:50
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 '18 at 12:57
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 '18 at 14:18
Posted my code in next answer
– Vlad Doronin
Nov 21 '18 at 12:06
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 '18 at 11:35
Thanks a lot! By the way page is using Lightstreamer to fetch data from their servers, I tryed to use it directly, which obviously was not successfull
– Vlad Doronin
Nov 20 '18 at 11:35
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 '18 at 12:50
cool, can you accept my answer. I'm trolling for points on the stack overflow :)
– piotr szybicki
Nov 20 '18 at 12:50
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 '18 at 12:57
Yeah, sure. But VTD didn't work for me. Page has some tokens, which VTD can not parse, so now i'm writing custom reader. But i tryed it on another XML file and it really fast.
– Vlad Doronin
Nov 20 '18 at 12:57
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 '18 at 14:18
Can you share your solution when you are done. I'm curious to see what you come up with.
– piotr szybicki
Nov 20 '18 at 14:18
Posted my code in next answer
– Vlad Doronin
Nov 21 '18 at 12:06
Posted my code in next answer
– Vlad Doronin
Nov 21 '18 at 12:06
add a comment |
Wrote helper to read url content. Parser for elements in another class.
public class HTMLReaderHelper {
private final URL currentURL;
HTMLReaderHelper(URL url){
currentURL = url;
}
public CharIterator charIterator(){
CharIterator iterator;
try {
iterator = new CharIterator();
} catch(IOException ex){
return null;
}
return iterator;
}
public StringIterator stringIterator(){
return new StringIterator();
}
class CharIterator implements java.util.Iterator<Character>{
private InputStream urlStream;
private boolean isValid;
private Queue<Character> buffer;
private CharIterator() throws IOException {
urlStream = currentURL.openStream();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
char c;
try {
c = (char)urlStream.read();
buffer.add(c);
} catch (IOException ex) {
markInvalid();
return false;
}
return c != (char) -1;
}
@Override
public Character next() {
if(!isValid){
return null;
}
char c;
try {
if(buffer.size() > 0){
return buffer.remove();
}
c = (char)urlStream.read();
} catch (IOException ex) {
markInvalid();
return null;
}
return (c != (char)-1) ? c : null;
}
private void markInvalid(){
isValid = false;
}
}
class StringIterator implements java.util.Iterator<String>{
private CharIterator charPointer;
private Queue<String> buffer;
private boolean isValid;
private StringIterator(){
charPointer = charIterator();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
String value = next();
try {
buffer.add(value);
} catch (NullPointerException ex){
markInvalid();
return false;
}
return isValid;
}
@Override
public String next() {
if(buffer.size() > 0){
return buffer.remove();
}
if(!isValid){
return null;
}
StringBuilder sb = new StringBuilder();
Character currentChar = charPointer.next();
if(currentChar == null){
return null;
}
while (currentChar.equals('n') || currentChar.equals('r')){
currentChar = charPointer.next();
if(currentChar == null){
return null;
}
}
while (currentChar != Character.valueOf('n') && currentChar != Character.valueOf('r')){
sb.append(currentChar);
currentChar = charPointer.next();
}
return sb.toString();
}
private void markInvalid(){
isValid = false;
}
}
}
add a comment |
Wrote helper to read url content. Parser for elements in another class.
public class HTMLReaderHelper {
private final URL currentURL;
HTMLReaderHelper(URL url){
currentURL = url;
}
public CharIterator charIterator(){
CharIterator iterator;
try {
iterator = new CharIterator();
} catch(IOException ex){
return null;
}
return iterator;
}
public StringIterator stringIterator(){
return new StringIterator();
}
class CharIterator implements java.util.Iterator<Character>{
private InputStream urlStream;
private boolean isValid;
private Queue<Character> buffer;
private CharIterator() throws IOException {
urlStream = currentURL.openStream();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
char c;
try {
c = (char)urlStream.read();
buffer.add(c);
} catch (IOException ex) {
markInvalid();
return false;
}
return c != (char) -1;
}
@Override
public Character next() {
if(!isValid){
return null;
}
char c;
try {
if(buffer.size() > 0){
return buffer.remove();
}
c = (char)urlStream.read();
} catch (IOException ex) {
markInvalid();
return null;
}
return (c != (char)-1) ? c : null;
}
private void markInvalid(){
isValid = false;
}
}
class StringIterator implements java.util.Iterator<String>{
private CharIterator charPointer;
private Queue<String> buffer;
private boolean isValid;
private StringIterator(){
charPointer = charIterator();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
String value = next();
try {
buffer.add(value);
} catch (NullPointerException ex){
markInvalid();
return false;
}
return isValid;
}
@Override
public String next() {
if(buffer.size() > 0){
return buffer.remove();
}
if(!isValid){
return null;
}
StringBuilder sb = new StringBuilder();
Character currentChar = charPointer.next();
if(currentChar == null){
return null;
}
while (currentChar.equals('n') || currentChar.equals('r')){
currentChar = charPointer.next();
if(currentChar == null){
return null;
}
}
while (currentChar != Character.valueOf('n') && currentChar != Character.valueOf('r')){
sb.append(currentChar);
currentChar = charPointer.next();
}
return sb.toString();
}
private void markInvalid(){
isValid = false;
}
}
}
add a comment |
Wrote helper to read url content. Parser for elements in another class.
public class HTMLReaderHelper {
private final URL currentURL;
HTMLReaderHelper(URL url){
currentURL = url;
}
public CharIterator charIterator(){
CharIterator iterator;
try {
iterator = new CharIterator();
} catch(IOException ex){
return null;
}
return iterator;
}
public StringIterator stringIterator(){
return new StringIterator();
}
class CharIterator implements java.util.Iterator<Character>{
private InputStream urlStream;
private boolean isValid;
private Queue<Character> buffer;
private CharIterator() throws IOException {
urlStream = currentURL.openStream();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
char c;
try {
c = (char)urlStream.read();
buffer.add(c);
} catch (IOException ex) {
markInvalid();
return false;
}
return c != (char) -1;
}
@Override
public Character next() {
if(!isValid){
return null;
}
char c;
try {
if(buffer.size() > 0){
return buffer.remove();
}
c = (char)urlStream.read();
} catch (IOException ex) {
markInvalid();
return null;
}
return (c != (char)-1) ? c : null;
}
private void markInvalid(){
isValid = false;
}
}
class StringIterator implements java.util.Iterator<String>{
private CharIterator charPointer;
private Queue<String> buffer;
private boolean isValid;
private StringIterator(){
charPointer = charIterator();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
String value = next();
try {
buffer.add(value);
} catch (NullPointerException ex){
markInvalid();
return false;
}
return isValid;
}
@Override
public String next() {
if(buffer.size() > 0){
return buffer.remove();
}
if(!isValid){
return null;
}
StringBuilder sb = new StringBuilder();
Character currentChar = charPointer.next();
if(currentChar == null){
return null;
}
while (currentChar.equals('n') || currentChar.equals('r')){
currentChar = charPointer.next();
if(currentChar == null){
return null;
}
}
while (currentChar != Character.valueOf('n') && currentChar != Character.valueOf('r')){
sb.append(currentChar);
currentChar = charPointer.next();
}
return sb.toString();
}
private void markInvalid(){
isValid = false;
}
}
}
Wrote helper to read url content. Parser for elements in another class.
public class HTMLReaderHelper {
private final URL currentURL;
HTMLReaderHelper(URL url){
currentURL = url;
}
public CharIterator charIterator(){
CharIterator iterator;
try {
iterator = new CharIterator();
} catch(IOException ex){
return null;
}
return iterator;
}
public StringIterator stringIterator(){
return new StringIterator();
}
class CharIterator implements java.util.Iterator<Character>{
private InputStream urlStream;
private boolean isValid;
private Queue<Character> buffer;
private CharIterator() throws IOException {
urlStream = currentURL.openStream();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
char c;
try {
c = (char)urlStream.read();
buffer.add(c);
} catch (IOException ex) {
markInvalid();
return false;
}
return c != (char) -1;
}
@Override
public Character next() {
if(!isValid){
return null;
}
char c;
try {
if(buffer.size() > 0){
return buffer.remove();
}
c = (char)urlStream.read();
} catch (IOException ex) {
markInvalid();
return null;
}
return (c != (char)-1) ? c : null;
}
private void markInvalid(){
isValid = false;
}
}
class StringIterator implements java.util.Iterator<String>{
private CharIterator charPointer;
private Queue<String> buffer;
private boolean isValid;
private StringIterator(){
charPointer = charIterator();
isValid = true;
buffer = new ArrayDeque<>();
}
@Override
public boolean hasNext() {
String value = next();
try {
buffer.add(value);
} catch (NullPointerException ex){
markInvalid();
return false;
}
return isValid;
}
@Override
public String next() {
if(buffer.size() > 0){
return buffer.remove();
}
if(!isValid){
return null;
}
StringBuilder sb = new StringBuilder();
Character currentChar = charPointer.next();
if(currentChar == null){
return null;
}
while (currentChar.equals('n') || currentChar.equals('r')){
currentChar = charPointer.next();
if(currentChar == null){
return null;
}
}
while (currentChar != Character.valueOf('n') && currentChar != Character.valueOf('r')){
sb.append(currentChar);
currentChar = charPointer.next();
}
return sb.toString();
}
private void markInvalid(){
isValid = false;
}
}
}
answered Nov 21 '18 at 12:05


Vlad DoroninVlad Doronin
33
33
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53390833%2ffetch-html-part-in-java%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Perhaps you can try Jsoup?
– manfromnowhere
Nov 20 '18 at 10:31
It builds dom from whole page. It quite fast but not enough
– Vlad Doronin
Nov 20 '18 at 10:41