how to bypass Oracle ADF loopback script for scripting website using php cURL library?
I am scraping a website which has Oracle ADF loopback script which continuously redirects me to same page of mine, so how to bypass it?
Following is my php code.
<?php
$url = 'https://www.mywebsite.com/faces/index.jspx';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__) . '/cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, dirname(__FILE__) . '/cookie.txt');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$header = 'User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$data = curl_exec($ch);
curl_close($ch);
if (curl_errno($ch)) { // check for execution errors
echo 'Scraper error: ' . curl_error($ch);
exit;
}
echo $data;
?>
When i run above code i got redirected to same page,
and it also adds some query string parameters like ?_afrLoop=39478247795404&_afrWindowMode=0&_afrWindowId=null
in actual site _afrWindowId
has some random alphanumeric string but i am getting null
.
after stopping page redirection manually i got page which has Oracle loopback script as following
which causes the redirection, what to do help me.
loopback script:
<html lang="el-GR"><head><script>
/*
** Copyright (c) 2008, Oracle and/or its affiliates. All rights reserved.
*/
/**
* This is the loopback script to process the url before the real page loads. It introduces
* a separate round trip. During this first roundtrip, we currently do two things:
* - check the url hash portion, this is for the PPR Navigation.
* - do the new window detection
* the above two are both controled by parameters in web.xml
*
* Since it's very lightweight, so the network latency is the only impact.
*
* here are the list of will-pass-in parameters (these will replace the param in this whole
* pattern:
* viewIdLength view Id length (characters),
* loopbackIdParam loopback Id param name,
* loopbackId loopback Id,
* loopbackIdParamMatchExpr loopback Id match expression,
* windowModeIdParam window mode param name,
* windowModeParamMatchExpr window mode match expression,
* clientWindowIdParam client window Id param name,
* clientWindowIdParamMatchExpr client window Id match expression,
* windowId window Id,
* initPageLaunch initPageLaunch,
* enableNewWindowDetect whether we want to enable new window detection
* jsessionId session Id that needs to be appended to the redirect URL
* enablePPRNav whether we want to enable PPR Navigation
*
*/
var id = null;
var query = null;
var href = document.location.href;
var hashIndex = href.indexOf("#");
var hash = null;
/* process the hash part of the url, split the url */
if (hashIndex > 0)
{
hash = href.substring(hashIndex + 1);
/* only analyze hash when pprNav is on (bug 8832771) */
if (false && hash && hash.length > 0)
{
hash = decodeURIComponent(hash);
if (hash.charAt(0) == "@")
{
query = hash.substring(1);
}
else
{
var state = hash.split("@");
id = state[0];
query = state[1];
}
}
href = href.substring(0, hashIndex);
}
/* process the query part */
var queryIndex = href.indexOf("?");
if (queryIndex > 0)
{
/* only when pprNav is on, we take in the query from the hash portion */
query = (query || (id && id.length>0))? query: href.substring(queryIndex);
href = href.substring(0, queryIndex);
}
var jsessionIndex = href.indexOf(';');
if (jsessionIndex > 0)
{
href = href.substring(0, jsessionIndex);
}
/* we will replace the viewId only when pprNav is turned on (bug 8832771) */
if (false)
{
if (id != null && id.length > 0)
{
href = href.substring(0, href.length - 11) + id;
}
}
var isSet = false;
if (query == null || query.length == 0)
{
query = "?";
}
else if (query.indexOf("_afrLoop=") >= 0)
{
isSet = true;
query = query.replace(/_afrLoop=[^&]*/, "_afrLoop=39279593944826");
}
else
{
query += "&";
}
if (!isSet)
{
query = query += "_afrLoop=39279593944826";
}
/* below is the new window detection logic */
var initWindowName = "_afr_init_"; // temporary window name set to a new window
var windowName = window.name;
// if the window name is "_afr_init_", treat it as redirect case of a new window
if ((true) && (!windowName || windowName==initWindowName ||
windowName!="null"))
{
/* append the _afrWindowMode param */
var windowMode;
if (true)
{
/* this is the initial page launch case,
also this could be that we couldn't detect the real windowId from the server side */
windowMode=0;
}
else if ((href.indexOf("/__ADFvDlg__") > 0) || (query.indexOf("__ADFvDlg__") >= 0))
{
/* this is the dialog case */
windowMode=1;
}
else
{
/* this is the ctrl-N case */
windowMode=2;
}
if (query.indexOf("_afrWindowMode=") >= 0)
{
query = query.replace(/_afrWindowMode=[^&]*/, "_afrWindowMode="+windowMode);
}
else
{
query = query += "&_afrWindowMode="+windowMode;
}
/* append the _afrWindowId param */
var clientWindowId;
/* in case we couldn't detect the windowId from the server side */
if (!windowName || windowName == initWindowName)
{
clientWindowId = "null";
// set window name to an initial name so we can figure out whether a page is loaded from
// cache when doing Ctrl+N with IE
window.name = initWindowName;
}
else
{
clientWindowId = windowName;
}
if (query.indexOf("_afrWindowId=") >= 0)
{
query = query.replace(/_afrWindowId=w*/, "_afrWindowId="+clientWindowId);
}
else
{
query = query += "&_afrWindowId="+clientWindowId;
}
}
var sess = "";
if (sess.length > 0)
href += sess;
/* if pprNav is on, then the hash portion should have already been processed */
if ((false) || (hash == null))
document.location.replace(href + query);
else
document.location.replace(href + query + "#" + hash);
</script>
</head>
</html>
php curl web-scraping oracle-adf loopback
add a comment |
I am scraping a website which has Oracle ADF loopback script which continuously redirects me to same page of mine, so how to bypass it?
Following is my php code.
<?php
$url = 'https://www.mywebsite.com/faces/index.jspx';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__) . '/cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, dirname(__FILE__) . '/cookie.txt');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$header = 'User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$data = curl_exec($ch);
curl_close($ch);
if (curl_errno($ch)) { // check for execution errors
echo 'Scraper error: ' . curl_error($ch);
exit;
}
echo $data;
?>
When i run above code i got redirected to same page,
and it also adds some query string parameters like ?_afrLoop=39478247795404&_afrWindowMode=0&_afrWindowId=null
in actual site _afrWindowId
has some random alphanumeric string but i am getting null
.
after stopping page redirection manually i got page which has Oracle loopback script as following
which causes the redirection, what to do help me.
loopback script:
<html lang="el-GR"><head><script>
/*
** Copyright (c) 2008, Oracle and/or its affiliates. All rights reserved.
*/
/**
* This is the loopback script to process the url before the real page loads. It introduces
* a separate round trip. During this first roundtrip, we currently do two things:
* - check the url hash portion, this is for the PPR Navigation.
* - do the new window detection
* the above two are both controled by parameters in web.xml
*
* Since it's very lightweight, so the network latency is the only impact.
*
* here are the list of will-pass-in parameters (these will replace the param in this whole
* pattern:
* viewIdLength view Id length (characters),
* loopbackIdParam loopback Id param name,
* loopbackId loopback Id,
* loopbackIdParamMatchExpr loopback Id match expression,
* windowModeIdParam window mode param name,
* windowModeParamMatchExpr window mode match expression,
* clientWindowIdParam client window Id param name,
* clientWindowIdParamMatchExpr client window Id match expression,
* windowId window Id,
* initPageLaunch initPageLaunch,
* enableNewWindowDetect whether we want to enable new window detection
* jsessionId session Id that needs to be appended to the redirect URL
* enablePPRNav whether we want to enable PPR Navigation
*
*/
var id = null;
var query = null;
var href = document.location.href;
var hashIndex = href.indexOf("#");
var hash = null;
/* process the hash part of the url, split the url */
if (hashIndex > 0)
{
hash = href.substring(hashIndex + 1);
/* only analyze hash when pprNav is on (bug 8832771) */
if (false && hash && hash.length > 0)
{
hash = decodeURIComponent(hash);
if (hash.charAt(0) == "@")
{
query = hash.substring(1);
}
else
{
var state = hash.split("@");
id = state[0];
query = state[1];
}
}
href = href.substring(0, hashIndex);
}
/* process the query part */
var queryIndex = href.indexOf("?");
if (queryIndex > 0)
{
/* only when pprNav is on, we take in the query from the hash portion */
query = (query || (id && id.length>0))? query: href.substring(queryIndex);
href = href.substring(0, queryIndex);
}
var jsessionIndex = href.indexOf(';');
if (jsessionIndex > 0)
{
href = href.substring(0, jsessionIndex);
}
/* we will replace the viewId only when pprNav is turned on (bug 8832771) */
if (false)
{
if (id != null && id.length > 0)
{
href = href.substring(0, href.length - 11) + id;
}
}
var isSet = false;
if (query == null || query.length == 0)
{
query = "?";
}
else if (query.indexOf("_afrLoop=") >= 0)
{
isSet = true;
query = query.replace(/_afrLoop=[^&]*/, "_afrLoop=39279593944826");
}
else
{
query += "&";
}
if (!isSet)
{
query = query += "_afrLoop=39279593944826";
}
/* below is the new window detection logic */
var initWindowName = "_afr_init_"; // temporary window name set to a new window
var windowName = window.name;
// if the window name is "_afr_init_", treat it as redirect case of a new window
if ((true) && (!windowName || windowName==initWindowName ||
windowName!="null"))
{
/* append the _afrWindowMode param */
var windowMode;
if (true)
{
/* this is the initial page launch case,
also this could be that we couldn't detect the real windowId from the server side */
windowMode=0;
}
else if ((href.indexOf("/__ADFvDlg__") > 0) || (query.indexOf("__ADFvDlg__") >= 0))
{
/* this is the dialog case */
windowMode=1;
}
else
{
/* this is the ctrl-N case */
windowMode=2;
}
if (query.indexOf("_afrWindowMode=") >= 0)
{
query = query.replace(/_afrWindowMode=[^&]*/, "_afrWindowMode="+windowMode);
}
else
{
query = query += "&_afrWindowMode="+windowMode;
}
/* append the _afrWindowId param */
var clientWindowId;
/* in case we couldn't detect the windowId from the server side */
if (!windowName || windowName == initWindowName)
{
clientWindowId = "null";
// set window name to an initial name so we can figure out whether a page is loaded from
// cache when doing Ctrl+N with IE
window.name = initWindowName;
}
else
{
clientWindowId = windowName;
}
if (query.indexOf("_afrWindowId=") >= 0)
{
query = query.replace(/_afrWindowId=w*/, "_afrWindowId="+clientWindowId);
}
else
{
query = query += "&_afrWindowId="+clientWindowId;
}
}
var sess = "";
if (sess.length > 0)
href += sess;
/* if pprNav is on, then the hash portion should have already been processed */
if ((false) || (hash == null))
document.location.replace(href + query);
else
document.location.replace(href + query + "#" + hash);
</script>
</head>
</html>
php curl web-scraping oracle-adf loopback
Would deactivating the loopback functionnality on the ADF project work for you?
– MrAdibou
Jan 7 at 10:35
@MrAdibou i can not deactivate, because i am scraping other website which i don't own.
– Haritsinh Gohil
Jan 7 at 11:08
add a comment |
I am scraping a website which has Oracle ADF loopback script which continuously redirects me to same page of mine, so how to bypass it?
Following is my php code.
<?php
$url = 'https://www.mywebsite.com/faces/index.jspx';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__) . '/cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, dirname(__FILE__) . '/cookie.txt');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$header = 'User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$data = curl_exec($ch);
curl_close($ch);
if (curl_errno($ch)) { // check for execution errors
echo 'Scraper error: ' . curl_error($ch);
exit;
}
echo $data;
?>
When i run above code i got redirected to same page,
and it also adds some query string parameters like ?_afrLoop=39478247795404&_afrWindowMode=0&_afrWindowId=null
in actual site _afrWindowId
has some random alphanumeric string but i am getting null
.
after stopping page redirection manually i got page which has Oracle loopback script as following
which causes the redirection, what to do help me.
loopback script:
<html lang="el-GR"><head><script>
/*
** Copyright (c) 2008, Oracle and/or its affiliates. All rights reserved.
*/
/**
* This is the loopback script to process the url before the real page loads. It introduces
* a separate round trip. During this first roundtrip, we currently do two things:
* - check the url hash portion, this is for the PPR Navigation.
* - do the new window detection
* the above two are both controled by parameters in web.xml
*
* Since it's very lightweight, so the network latency is the only impact.
*
* here are the list of will-pass-in parameters (these will replace the param in this whole
* pattern:
* viewIdLength view Id length (characters),
* loopbackIdParam loopback Id param name,
* loopbackId loopback Id,
* loopbackIdParamMatchExpr loopback Id match expression,
* windowModeIdParam window mode param name,
* windowModeParamMatchExpr window mode match expression,
* clientWindowIdParam client window Id param name,
* clientWindowIdParamMatchExpr client window Id match expression,
* windowId window Id,
* initPageLaunch initPageLaunch,
* enableNewWindowDetect whether we want to enable new window detection
* jsessionId session Id that needs to be appended to the redirect URL
* enablePPRNav whether we want to enable PPR Navigation
*
*/
var id = null;
var query = null;
var href = document.location.href;
var hashIndex = href.indexOf("#");
var hash = null;
/* process the hash part of the url, split the url */
if (hashIndex > 0)
{
hash = href.substring(hashIndex + 1);
/* only analyze hash when pprNav is on (bug 8832771) */
if (false && hash && hash.length > 0)
{
hash = decodeURIComponent(hash);
if (hash.charAt(0) == "@")
{
query = hash.substring(1);
}
else
{
var state = hash.split("@");
id = state[0];
query = state[1];
}
}
href = href.substring(0, hashIndex);
}
/* process the query part */
var queryIndex = href.indexOf("?");
if (queryIndex > 0)
{
/* only when pprNav is on, we take in the query from the hash portion */
query = (query || (id && id.length>0))? query: href.substring(queryIndex);
href = href.substring(0, queryIndex);
}
var jsessionIndex = href.indexOf(';');
if (jsessionIndex > 0)
{
href = href.substring(0, jsessionIndex);
}
/* we will replace the viewId only when pprNav is turned on (bug 8832771) */
if (false)
{
if (id != null && id.length > 0)
{
href = href.substring(0, href.length - 11) + id;
}
}
var isSet = false;
if (query == null || query.length == 0)
{
query = "?";
}
else if (query.indexOf("_afrLoop=") >= 0)
{
isSet = true;
query = query.replace(/_afrLoop=[^&]*/, "_afrLoop=39279593944826");
}
else
{
query += "&";
}
if (!isSet)
{
query = query += "_afrLoop=39279593944826";
}
/* below is the new window detection logic */
var initWindowName = "_afr_init_"; // temporary window name set to a new window
var windowName = window.name;
// if the window name is "_afr_init_", treat it as redirect case of a new window
if ((true) && (!windowName || windowName==initWindowName ||
windowName!="null"))
{
/* append the _afrWindowMode param */
var windowMode;
if (true)
{
/* this is the initial page launch case,
also this could be that we couldn't detect the real windowId from the server side */
windowMode=0;
}
else if ((href.indexOf("/__ADFvDlg__") > 0) || (query.indexOf("__ADFvDlg__") >= 0))
{
/* this is the dialog case */
windowMode=1;
}
else
{
/* this is the ctrl-N case */
windowMode=2;
}
if (query.indexOf("_afrWindowMode=") >= 0)
{
query = query.replace(/_afrWindowMode=[^&]*/, "_afrWindowMode="+windowMode);
}
else
{
query = query += "&_afrWindowMode="+windowMode;
}
/* append the _afrWindowId param */
var clientWindowId;
/* in case we couldn't detect the windowId from the server side */
if (!windowName || windowName == initWindowName)
{
clientWindowId = "null";
// set window name to an initial name so we can figure out whether a page is loaded from
// cache when doing Ctrl+N with IE
window.name = initWindowName;
}
else
{
clientWindowId = windowName;
}
if (query.indexOf("_afrWindowId=") >= 0)
{
query = query.replace(/_afrWindowId=w*/, "_afrWindowId="+clientWindowId);
}
else
{
query = query += "&_afrWindowId="+clientWindowId;
}
}
var sess = "";
if (sess.length > 0)
href += sess;
/* if pprNav is on, then the hash portion should have already been processed */
if ((false) || (hash == null))
document.location.replace(href + query);
else
document.location.replace(href + query + "#" + hash);
</script>
</head>
</html>
php curl web-scraping oracle-adf loopback
I am scraping a website which has Oracle ADF loopback script which continuously redirects me to same page of mine, so how to bypass it?
Following is my php code.
<?php
$url = 'https://www.mywebsite.com/faces/index.jspx';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__) . '/cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, dirname(__FILE__) . '/cookie.txt');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$header = 'User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$data = curl_exec($ch);
curl_close($ch);
if (curl_errno($ch)) { // check for execution errors
echo 'Scraper error: ' . curl_error($ch);
exit;
}
echo $data;
?>
When i run above code i got redirected to same page,
and it also adds some query string parameters like ?_afrLoop=39478247795404&_afrWindowMode=0&_afrWindowId=null
in actual site _afrWindowId
has some random alphanumeric string but i am getting null
.
after stopping page redirection manually i got page which has Oracle loopback script as following
which causes the redirection, what to do help me.
loopback script:
<html lang="el-GR"><head><script>
/*
** Copyright (c) 2008, Oracle and/or its affiliates. All rights reserved.
*/
/**
* This is the loopback script to process the url before the real page loads. It introduces
* a separate round trip. During this first roundtrip, we currently do two things:
* - check the url hash portion, this is for the PPR Navigation.
* - do the new window detection
* the above two are both controled by parameters in web.xml
*
* Since it's very lightweight, so the network latency is the only impact.
*
* here are the list of will-pass-in parameters (these will replace the param in this whole
* pattern:
* viewIdLength view Id length (characters),
* loopbackIdParam loopback Id param name,
* loopbackId loopback Id,
* loopbackIdParamMatchExpr loopback Id match expression,
* windowModeIdParam window mode param name,
* windowModeParamMatchExpr window mode match expression,
* clientWindowIdParam client window Id param name,
* clientWindowIdParamMatchExpr client window Id match expression,
* windowId window Id,
* initPageLaunch initPageLaunch,
* enableNewWindowDetect whether we want to enable new window detection
* jsessionId session Id that needs to be appended to the redirect URL
* enablePPRNav whether we want to enable PPR Navigation
*
*/
var id = null;
var query = null;
var href = document.location.href;
var hashIndex = href.indexOf("#");
var hash = null;
/* process the hash part of the url, split the url */
if (hashIndex > 0)
{
hash = href.substring(hashIndex + 1);
/* only analyze hash when pprNav is on (bug 8832771) */
if (false && hash && hash.length > 0)
{
hash = decodeURIComponent(hash);
if (hash.charAt(0) == "@")
{
query = hash.substring(1);
}
else
{
var state = hash.split("@");
id = state[0];
query = state[1];
}
}
href = href.substring(0, hashIndex);
}
/* process the query part */
var queryIndex = href.indexOf("?");
if (queryIndex > 0)
{
/* only when pprNav is on, we take in the query from the hash portion */
query = (query || (id && id.length>0))? query: href.substring(queryIndex);
href = href.substring(0, queryIndex);
}
var jsessionIndex = href.indexOf(';');
if (jsessionIndex > 0)
{
href = href.substring(0, jsessionIndex);
}
/* we will replace the viewId only when pprNav is turned on (bug 8832771) */
if (false)
{
if (id != null && id.length > 0)
{
href = href.substring(0, href.length - 11) + id;
}
}
var isSet = false;
if (query == null || query.length == 0)
{
query = "?";
}
else if (query.indexOf("_afrLoop=") >= 0)
{
isSet = true;
query = query.replace(/_afrLoop=[^&]*/, "_afrLoop=39279593944826");
}
else
{
query += "&";
}
if (!isSet)
{
query = query += "_afrLoop=39279593944826";
}
/* below is the new window detection logic */
var initWindowName = "_afr_init_"; // temporary window name set to a new window
var windowName = window.name;
// if the window name is "_afr_init_", treat it as redirect case of a new window
if ((true) && (!windowName || windowName==initWindowName ||
windowName!="null"))
{
/* append the _afrWindowMode param */
var windowMode;
if (true)
{
/* this is the initial page launch case,
also this could be that we couldn't detect the real windowId from the server side */
windowMode=0;
}
else if ((href.indexOf("/__ADFvDlg__") > 0) || (query.indexOf("__ADFvDlg__") >= 0))
{
/* this is the dialog case */
windowMode=1;
}
else
{
/* this is the ctrl-N case */
windowMode=2;
}
if (query.indexOf("_afrWindowMode=") >= 0)
{
query = query.replace(/_afrWindowMode=[^&]*/, "_afrWindowMode="+windowMode);
}
else
{
query = query += "&_afrWindowMode="+windowMode;
}
/* append the _afrWindowId param */
var clientWindowId;
/* in case we couldn't detect the windowId from the server side */
if (!windowName || windowName == initWindowName)
{
clientWindowId = "null";
// set window name to an initial name so we can figure out whether a page is loaded from
// cache when doing Ctrl+N with IE
window.name = initWindowName;
}
else
{
clientWindowId = windowName;
}
if (query.indexOf("_afrWindowId=") >= 0)
{
query = query.replace(/_afrWindowId=w*/, "_afrWindowId="+clientWindowId);
}
else
{
query = query += "&_afrWindowId="+clientWindowId;
}
}
var sess = "";
if (sess.length > 0)
href += sess;
/* if pprNav is on, then the hash portion should have already been processed */
if ((false) || (hash == null))
document.location.replace(href + query);
else
document.location.replace(href + query + "#" + hash);
</script>
</head>
</html>
php curl web-scraping oracle-adf loopback
php curl web-scraping oracle-adf loopback
edited Jan 1 at 14:01
Umar Abdullah
1
1
asked Jan 1 at 13:31


Haritsinh GohilHaritsinh Gohil
721412
721412
Would deactivating the loopback functionnality on the ADF project work for you?
– MrAdibou
Jan 7 at 10:35
@MrAdibou i can not deactivate, because i am scraping other website which i don't own.
– Haritsinh Gohil
Jan 7 at 11:08
add a comment |
Would deactivating the loopback functionnality on the ADF project work for you?
– MrAdibou
Jan 7 at 10:35
@MrAdibou i can not deactivate, because i am scraping other website which i don't own.
– Haritsinh Gohil
Jan 7 at 11:08
Would deactivating the loopback functionnality on the ADF project work for you?
– MrAdibou
Jan 7 at 10:35
Would deactivating the loopback functionnality on the ADF project work for you?
– MrAdibou
Jan 7 at 10:35
@MrAdibou i can not deactivate, because i am scraping other website which i don't own.
– Haritsinh Gohil
Jan 7 at 11:08
@MrAdibou i can not deactivate, because i am scraping other website which i don't own.
– Haritsinh Gohil
Jan 7 at 11:08
add a comment |
1 Answer
1
active
oldest
votes
The right way to crawl ADF pages is to pass in URL a parameter
*domain.com*?org.apache.myfaces.trinidad.outputMode=webcrawler
to all the GET requests from the script. Keep in mind that when you switch to crawler mode, the pages will look different since it is not meant for human consumption, but it should contain all the raw details you would care about to crawl.
Although, this is an old question and the OP might have long moved on to better things, thought of answering this here to help anybody else hitting the same problem.
Ashvin i am using php cURL library, i can not set output mode as you have stated, i think you can set it in ADF but not in php.
– Haritsinh Gohil
Mar 1 at 11:59
I am referring to URL parameter which you make a request with.
– Ashwin Prabhu
Mar 1 at 15:05
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53995849%2fhow-to-bypass-oracle-adf-loopback-script-for-scripting-website-using-php-curl-li%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The right way to crawl ADF pages is to pass in URL a parameter
*domain.com*?org.apache.myfaces.trinidad.outputMode=webcrawler
to all the GET requests from the script. Keep in mind that when you switch to crawler mode, the pages will look different since it is not meant for human consumption, but it should contain all the raw details you would care about to crawl.
Although, this is an old question and the OP might have long moved on to better things, thought of answering this here to help anybody else hitting the same problem.
Ashvin i am using php cURL library, i can not set output mode as you have stated, i think you can set it in ADF but not in php.
– Haritsinh Gohil
Mar 1 at 11:59
I am referring to URL parameter which you make a request with.
– Ashwin Prabhu
Mar 1 at 15:05
add a comment |
The right way to crawl ADF pages is to pass in URL a parameter
*domain.com*?org.apache.myfaces.trinidad.outputMode=webcrawler
to all the GET requests from the script. Keep in mind that when you switch to crawler mode, the pages will look different since it is not meant for human consumption, but it should contain all the raw details you would care about to crawl.
Although, this is an old question and the OP might have long moved on to better things, thought of answering this here to help anybody else hitting the same problem.
Ashvin i am using php cURL library, i can not set output mode as you have stated, i think you can set it in ADF but not in php.
– Haritsinh Gohil
Mar 1 at 11:59
I am referring to URL parameter which you make a request with.
– Ashwin Prabhu
Mar 1 at 15:05
add a comment |
The right way to crawl ADF pages is to pass in URL a parameter
*domain.com*?org.apache.myfaces.trinidad.outputMode=webcrawler
to all the GET requests from the script. Keep in mind that when you switch to crawler mode, the pages will look different since it is not meant for human consumption, but it should contain all the raw details you would care about to crawl.
Although, this is an old question and the OP might have long moved on to better things, thought of answering this here to help anybody else hitting the same problem.
The right way to crawl ADF pages is to pass in URL a parameter
*domain.com*?org.apache.myfaces.trinidad.outputMode=webcrawler
to all the GET requests from the script. Keep in mind that when you switch to crawler mode, the pages will look different since it is not meant for human consumption, but it should contain all the raw details you would care about to crawl.
Although, this is an old question and the OP might have long moved on to better things, thought of answering this here to help anybody else hitting the same problem.
edited Mar 1 at 15:06
answered Mar 1 at 9:20
Ashwin PrabhuAshwin Prabhu
6,66444169
6,66444169
Ashvin i am using php cURL library, i can not set output mode as you have stated, i think you can set it in ADF but not in php.
– Haritsinh Gohil
Mar 1 at 11:59
I am referring to URL parameter which you make a request with.
– Ashwin Prabhu
Mar 1 at 15:05
add a comment |
Ashvin i am using php cURL library, i can not set output mode as you have stated, i think you can set it in ADF but not in php.
– Haritsinh Gohil
Mar 1 at 11:59
I am referring to URL parameter which you make a request with.
– Ashwin Prabhu
Mar 1 at 15:05
Ashvin i am using php cURL library, i can not set output mode as you have stated, i think you can set it in ADF but not in php.
– Haritsinh Gohil
Mar 1 at 11:59
Ashvin i am using php cURL library, i can not set output mode as you have stated, i think you can set it in ADF but not in php.
– Haritsinh Gohil
Mar 1 at 11:59
I am referring to URL parameter which you make a request with.
– Ashwin Prabhu
Mar 1 at 15:05
I am referring to URL parameter which you make a request with.
– Ashwin Prabhu
Mar 1 at 15:05
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53995849%2fhow-to-bypass-oracle-adf-loopback-script-for-scripting-website-using-php-curl-li%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Would deactivating the loopback functionnality on the ADF project work for you?
– MrAdibou
Jan 7 at 10:35
@MrAdibou i can not deactivate, because i am scraping other website which i don't own.
– Haritsinh Gohil
Jan 7 at 11:08