ProxyReaper
|
The Goal[edit]
This program's goal is to maintain a list of non transparent proxies.
The Story[edit]
The initial goal was to bypass sites that :
-prevents you to 'not participate' in their money scheme by filtering multiple usage of their 'services' (rapidwhatever..., mainly hosting of user content) by requiring you to cought up some dough and get a subscription)... don't like that...
-prevents you from automating their services (like scrapping for "pasteshare" before the powers that be wants the paste erased)...
Well you can also use it for privacy (your own ISP will see what's happening if you don't use https, and even there...)
this was largely inspired by https://github.com/xme/oplb (hey xavier ;)) but i wanted some threading, sniffed around perl/POE but the POE http client wasn't really supporting socks out of the box. And i haven't done c++ in a long time (funfunfun !).
The Code[edit]
https://github.com/Jegeva/proxyReaper
The Notes[edit]
Nota : valgrind says it looses some memory due to : boost and openssl. If you have an idea, drop me a mail :
xor this "______~___~_____~___" and "5:5:)>P='3>82>63P<02" (yeah spam you know...)
The Prerequisites[edit]
libcurlpp-dev libcurl-dev libboost_regex-dev libboost_iostreams-dev libsqlite3-dev libidn-dev
Caveats[edit]
it is compiling with
-Wl,-rpath '-Wl,$$ORIGIN'
meaning it searches for libproxyReaperlib.so in the local folder : THIS IS BAD MOJO !, udpate your LD_LIBRARY_PATH instead in a .sh!
this is a bad hack (security wise but who cares about that...) You've been warned...
The Usage[edit]
(valid for 0.1a)
this needs two external components :
1) a php script that dumps the http headers like that :
proxyReaper; <?php echo "IP;".$_SERVER['REMOTE_ADDR']."\n"; echo "VIA;".$_SERVER['HTTP_VIA']."\n"; echo "proxID;".$_SERVER['HTTP_X_PROXY_ID']."\n"; echo "xff;".$_SERVER['HTTP_X_FORWARDED_FOR']."\n"; echo "forw;".$_SERVER['HTTP_FORWARDED']."\n"; ?>
will work on a local server to replace this, still have to sniff around libupnp...
2) a script that outputs 1 proxy per line like :
(yeah xroxy sucks, they never update but since the guys that want you to pay for proxies
will change their "protections" if some parsers are released and i don't really want to enter in a weapon race it's just here to test this on your own), ask me nicely at the space and i can provide some other scripts.
take a look at perl HTML::Parser, WWW-Mechanize,WWW::Selenium and play a bit with css/js if your browser can display it from a source, so can you...
finite state machines FTW...
#! /usr/bin/perl use LWP::UserAgent; use XML::XPath; use XML::XPath::XMLParser; my $xroxyUrl = "http://www.xroxy.com/rss"; my $xroxyUA = "Xroxy-Aggregator PHP v0.3"; my $ua = LWP::UserAgent->new; $ua->timeout(30); $ua->agent($xroxyUA); my $response = $ua->get($xroxyUrl); if ($response->is_success) { # print $response->decoded_content; parseXML($response->decoded_content); } sub parseXML { my $xmlContent = shift; my $string; return unless defined($xmlContent); $xmlContent =~ s/\<\!\[CDATA\[//g; $xmlContent =~ s/\]\]\>//g; my $xml = XML::XPath->new(xml => $xmlContent); my $nodes = $xml->find('/rss/channel/item/description/proxy'); foreach my $n ($nodes->get_nodelist) { $string = $n->find('ip')->string_value .";" ; $string .= $n->find('port')->string_value .";" ; if($n->find('type')->string_value =~ "Socks"){ $string .= lc($n->find('type')->string_value). ";" ; }else{ $string .= "http"; if($n->find('ssl')->string_value eq "true" ){ $string .= "s"; } $string.=";"; } print $string."\n"; } }
for now it's statically defined in the main.cpp, next update, parse ~/.proxyReaper/sources/* that are executable.
This initial commit is a PoC, i accept patches ;)