ProxyReaper

From Hackerspace Brussels
Jump to: navigation, search
ProxyReaper
[[Image:{{{picture}}}|200px]]
What:
proxyReaper
Tagline:
harvest proxies
Maintainer(s):
Jegeva
Archived:




The Goal[edit]

This program's goal is to maintain a list of non transparent proxies.

The Story[edit]

The initial goal was to bypass sites that :

-prevents you to 'not participate' in their money scheme by filtering multiple usage of their 'services' (rapidwhatever..., mainly hosting of user content) by requiring you to cought up some dough and get a subscription)... don't like that...

-prevents you from automating their services (like scrapping for "pasteshare" before the powers that be wants the paste erased)...

Well you can also use it for privacy (your own ISP will see what's happening if you don't use https, and even there...)

this was largely inspired by https://github.com/xme/oplb (hey xavier ;)) but i wanted some threading, sniffed around perl/POE but the POE http client wasn't really supporting socks out of the box. And i haven't done c++ in a long time (funfunfun !).

The Code[edit]

https://github.com/Jegeva/proxyReaper

The Notes[edit]

Nota : valgrind says it looses some memory due to : boost and openssl. If you have an idea, drop me a mail :

xor this "______~___~_____~___" and "5:5:)>P='3>82>63P<02" (yeah spam you know...)

The Prerequisites[edit]

libcurlpp-dev
libcurl-dev
libboost_regex-dev
libboost_iostreams-dev
libsqlite3-dev
libidn-dev

Caveats[edit]

it is compiling with

-Wl,-rpath '-Wl,$$ORIGIN'

meaning it searches for libproxyReaperlib.so in the local folder : THIS IS BAD MOJO !, udpate your LD_LIBRARY_PATH instead in a .sh!

this is a bad hack (security wise but who cares about that...) You've been warned...

The Usage[edit]

(valid for 0.1a)

this needs two external components :

1) a php script that dumps the http headers like that :


proxyReaper;
<?php
echo "IP;".$_SERVER['REMOTE_ADDR']."\n";
echo "VIA;".$_SERVER['HTTP_VIA']."\n";
echo "proxID;".$_SERVER['HTTP_X_PROXY_ID']."\n";
echo "xff;".$_SERVER['HTTP_X_FORWARDED_FOR']."\n";
echo "forw;".$_SERVER['HTTP_FORWARDED']."\n";
?>

will work on a local server to replace this, still have to sniff around libupnp...


2) a script that outputs 1 proxy per line like : (yeah xroxy sucks, they never update but since the guys that want you to pay for proxies will change their "protections" if some parsers are released and i don't really want to enter in a weapon race it's just here to test this on your own), ask me nicely at the space and i can provide some other scripts.

take a look at perl HTML::Parser, WWW-Mechanize,WWW::Selenium and play a bit with css/js if your browser can display it from a source, so can you...

finite state machines FTW...

#! /usr/bin/perl

use LWP::UserAgent;
use XML::XPath;
use XML::XPath::XMLParser;



my $xroxyUrl	= "http://www.xroxy.com/rss";
my $xroxyUA	= "Xroxy-Aggregator PHP v0.3";

my $ua = LWP::UserAgent->new;
	$ua->timeout(30);
	$ua->agent($xroxyUA);

my $response = $ua->get($xroxyUrl);
if ($response->is_success) {
   # print $response->decoded_content;
    parseXML($response->decoded_content);
}

sub parseXML { 
	my $xmlContent = shift;
	my $string;
	return unless defined($xmlContent);
	$xmlContent =~ s/\<\!\[CDATA\[//g;
	$xmlContent =~ s/\]\]\>//g;
	my $xml = XML::XPath->new(xml => $xmlContent);
	my $nodes = $xml->find('/rss/channel/item/description/proxy');
	foreach my $n ($nodes->get_nodelist) {
	    $string = $n->find('ip')->string_value .";" ;
	    $string .= $n->find('port')->string_value .";" ;
	    if($n->find('type')->string_value =~ "Socks"){ 
		$string .= lc($n->find('type')->string_value). ";" ;
	    }else{
		$string .= "http";
		if($n->find('ssl')->string_value eq "true" ){
		    $string .= "s";
		}
		$string.=";";
	    }
	    print $string."\n";
	}
}


for now it's statically defined in the main.cpp, next update, parse ~/.proxyReaper/sources/* that are executable.

This initial commit is a PoC, i accept patches ;)