========================================================================

CGIProxy 2.2.4  (released January 26, 2019)

HTTP/FTP Proxy in a CGI Script

(c) 1996, 1998-2018 by James Marshall, james@jmarshall.com
All rights reserved.  Free for non-commercial use; commercial use
requires a license.

For the latest, see http://www.jmarshall.com/tools/cgiproxy/

========================================================================

This README contains:

  1. INTRODUCTION
  2. LEGAL DISCLAIMER
  3. INSTALLATION
  4. USAGE
  5. HELP IMPROVE THIS PROXY BY TELLING ME
  6. LIMITS AND BUGS

  7. OPTIONS
  8. CHANGES

------------------------------------------------------------------------
1. INTRODUCTION:

This CGI (or other) script acts as an HTTP, HTTPS, or FTP proxy.  Through it,
you can retrieve any resource that is accessible from the server this runs on.
This is useful when your own access is limited, but you can reach a server
that in turn can reach others that you can't.  By default, no user info
(except browser type) is sent to the target server, so you can set up your
own anonymous proxy.

Whenever an HTML resource is retrieved, it's modified so that all links
in it point back through the same proxy, including images, form submissions,
and everything else.  JavaScript and Flash apps are similarly "proxified".
Once you're using the proxy, you can browse normally and (almost) forget
it's there.

CGIProxy can run in four ways: as a CGI script, as a mod_perl script, as a
FastCGI script, or with its own embedded secure HTTP server.  Configurable
options include text-only proxying (to save bandwidth), selective cookie and
script removal, simple ad filtering, access restriction by server, custom
encoding of target URLs and cookies, and more-- there are more than 70 options
so far (see the complete list of configuration options below).  It requires
Perl 5.6.1 or later.

The original seed for this was a program I wrote for Rich Morin's
article in the June 1996 issue of Unix Review, online at
http://www.cfcl.com/tin/P/199606.shtml .

IMPORTANT NOTE ABOUT ANONYMOUS BROWSING:
CGIProxy was originally made for indirect browsing more than anonymity,
but since people are using it for anonymity, I've tried to make it as
anonymous as possible.  Suggestions welcome.  For best anonymity, browse
with JavaScript turned off, or configure CGIProxy to remove script
content (see the options below).   Also, be aware that Flash or other
executable content may be able to reveal you to a server, though CGIProxy
tries to prevent that too.  Please tell me if you find any way that
anonymity can be compromised when using CGIProxy.

------------------------------------------------------------------------
2. LEGAL DISCLAIMER:

Censorship is a controversial subject, and some governments and companies
have rules about what information you should have access to.  If you use
my software to bypass rules that have been imposed on you, you assume all
legal risks and responsibilities involved.  I'm providing the software as
a demonstration and teaching tool, and for when legitimate access is
needed to non-accessible servers.  I won't encourage you to break any
rules, because I would get in trouble if I did.  I can't prevent you from
using this software in illegitimate ways, but I believe the value of it in
its many uses is far too great to let a few miscreants ruin it for
everybody.

------------------------------------------------------------------------
3. INSTALLATION:

*** As of version 2.2.1, CGIProxy now uses an installation wizard, so these
*** instructions have been completely rewritten.  Please tell me about any
*** installation problems you have on any platform, so I can fix them!

*** Starting in version 2.2.3, you can easily upgrade CGIProxy in place with
*** the "upgrade" command-- see UPGRADING below.

You need to install CGIProxy on a machine that's outside of any censoring
firewall, but that is still accessible to people behind the firewall. You do
NOT need to install anything on the browsing machine(s). Once CGIProxy is
installed on a machine, any number of people can use it, if they know its URL.

Subsections within this INSTALLATION section include:

    PREREQUISITES
        AS A CGI SCRIPT
        AS A MOD_PERL SCRIPT
        AS A FASTCGI SCRIPT
        USING CGIPROXY'S EMBEDDED SERVER
    INSTALLING THE CGIPROXY SCRIPT
    UPGRADING CGIPROXY FROM A PREVIOUS VERSION
    UNINSTALLING CGIPROXY
    AFTER YOU INSTALL
    TROUBLESHOOTING
        "BAD REQUEST" ERRORS



PREREQUISITES:

Now's a great time to review security when using CGIProxy, at
http://www.jmarshall.com/tools/cgiproxy/security.html

CGIProxy requires Perl and OpenSSL, but those are already installed on almost
all Unix servers. If you're using a Windows server, install both of these.

There are four ways CGIProxy can run: as a CGI script, as a mod_perl script,
as a FastCGI script, or with its own embedded server. As a CGI script is the
slowest to run, and as a mod_perl or FastCGI script are the fastest to run.
Depending on which of these ways you choose, your Web server needs to be
configured to support it, as detailed in the following sections.

IF YOU'RE INSTALLING ON A CENTOS SERVER:
CentOS doesn't include all standard Perl modules by default, so install those
first by running "yum install epel-release" as root. You can't even run the
installation wizard without first doing this. In addition, the CPAN modules
required by CGIProxy must be installed as root on CentOS, so if the
installation wizard isn't run as root, you'll need to install those modules
separately from the wizard. The wizard will tell you how to do this.

IF YOU'RE INSTALLING ON A WINDOWS SERVER:
The commands below that run nph-proxy.cgi are for Unix, but are almost the same
for the Windows command line. In Windows, make two changes: start a command
with "perl", and replace "/" with "\". You can also remove an initial "./". For
example, instead of running "./nph-proxy.cgi command", run "perl nph-proxy.cgi
command".


AS A CGI SCRIPT:

Your Web server has to be configured to support CGI scripts such that any
executable file ending in ".cgi" is treated as a CGI script. Your webmaster
normally does this. For Apache, you need something like this in the section of
httpd.conf that controls the directory where nph-proxy.cgi will be (normally a
<Directory> or <Files> block):

    AddHandler cgi-script .cgi
    Options +ExecCGI

Apache has all the details at https://httpd.apache.org/docs/2.4/howto/cgi.html


AS A MOD_PERL SCRIPT:

mod_perl is an optional feature of Apache servers. With mod_perl, CGIProxy can
run much faster than as a CGI script. Your webmaster normally configures
mod_perl. If you're configuring your own mod_perl, you can use this block in
your httpd.conf to make all scripts ending in ".pl" use mod_perl:

<Files *.pl>
    Options +ExecCGI
    SetHandler perl-script
    PerlResponseHandler ModPerl::Registry
    PerlOptions +ParseHeaders
    PerlSendHeader Off
    <Files nph-*>
        PerlOptions -ParseHeaders
    </Files>
</Files>

If you know how to configure Apache, you can tweak this as desired. If you use
the block above, then rename nph-proxy.cgi to nph-proxy.pl .


AS A FASTCGI SCRIPT:

FastCGI scripts are similar to CGI scripts in some ways, but run faster than
CGI scripts. FastCGI can be used with most Web servers, including those that
don't support CGI scripts, such as nginx. A FastCGI script runs as multiple
managed processes on a server, and is called by the Web server process.

After installing CGIProxy, configure your Web server to use CGIProxy as a
FastCGI script, using your settings of $SECRET_PATH and $FCGI_SOCKET; you can
view these settings in either the configuration menu or the file cgiproxy.conf
. Examples of how to configure the nginx and Apache servers are just below.

To start the FastCGI process, run the command "cgiproxy/bin/nph-proxy.cgi
start-fcgi" from your home directory.

To stop the FastCGI process in Unix, run the command "killall nph-proxy.cgi".


Configuring nginx with FastCGI:

Include a section like this in your nginx.conf, within the secure server
section:

location /secret/ {
    fastcgi_pass             localhost:8002;
    fastcgi_split_path_info  ^(/secret)(/?.*)$;
    fastcgi_param            SCRIPT_NAME $fastcgi_script_name;
    fastcgi_param            PATH_INFO $fastcgi_path_info;
    include                  fastcgi.conf;      # if not included elsewhere
}

The first three lines need to match your configuration in nph-proxy.cgi :
replace "secret" (in two places!) with your setting of $SECRET_PATH, and
replace "8002" with your setting of $FCGI_SOCKET if you changed it from the
default.


Configuring Apache with FastCGI:

Include a line like this in your httpd.conf:

FastCgiExternalServer /var/www/html/secret -host localhost:8002

Replace "/var/www/html/secret" with your setting of $SECRET_PATH appended to
your DocumentRoot, and replace "8002" with your setting of $FCGI_SOCKET if you
changed it from the default.


USING CGIPROXY'S EMBEDDED SERVER:

CGIProxy includes an embedded Web server, so you don't need an external Web
server to run it. However, you will need a key pair. (All secure servers need
this, but external servers usually have a key pair already installed by your
hosting provider.) A key pair consists of a certificate (visible to the public)
and a private key (which you must keep secret). You can get a key pair for your
server from a certificate authority (CA). Most certificate authorities charge
an annual fee for a key pair, but Let's Encrypt (https://letsencrypt.org)
provides them for free. They also provide the Certbot tool
(https://certbot.eff.org/) to automate the process of managing your
certificate. After you install CGIProxy, you'll need to copy the certificate
and private key files into the "cgiproxy" directory under your home directory.

After installing CGIProxy, start the embedded server by running the command
"cgiproxy/bin/nph-proxy.cgi start-server" from your home directory. After it
starts, it will tell you the URL to access the proxy with (including the actual
port number), and the process ID of the running server.

To stop the embedded server in Unix, run the command "killall nph-proxy.cgi".



INSTALLING CGIPROXY:

*** As of version 2.2.3, the .tar.gz distribution file (the "tarball") has a
different structure than before, so there is one more step (described in step
3 below) to extract the nph-proxy.cgi file.  Alternately, you can directly
download the new script file from
https://jmarshall.com/tools/cgiproxy/nph-proxy.txt, rename it to
"nph-proxy.cgi", and skip step 3 below.

Once the prerequisites are in place, then to install CGIProxy:

    1) Upload the distribution file (the file ending in ".tar.gz") to your Web
       server.
    2) Log in to a shell account on the server.
    3) Unpack the distribution:  On Unix, run the commands:
           tar xzvf cgiproxy.*.tar.gz
           tar xzvf cgiproxy-inner.*.tar.gz
       From that you have two files, nph-proxy.cgi and README.
       README .  nph-proxy.cgi is the program you'll be installing and running.
       That's it-- just one file.
    4) If you want to rename nph-proxy.cgi (see below), do it now.
    5) Run the command "./nph-proxy.cgi install". This does almost everything
       required to install CGIProxy. It runs a simple installation wizard that
       asks you a few questions. Ideally you can run this as root, but it
       should work even if you don't have root access.
    6) If using the embedded server:  Copy both files of your key pair
       (described in "Prerequisites" above) into the "cgiproxy" directory under
       your home directory.  The filenames should match the configuration
       variables $CERTIFICATE_FILE and $PRIVATE_KEY_FILE, which are
       "plain-cert.pem" and "plain-key.pem" unless you changed them.
    7) If installing on a Windows server:  Add a daily or hourly task to purge
       the database in the Task Scheduler, using the command
       "perl \path\to\script\nph-proxy.cgi" (replace with the correct path) and the
       parameter "purge-db".  If you're allowing us to collect simple usage
       statistics (i.e. if $REPORT_USAGE is true), add two more tasks to the
       Task Scheduler:
           1) Every week or month, run "perl \path\to\script\nph-proxy.cgi"
              with the paramter "report-usage".
           2) At midnight at the beginning of every month, run the same command
              with the parameter "count-unique-users".

As part of the wizard, you will see a simple menu that lets you modify any
configuration variables in CGIProxy. (You probably don't need to change
anything.) When you're finished, enter "0" to save your settings and exit the
configuration menu. After installing CGIProxy, you can always run the
configuration menu again by running "./nph-proxy.cgi config".

After the configuration, the wizard will install all the Perl CPAN modules
needed by CGIProxy.  To *only* install these modules, and not do any other
installation tasks, run "./nph-proxy.cgi install-modules".

You can rename nph-proxy.cgi if you want. However, if you're installing it as a
CGI script or under mod_perl, be sure the name still starts with "nph-". (You
can change that requirement for mod_perl, by reconfiguring mod_perl.)



UPGRADING CGIPROXY FROM A PREVIOUS VERSION

To upgrade CGIProxy *from* version 2.2.3 or later, simply run the command
"./nph-proxy.cgi upgrade".  If you want to run this automatically, put it in a
cron job that runs weekly or monthly.  This command only upgrades if there is
a newer version of CGIProxy than what you are currently running.

To upgrade CGIProxy from version 2.2.2 or earlier, the easiest way is to
uninstall CGIProxy and then install the new version, with these steps:

    1) First, get the new version of the script from
       https://jmarshall.com/tools/cgiproxy/nph-proxy.txt , and rename it to
       "nph-proxy.cgi".
    2) If you want to save your previous configuration settings, copy the
       file cgiproxy/cgiproxy.conf to somewhere *outside* of your $PROXY_DIR
       (which is usually "cgiproxy/" under your home directory).
    3) Run "./nph-proxy.cgi uninstall".
    4) Run "./nph-proxy.cgi install".  If you saved your old cgiproxy.conf
       in step 2, then run "./nph-proxy.cgi install --old-config cgiproxy.conf"
       instead.

When running the "upgrade" command, CGIProxy verifies the code signature of
the inner .tar.gz file to ensure it's the same as the original .tar.gz file
from jmarshall.com .



UNINSTALLING CGIPROXY

To uninstall CGIProxy 2.2.3 or later, run "./nph-proxy.cgi uninstall".

To uninstall CGIProxy 2.2.2 or earlier, first get the new version of the
script from https://jmarshall.com/tools/cgiproxy/nph-proxy.txt , and rename
it to "nph-proxy.cgi".  Then run this command with the new copy you just
downloaded:  "./nph-proxy.cgi uninstall".  If it can't figure out exactly
where the earlier version is installed, it will ask you for the complete
path to the old nph-proxy.cgi .

Note that this does *not* remove all traces of CGIProxy-- here are traces that
remain that you may want to delete:

    - If you're using a database other than SQLite, the "cgiproxy" database
      will still be there, along with any special user created for CGIProxy.
    - The installation wizard may have installed various Perl modules, either
      in the system location (if run as root), or under the "perl5/"
      directory under your home directory (if not run as root).
    - You may have configured your web server in order to run CGIProxy.
    - If you're using a Windows server, you will need to remove any related
      tasks from the Task Scheduler. 



AFTER YOU INSTALL:

Please review the security guidelines before doing anything further, at
http://www.jmarshall.com/tools/cgiproxy/security.html

If you want your proxy to be usable by other people, you need to communicate
the proxy URL to them.  Try to use a secure method to do so, or else the proxy
could easily be discovered by censors and blocked, or worse.

IN PARTICULAR, DON'T BE TEMPTED TO POST YOUR PROXY URL TO ANY PUBLIC SITES THAT
LIST AVAILABLE PROXIES.  IF YOU DO, THE CENSORS WILL QUICKLY SEE YOUR PROXY AND
BLOCK IT, OR THERE MAY BE EVEN WORSE CONSEQUENCES.  THESE PROXY-LISTING SITES
MAY BE USEFUL FOR ANONYMITY, BUT FOR GETTING AROUND CENSORSHIP THEY ARE
DANGEROUS!  THEY MAKE THE CENSORS' JOB EASIER.

IN ADDITION, PEOPLE SHOULD ONLY BE USING PROXIES INSTALLED BY PEOPLE OR
ORGANIZATIONS THAT THEY TRUST.

If heavy use of this proxy puts too much load on your server, see "NOTES ON
PERFORMANCE" near the top of the source code. The biggest improvement comes
from running this under mod_perl or FastCGI.



TROUBLESHOOTING:

"Bad Request" errors

These happen when you accumulate too many cookies for the Web server to handle.
To fix it, use a database with CGIProxy. The easiest database to use is
SQLite-- to use it, just set $DB_DRIVER to "SQLite". To use a running database
engine like MariaDB/MySQL or Oracle, you must also create a database account
for CGIProxy to use, and set $DB_USER and $DB_PASS.

IMPORTANT: Whenever you use a database, it will accumulate old data that you
need to purge periodically for both performance and security reasons. To purge
the database, run "./nph-proxy.cgi purge-db" regularly. On a Unix server, do
this automatically with a cron job . The CGIProxy installation wizard normally
adds this cron job for you, but if it doesn't, then you can do it yourself: run
"crontab -e" to edit your list of cron jobs, and add a line like this to purge
the database every morning at e.g. 2:13am:

    13 2 * * * /path/to/script/nph-proxy.cgi purge-db

Replace "/path/to/script/" with the full path to nph-proxy.cgi . If you'd
rather purge the database every hour, change the "2" above to "*".

If you ran "./nph-proxy.cgi install" as root, then do this as root too, but
instead of running "crontab -e", run "crontab -e -u $RUN_AS_USER", where
$RUN_AS_USER is that username from the CGIProxy configuration. This modifies
the list of cron jobs belonging to $RUN_AS_USER .

On a Windows server, use the Task Scheduler to do this instead of a cron job.

------------------------------------------------------------------------
4. USAGE:

Call the script directly to start a browsing session.  Once you've gotten
a page through the proxy, everything it links to will automatically go
through the proxy.  You can bookmark pages you browse to, and your
bookmarks will go through the proxy as they did the first time.

------------------------------------------------------------------------
5. HELP IMPROVE THIS PROXY BY TELLING ME:

1) Any kind of links in HTML, JavaScript, CSS, or Flash that aren't being
converted, including non-standard tags or links.

2) Any method of introducing JavaScript or other executable content that's not
being filtered out.

3) Any file types that contain links that need to be converted, other than
what's handled now (I know about PDFs).

4) Any other ways you can find to compromise anonymity.


Please verify you're using the latest version of CGIProxy before emailing me.

------------------------------------------------------------------------
6. LIMITS AND BUGS:

ANONYMITY MAY NOT BE PERFECT!!  In particular, there may be some holes where
unproxified JavaScript or Flash content can slip through.  If you find any,
please tell me.  For best anonymity, turn JavaScript and Flash off in your
browser (best), and/or configure CGIProxy to remove scripts.

Ideally, use a database to store cookies.  If you can't, then if you browse
to many sites with cookies, CGIProxy may drop some, causing some sites to not
work.  If this happens, delete some or all of your existing cookies (via the
"Manage Cookies" screen) and try again.

I didn't follow the spec on HTTP proxies, and there are violations of the
protocol.  Actually, this whole concept is a violation of the proxy model,
so I'm not too worried.  If any protocol violations cause you problems,
please let me know.

Only HTTP/HTTPS and FTP are supported so far.

========================================================================

7. OPTIONS:

Here's a list of all the configuration options in CGIProxy, sorted into rough
categories.  The default settings are in square [] brackets, and should work
fine for almost all situations, though see the section "OPTIONS RELATED TO YOUR
SERVER/NETWORK ENVIRONMENT" to be sure.  Also, see the sections "FASTCGI
CONFIGURATION OPTIONS", "EMBEDDED SERVER CONFIGURATION OPTIONS", and "DATABASE
CONFIGURATION OPTIONS" if needed.  For more information on any option, see
the comments in the source code where it is set, in the user configuration section.

To configure CGIProxy after installing it, run "./nph-proxy.cgi config".


OPTIONS RELATED TO YOUR SERVER/NETWORK ENVIRONMENT:
----------------------------------------------------

$PROXY_DIR ["cgiproxy"]
    The directory on the server where the program can place files.  A relative
    path will be resolved relative to the home directory of the script owner.

$RUN_AS_USER, $RUN_AS_GROUP [none]
    User ID and group to run as, in case you need to start the embedded server
    as root.  Also used to indicate Web server's user ID and group when running
    as a CGI or mod_perl script.

$SECRET_PATH [randomly-generated]
    If using FastCGI or the embedded server, set this to an alphanumeric
    sequence of characters that is hard to guess.  It will be part of the
    URL of your proxy.

$LOCAL_LIB_DIR ['perl5']
    The directory used by the local::lib module to install Perl (CPAN) modules.
    Only needed if local::lib is used, i.e. if installing Perl modules not as
    root.

To enable access to secure servers:
    Install the separate packages OpenSSL and Net::SSLeay.  Net::SSLeay is
    automatically installed by running "./nph-proxy.cgi install-modules".

To enable compressed (gzip'd) content:
    If you're not using Perl 5.9.4 or later, then install the IO::Compress::Gzip
    Perl module.  It is automatically installed by running "./nph-proxy.cgi install-modules".

$RUNNING_ON_SSL_SERVER ['']
    Set this if the script is running on an SSL server (i.e. accessed with
    an "https://" URL).  Or, the default value of '' means to guess based on
    the server port:  the script assumes SSL unless the server port is 80.

$NOT_RUNNING_AS_NPH [0]
    Set this if the script is not running as an NPH script (not recommended;
    see comments for possible dangers).

$HTTP_PROXY, $SSL_PROXY, $NO_PROXY [none]
    If this script has to use an HTTP proxy (like a firewall), then set
    $HTTP_PROXY to that proxy's host (and port if needed).  Set $SSL_PROXY
    similarly when using an SSL proxy.  $NO_PROXY is a comma-separated list
    of servers or domains that should be accessed directly, i.e. NOT through
    the proxies in $HTTP_PROXY and $SSL_PROXY.  Also see $USE_PASSIVE_FTP_MODE
    below when using a firewall.

$PROXY_AUTH, $SSL_PROXY_AUTH [none]
    If either or both of the proxies in $HTTP_PROXY and $SSL_PROXY require
    authentication, then set these two variables respectively to the required
    credentials.

$SOCKS_PROXY [none]
    To use a SOCKS 5 proxy (like Tor), set this to the SOCKS 5 proxy's host and
    port, or just port number to default to localhost.  It's strongly
    recommended to only use a SOCKS 5 proxy on the same server as CGIProxy, as
    all data is passed in the clear between them.  As of version 2.2.2, you no
    longer need to change @BANNED_NETWORKS to use a SOCKS proxy on localhost.

$SOCKS_USERNAME, $SOCKS_PASSWORD [none]
    If your SOCKS 5 proxy uses username/password authentication, set these.

$USER_FACING_PORT [none]
    If the users see a different port externally than CGIProxy sees internally,
    set this to the port the users see.

$REPORT_USAGE [no default]
    If set to 1 (true), allow us to collect simple usage metrics:  hit count,
    total traffic in bytes, and unique users per month.  All are sorted by
    language used.  We do NOT collect any URLs, IP addresses, or any other
    private information.  This helps us to develop CGIProxy, in multiple ways.
    Note that if this is set, it needs two automatic jobs to be set (cron jobs
    on Unix):  1) "nph-proxy.cgi report-usage" every week or month, and
    2) "nph-proxy.cgi count-unique-users" run at midnight on the first of
    every month.  (The full path to nph-proxy.cgi has to be included in the
    commands.)  On Unix and Mac, the installation wizard automatically adds
    these cron jobs as needed.


FASTCGI CONFIGURATION OPTIONS:
------------------------------

$FCGI_SOCKET [8002]
    The local port to listen on for FastCGI communication.

$FCGI_NUM_PROCESSES [100]
    Number of FastCGI processes to maintain (same as "-n" command-line parameter).

$FCGI_MAX_REQUESTS_PER_PROCESS [1000]
    How many HTTP requests each FastCGI process should handle before being
    restarted (same as "-m" command-line parameter).


EMBEDDED SERVER CONFIGURATION OPTIONS:
--------------------------------------

$CERTIFICATE_FILE ['plain-cert.pem'], $PRIVATE_KEY_FILE ['plain-key.pem']
    Filenames of your certificate (public key) and private key, in PEM format
    and in $PROXY_DIR.

$EMB_USERNAME, $EMB_PASSWORD [none]
    Optional username and password to protect your proxy with.


DATABASE CONFIGURATION OPTIONS:
-------------------------------

$DB_DRIVER ['SQLite']
    If using a database, set this to "SQLite", "MySQL", or "Oracle".
    Leave it commented out if not using a database.

$DB_SERVER [none]
    If your database isn't running on the same server as CGIProxy, or isn't on
    the default port, then set this to the database server and port in the form
    "db_host:db_port".

$DB_NAME ['cgiproxy']
    Name of the database the program can use.

$DB_USER, $DB_PASS [none]
    CGIProxy accesses the database with this username and password.

$USE_DB_FOR_COOKIES [1]
    Set this to true to use the server-side database to store cookies.  If false,
    will use the old method of storing cookies in the browser and sending all
    cookies with each request, which may result in "Bad Request" errors.


COMMON CONFIGURATION OPTIONS:
---------------------------

$TEXT_ONLY [0]
    Allow only text resources through the proxy, to save bandwidth.

$REMOVE_COOKIES [0]
    Ban all cookies to or from all servers.  To allow and ban cookies by
    specific servers, see @ALLOWED_COOKIE_SERVERS and @BANNED_COOKIE_SERVERS.

$REMOVE_SCRIPTS [0]
    Prevent any script content from any server from reaching the browser.
    This includes script statements within HTML pages, external script
    files, etc.  To allow and ban script content by specific servers,
    see @ALLOWED_SCRIPT_SERVERS and @BANNED_SCRIPT_SERVERS.  Anonymity is
    unreliable if you don't either remove scripts, or browse with scripts 
    turned off in your browser.

$FILTER_ADS [0]
    Remove ads from pages, based on the patterns in @BANNED_IMAGE_URL_PATTERNS.
    Also ban ad-related cookies by setting $NO_COOKIE_WITH_IMAGE.

$HIDE_REFERER [0]
    Don't tell servers which link you followed to get to their page.  (Yes,
    it's misspelled on purpose.)

$INSERT_ENTRY_FORM [1]
    At the top of every page, include a small form that lets you enter
    a new URL, change your options, or manage your cookies.

$ALLOW_USER_CONFIG [1]
    Let users set their own $REMOVE_COOKIES, $REMOVE_SCRIPTS, $FILTER_ADS,
    $HIDE_REFERER, and $INSERT_ENTRY_FORM, via checkboxes on the entry form.

sub proxy_encode {}, proxy_decode {}
    (Requires minor programming.)  You can customize the encoding of
    destination URLs by modifying these routines.  The default is a simple
    unobscured URL, but sample obscuring code is included in the comments.
    Note: If you're not removing scripts, then you also need to change
    _proxy_jslib_proxy_encode() and _proxy_jslib_proxy_decode()-- see the
    comments.

sub cookie_encode {}, cookie_decode {}
    (Requires minor programming.)  You can customize the encoding of cookies
    sent to the user's machine by modifying these routines.  The default is a
    simple unobscured cookie, but sample obscuring code is included in the
    comments.
    Note: If you're not removing scripts, then you also need to change
    _proxy_jslib_cookie_encode() and _proxy_jslib_cookie_decode()-- see the
    comments.

@ALLOWED_SERVERS, @BANNED_SERVERS  [empty]
    Allow or ban specific servers from being accessed through the proxy, based
    on their hostname.  Each array is a list of patterns (regular expressions)
    to match, not just single servers.

@BANNED_NETWORKS  [('127', '192.168', '172.16-31', '10', '169.254', '244.0.0')]
    Ban specific IP addresses or networks from being accessed through the
    proxy.  Recommended for security when this script is run on a firewall.
    As of version 2.2.2, a SOCKS proxy is allowed on localhost regardless of
    this setting.

@ALLOWED_COOKIE_SERVERS, @BANNED_COOKIE_SERVERS  [empty]
    Allow or ban cookies from specific servers.  Each array is a list of
    patterns (regular expressions) to match, not just single servers.

@ALLOWED_SCRIPT_SERVERS, @BANNED_SCRIPT_SERVERS  [empty]
    Allow or ban script content from specific servers.  Each array is a list
    of patterns (regular expressions) to match, not just single servers.

@BANNED_IMAGE_URL_PATTERNS  [sample list in source code]
    If $FILTER_ADS is set, then ban images that match any pattern in this list.

$RETURN_EMPTY_GIF [0]
    If an image is banned, then replace it with a 1x1 transparent GIF to
    show blank space instead of a broken image icon.

$NO_COOKIE_WITH_IMAGE [0]
    Ban all cookies that come with images or other non-text resources.  Those
    are usually just Web bugs, to track you for marketing purposes.

$QUIETLY_EXIT_PROXY_SESSION [0]
    (NOT for use with anonymous browsing!!!)  For VPN-like installations, let
    the user browse directly from proxied pages to unproxied pages, with no
    intermediate warning screens.  See the comments for more info.

$PROXIFY_SCRIPTS [1]
    Proxify all supported script content.  Currently, only JavaScript is
    supported.

$PROXIFY_SWF [1]
    Support Flash apps, i.e. reroute all network accesses in them back through
    this program.

$ENCODE_URL_INPUT [1]
    When submitting a URL through either the start form or the top form,
    encode it first by using proxy_encode().

$USER_IP_ADDRESS_TEST ['']
    This lets you call an external test to authorize the user.  See comments
    for more details.

$DESTINATION_SERVER_TEST ['']
    This lets you call an external test to determine if the destination
    server is allowed (as opposed to using @ALLOWED_SERVERS and
    @BANNED_SERVERS).  See comments for more details.

%REDIRECTS [none]
    This lets you automatically redirect a URL to another URL, perhaps to one
    that works better through CGIProxy, such as a mobile site.


INSERTING A STANDARD HEADER INTO EACH PAGE:
-------------------------------------------

$INSERT_HTML  [none]
    Insert your own block of HTML into the top of every page.

$INSERT_FILE  [none]
    Insert the contents of the named file into the top of every page.  Can't
    be used with $INSERT_HTML.

$ANONYMIZE_INSERTION [1]
    If $INSERT_HTML or $INSERT_FILE is used, then anonymize that HTML along
    with the rest of the page.

$FORM_AFTER_INSERTION [0]
    If $INSERT_HTML or $INSERT_FILE is used, and $INSERT_ENTRY_FORM is set,
    then put the URL entry form after the inserted HTML instead of before it.


MINOR OR SELDOM-USED OPTIONS:
-----------------------------

@PROXY_GROUP  [empty]
    This is an experimental feature which may help with load balancing, or
    may have other creative uses.  Cookies won't work if you use this.
    See the comments for further info.

$SESSION_COOKIES_ONLY [0]
    Force all cookies to expire when the current browser closes.

$MINIMIZE_CACHING [0]
    Try to prevent the user's browser from caching, i.e. from storing anything
    locally.  Better privacy, but consumes more bandwidth and seems slower.

$USER_AGENT  [none]
    Tell servers you're using this browser instead of what you're really using.

@TRANSMIT_HTML_IN_PARTS_URLS  [empty]
    Transmit each part of certain HTML pages back to the user as they are
    received, rather than wait for the whole page.  This is set to a list of
    patterns that match URLs for which you want this treatment.

$USE_PASSIVE_FTP_MODE [1]
    When doing FTP transfers, use "passive mode" instead of "non-passive mode".
    Passive mode tends to work better when this script runs behind a firewall,
    but that varies by network.

$SHOW_FTP_WELCOME [1]
    When showing FTP directories, always display the FTP welcome message,
    instead of never displaying it.

$PROXIFY_COMMENTS [0]
    Proxify the inside of HTML comments as if it's not inside comments.

$USE_POST_ON_START [1]
    Use POST instead of GET when submitting the URL entry form.

$REMOVE_TITLES [0]
    Remove titles from HTML pages.

$NO_BROWSE_THROUGH_SELF [0]
    Prevent the script from calling itself.

$NO_LINK_TO_START [0]
    Don't link to the start page from error pages.

$MAX_REQUEST_SIZE [16777216 = 16 Meg]
    (Obscure.) The largest request that can be handled in certain rare
    situations involving password-protected sites.

$ALLOW_UNPROXIFIED_SCRIPTS [0]
    Allow scripts of unsupported type to pass through the proxy.

$COOKIE_PATH_FOLLOWS_SPEC [0]
    When handling cookies with no path, treat it according to the cookie spec.
    If not set, behave as browsers (erroneously) do, i.e. set the path to "/".

$RESPECT_THREE_DOT_RULE [0]
    Restrict cookie domains as they should be, based on how many dots they
    have.  If not set, behave as browsers (erroneously) do, i.e. loosen
    restrictions on cookie domains with two dots.

$ALERT_ON_CSP_VIOLATION [0]
    When there is a Content Security Policy (CSP) violation, show a message
    to the user.  When not set, just show such error messages in the JavaScript
    console.  This is usually just used for testing.

%TIMEOUT_MULTIPLIER_BY_HOST [ "www.facebook.com" => 10 ]
    Multiply timeouts in JavaScript by this much, for the listed servers.
    This can be used to reduce crashes and improve performance on certain
    JavaScript-heavy sites.

$ALLOW_RTMP_PROXY [0]
    (Currently unused.)  Allow the creation of an RTMP proxy process.

========================================================================


8. CH'CH'CH'CH'CHANGES:
-----------------------


2.2.4, released January 26, 2019:
---------------------------------

The top form is no longer inserted into the HTML of each page, which means
it behaves *much* better.  Instead, the top form is now in its own HTML along
with an <iframe> that includes the full visited page.  No longer will the top
form cover up page content, or disappear under page content.

The starting page is now a little easier to use, with language names under
each flag and more obvious button placement.

No longer crashes with Perl 5.22.x .

Closed a couple of potential privacy holes.

Various bug fixes that make most websites work again.



2.2.3, released November 18, 2018:
----------------------------------

There is now an "upgrade" command, which does everything needed to securely
upgrade an installation of CGIProxy to the latest version.  It keeps all
previous configuration settings.  Note that if you are upgrading from version
2.2.2 or earlier, the easiest way is to first run the "uninstall" command, and
then run the "install" command; you can reuse previous config settings by
saving your old cgiproxy.conf and adding the "--old-config my_saved_file"
parameter to the "install" command.  But from 2.2.3 forward, the "upgrade"
command will do everything for upgrading.

There is now an "uninstall" command, to easily uninstall CGIProxy.

There is now an automatic mechanism to *optionally* report simple usage
metrics back to our server:  hourly hit counts per language, hourly traffic
per language, and unique users per month.  We do NOT collect any private info
such as IP addresses or URLs visited.  I think this is safe and uninvasive, but
please email me if you have any concerns.  It helps us a lot, both in seeing
where CGIProxy is used the most, and in our ongoing efforts to get funding for
CGIProxy development.  In any case, you don't have to report these numbers to
us-- it's a question in the installation wizard that sets the $REPORT_USAGE
config variable.

Removed support for -f command-line parameter.

Many bugs fixed and various new technologies supported under the covers,
which makes most complex websites work again.



2.2.2, released July 16, 2017:
------------------------------

The installation wizard has been greatly improved, based on user feedback:
the server's package manager is now used instead of cpan when
installing Perl modules, if available and if installing as root (dnf, yum,
and apt-get are supported); Web server configuration is read better on more
platforms; fewer fatal errors; better messages; at the end, the wizard now
gives the new proxy URL.  Also, the installed script now has one line modified
by the wizard, to set $PROXY_DIR to an absolute path.

$SECRET_PATH is now randomly-generated by default, and is no longer optional.
For CGI and mod_perl installations, it names the subdirectory where the
script file is installed.

All pages are now optimized for mobile usage, as a <meta name="viewport" ...>
is inserted into all of them.

For maximum security, TLS 1.2 is now used if available; if not, successively
older TLS/SSL versions are tried.

Improved caching of both the starting page and the JavaScript library.  Among
other things, the If-Modified-Since: header is respected for those two.

Added a "create-db" command to create the database, e.g. if you change
$DB_DRIVER after installing.

Added a -q ("quiet") option to suppress output when running as FastCGI script.

A SOCKS proxy on localhost is now allowed, even if "127" is set in
@BANNED_NETWORKS .

If the Math::Random::Secure module is installed, that will be used for
randomly-generated strings such as $SECRET_PATH or session cookies.  That
module is not installed by default because loading it adds significant time
to every request, plus it has many dependencies.  But you can install it
if you need cryptographic-quality randomness in those generated strings.

Many bugs and potential privacy holes fixed.



2.2.1, released May 28, 2017:
-----------------------------

CGIProxy now has an installation wizard!  Run "./nph-proxy.cgi install" to
install it (the "init" command is no longer needed or supported).  This
should work on all platforms, and should work if run as either root or
non-root.  PLEASE REPORT ANY PROBLEMS, OR IF ANYTHING IS UNCLEAR!  The more
feedback I get, the better the installation wizard will work on all platforms
in the future.

CGIProxy now has a configuration menu!  Run "./nph-proxy.cgi config" to run
it.  Related to this, CGIProxy now uses an external configuration file,
which will make future upgrades much easier.  There is no longer any need to
edit the script file directly, unless you need to change $PROXY_DIR .

$SECRET_PATH is no longer optional, and is now set to a random alphanumeric
string by the installation wizard.  It is used in all four run methods-- for
CGI and mod_perl installations, a subdirectory is named by it and the script
is copied there during installation.

Added $RUN_AS_GROUP config variable, to go with $RUN_AS_USER.

Many bugs fixed, making many sites work again.



2.1.17, released June 11, 2016:
-------------------------------

Fixed to no longer access Window.prototype, which has been recently disallowed
by most browsers.  This makes *many* websites work again, including Gmail,
YouTube, and facebook.

A few other bug fixes.



2.1.16, released November 19, 2015:
-----------------------------------

Cleaned up installation process.  In particular, $RUN_AS_USER and $PROXY_DIR
should now be set before running "./nph-proxy.cgi init", so that file
ownership and permissions can be set correctly.

Added Dutch localization.



2.1.15, released October 21, 2015:
----------------------------------

Database support is MUCH less error-prone.  The database is created when
running "./nph-proxy.cgi init" rather than when running as a CGI script
or manually, which is really how it should have been done all along.  Better
separation of the various roles and permissions.

Added support for SQLite databases, which means you can now use a database
without running a database engine like MySQL/MariaDB or Oracle.  $DB_DRIVER
now defaults to "SQLite", hopefully making "Bad Request" errors a thing of
the past.

There is now a "./nph-proxy.cgi init" command, which a) creates needed
directories, b) installs all Perl modules (like the "install-modules"
command" has done), and c) creates the database.  "install-modules" still
works, but is usually unnecessary.

Accordingly, the installation instructions at install.html have been
updated.

Added support for Storage objects, mostly.

MANY major fixes to CORS support, XMLHttpRequest support, cookie support,
and other areas, making popular sites work again.



2.1.14, released November 13, 2014:
-----------------------------------

Fixed FastCGI support.  Note that $FCGI_SOCKET is now a listening port number
rather than a Unix-domain socket (which wasn't working well anyway).  Be sure
to fix your setting of $FCGI_SOCKET; more details are in the comments above
it in the configuration section.  Also note that the installation page has
been updated with better FastCGI configuration for nginx and Apache.

Added the $USER_FACING_PORT config variable for when one user-facing web server
calls CGIProxy running under other web server software, or any other case when
the port the user sees isn't the same port that CGIProxy sees.

Many bug fixes, making many more sites work.  Major sites are working again,
and in all major browsers.



2.1.13, released October 6, 2014:
---------------------------------

Complete performance rewrite of the JavaScript-modifying code, i.e. proxify_js()
and related routines.  The delay when viewing YouTube video pages is much shorter,
and the load on the server CPU is much less.  :)

Several bug fixes.



2.1.12, released June 21, 2014:
-------------------------------

Now has fully-tested and working IPv6 support.

Works again with older Socket module.  Note that IPv6 support only works with
Socket 1.94 or later.

Several bug fixes.



2.1.11, released May 31, 2014:
------------------------------

Added CORS support, which among other things makes YouTube work again.

Added a page about CGIProxy security, linked to from the footer on the CGIProxy
start page and all error and similar pages.

Added untested IPv6 support-- please tell me how it works if you have IPv6 access!

Several bug fixes.



2.1.10, release April 5, 2014:
------------------------------

Added Spanish and Polish message localization.

Added support for LZMA-compressed SWF files.  Note that this requires the
Perl module IO::Compress::Lzma, which will be installed if you run
"./nph-proxy.cgi install-modules" again, like you did when you first
installed CGIProxy.

Added %TIMEOUT_MULTIPLIER_BY_HOST option, to tune performance with certain
sites.

Fixed many bugs, making many sites work better.



2.1.9, released January 27, 2014:
---------------------------------

Added German, Italian, Javanese, and Sundanese message localization.

Content Security Policy (CSP) 1.0 is now supported with Firefox and Chrome.
Other browsers, and CSP 1.1, will be added when they support CSP.

Added $ALERT_ON_CSP_VIOLATION option.

Various bugs fixes and workarounds.



2.1.8, released October 22, 2013:
---------------------------------

Added Chinese, French, and Indonesian message localization.

The full Gmail now works through CGIProxy.

YouTube once again works through CGIProxy.

Can now use a SOCKS 5 proxy, such as Tor (recommended only on same server).
Configured with $SOCKS_PROXY, $SOCKS_USERNAME, and $SOCKS_PASSWORD .

Database initialization now works better.

Can now use a remote database by setting the $DB_SERVER config variable.

Many bugs fixed or worked around, and privacy holes closed.

Now once again runs on Perl 5.6.1 (one statement in 2.1.7 required Perl 5.10.0).

Shuffles HTTP request headers to better avoid detection.

$ANONYMIZE_INSERTION now defaults to 1.



2.1.7, released July 26, 2013:
------------------------------

CGIProxy now has message localization:  The user can choose an interface in
Arabic, English, Farsi, Russian, or Turkish.  If you would like support for
other languages, please consider translating CGIProxy's messages-- see
http://www.jmarshall.com/tools/cgiproxy/translate.html for full details.

The full facebook site now works almost fully through CGIProxy, so it's no
longer redirected to the mobile site by default.  If it's slow for you or your
users, see the comments and suggestions above where %REDIRECTS is set.

Running under FastCGI now works on servers other than just nginx.

Resuming partial downloads is now supported, with partial support of the
Range: header.

The JavaScript library (jslib) is now gzipped when possible, to save bandwidth.
Should have done this a while ago.

Fixed error with "-c" in usage message; sorry about that.

Added support for Content-Security-Policy: header, though it's disabled until
the header is better defined and browsers support it.

Many bugs fixed, making many sites work better.



2.1.6, released February 4, 2013:
--------------------------------

Now can run as a FastCGI script.

Now can run without an external HTTP server, by using its own embedded secure
HTTP server.

Installation is easier, as Perl modules can be automatically installed
(including under your home directory) by running
"./nph-proxy.cgi install-modules" from the command line.  See the $LOCAL_LIB_DIR
config option, if you need to install the modules and you're not root.

Windows support has improved.

Documentation has been improved, especially for installation.

Command-line usage is now documented; run "./nph-proxy.cgi -?" for usage.

There are some new config options, mostly for FastCGI support, the embedded
server, and database support.

Some of the configuration section has been rearranged; most potentially
needed config options are now near the top.

Fixed a bug handling spaces in path when using proxy_encode().



2.1.5b, released November 10, 2012:
----------------------------------

Added redirection for Gmail to %REDIRECTS ; redirects to HTML-only version.



2.1.5, released October 21, 2012:
---------------------------------

Now optionally uses a server-side database to store cookies, which fixes
"Bad Request" errors when user has too many cookies.  Can use either MySQL
or Oracle.  Configure this with $DB_DRIVER, $DB_USER, $DB_PASS, and
$USE_DB_FOR_COOKIES .

Now supports a simple mechanism to automatically redirect pages that aren't
handled well by CGIProxy.  For example, www.facebook.com is redirected to
m.facebook.com (mobile), until we can get www.facebook.com working better.
This is configured with the %REDIRECTS hash.

17 bugs fixed, mostly in JavaScript support but some in Flash and HTML support too.



2.1.4, released May 8, 2012:
----------------------------

Fixed a bug with chunked responses, making captcha work better.

Closed some privacy holes in JavaScript and Flash support.

Fixed some small bugs, making more pages work better.

Added "Delete all cookies" link to top of cookie management screen.



2.1.3, released April 27, 2012:
-------------------------------

Improved Flash support, including better support for online video.  No delay
with YouTube anymore.

Improved support for Same-Origin Policy (browser security).

Other security fixes.

Other fixes and workarounds, making more pages work correctly.



2.1.2, released April 8, 2012:
------------------------------

Added "download" link to footer.

Fixed "SSL options" bug.

No longer fails in Windows because of getpwuid() call.

Various security fixes.

Various small fixes to make more pages work better.

Internal: rewrote _proxy_jslib_handle() and _proxy_jslib_assign() more cleanly,
using _proxy_jslib_instanceof() instead of the error-prone _proxy_jslib_object_type()
(which was removed).  Still works around many browser bugs.  :P



2.1.1, released January 19, 2012:
---------------------------------

$ENCODE_URL_INPUT is now on by default, and is fixed to work better in all
major browsers.

Link to CGIProxy home page in the footer is now proxified.

Now supports jQuery-based sites better.

Added support for a few non-standard HTTP request headers.

Works better with captcha.

$NO_COOKIE_WITH_IMAGE is now off by default, to support captcha.

Many small fixes and workarounds, including several privacy and security fixes.



2.1, released December 9, 2011:
-------------------------------

Flash 9 and later is now supported, which among other things means YouTube
works through CGIProxy again.

Changed flag segment of full URLs to something less obvious.

Now supports "data:" URIs.

Now supports ECMAScript (JavaScript) version 5.

Many fixes in JavaScript support and elsewhere.

No longer eats memory when connecting to secure servers.

$PROXIFY_SWF is now set by default.

Now uses the Encode module instead of the old utf8:: stuff.

Added two config options $PROXY_DIR and $ALLOW_RTMP_PROXY, though they're not
used yet.



2.1beta19, released December 25, 2008:
--------------------------------------

Fixed a couple of bugs with cookies, so they (including logins) should work
better.

Fixed so that Safari no longer chokes on StorageList handling.

Various other small fixes.



2.1beta18, released August 10, 2008:
------------------------------------

Certain pages were very slow in MSIE due to the way MSIE implements
Array.pop(), so a lot of code was rewritten to avoid using Array.pop().  Those
pages now function at normal speed in MSIE.  Some pages see an improvement
of 10x or better.

proxify_js() is now about 5% faster due to reworking of $div_ok setting.

Now handles "application/xhtml+xml" content correctly.

$REMOVE_SCRIPTS and $HIDE_REFERER both now default to 0 (false).

Various small fixes, workarounds, and cleanup.



2.1beta17, released March 11, 2008:
-----------------------------------

Fixed a couple of bugs with SWF (Flash) support.  In particular, <param> tags
are now proxified correctly when in certain <object>s, and thus youtube.com now
works when using MSIE.



2.1beta16, released March 3, 2008:
----------------------------------

Includes a ton of fixes and workarounds in JavaScript support, making more
sites work through it.

Includes a fair amount of performance improvement too.

Added experimental support for Shockwave Flash (SWF) apps.  If you set
$PROXIFY_SWF=1, the script will proxify the SWF bytecode so that any
network accesses go back through the same script.  It works with many but
not all Flash apps.  Sometimes it can slow down a page, if the page has e.g.
lots of Flash ads.

Fixed port-handling in HTTP Basic authentication support.

Several other small bug fixes, cleanup, and comments.



2.1beta15, released October 26, 2006:
-------------------------------------

Fixed bug handling "javascript:" URLs that was causing some sites to fail.

Fixed bug in _proxy_jslib_cookie_encode() and _proxy_jslib_cookie_decode(),
in the commented-out rot-13 line.  Cookies should now work when using rot-13.

"about:blank" pages no longer generate the "WARNING:" page.

In the start page, the URL entry field now initially has focus (only if JS
is running).

Added support for Element.getElementsByTagName to _proxy_jslib_handle()
(though it's done in the form of "Node.getElementsByTagName").

Now explicitly defaults the first argument of "Document.open()" to text/html.
This is supposed to happen anyway, but not all browsers do it correctly when
writing to a frame.

Proxification of top-level "return" statements should now work better in a
couple of ways.

Several more small fixes and cleanup, making more sites work.



2.1beta14, released October 16, 2006:
-------------------------------------

Better handling of "javascript:" URLs.

Now correctly handles "delete(...)" .

Various browser bugs/crashes are now trapped by try/catch blocks so they're
not fatal.

More erroneous HTML and JavaScript is now worked around.

Many more small fixes and workarounds.



2.1beta13, released September 12, 2006:
---------------------------------------

The $TRANSMIT_HTML_IN_PARTS config variable has been removed; now you only
need to set @TRANSMIT_HTML_IN_PARTS_URLS .  Use of both was redundant.

Now correctly handles UTF-8 content.

Authorization cookies are now associated with the server and port, rather than
just the server.

Worked around MSIE bug that shifts centered content to the right.

Now works around more erroneous HTML, JavaScript, and browser behavior.

Other minor fixes.



2.1beta12, released June 6, 2006:
---------------------------------

Now optionally supports compression (gzip) of message bodies, if the
Compress::Zlib Perl module is installed.  This also helps with certain server
bugs.

To help with pages that are returned in parts with long delays between those
parts, you can now use the $TRANSMIT_HTML_IN_PARTS and
@TRANSMIT_HTML_IN_PARTS_URLS options to have CGIProxy process and return each
piece of HTML as it receives it rather than wait for the whole page.  This
helps with certain library database queries, for example.

Further improved CSS handling, including better display of the top form.

Improved handling of "&#nnn;" HTML entities.

Better updating of the top form when using frames.

When $TEXT_ONLY is set, there are no longer any wasteful attempts to get the
images (unless $RETURN_EMPTY_GIF is set).  Also, $RETURN_EMPTY_GIF now defaults
to 0 .

Now handles JavaScript's deprecated with() statement.

Now handles Document.referrer .

Cookie values are now allowed to have spaces in them, even though that's
technically illegal; unfortunately, some sites require them to work.  If this
causes any problems, please let me know.

Other minor fixes.



2.1beta11, released March 14, 2006:
-----------------------------------

Improved (though not yet perfect) CSS handling.

Now uses a different system to track which elements have been proxified.
(PLEASE tell me if you find any privacy holes, i.e. when the browser
makes a direct connection to the end server when it should be going though
CGIProxy.)

Fixed a hang when using MSIE and IIS.

HTML (e.g. innerHTML) is no longer doubly-proxified when using "+=".

Improved handling of &#nnn; -type entities.

No longer inserts "<!-- resource has been modified by proxy -->"; it was
becoming more trouble than it's worth.

Better handling of data retrieved through XMLHttpRequest.

Cleaner support of Document.domain .

Many other minor fixes, workarounds, and cleanup, mostly in JavaScript support.



2.1beta10, released December 6, 2005:
-------------------------------------

Now supports UTF-16 pages, if using Perl 5.8.0 or later.

Now supports query-only URLs.

Window.open with a relative URL is now handled correctly.

In JavaScript, (non-standard) octal numbers and characters are now handled.

Now allows (erroneous) commas in cookie values, created by at least one server.

Now allows (erroneous) line terminators in JavaScript string literals if
  preceded by a "\", since browsers seem to allow it.

Many other minor fixes, mostly in JavaScript support.



2.1beta9, released October 27, 2005:
------------------------------------

Cleaned up URL handling in proxy_encode(), proxy_decode(), and their JS
counterparts.  Those routines are now back to their old format in which
they contain only user-configurable statements.  The "do not remove" code
has been moved into wrapper functions; anywhere proxy_encode() is called
should now call wrap_proxy_encode() instead, and the same is true for the
three other related routines.

@BANNED_NETWORKS now includes the whole 127.x.x.x subnet.

Cookies are now handled more like browsers handle them (though not as the
spec calls for), in a couple of ways.

In the JavaScript code, reworked handling of "delete", preincrement, and
predecrement.  Those operators are now handled much better.

Other minor fixes.



2.1beta8, released August 23, 2005:
-----------------------------------

Now more properly encodes all "?", "#", and others in default proxy_encode().
This includes further working around aforementioned Apache bug with PATH_INFO.

Now supports Accept-Language: header in requests, which means that pages will
more often be returned in the expected language.

Added support for query-only URLs, i.e. those beginning with "?".  Don't know
why this never came up before.

A couple fixes and workarounds in JavaScript support.

Other minor fixes.



2.1beta7, released June 28, 2005:
---------------------------------

Added a "Report a bug" link in the top form, which will hopefully result
in more bug reports.

Now supports (erroneous) JavaScript that contains the string "</script"
within it, as browsers do.

Worked around an Apache bug (fixed in 2.0.55) that causes problems when
PATH_INFO contains "//".

Fixed a few bugs in JavaScript handling.

Other minor fixes.



2.1beta6, released May 24, 2005:
--------------------------------

JavaScript is now supported, i.e. JavaScript is modified as necessary to
route all network accesses back through the script.

$PROXIFY_SCRIPTS is now on by default.

Unsupported scripts are now removed by default; to not remove them, use
$ALLOW_UNPROXIFIED_SCRIPTS.

Top form has been put into its own one-cell table, with a white background.
This makes it readable on many pages where it wasn't before.

Checkbox text in top form now uses <label> tags, to allow clicking on text.

Now can encode URLs before submitting them, if you set $ENCODE_URL_INPUT.

Now supports external tests for valid user IP address ($USER_IP_ADDRESS_TEST)
and valid destination server ($DESTINATION_SERVER_TEST).  Both may be either
a command-line program, or a CGI script on a remote server.  These features
were added at the request of (and paid for by) the International Broadcasting
Bureau.

No longer times out after 10 minutes, which will help with large files.

Added some more appropriate values to @BANNED_NETWORKS.

Cookies with no path specified now use a path of "/", even though that
violates the spec, because that's how browsers treat them.  If you want
to follow the spec instead, you can set $COOKIE_PATH_FOLLOWS_SPEC.

Illegal cookie domains that contain only two dots but are not in one of
the seven main TLDs are now allowed by default, because that's how browsers
behave.  You can follow the spec instead by setting $RESPECT_THREE_DOT_RULE.

Many other minor fixes, and workarounds for browser bugs and buggy sites.



2.0.1, released November 19, 2002:
----------------------------------

This release improves compatibility and installability in a few environments;
it doesn't really add new features.  In particular:

An SSL proxy is now supported with the $SSL_PROXY option, analogous to
$HTTP_PROXY.  Authentication for it is handled with $SSL_PROXY_AUTH,
analogous to $PROXY_AUTH.

The $RUNNING_ON_WINDOWS option is no longer needed, and so no longer exists--
relevant code now determines the OS automatically when needed, or is handled
another way.

Sockets now work correctly on BSDI and possibly other systems, where before
the script generated messages like "Address family not supported by protocol
family" and possibly others.  It works now because socket address structures
are created with the general pack_sockaddr_in() and inet_aton() functions from
the Socket module, instead of the traditional hard-coded "pack('S n a4 x8')"
method.  The new way is more "correct", given that we're already loading the
Socket module anyway.  It should help for future IPv6 support too.

All page insertions, including the initial "<!-- resource ... -->" comment, are
now inserted *after* any initial <!doctype> declaration, to avoid confusing
MSIE 6.0, which does not allow comments before an initial <!doctype>.  The
problem showed up as subtle errors in page elements (like fonts, etc.) when
using MSIE; some or all of these problems should go away now.

Cache behavior when using a caching HTTP proxy should be correct now, because
Pragma: and Cache-Control: headers (if available) are now passed through to
the outgoing request.  Before, caches would not always refresh as expected,
so page reloads didn't always work.

Fixed a compilation error when using certain older versions of Perl.



2.0, released September 18, 2002:
---------------------------------

This is a MAJOR release, even though most of the changes are internal.  Some
things about the 2.0 script are fundamentally different from the 1.x series.
Here's a list of changes, roughly categorized:


---- Visible new features and changes: ----

Now supports SSL, i.e. can retrieve pages on secure servers.  For this to
work, the separate packages OpenSSL and Net::SSLeay must be installed.  If
they are not installed, then CGIProxy still works but cannot download pages
from secure servers.  Also, it is strongly recommended to run CGIProxy on a
secure server when SSL is supported, or else secure data will be compromised
on the link between the browser and CGIProxy.

The top entry form now has an "UP" link, which links to the parent directory of
the current URL.  The idea comes from the (quite useful) button in Konqueror.

Handling of top insertions has been cleaned up-- FTP directory listings now
include the correct insertions, and other cleaner behavior.


---- New or changed config options: ----

$RUNNING_ON_SSL_SERVER is now a three-way option:  If it's set to '', then
assume an SSL server if and only if port 443 is being used.  Besides being a
good default, this lets you put the script where it can be served by both a
secure server and a non-secure server.

Perl 4 is no longer supported.  CGIProxy should run fine with Perl 5.004 or
later.  Better yet, upgrade to Perl 5.6.1 or later if you can-- future
versions of CGIProxy may require that for some features.

FTP requests now use passive (PASV) mode by default, though you can use
non-passive mode by setting $USE_PASSIVE_FTP_MODE=0.

$HTTP_PROXY and $NO_PROXY are now used instead of $ENV{'http_proxy'} and
$ENV{'no_proxy'}.

For VPN-like installations, there is now a $QUIETLY_EXIT_PROXY_SESSION option
that allows a smooth transition from browsing an intranet through the proxy
to browsing external sites directly, without getting intermediate warning
screens.  It's not meant for any situation where anonymity is important.  See
the comments where it is set for more details.

Set the new $SESSION_COOKIES_ONLY option to make all cookies expire when the
browser closes.

Set the new $MINIMIZE_CACHING option to minimize any caching that may be done
by the browser, by using appropriate HTTP response headers.  Cacheability of
various responses has also been cleaned up in general.

If for some reason you want content (e.g. HTML) inside <!-- --> comments to
be proxified like the rest, set the new $PROXIFY_COMMENTS option.

Improved the sample list of ad servers in @BANNED_IMAGE_URL_PATTERNS.

Cleaner and slightly different handling of $INSERT_HTML and $INSERT_FILE;
see comments in user config section for details.

$FASTER_HTML_LESS_PRIVACY no longer exists, because it's no longer relevant
with the new HTML-parsing structure (see below).


---- For programmers: ----

SSL support was implemented as a package that implements a tied filehandle,
called "SSL_Handle".  The idea came from the Net::SSLeay::Handle module,
which for a couple reasons wasn't suitable to use directly.  The SSL_Handle
package may be useful in other SSL applications, though this version is not
a full implementation.  It does buffer its input, though, which can make it
a much faster alternative to Net::SSLeay's ssl_read_until() routine.

The entire section of code that modifies an HTML response (roughly 20-25% of
the whole program) has been completely rewritten from scratch, and has a new
structure.  It's MUCH cleaner and slightly smaller than before, and is now
encapsulated in the routine proxify_html().  It handles the heterogeneity of
HTML correctly, i.e. non-HTML content that can exist within HTML (like
scripts, stylesheets, comments, or SGML declarations) is correctly separated
out and handled according to type.  Also, tags are fully parsed into
attributes and rebuilt if needed, instead of hacking it with long regular
expressions as before; this removes a family of potential bugs and a lot of
messy code.  Overall, the results are more accurate, and the new structure is
much more solid, flexible, and extensible, and much easier to work with than
before; it should let us solve any new problems in the "right" way when they
arise.  :) :)

Many other sections of code have been partially or entirely rewritten, and
behavior is generally cleaner.

Global variables are now handled much more cleanly, especially regarding
their persistence when using mod_perl.  Variables are now (almost) completely
divided between UPPER_CASE constants which retain their values between runs,
and lower_case variables which are reset for each run.  After the user config
section is a constant initialization section; both are run only during the
first run of the script, and are skipped for efficiency during subsequent
runs under mod_perl.  The config and initialization sections have been
arranged more carefully than before, for clarity and other reasons.  Several
variables have had their semantics clarified.

NOTE: If you modified the code in an earlier version and used certain config
variables, you should review the new code before inserting the same changes.
For example, a variable like $REMOVE_SCRIPTS should normally be replaced by
$e_remove_scripts; the former is the config setting that never changes
anymore, while the latter reflects the value used for this run of the program
(which the user might change via a checkbox).  There are several similar
variables.  Also, avoid modifying UPPER_CASE variables, because those will
retain their values between runs under mod_perl... unless that's what you
want.


---- Other stuff: ----

The HTTP client in CGIProxy now uses HTTP/1.1 if the browser uses HTTP/1.1.
Before, only certain HTTP/1.1 features like the Host: header were supported;
now, all required HTTP/1.1 client features are supported, so CGIProxy is
"conditionally compliant" with HTTP/1.1 regarding its client functions, as
per the HTTP spec.  CGIProxy can't control all server functions, but for
those that it does (such as the Date: header), it complies with HTTP/1.1.

Improved detection of text vs. non-text when supporting $NO_COOKIE_WITH_IMAGE,
which makes some pages behave better.

Many changes to take advantage of Perl 5, such as "use strict", better
regular expressions, references, and cleaner code all over the place.

Various other bugs fixed, privacy holes closed, performance improvements, UI
improvements, code rearrangement, and cleanup.



1.5.1, released February 7, 2002:
---------------------------------

Headers are no longer split on commas, which among other things means that
cookies work again.  :P

A couple other minor bug fixes.



1.5, released November 26, 2001:
--------------------------------

Many changes this time around, some major, some minor.  The code is about
50% larger.  Here's a list of changes, roughly categorized:


---- Most visible new features and changes: ----

A new cookie management screen lets the user view and selectively delete any
cookies being sent through the proxy.

On pages with frames, any insertion (such as the small URL entry form) is now
in its own top frame.

Cookies may now be encoded with the cookie_encode() and cookie_decode()
routines, similar in concept to proxy_encode() and proxy_decode().


---- Making more pages work: ----

Referrer information, required by some servers, is now optionally sent to
the server.

Certain pages with Flash or other embedded objects now work better.

HTTP URLs which include authentication in them (e.g. "username:password")
are now supported.

Links to "javascript:" URLs are handled in a more friendly way.

Inline frames (i.e. <iframe> tags) are now handled like frames with regard
to insertions.


---- New config options: ----

If you're running this on or inside a firewall, use @BANNED_NETWORKS to ban
access to single hosts and whole networks by IP address.  This is more
reliable than banning by hostname.

Set $REMOVE_TITLES to remove titles from HTML pages.

Error pages now link to the starting page, unless $NO_LINK_TO_START is set.


---- For programmers: ----

There's now a simple framework for handling any arbitrary MIME types that
may need modification (e.g. changing embedded links); see proxify_block().
To add handling for a MIME type, add code in proxify_block() and add your
MIME type into @TYPES_TO_HANDLE (and @OTHER_TYPES_TO_REGISTER if needed).

Additionally, a framework to proxify script content is in place:  All script
content will be routed through the MIME type handler if $PROXIFY_SCRIPTS is
set.  You need to add your own code to proxify_block() for this to actually
do anything.

Parsing of the flag segment of PATH_INFO has been encapsulated into the
routines pack_flags() and unpack_flags().


---- Invisible, or bug fixes: ----

Style sheets are handled much better, including <style> elements, "style"
attributes, and external style sheets.  CSS is explicitly handled, but the
framework is in place for other types.

Certain tags in HTML specify the expected MIME type of the resource they
link to; this is now handled better.

The list of script MIME types is now more complete.

Cookies with the "secure" clause are now supported, though it's never
actually used without SSL support.

Strengthened $NO_COOKIE_WITH_IMAGE support by requiring an appropriate Accept:
header, to guard against certain sneaky Web bugs like at zdnet.com.

The script's URL is now generated using the Host: header if available,
instead of SERVER_NAME; this tends to have better results.

Any resource with an unidentified MIME type is now treated as text/html,
like Netscape does.  This is safest.

<a href> attributes containing character entities are now handled better when
proxy_encode()'ing is used.

FTP URLs with spaces in them are now handled better, in a couple of ways.

Various tags, attributes, and HTTP headers that require special treatment
are now handled more correctly.

Various regex fixes, other minor bug fixes, and cleanup.



1.4.1 and 1.4.1-SSL, released March 8, 2001:
--------------------------------------------

CPU load was decreased 15% with two simple changes that I should have thought
of long ago.

Fixed error with <meta> "refresh" tags that caused proxy to loop through
itself.

Fixed problem with user-chosen URL entry form.



1.4-SSL, released February 22, 2001:
------------------------------------

This is a special version that can retrieve pages from SSL servers.  It
is based on version 1.4, and otherwise works pretty identically to that
release.



1.4, released February 10, 2001:
----------------------------------

You can now optionally insert a compact version of the initial entry form
into the top of every downloaded page, by setting $INSERT_ENTRY_FORM=1.  The
form also displays the URL you're currently viewing.  This is selectable by
the user like the three original user-selectable options.  Frames are handled
correctly, i.e. it's not inserted in frames, because that would be really
ugly.  That was the hard part.

You can also insert your own block of HTML if desired: specify it either as a
fixed string in $INSERT_HTML, or name a file to be inserted in $INSERT_FILE.

For consistency, the URL format was changed slightly-- the initial flags in
PATH_INFO are always there and are now five in number.  So any bookmarks
saved through the proxy will have to be converted or recreated.  If there's
enough demand, I can write a simple converter each time I change the URL
format.

The user-entered hostname is now always lowercased, since host names are
case-insensitive.

Fixed a minor bug in 1.3.2 having to do with PATH_INFO encoding.



1.3.2, released February 3, 2001:
---------------------------------

By popular demand, you can now restrict which servers the proxy can access,
like the online demo does.  This is configured with the lists
@ALLOWED_SERVERS and @BANNED_SERVERS.

For FTP transfers, a "Content-Length:" header is now returned when
guessable.  This lets some browsers show you the percentage progress.

Pseudo-headers created by <meta http-equiv> tags are now handled like real
HTTP headers.  Internally, the handling of HTTP headers has been cleaned up.

If you absolutely, truly, can't run NPH scripts on your server, there is
now an option to run as best as possible as a normal non-NPH CGI script.
For this to work, your server MUST support the "Status:" CGI response
header.  All servers are supposed to support it, but not all do.

There is now a $NO_BROWSE_THROUGH_SELF option which prevents the proxy from
calling itself, which is usually a mistake anyway.

Proxy authentication (the "Proxy-Authorization:" request header) is now
supported in a limited way, with $PROXY_AUTH.

Regexes have been improved to match tag attributes better, and a related
privacy hole was fixed.

URLs with spaces (which are a bad idea anyway) are now more likely to be
handled as expected.



1.3.1, released June 6, 2000:
-----------------------------

Script now runs correctly under mod_perl (requires at least Perl 5.004).

Script now runs correctly on an SSL server, if $RUNNING_ON_SSL_SERVER is set.

Main URL-conversion loop runs almost twice as fast (40% less CPU time), with
a fix I should have noticed a long time ago.

Login for HTTP Basic authentication is now submitted with POST instead of
GET, for better security.

Fixed privacy hole when servers didn't return Content-Type: header.



1.3, released April 8, 2000:
-----------------------------

Anonymity has been improved, especially regarding JavaScript or other
script content.  Before it was an afterthought; now it's being implemented
as completely as possible.  If you know any anonymity holes, please tell
me.  I'm especially interested in knowing any MIME types that identify
scripts.

In particular:

. Much more JavaScript is filtered out than before.  As far as I know, all
  of it is removed:  in <script> blocks, in style sheets, in HTML attributes,
  wherever indicated by HTTP headers, and other places.

. By default, $REMOVE_SCRIPTS is set to true.

. A potential privacy hole from a bug in Internet Explorer is protected.

You can now select which servers to allow scripts from, by setting
@ALLOWED_SCRIPT_SERVERS and @BANNED_SCRIPT_SERVERS.


If several people share a proxy, they can customize their own settings if
you set $ALLOW_USER_CONFIG.

Large files and streaming media are now supported, by transmitting the
data from the server as it arrives, rather than receiving the whole
resource before sending it to the client.  This works for both HTTP and
FTP.

HTTP Basic authentication is now supported.  (I sure hope people use it,
because it's the most elaborate and convoluted hack of the whole program.)

There's an experimental load-balancing feature.  If you set @PROXY_GROUP
to a set of URL's of cooperating proxies, they'll randomly distribute the
load among them.  This may help or hinder privacy, and it may have other
uses too.  Let me know if you find those uses.

For those who like to mess with the code, there are some neat new internal
mechanisms.  Cookies have now been extended to handle multiple tasks (had
to for Basic authentication), and there's a new internally-handled URL
scheme "x-proxy" that lets you plug in whatever magic functionality you
want (had to for Basic authentication).

There is no longer a startproxy.cgi.  It was swallowed by the main script.
It was becoming a vanishingly small percentage of the overall code.

More HTML tags are transformed, whichever non-standard tags people
reported to me (thanks!).

A couple of non-standard HTTP headers with URLs are now transformed
correctly.

If you're running on Windows, you can now set a configuration flag, and
CGIProxy will work around a couple problems on that platform.

$SUPPORT_COOKIES has been reversed, and renamed to $REMOVE_COOKIES.  This
makes it more analogous to $REMOVE_SCRIPTS, and each have their
@ALLOWED...  and @BANNED... server lists.

@BANNED_COOKIE_SERVERS and $NO_COOKIE_WITH_IMAGE have changed slightly--
they now take effect even when $FILTER_ADS isn't set.  They're more
associated with the $REMOVE_COOKIES flag now.

The initial URL-entry form may now submit using POST instead of GET,
based on the setting of $USE_POST_ON_START.  This is because some
filters apparently search outgoing URIs, but not POST request bodies.

FTP now follows symbolic links correctly, and another FTP bug or two were
fixed.



1.2, released September 11, 1999:
---------------------------------

The internal structure was rearranged in a big way, to support multiple
protocols more cleanly.  Previously, HTTP was ingrained throughout; now
it's more modular.

FTP is now supported.

@ALLOWED_COOKIE_SERVERS lets you only accept cookies from certain servers.

@BANNED_COOKIE_SERVERS and @ALLOWED_COOKIE_SERVERS are now lists of Perl
patterns (regular expressions) to match, rather than literal host names.
This lets you allow or forbid whole sets of servers rather than listing
each server individually.  For more information on Perl patterns, read the
Perl documentation.  nph-proxy.cgi has a note in the user config section
that may help enough.

You can remove scripts from HTML pages by setting $REMOVE_SCRIPTS=1.  This
helps with anonymity somewhat by removing some JavaScript (but not all!).
It also removes most popup ads.  :)

The HEAD method is now supported more cleanly.

Rare net_path form of relative URL (i.e. like "//host.com/path/etc") is
now supported, for completeness and safety.

The default lists of cookie and ad servers are a bit better.



1.1, released March 9, 1999:
----------------------------

The whole format of the target URL in PATH_INFO was restructured.  It can
be encoded however the user wishes.  This gets around PATH_INFO clashes in
various servers, solving most problems regarding server incompatibilities
I've heard about.

Cookies are now optionally supported (but off by default).

Banner ads can be filtered out.  Only a simple set of URL patterns are
filtered out by default, but it's easy to add more entries to
@BANNED_IMAGE_URL_PATTERNS.

Cookies from ad servers are filtered out (at least the main ones).  Again,
the default list in @BANNED_COOKIE_SERVERS is simple, but you can easily
add more.

Binary files are no longer getting messed up on Windows.

More HTTP headers are fixed to point back through the proxy.

Under some conditions in 1.0, extra processes would hang around for hours
and drag the system.  Alex Freed added a timeout to solve this for now.  I
can't reproduce the problem, so any info is appreciated.  [9-9-1999: It
may be a bug in older Apaches, fixed by upgrading to Apache 1.3.6 or
better.  Julian Haight reports the same problem with other scripts on
Apache 1.3.3, but not with Apache 1.3.6.]

Internally: code was cleaned up, URL-parsing was improved, and relative
URL calculation was redone.



1.0, released August 3, 1998:
-----------------------------

Initial release.


========================================================================

Last Modified: January 26, 2019
http://www.jmarshall.com/tools/cgiproxy/

