docs: cleanups improvements and fighting sphinx

- Hide links to internal code listings, and link to github instead
- Improve formatting of code/example captions
- Fix outdated documentation of command-line options
- Complete documentation of all events + improved formatting
- tcp_open -> tcp_start, tcp_close -> tcp_end to reduce confusion
This commit is contained in:
Aldo Cortesi
2016-10-12 10:57:05 +13:00
parent 26af9b29fc
commit fdb6a44245
18 changed files with 389 additions and 356 deletions

8
docs/scripting/api.rst Normal file
View File

@@ -0,0 +1,8 @@
.. _api:
API
====
.. automodule:: mitmproxy.models.http
:inherited-members:
:members: HTTPFlow, HTTPRequest, HTTPResponse

View File

@@ -0,0 +1,4 @@
.. _context:
Context
=======

202
docs/scripting/events.rst Normal file
View File

@@ -0,0 +1,202 @@
.. _events:
Events
=======
General
-------
.. list-table::
:widths: 40 60
:header-rows: 0
* - .. py:function:: configure(options, updated)
- Called once on startup, and whenever options change.
*options*
An ``options.Options`` object with the total current configuration
state of mitmproxy.
*updated*
A set of strings indicating which configuration options have been
updated. This contains all options when *configure* is called on
startup, and only changed options subsequently.
* - .. py:function:: done()
- Called once when the script shuts down, either because it's been
unloaded, or because the proxy itself is shutting down.
* - .. py:function:: log(entry)
- Called whenever an event log is added.
*entry*
An ``controller.LogEntry`` object - ``entry.msg`` is the log text,
and ``entry.level`` is the urgency level ("debug", "info", "warn",
"error").
* - .. py:function:: start()
- Called once on startup, before any other events. If you return a
value from this event, it will replace the current addon. This
allows you to, "boot into" an addon implemented as a class instance
from the module level.
* - .. py:function:: tick()
- Called at a regular sub-second interval as long as the addon is
executing.
Connection
----------
.. list-table::
:widths: 40 60
:header-rows: 0
* - .. py:function:: clientconnect(root_layer)
- Called when a client initiates a connection to the proxy. Note that a
connection can correspond to multiple HTTP requests.
*root_layer*
The root layer (see `mitmproxy.protocol` for an explanation what
the root layer is), provides transparent access to all attributes
of the :py:class:`~mitmproxy.proxy.RootContext`. For example,
``root_layer.client_conn.address`` gives the remote address of the
connecting client.
* - .. py:function:: clientdisconnect(root_layer)
- Called when a client disconnects from the proxy.
*root_layer*
The root layer object.
* - .. py:function:: next_layer(layer)
- Called whenever layers are switched. You may change which layer will
be used by returning a new layer object from this event.
*layer*
The next layer, as determined by mitmpmroxy.
* - .. py:function:: serverconnect(server_conn)
- Called before the proxy initiates a connection to the target server.
Note that a connection can correspond to multiple HTTP requests.
*server_conn*
A ``ServerConnection`` object. It is guaranteed to have a non-None
``address`` attribute.
* - .. py:function:: serverdisconnect(server_conn)
- Called when the proxy has closed the server connection.
*server_conn*
A ``ServerConnection`` object.
HTTP Events
-----------
.. list-table::
:widths: 40 60
:header-rows: 0
* - .. py:function:: request(flow)
- Called when a client request has been received.
*flow*
A ``models.HTTPFlow`` object. At this point, the flow is
guaranteed to have a non-None ``request`` attribute.
* - .. py:function:: requestheaders(flow)
- Called when the headers of a client request have been received, but
before the request body is read.
*flow*
A ``models.HTTPFlow`` object. At this point, the flow is
guaranteed to have a non-None ``request`` attribute.
* - .. py:function:: responseheaders(flow)
- Called when the headers of a server response have been received, but
before the response body is read.
*flow*
A ``models.HTTPFlow`` object. At this point, the flow is
guaranteed to have a non-none ``request`` and ``response``
attributes, however the response will have no content.
* - .. py:function:: response(flow)
- Called when a server response has been received.
*flow*
A ``models.HTTPFlow`` object. At this point, the flow is
guaranteed to have a non-none ``request`` and ``response``
attributes. The raw response body will be in ``response.body``,
unless response streaming has been enabled.
* - .. py:function:: error(flow)
- Called when a flow error has occurred, e.g. invalid server responses,
or interrupted connections. This is distinct from a valid server HTTP
error response, which is simply a response with an HTTP error code.
*flow*
The flow containing the error. It is guaranteed to have
non-None ``error`` attribute.
WebSocket Events
-----------------
.. list-table::
:widths: 40 60
:header-rows: 0
* - .. py:function:: websockets_handshake(flow)
- Called when a client wants to establish a WebSockets connection. The
WebSockets-specific headers can be manipulated to manipulate the
handshake. The ``flow`` object is guaranteed to have a non-None
``request`` attribute.
*flow*
The flow containing the HTTP websocket handshake request. The
object is guaranteed to have a non-None ``request`` attribute.
TCP Events
----------
These events are called only if the connection is in :ref:`TCP mode
<tcpproxy>`. So, for instance, TCP events are not called for ordinary HTTP/S
connections.
.. list-table::
:widths: 40 60
:header-rows: 0
* - .. py:function:: tcp_end(flow)
- Called when TCP streaming ends.
*flow*
A ``models.TCPFlow`` object.
* - .. py:function:: tcp_error(flow)
- Called when a TCP error occurs - e.g. the connection closing
unexpectedly.
*flow*
A ``models.TCPFlow`` object.
* - .. py:function:: tcp_message(flow)
- Called a TCP payload is received from the client or server. The
sender and receiver are identifiable. The most recent message will be
``flow.messages[-1]``. The message is user-modifiable.
*flow*
A ``models.TCPFlow`` object.
* - .. py:function:: tcp_start(flow)
- Called when TCP streaming starts.
*flow*
A ``models.TCPFlow`` object.

View File

@@ -1,227 +0,0 @@
.. _inlinescripts:
Inline Scripts
==============
**mitmproxy** has a powerful scripting API that allows you to modify flows
on-the-fly or rewrite previously saved flows locally.
The mitmproxy scripting API is event driven - a script is simply a Python
module that exposes a set of event methods. Here's a complete mitmproxy script
that adds a new header to every HTTP response before it is returned to the
client:
.. literalinclude:: ../../examples/add_header.py
:caption: examples/add_header.py
:language: python
All events that deal with an HTTP request get an instance of :py:class:`~mitmproxy.models.HTTPFlow`,
which we can use to manipulate the response itself.
We can now run this script using mitmdump or mitmproxy as follows:
>>> mitmdump -s add_header.py
The new header will be added to all responses passing through the proxy.
Examples
--------
mitmproxy comes with a variety of example inline scripts, which demonstrate many basic tasks.
We encourage you to either browse them locally or on `GitHub`_.
Events
------
Script Lifecycle Events
^^^^^^^^^^^^^^^^^^^^^^^
.. py:function:: start(context)
Called once on startup, before any other events.
:param List[str] argv: The inline scripts' arguments.
For example, ``mitmproxy -s 'example.py --foo 42'`` sets argv to ``["--foo", "42"]``.
.. py:function:: done(context)
Called once on script shutdown, after any other events.
Connection Events
^^^^^^^^^^^^^^^^^
.. py:function:: clientconnect(context, root_layer)
Called when a client initiates a connection to the proxy. Note that
a connection can correspond to multiple HTTP requests.
.. versionchanged:: 0.14
:param Layer root_layer: The root layer, which provides transparent access to all attributes of the
:py:class:`~mitmproxy.proxy.RootContext`. For example, ``root_layer.client_conn.address``
gives the remote address of the connecting client.
.. py:function:: clientdisconnect(context, root_layer)
Called when a client disconnects from the proxy.
.. versionchanged:: 0.14
:param Layer root_layer: see :py:func:`clientconnect`
.. py:function:: serverconnect(context, server_conn)
Called before the proxy initiates a connection to the target server. Note that
a connection can correspond to multiple HTTP requests.
:param ServerConnection server_conn: The server connection object. It is guaranteed to have a
non-None ``address`` attribute.
.. py:function:: serverdisconnect(context, server_conn)
Called when the proxy has closed the server connection.
.. versionadded:: 0.14
:param ServerConnection server_conn: see :py:func:`serverconnect`
HTTP Events
^^^^^^^^^^^
.. py:function:: request(context, flow)
Called when a client request has been received. The ``flow`` object is
guaranteed to have a non-None ``request`` attribute.
:param HTTPFlow flow: The flow containing the request which has been received.
The object is guaranteed to have a non-None ``request`` attribute.
.. py:function:: responseheaders(context, flow)
Called when the headers of a server response have been received.
This will always be called before the response hook.
:param HTTPFlow flow: The flow containing the request and response.
The object is guaranteed to have non-None ``request`` and
``response`` attributes. ``response.content`` will be ``None``,
as the response body has not been read yet.
.. py:function:: response(context, flow)
Called when a server response has been received.
:param HTTPFlow flow: The flow containing the request and response.
The object is guaranteed to have non-None ``request`` and
``response`` attributes. ``response.body`` will contain the raw response body,
unless response streaming has been enabled.
.. py:function:: error(context, flow)
Called when a flow error has occurred, e.g. invalid server responses, or
interrupted connections. This is distinct from a valid server HTTP error
response, which is simply a response with an HTTP error code.
:param HTTPFlow flow: The flow containing the error.
It is guaranteed to have non-None ``error`` attribute.
WebSockets Events
^^^^^^^^^^^^^^^^^
.. py:function:: websocket_handshake(context, flow)
Called when a client wants to establish a WebSockets connection.
The WebSockets-specific headers can be manipulated to manipulate the handshake.
The ``flow`` object is guaranteed to have a non-None ``request`` attribute.
:param HTTPFlow flow: The flow containing the request which has been received.
The object is guaranteed to have a non-None ``request`` attribute.
TCP Events
^^^^^^^^^^
.. py:function:: tcp_message(context, tcp_msg)
.. warning:: API is subject to change
If the proxy is in :ref:`TCP mode <tcpproxy>`, this event is called when it
receives a TCP payload from the client or server.
The sender and receiver are identifiable. The message is user-modifiable.
:param TcpMessage tcp_msg: see *examples/tcp_message.py*
API
---
The canonical API documentation is the code, which you can browse here, locally or on `GitHub`_.
*Use the Source, Luke!*
The main classes you will deal with in writing mitmproxy scripts are:
:py:class:`mitmproxy.flow.FlowMaster`
- The "heart" of mitmproxy, usually subclassed as :py:class:`mitmproxy.dump.DumpMaster` or
:py:class:`mitmproxy.console.ConsoleMaster`.
:py:class:`~mitmproxy.models.ClientConnection`
- Describes a client connection.
:py:class:`~mitmproxy.models.ServerConnection`
- Describes a server connection.
:py:class:`~mitmproxy.models.HTTPFlow`
- A collection of objects representing a single HTTP transaction.
:py:class:`~mitmproxy.models.HTTPRequest`
- An HTTP request.
:py:class:`~mitmproxy.models.HTTPResponse`
- An HTTP response.
:py:class:`~mitmproxy.models.Error`
- A communications error.
:py:class:`netlib.http.Headers`
- A dictionary-like object for managing HTTP headers.
:py:class:`netlib.certutils.SSLCert`
- Exposes information SSL certificates.
Running scripts in parallel
---------------------------
We have a single flow primitive, so when a script is blocking, other requests are not processed.
While that's usually a very desirable behaviour, blocking scripts can be run threaded by using the
:py:obj:`mitmproxy.script.concurrent` decorator.
**If your script does not block, you should avoid the overhead of the decorator.**
.. literalinclude:: ../../examples/nonblocking.py
:caption: examples/nonblocking.py
:language: python
Make scripts configurable with arguments
----------------------------------------
Sometimes, you want to pass runtime arguments to the inline script. This can be simply done by
surrounding the script call with quotes, e.g. ```mitmdump -s 'script.py --foo 42'``.
The arguments are then exposed in the start event:
.. literalinclude:: ../../examples/modify_response_body.py
:caption: examples/modify_response_body.py
:language: python
Running scripts on saved flows
------------------------------
Sometimes, we want to run a script on :py:class:`~mitmproxy.models.Flow` objects that are already
complete. This happens when you start a script, and then load a saved set of flows from a file
(see the "scripted data transformation" example :ref:`here <mitmdump>`).
It also happens when you run a one-shot script on a single flow through the ``|`` (pipe) shortcut
in mitmproxy.
In this case, there are no client connections, and the events are run in the following order:
**start**, **request**, **responseheaders**, **response**, **error**, **done**.
If the flow doesn't have a **response** or **error** associated with it, the matching events will
be skipped.
Spaces in the script path
-------------------------
By default, spaces are interpreted as a separator between the inline script and its arguments
(e.g. ``-s 'foo.py 42'``). Consequently, the script path needs to be wrapped in a separate pair of
quotes if it contains spaces: ``-s '\'./foo bar/baz.py\' 42'``.
.. _GitHub: https://github.com/mitmproxy/mitmproxy

View File

@@ -1,26 +0,0 @@
FlowMaster
==========
.. note::
We strongly encourage you to use :ref:`inlinescripts` rather than subclassing mitmproxy's FlowMaster.
- Inline Scripts are equally powerful and provide an easier syntax.
- Most examples are written as inline scripts.
- Multiple inline scripts can be used together.
- Inline Scripts can either be executed headless with mitmdump or within the mitmproxy UI.
All of mitmproxy's basic functionality is exposed through the **mitmproxy**
library. The example below shows a simple implementation of the "sticky cookie"
functionality included in the interactive mitmproxy program. Traffic is
monitored for ``Cookie`` and ``Set-Cookie`` headers, and requests are rewritten
to include a previously seen cookie if they don't already have one. In effect,
this lets you log in to a site using your browser, and then make subsequent
requests using a tool like curl, which will then seem to be part of the
authenticated session.
.. literalinclude:: ../../examples/stickycookies
:caption: examples/stickycookies
:language: python

View File

@@ -0,0 +1,79 @@
.. _overview:
Overview
=========
Mitmproxy has a powerful scripting API that allows you to control almost any
aspect of traffic being proxied. In fact, much of mitmproxy's own core
functionality is implemented using the exact same API exposed to scripters (see
:src:`mitmproxy/builtins`).
Scripting is event driven, with named handlers on the script object called at
appropriate points of mitmproxy's operation. Here's a complete mitmproxy script
that adds a new header to every HTTP response before it is returned to the
client:
.. literalinclude:: ../../examples/add_header.py
:caption: :src:`examples/add_header.py`
:language: python
All events that deal with an HTTP request get an instance of
:py:class:`~mitmproxy.models.HTTPFlow`, which we can use to manipulate the
response itself. We can now run this script using mitmdump or mitmproxy as
follows:
>>> mitmdump -s add_header.py
The new header will be added to all responses passing through the proxy.
mitmproxy comes with a variety of example inline scripts, which demonstrate
many basic tasks.
Running scripts in parallel
---------------------------
We have a single flow primitive, so when a script is blocking, other requests are not processed.
While that's usually a very desirable behaviour, blocking scripts can be run threaded by using the
:py:obj:`mitmproxy.script.concurrent` decorator.
**If your script does not block, you should avoid the overhead of the decorator.**
.. literalinclude:: ../../examples/nonblocking.py
:caption: examples/nonblocking.py
:language: python
Make scripts configurable with arguments
----------------------------------------
Sometimes, you want to pass runtime arguments to the inline script. This can be simply done by
surrounding the script call with quotes, e.g. ```mitmdump -s 'script.py --foo 42'``.
The arguments are then exposed in the start event:
.. literalinclude:: ../../examples/modify_response_body.py
:caption: examples/modify_response_body.py
:language: python
Running scripts on saved flows
------------------------------
Sometimes, we want to run a script on :py:class:`~mitmproxy.models.Flow` objects that are already
complete. This happens when you start a script, and then load a saved set of flows from a file
(see the "scripted data transformation" example :ref:`here <mitmdump>`).
It also happens when you run a one-shot script on a single flow through the ``|`` (pipe) shortcut
in mitmproxy.
In this case, there are no client connections, and the events are run in the following order:
**start**, **request**, **responseheaders**, **response**, **error**, **done**.
If the flow doesn't have a **response** or **error** associated with it, the matching events will
be skipped.
Spaces in the script path
-------------------------
By default, spaces are interpreted as a separator between the inline script and its arguments
(e.g. ``-s 'foo.py 42'``). Consequently, the script path needs to be wrapped in a separate pair of
quotes if it contains spaces: ``-s '\'./foo bar/baz.py\' 42'``.
.. _GitHub: https://github.com/mitmproxy/mitmproxy