Metadata-Version: 2.1
Name: auditmatica
Version: 0.1.0
Summary: Audit Archivematica user activities via nginx access logs.
Home-page: https://github.com/artefactual-labs/auditmatica
Author: Artefactual Systems, Inc.
Author-email: info@artefactual.com
License: AGPL 3
Description: # auditmatica
        
        ## About
        
        Audit [Archivematica][am] user activities via nginx access logs.
        
        ## Overview
        
        `auditmatica` is intended to facilitate auditing of user activities in
        Archivematica and the Archivematica Storage Service. It uses nginx access logs
        as its data source, and outputs either logs in [Common Event Format (CEF)][cef]
        or a JSON overview of user activities.
        
        ## Usage
        
        `auditmatica` has two subcommands, `write-cef` and `overview`.
        
        ```
        Usage: auditmatica [OPTIONS] COMMAND [ARGS]...
        
          Auditmatica: Archivematica auditing package
        
        Options:
          --version  Show the version and exit.
          --help     Show this message and exit.
        
        Commands:
          overview   Print overview of user activities from nginx access log.
          write-cef  Write Common Event Format (CEF) log from nginx access log.
        
        ```
        
        ### `write-cef`
        
        To write CEF events, use the `write-cef` subcommand. E.g.:
        
        ```bash
        auditmatica write-cef /path/to/nginx/access.log
        ```
        
        or
        
        ```bash
        cat /var/log/nginx/access.log | auditmatica write-cef
        ```
        
        CEF is a widely used standard for network and security analysis. CEF events
        can be sent to applications for review, monitoring, and visualization via a
        file or over syslog. CEF events written by `auditmatica` include an event name,
        signature (unique identifier), and severity level (0-10), which are determined
        based on details from the nginx access log such as URL, HTTP method, and HTTP
        return code.
        
        A sample CEF event written by `auditmatica` looks like the following:
        
        ```
        CEF:0|Artefactual Systems, Inc.|Archivematica|hosted|16|AIP downloaded from Archival Storage|3|cs1Label=requestClientApplication cs1="Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0" rt=Jan 13 2021 20:01:33 requestMethod=GET request=/archival-storage/download/aip/8fa54cfc-f5c5-4673-b44e-fc514496bad7/ src=172.19.0.1 suser=test msg=UUID:8fa54cfc-f5c5-4673-b44e-fc514496bad7
        ```
        
        For a comprehensive list of Archivematica CEF events, see
        [IMPLEMENTATION.md](IMPLEMENTATION.md).
        
        For more details on CEF, see the [CEF specification][cef].
        
        This command accepts several optional arguments:
        
        ```
        Usage: auditmatica write-cef [OPTIONS] [LOG]
        
          Write Common Event Format (CEF) log from nginx access log.
        
        Options:
          -o, --output PATH       Filepath for output CEF file (default=None, print to
                                  stdout)
        
          -s, --syslog            Write CEF events to syslog instead of file
          --syslog-address TEXT   Address for syslog connection (default='/dev/log')
          --syslog-facility TEXT  Facility for syslog messages (default='USER')
          --syslog-port INTEGER   Port for remote syslog connections
          --ss-base-url TEXT      Override the Storage Service URL to scan for
                                  (default='http://127.0.0.1:62081')
        
          --suppress              Suppress log lines that do not map to a specific
                                  event instead of reverting to default event
        
          -v, --verbose           Enable verbose error message reporting
          --help                  Show this message and exit.
        
        
        ```
        
        #### Storage Service
        
        `auditmatica` looks for Storage Service events in the nginx access log by
        checking each URL to determine if it begins with the expected base URL of the
        Storage Service. By default, this is `http://127.0.0.1:62081`.
        
        To override the Storage Service URL to scan for, use `--ss-base-url`. E.g.:
        
        ```
        auditmatica --ss-base-url http://archivematica.example.com:8000
        ```
        
        #### Output
        
        By default, `auditmatica` writes CEF events to stdout and some end-user facing
        messages to stderr.
        
        To write CEF events to a file, use the `-o/--output` option to specify a
        filepath for the output file. E.g.:
        
        ```
        auditmatica write-cef /path/to/nginx/access.log --output my-output-file.log
        ```
        
        To write CEF events to [syslog][syslog], use the `-s/--syslog` option. By
        default, this will write syslog messages to `/dev/log/` using the `USER`
        facility. The `--syslog-address`, `--syslog-port`, and `--syslog-facility`
        options can be used to customize the syslog connection. E.g.: 
        
        ```bash
        auditmatica write-cef -s \
          --syslog-address localhost \
          --syslog-port 514 \
          --syslog-facility local0 \
          /path/to/nginx/access.log
        ```
        
        `--syslog-port` will only be used if an address other than `/dev/log` is also
        passed with`--syslog-address`.
        
        Valid `--syslog-facility` values are `local0`-`local7`, which are reserved by
        syslog for local use.
        
        ### `overview`
        
        To generate a high-level JSON overview of Archivematica user activities, use
        the `overview` subcommand. E.g.:
        
        ```bash
        cat access.log | auditmatica overview
        ```
        ```
        Usage: auditmatica overview [OPTIONS] [LOG]
        
          Write JSON overview of user activities from nginx access log.
        
        Options:
          --ss-base-url TEXT  Override the Storage Service URL to scan for
                              (default='http://127.0.0.1:62081')
          --help              Show this message and exit.
        
        ```
        
        #### Storage Service
        
        `auditmatica` looks for Storage Service events in the nginx access log by
        checking each URL to determine if it begins with the expected base URL of the
        Storage Service. By default, this is `http://127.0.0.1:62081`.
        
        To override the Storage Service URL to scan for, use `--ss-base-url`. E.g.:
        
        ```
        auditmatica --ss-base-url http://archivematica.example.com:8000
        ```
        
        ## Download
        
        Download from GitHub:
        
        ```bash
        git clone https://github.com/artefactual-labs/auditmatica.git
        ```
        
        ## Install
        
        ### General
        
        `auditmatica` requires Python 3.6+.
        
        Change into the cloned directory and install:
        
        ```bash
        cd auditmatica/
        pip install .
        ```
        
        ### Usernames
        
        Including authenticated usernames in `auditmatica`'s outputs requires some
        additional setup:
        
        1. Enable auditing middleware in Archivematica and the Storage Service via
        environment variables
        ```bash
        # Archivematica 1.13+
        ARCHIVEMATICA_DASHBOARD_DASHBOARD_AUDIT_LOG_MIDDLEWARE: "true"
        
        # Storage Service 0.18+
        SS_AUTH_LOG_MIDDLEWARE: "true"
        ```
        
        2. Restart Archivematica and Storage Service services
        ```bash
        sudo service archivematica-dashboard restart
        sudo service archivematica-storage-service restart
        ```
        
        3. Add the following configuration to the `http` block of the `nginx.conf`
        configuration file (likely `/etc/nginx/nginx.conf`, though this may vary)
        ```bash
        log_format main '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for" user=$upstream_http_x_username';
        access_log "/var/log/nginx/access.log" main;
        ```
        This format must be exact, as it maps directly to a pattern `auditmatica` uses
        to parse nginx access log lines with usernames. If an `access_log` is already
        specified, replace it or name it something other than `main`. The `access_log`
        path (by default `/var/log/nginx/access.log`) can be changed as needed.
        
        4. Optionally, add `proxy_hide_header x-username;` to the nginx `server` blocks
        to prevent the authenticated username from being sent back with each response
        to the client device.
        
        5. Restart nginx service:
        ```bash
        sudo service nginx restart
        ```
        
        If the above is configured correctly, the resulting nginx access log lines
        should look as follows:
        ```
        172.10.5.1 - - [13/Jan/2021:19:53:10 +0000] "GET /backlog/download/2e28b8a9-351c-4da7-92d2-837ac04cd2d9/ HTTP/1.1" 200 19436360 "http://127.0.0.1:62080/backlog/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0" "-" user=test
        ```
        
        ## Development
        
        For development, it may be useful to install `auditmatica` with
        `pip install -e .`, which will apply changes made to the source code
        immediately.
        
        To run all tests with tox: `tox`
        
        Or run tests directly with pytest:
        ```bash
        pip install -r requirements/test.txt
        pytest
        ```
        
        [am]: https://archivematica.org
        [cef]: https://community.microfocus.com/t5/ArcSight-Connectors/ArcSight-Common-Event-Format-CEF-Implementation-Standard/ta-p/1645557
        [syslog]: https://tools.ietf.org/html/rfc5424
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: GNU Affero General Public License v3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
