Metadata-Version: 2.1
Name: baish
Version: 0.1.0a4
Summary: A security-focused tool that uses LLMs to analyze shell scripts
Author-email: curtis <curtis@serverascode.com>
License: GPL-3.0
Project-URL: Homepage, https://github.com/taicodotca/baish
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Security
Classifier: Topic :: System :: Systems Administration
Classifier: Topic :: Utilities
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: langchain>=0.3.8
Requires-Dist: anthropic>=0.39.0
Requires-Dist: groq>=0.12.0
Requires-Dist: yara-python>=4.5.1
Requires-Dist: python-magic>=0.4.27
Requires-Dist: pyyaml>=6.0.2
Requires-Dist: rich>=13.9.4
Requires-Dist: langchain-anthropic>=0.3.0
Requires-Dist: langchain-groq>=0.2.1
Requires-Dist: langchain-core>=0.3.21
Requires-Dist: pydantic>=2.10.2
Requires-Dist: loguru>=0.7.2

![Baish Logo](img/baish.png)

# Baish (Bash AI Shield)

Baish is a security-focused tool that uses Large Language Models (LLMs) and other heuristics to analyse shell scripts before they are executed. It's designed to be used as a more secure alternative to the common `curl | bash` pattern.

Importantly, Baish is a cybersecurity learning project, where the developers have a relatively narrow solution to implement, but still learn a lot about the problem space. For example, how to use LLMs, how to secure them, and how to take and understand untrusted input.

Perhaps it's unlikely that anyone who would `curl | bash` would `curl | baish --shield | bash` first (perhaps in a CI/CD pipeline). That said, ultimately, as an industry, we run a lot of unknown code, so there may be uses for this. 

The underlying problems are the same in almost every application, and we are trying to use different heuristics in combination with general AI capabilities to build a cybersecurity tool. So there are two parts to the project:

1. Build a tool that uses LLMs to help improve security
2. Understand how to use LLMs themselves more securely

## About TAICO

The [Toronto Artificial Intelligence and Cybersecurity Organization (TAICO)](https://taico.ca) is a group of AI and cybersecurity experts who meet monthly to discuss the latest trends and technologies in the field. Baish is a project of TAICO.

## Caveats and Disclaimers

⚠️ Baish's analysis is not foolproof! This is a proof of concept! To be completely sure that a script is safe, you would have to review and analyze it yourself.

⚠️ Different LLM providers will give different results. One provider and one model may give a script a low risk score, while another model or provider gives a high risk score. You would have to experiment with different providers and models to see which one you trust the most.

⚠️ Baish is in heavy development. Expect breaking changes.

⚠️ Using local Ollama for local LLMs is still experimental and may not work as expected.

## Features

- Accepts files on stdin, ala the `curl | bash` pattern, but instead you would do `curl | baish --shield | bash`
- Can analyze any file, not just shell scripts curled to bash
- Analyzes scripts using various configurable LLMs for potential security risks
- Provides a harm score (1-10) indicating potential dangerous operations (higher is more dangerous)
- Provides a complexity score (1-10) indicating how complex the script is (higher is more complex)
- Saves downloaded scripts for later review 
- Logs all requests and responses from LLMs along with the script ID
- Uses YARA rules and other heuristics to detect potential prompt injection

## Large Language Model Provider Support

Baish currently supports the following providers:

* Groq
* Anthropic
* Experimental support for Ollama for local LLMs, e.g. llama3, mistral, etc.

It is straightforward to add support for other providers, pretty much anything LangChain supports, and contributions are welcome!

## Prerequisites

* An API key from the a supported LLM provider.
* Knowing which model from the provider you are going to use.
* Python 3.10 or later
* libmagic (for file type detection)
  * Ubuntu/Debian: `apt install libmagic1`
  * RHEL/CentOS: `dnf install file-libs`
  * macOS: `brew install libmagic`

## Installation

### From PyPI

### Virtual Environment

* Best to create a virtual environment to install baish.

```bash
python3 -m venv baish-env
source baish-env/bin/activate
```

### From PyPI

```bash
pip install baish
```

### From Source

* Checkout the repo:

```bash
git clone https://github.com/taicodotca/baish.git
cd baish
```

* Install the dependencies:

```bash
pip install -r requirements.txt
```

* Set the API key for the LLM provider in the `config.yaml` file.  You can also specify a different model, temperature, etc.

* Run baish:

```bash
$ ./baish 
Error: No input provided
Usage: cat script.sh | baish
```

## Usage

* Technically, you can pipe any file to baish, but it's really meant to be used with shell scripts, especiall via the `curl evil.com/evil.sh | baish` pattern.

```bash
curl -sSL https://thisisapotentiallyunsafescript.com/script.sh | baish
```

Baish will output the harm score, complexity score, and an explanation for why the script is either safe or not.

### Setting Provider and Model

You can set the provider and model in the `config.yaml` file.   

E.g. `config.yaml`:

```yaml
default_llm: haiku # default model to use
llms:
  haiku: # memorable name
    provider: anthropic # provider name
    model: claude-3-5-haiku-latest # model name
    temperature: 0.1 # temperature

  other_model:
    provider: groq
    model: llama3-70b-8192
    temperature: 0.1
```

### Using Ollama

If using Ollama, you can also specify the base URL, though it will default to `http://localhost:11434` if not specified.

```yaml
other_model:
  provider: ollama
  model: llama3:latest
  url: http://localhost:11434
```

Currently our prompt is quite long, and for example when using llama3, the promt lenght is 2048 by default, so you may see errors like this:

```
time=2024-12-08T11:22:33.343-05:00 level=WARN source=runner.go:129 msg="truncating input prompt" limit=2048 prompt=2815 keep=25 new=2048
```

You can increase the context window with the following command:

```
$ ollama run llama3 /set parameter num_ctx 4096
You've set the `num_ctx` parameter to 4096. This parameter is used in some machine 
learning models, such as transformer-based architectures, and specifies the number of 
context windows or attention heads to use.
<output abbreviated>
```

## Examples

Here's a few examples of real world scripts that Baish can help you analyze before execution. These are mostly about installing real world software.

```text
$ curl -fsSL https://ollama.com/install.sh | ./baish
⠙ Analyzing file...
╭────────────────────────────── Baish - Bash AI Shield ───────────────────────────────╮
│ Analysis Results - script_1732984526.sh                                             │
│                                                                                     │
│ Harm Score:       2/10 ████────────────────                                         │
│ Complexity Score: 8/10 ████████████████────                                         │
│ Uses Root:    True                                                                  │
│                                                                                     │
│ File type: text/x-shellscript                                                       │
│                                                                                     │
│ Explanation:                                                                        │
│ This script is a Linux installer for Ollama, a software package. It installs Ollama │
│ on the system, detects the operating system architecture, and installs the          │
│ appropriate version of Ollama. It also checks for and installs NVIDIA CUDA drivers  │
│ if necessary. The script uses various tools and commands to perform these tasks,    │
│ including curl, tar, and dpkg. The script is designed to be run as root and         │
│ modifies the system by installing software and configuring system settings.         │
│                                                                                     │
│ Script saved to: /home/curtis/.baish/scripts/script_1732984526.sh                   │
│ To execute, run: bash /home/curtis/.baish/scripts/script_1732984526.sh              │
│                                                                                     │
│ ⚠️  AI-based analysis is not perfect and should not be considered a complete         │
│ security audit. For complete trust in a script, you should analyze it in detail     │
│ yourself. Baish has downloaded the script so you can review and execute it in your  │
│ own environment.                                                                    │
╰─────────────────────────────────────────────────────────────────────────────────────╯
```

Install rvm:

```
curl -sSL https://get.rvm.io | ./baish --debug
```

Install rust:

```
curl --silent https://sh.rustup.rs | ./baish
```

Install docker:

```
curl -fsSL https://get.docker.com | ./baish --debug
```

### Shield Mode

Baish can also be used in "shield" mode, which will error out if the script is not safe.

```
curl -sSL https://thisisapotentiallyunsafescript.com/script.sh | ./baish -s | bash
```

E.g. of running an unsafe script through baish in shield mode, where bash will execute the output of baish, in this case outputting an error message:

```bash
$ cat tests/fixtures/secret-upload.sh | ./baish -s | bash
Script unsafe: High risk score detected
```

Or without piping to bash. Note how the output is a "script" itself, echoing the output to the terminal which bash will then execute:

```bash
$ cat tests/fixtures/secret-upload.sh | ./baish -s
echo "Script unsafe: High risk score detected"
```

## Logging and Stored Scripts

Baish logs all requests and responses from LLMs along with the script ID. It also saves the script to disk with the ID so it can be reviewed later.

Below we see the results of one Baish run.

```
$ tree ~/.baish/
/home/ubuntu/.baish/
├── logs
│   └── 2024-12-05_15-50-43_c6f3de91_llm.jsonl
└── scripts
    └── 2024-12-05_15-50-43_c6f3de91_script.sh

3 directories, 2 files
```

## Known Issues

* LLMs with short context windows (like some local models) may fail to analyze longer scripts due to prompt length limitations. Even commercial models with short context windows can fail to analyze longer scripts. 

## Future Work and TODOs

| Feature | Status | Description | Details |
|---------|--------|-------------|----------|
| JSON Output | DONE | Structured output format | Enables programmatic parsing of Baish results |
| LLM Logging | DONE | Request/response tracking | Log all LLM interactions with script IDs for audit trails |
| Prompt Injection Detection | DONE | YARA-based detection | Use YARA rules to identify potential prompt injection attempts |
| Shield Mode | DONE | Safe execution pipeline | Enable `curl \| baish \| bash` pattern with security controls |
| System Prompts | DONE | LLM prompt configuration | Configure system prompts for supported LLM providers |
| Prompt Injection Detection with YARA | DONE | YARA-based detection | Use YARA rules to identify potential prompt injection attempts |
| Root Usage Detection | IN PROGRESS | Improve detection | Enhance accuracy of root privilege usage detection |
| End to End Tests | IN PROGRESS | Dockerized tests | Run end to end tests in Docker |
| Atomic Red Team Integration | TODO | Use ART for testing | Use Atomic Red Team tests to validate Baish's detection capabilities against known malicious patterns |
| CI/CD Mode | TODO | Add pipeline integration | Create a specialized mode for CI/CD environments |
| Directory Analysis | TODO | Bulk file scanning | Analyze multiple files and generate comprehensive security reports |
| Custom YARA Rules | TODO | User-defined rules | Allow users to add their own YARA rules for custom threat detection |
| N/A Scoring | TODO | Better non-script handling | Display N/A instead of scores for non-scripts or prompt injection cases |
| Vector DB Memory | TODO | Long-term analysis storage | Implement vector database for historical analysis and pattern recognition |
| LLM Self-evaluation | TODO | Prompt injection checks | Enable LLMs to self-evaluate for prompt injection vulnerabilities |
| Token Length Management | TODO | Better chunking | Improve text chunking for large scripts using LangChain |
| Custom Prompts | TODO | User-defined prompts | Allow users to specify custom analysis prompts |
| Guardrails Integration | TODO | Add guardrails-ai | Integrate with guardrails-ai for additional security checks |
| Script Deobfuscation | TODO | Pre-analysis cleanup | Implement deobfuscation using tools like debash |
| VM Sandbox | TODO | Isolated execution | Run scripts in VM sandbox before actual execution |
| Shell Compilation | TODO | Compiled shell scripts | Support for compiled shell scripts using shc |
| One-time API Keys | TODO | Temporary credentials | Implement single-use API keys for safer execution |
| Base64 Detection | TODO | Encoded content handling | Detect and handle base64 encoded content |
| VirusTotal Integration | TODO | Hash checking | Check script hashes against VirusTotal database |
| VM Detonation | TODO | Dynamic analysis | Execute scripts in isolated environments for behavior analysis |
| Ollama JSON Support | TODO | JSON output | Support for Ollama JSON output format which comes in version 0.5 |
| Fix Debug logging | TODO | Debug logging | Right now many if debug statements are left in the code |
| Results Manager Coverage | TODO | Results Manager | Results Manager should manage logs, results, and scripts |

## Further Reading

* https://www.seancassidy.me/dont-pipe-to-your-shell.html
* https://www.djm.org.uk/posts/protect-yourself-from-non-obvious-dangers-curl-url-pipe-sh/index.html
* https://github.com/djm/pipe-to-sh-poc
* https://www.arp242.net/curl-to-sh.html
* https://github.com/greyhat-academy/malbash
