Metadata-Version: 2.1
Name: synthetic-sample
Version: 1.0.6
Summary: A generator for synthetic sales data
Home-page: https://github.com/amendmentai/synthetic_sample
Author: Makani Cartwright
Author-email: makani@amendmentai.com
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown

# synthetic_sample

synthetic_sample is a data generation application for producing synthetic sales transactions over a time series, including associated shipment and product data

## Usage

Sample data is generated by running `synthetic_sample_generator.py`  and using

``` 
python3 synthetic_sample_generator.py --json_filepath JSON_FILEPATH --output_directory OUTPUT_DIRECTORY --create_records
```
where
- `json_filepath` is the filepath to the input JSON (see Request Requirements below)
- `output_directory` is the directory to save output data to, in CSV format
- `create_records` is a flag that  indicates that raw record data should also be saved to the output directory. Running without this 
flag results in only aggregate output data

## Request Requirements
The required input format is a JSON with the following fields:
- Required: 
  - `start_date`: date in the first period to include, e.g. if 2020/02/15 is provided, the full week of that date will be included
  - `end_date`: date in the last period to include, e.g. if 2020/02/15 is provided, the full week of that date will be included
  - `annual_growth_factor`: year over year growth factor, 10% growth corresponds to a value of 1.1
  - `period_type`: indicates what type of curve to generate, supports "month" or "week"
  - at least one of 
    - `total_sales`: total number of sales for the period
    - `total_packages`: total number of packages shipped for the period
    - `total_quantity`: total number of items sold for the period
    - `annual_sales`: annualized number of sales for the period
    - `annual_packages`: annualized number of packages shipped for the period
    - `annual_quantity`: annualized number of items sold for the period
  - `curve_definition`: Definition of the curve to create, either as a list of dictionaries with each feature or as a 
    string indicating the name of the default curve to use.
    - If a list of dictionaries is provided, they must adhere to the following structure
      - Required Keys:
        - `anchor_type`: Type of annual anchor used to define the feature
          - Possible Values: "holiday", "week_of_year", "month_of_year", "day_of_year"
        - `anchor_point`: Annual point to define the feature
          - Possible values: (string) - holiday name, (int) - week or day of year
        - `anchor_value`: Cumulative percent of total sales (0.0-1.0) completed by the end of the period of the anchor_point
      - Optional Keys:
        - `relative_start`: Number of periods before the anchor_point to define a relative cumulative percent value 
        - `start_value`: Cumulative percent of total sales (0.0-1.0) completed by the end of the period indicated by relative_start
        - `relative_end`: Number of periods before the anchor_point to define a relative cumulative percent value
        - `end_value`: Cumulative percent of total sales (0.0-1.0) completed by the end of the period indicated by relative_end
    - If a string is provided, it must correspond to a default in `synthetic_sample/defaults/curves/{period_type}/{curve_definition}.json`
      - Initial set of available curves are
        - `modern_brand`
        - `modern_distributor`
        - `traditional_brand`
        - `traditional_distributor`
- Optional: 
  - `default_type`: string indicating the type of defaults to use, these can be found as JSON in `synthetic_sample/defaults/lib/`
  - `product_distribution`: dictionary of product labels (i.e. SKUs) and their relative weights
  - `week_distribution`: dictionary of weeks of the month (where 1 is the first week and -1 is the last) and their relative weights
  - `weekday_distribution`: dictionary of weekdays (where 0 is Monday and 6 is Sunday) and their relative weights
  - `seasonal_distribution`: dictionary of seasons ("Q1"..."Q4") and their relative weights
  - `modifiers`: list of any modifiers to apply. 
    - "covid": Applies a 33% boost to all periods between 2020/3/26 and 2021/9/1

### Example:
The below request will generate data for each month starting 2018-06 and ending 2020-12.
```JSON
{
  "start_date": "2018-06-01",
  "end_date": "2020-12-31",
  "total_sales": 1000000,
  "total_packages": 1500000,
  "total_quantity": 6000000,
  "annual_growth_factor": 1.15,
  "product_distribution": {
    "AAA-01" : 1,
    "AAA-02" : 2.5,
    "AAA-11" : 5.6,
    "BBB-10" : 0.5,
    "BBB-20" : 1
  },
  "week_distribution": {
    "1": 0.1,
    "-1": 0.5
  },
  "weekday_distribution": {
    "0": 0.0,
    "1": 0.0,
    "2": 0.0,
    "3": 0.0,
    "4": 0.0,
    "5": 2.0,
    "6": 1.0
  },
  "seasonal_distribution": {
    "Q1": 1,
    "Q2": 1,
    "Q3": 1,
    "Q4": 1
  },
  "period_type": "month",
  "curve_definition": [
    {
      "anchor_type": "month_of_year",
      "anchor_point": 1,
      "anchor_value": 0.0424
    },
    {
      "anchor_type": "month_of_year",
      "anchor_point": 2,
      "anchor_value": 0.103
    },
    {
      "anchor_type": "month_of_year",
      "anchor_point": 3,
      "anchor_value": 0.203
    },
    {
      "anchor_type": "month_of_year",
      "anchor_point": 4,
      "anchor_value": 0.3152
    },
    {
      "anchor_type": "month_of_year",
      "anchor_point": 5,
      "anchor_value": 0.4139
    },
    {
      "anchor_type": "month_of_year",
      "anchor_point": 6,
      "anchor_value": 0.4776
    },
    {
      "anchor_type": "month_of_year",
      "anchor_point": 7,
      "anchor_value": 0.5321
    },
    {
      "anchor_type": "month_of_year",
      "anchor_point": 8,
      "anchor_value": 0.5897
    },
    {
      "anchor_type": "month_of_year",
      "anchor_point": 9,
      "anchor_value": 0.6715
    },
    {
      "anchor_type": "month_of_year",
      "anchor_point": 10,
      "anchor_value": 0.7836
    },
    {
      "anchor_type": "month_of_year",
      "anchor_point": 11,
      "anchor_value": 0.9018
    },
    {
      "anchor_type": "month_of_year",
      "anchor_point": 12,
      "anchor_value": 1.0
    }
  ]
}
```


