Metadata-Version: 2.1
Name: nkululeko
Version: 0.40.1
Summary: Do machine learning audio classification experiments
Home-page: https://github.com/felixbur/nkululeko
Author: Felix Burkhardt
Author-email: fxburk@gmail.com
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 3 - Alpha
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE

# Nkululeko

## Description
A project to detect speaker characteristics by machine learning experiments with a high level interface.

The idea is to have a framework (based on e.g. sklearn and torch) that can be used by people not being experienced programmers as they mainly have to adapt an initialization parameter file per experiment.

* The latest features can be seen at [the ini-file options](./ini_file.md) that are used to control Nkululeko
* Below is a [Hello World example](#helloworld) that should set you up fastly.
* [Here's a blog post on how to set up nkululeko on your computer.](http://blog.syntheticspeech.de/2021/08/30/how-to-set-up-your-first-nkululeko-project/)
* [Here's a slide presentation about nkululeko](docs/nkululeko.pdf)
* [Here's a video presentation about nkululeko](https://www.youtube.com/watch?v=Ueuetnu7d7M)
* [Here's the 2022 LREC article on nkululeko](http://felix.syntheticspeech.de/publications/Nkululeko_LREC.pdf)

Here are some examples of typical output:

### Confusion matrix
Per default, Nkululeko displays results as a  confusion matrix, using binning with regression.

![confusion matrix](images/conf_mat.png)

### Epoch progression
The point when overfitting starts can sometimes be seen by looking at the results per epoch:

![epoch progression](images/epoch_progression.png)

### Feature importance
Using the *explore* interface, Nkululeko analyses the importance of acoustic features:

![feature importance](images/feat_importance.png)

### Feature distribution
And can show the distribution of specific features per category:

![feature distribution](images/feat_dist.png)

### t-SNE plots
A t-SNE plot can give you an estimate wether your acoustic features are useful at all:

![tsne plot](images/tsne.png)

### Data distribution
Sometimes you only want to take a look at your data:

![data distribution](images/data_plot.png)

## Installation

Creat and activate a virtual python environment and simply run
```
pip install -r nkululeko
```

Some examples for *ini*-files (which you use to control nkululeko) are in the [demo folder](https://github.com/felixbur/nkululeko/tree/main/demos).



## Usage
Basically, you specify your experiment in an "ini" file (e.g. *experiment.ini*) and then call Nkululeko to run the experiment like this:
  * ```python -m nkululeko.nkululeko --config experiment.ini```

Alternatively, there is a central "experiment" class that can be used by own experiments

There's my [blog](http://blog.syntheticspeech.de/?s=nkululeko) with tutorials:
* [Introduction](http://blog.syntheticspeech.de/2021/08/04/machine-learning-experiment-framework/)
* [Nkulueko FAQ](http://blog.syntheticspeech.de/2022/07/07/nkululeko-faq/)
* [How to set up your first nkululeko project](http://blog.syntheticspeech.de/2021/08/30/how-to-set-up-your-first-nkululeko-project/)
* [Setting up a base nkululeko experiment](http://blog.syntheticspeech.de/2021/10/05/setting-up-a-base-nkululeko-experiment/)
* [How to import a database](http://blog.syntheticspeech.de/2022/01/27/nkululeko-how-to-import-a-database/) 
* [Comparing classifiers and features](http://blog.syntheticspeech.de/2021/10/05/nkululeko-comparing-classifiers-and-features/)
* [Use Praat features](http://blog.syntheticspeech.de/2022/06/27/how-to-use-selected-features-from-praat-with-nkululeko/)
* [Combine feature sets](http://blog.syntheticspeech.de/2022/06/30/how-to-combine-feature-sets-with-nkululeko/)
* [Classifying continuous variables](http://blog.syntheticspeech.de/2022/01/26/nkululeko-classifying-continuous-variables/) 
* [Try out / demo a trained model](http://blog.syntheticspeech.de/2022/01/24/nkululeko-try-out-demo-a-trained-model/) 
* [Plot distributions of feature values](http://blog.syntheticspeech.de/2023/02/16/nkululeko-how-to-plot-distributions-of-feature-values/)
* [Perform cross database experiments](http://blog.syntheticspeech.de/2021/10/05/nkululeko-perform-cross-database-experiments/)
* [Meta parameter optimization](http://blog.syntheticspeech.de/2021/09/03/perform-optimization-with-nkululeko/)
* [How to set up wav2vec embedding](http://blog.syntheticspeech.de/2021/12/03/how-to-set-up-wav2vec-embedding-for-nkululeko/)
* [How to soft-label a database](http://blog.syntheticspeech.de/2022/01/24/how-to-soft-label-a-database-with-nkululeko/) 
* [Re-generate the progressing confusion matrix animation wit a different framerate](demos/plot_faster_anim.py)
* [How to limit/filter a dataset](http://blog.syntheticspeech.de/2022/02/22/how-to-limit-a-dataset-with-nkululeko/)
* [Specifying database disk location](http://blog.syntheticspeech.de/2022/02/21/specifying-database-disk-location-with-nkululeko/) 
* [Add dropout with MLP models](http://blog.syntheticspeech.de/2022/02/25/adding-dropout-to-mlp-models-with-nkululeko/)
* [Do cross-validation](http://blog.syntheticspeech.de/2022/03/23/how-to-do-cross-validation-with-nkululeko/)
* [Combine predictions per speaker](http://blog.syntheticspeech.de/2022/03/24/how-to-combine-predictions-per-speaker-with-nkululeko/)
* [Run multiple experiments in one go](http://blog.syntheticspeech.de/2022/03/28/how-to-run-multiple-experiments-in-one-go-with-nkululeko/)
* [Compare several MLP layer layouts with each other](http://blog.syntheticspeech.de/2022/04/11/how-to-compare-several-mlp-layer-layouts-with-each-other/)
* [Import features from outside the software](http://blog.syntheticspeech.de/2022/10/18/how-to-import-features-from-outside-the-nkululeko-software/)
* [Explore feature importance](http://blog.syntheticspeech.de/2023/02/20/nkululeko-show-feature-importance/)
*  [Plot distributions for feature values](http://blog.syntheticspeech.de/2023/02/16/nkululeko-how-to-plot-distributions-of-feature-values/)


The framework is targeted at the speech domain and supports experiments where different classifiers are combined with different feature extractors.

Here's a rough UML-like sketch of the framework.
![sketch](images/class_diagram.png)

Currently the following linear classifiers are implemented (integrated from sklearn):
* SVM, SVR, XGB, XGR, Tree, Tree_regressor, KNN, KNN_regressor, NaiveBayes, GMM
  and the following ANNs
* MLP, CNN (tbd)

Here's [an animation that shows the progress of classification done with nkululeko](https://youtu.be/6Y0M382GjvM)

### Initialization file
You could 
* use a generic main python file (like my_experiment.py), 
* adapt the path to your nkululeko src 
* and then adapt an .ini file (again fitting at least the paths to src and data)
  
Here's [an overview on the ini-file options](./ini_file.md)

### <a name="helloworld">Hello World example</a>
* NEW [I made a video to show you how to do this on Windows](https://www.youtube.com/watch?v=ytbCnM2iQnc)
* Set up Python on your computer, version >= 3.6
* Open a terminal/commandline/console window
* Test python by typing ```python```, python should start with version >3 (NOT 2!)
* Create a folder on your computer for this example, let's call it *nkulu_work*
* Download nkululeko and unpack to this folder, or use "git clone" (prefered, if you know git)  
* Make sure the folder is called *nkululeko* and not somethin else, e.g. *nkululeko_main*
* Get a copy of the [Berlin emodb in audformat](https://tubcloud.tu-berlin.de/s/8Td8kf8NXpD9aKM) and unpack the same folder (*nkulu_work*)
* Make sure the folder is called "emodb" and does contain the database files directly (not box-in-a-box)
* Also, in the *nkulu_work* folder: 
  * Create a python environment
    * ```python3 -m venv venv```
  * Then, activate it:
    * under linux / mac
      * ```source venv/bin/activate```
    * under Windows
      * ```venv\Scripts\activate.bat```
    * if that worked, you should see a ```(venv)``` in front of your prompt
  * Install the required packages in your environment
    * ```pip install nkululeko```
    * Repeat until all error messages vanished (or fix them, or try to ignore them)...
* Now you should have two folders in your *nkulu_work* folder:
  * *emodb* and *venv*
* Download a copy of the file [exp_emodb.ini](demos/exp_emodb.ini)
* Run the demo
  * ```python -m nkululeko.nkululeko --config exp_emodb.ini```
* Find the results in the newly created folder exp_emodb 
  * Inspect ```exp_emodb/images/run_0/emodb_xgb_os_0_000_cnf.png```
  * This is the main result of you experiment: a confusion matrix for the emodb emotional categories
* Inspect and play around with the [demo configuration file](demos/exp_emodb.ini) that defined your experiment, then re-run.
* There are many ways to experiment with different classifiers and acoustic features sets, [all described here](https://github.com/felixbur/nkululeko/blob/main/ini_file.md)
  
### Features
* Classifiers: Naive Bayes, KNN, Tree, XGBoost, SVM, MLP
* Feature extractors: Praat, Opensmile, openXBOW BoAW, TRILL embeddings, Wav2vec2 embeddings, audModel embeddings, ...
* Feature scaling
* Label encoding
* Binning (continuous to categorical)
* Online demo interface for trained models 

### Outlook
* Classifiers: CNN
* Feature extractors: mid level descriptors, Mel-spectra

## Licence
Nkululeko can be used under the [MIT license](https://choosealicense.com/licenses/mit/)

Changelog
=========

Version 0.40.1
--------------
* fixed a bug: additional test database was not label encoded

Version 0.40.0
--------------
* added EXPL section and first functionality
* added test module (for test databases)

Version 0.39.0
--------------
* added feature distribution plots
* added  plot format

Version 0.38.3
--------------
* added demo mode with list argument

Version 0.38.2
--------------
* fixed a bug concerned with "no_reuse" evaluation

Version 0.38.1
--------------
* demo mode with file argument

Version 0.38.0
--------------
* fixed demo mode

Version 0.37.2
--------------
* mainly replaced pd.append with pd.concat


Version 0.37.1
--------------
* fixed bug preventing praat feature extraction to work

Version 0.37.0
--------------
* fixed bug cvs import not detecting multiindex 

Version 0.36.3
--------------
* published as a pypi module

Version 0.36.0
--------------
* added entry nkululeko.py script


Version 0.35.0
--------------
* fixed bug that prevented scaling (normalization)

Version 0.34.2
--------------
* smaller bug fixed concerning the loss_string

Version 0.34.1
--------------
* smaller bug fixes and tried Soft_f1 loss


Version 0.34.0
--------------
* smaller bug fixes and debug ouputs

Version 0.33.0
--------------
* added GMM as a model type

Version 0.32.0
--------------
* added audmodel embeddings as features

Version 0.31.0
--------------
* added models: tree and tree_reg
  
Version 0.30.0
--------------
* added models: bayes, knn and knn_reg

Version 0.29.2
--------------
* fixed hello world example


Version 0.29.1
--------------
* bug fix for 0.29


Version 0.29.0
--------------
* added a new FeatureExtractor class to import external data

Version 0.28.2
--------------
* removed some Pandas warnings
* added no_reuse function to database.load()

Version 0.28.1
--------------
* with database.value_counts show only the data that is actually used


Version 0.28.0
--------------
* made "label_data" configuration automatic and added "label_result"


Version 0.27.0
--------------
* added "label_data" configuration to label data with trained model (so now there can be train, dev and test set)

Version 0.26.1
--------------
* Fixed some bugs caused by the multitude of feature sets
* Added possibilty to distinguish between absolut or relative pathes in csv datasets

Version 0.26.0
--------------
* added the rename_speakers funcionality to prevent identical speaker names in datasets

Version 0.25.1
--------------
* fixed bug that no features were chosen if not selected

Version 0.25.0
--------------
* made selectable features universal for feature sets

Version 0.24.0
--------------
* added multiple feature sets (will simply be concatenated)

Version 0.23.0
--------------
* added selectable features for Praat interface

Version 0.22.0
--------------
* added David R. Feinberg's Praat features, praise also to parselmouth

Version 0.21.0
--------------

* Revoked 0.20.0
* Added support for only_test = True, to enable later testing of trained models with new test data

Version 0.20.0
--------------

* implemented reuse of trained and saved models

Version 0.19.0
--------------

* added "max_duration_of_sample" for datasets


Version 0.18.6
--------------

* added support for learning and dropout rate as argument


Version 0.18.5
--------------

* added support for epoch number as argument
  
Version 0.18.4
--------------

* added support for ANN layers as arguments

Version 0.18.3
--------------

* added reuse of test and train file sets
* added parameter to scale continous target values: target_divide_by


Version 0.18.2
--------------

* added preference of local dataset specs to global ones
  
Version 0.18.1
--------------

* added regression value display for confusion matrices

Version 0.18.0
--------------

* added leave one speaker group out

Version 0.17.2
--------------

* fixed scaler, added robust



Version 0.17.0
--------------

* Added minimum duration for test samples


Version 0.16.4
--------------

* Added possibility to combine predictions per speaker (with mean or mode function)

Version 0.16.3
--------------

* Added minimal sample length for databases


Version 0.16.2
--------------

* Added k-fold-cross-validation for linear classifiers

Version 0.16.1
--------------

* Added leave-one-speaker-out for linear classifiers


Version 0.16.0
--------------

* Added random sample splits

