Analyzing data 
-----------------

Preprocessing
~~~~~~~~~~~~~

To aggregate sample data with mean and standard deviation, a
preprocessing step is required. This is accomplished using the script
``dev/aggregate.py``. The script processes the raw data collected during
benchmarking and computes the necessary statistical measures for
analysis.

Setting Up the Environment
^^^^^^^^^^^^^^^^^^^^^^^^^^

Before running the preprocessing script, ensure that you have a suitable
Python environment. You can create a virtual environment and install the
necessary dependencies using the ``requirements.txt`` file. This ensures
that all required packages are available for the script to run smoothly.

.. code:: sh

   python3 -m venv env
   source env/bin/activate
   pip install -r requirements.txt

Preprocessing Script
^^^^^^^^^^^^^^^^^^^^

The ``dev/aggregate.py`` script requires two arguments to function
correctly:

1. **Path to the TOML File**: This is the configuration file used during
   the benchmarking process. It contains the parameters and tasks that
   were executed.

2. **Output Directory**: This is the directory where the aggregated data
   will be stored after processing.

To run the script, use the following command:

.. code:: sh

   python dev/aggregate.py <path/to/toml> <path/to/output_directory>

For example, running the following commands produces:

.. code:: sh

   python dev/aggregate.py examples/demo.toml aggregated
   tree aggregated

   aggregated/
   ├── dd-1
   │   ├── deep-trace
   │   │   ├── io.csv
   │   │   ├── package-0-core.csv
   │   │   ├── package-0.csv
   │   │   ├── perf.csv
   │   │   ├── stderr
   │   │   ├── stdout
   │   │   └── trace.csv
   │   ├── io.csv
   │   ├── package-0-core.csv
   │   ├── package-0.csv
   │   └── perf.csv
   ├── nbody-1
   │   ├── deep-trace
   │   │   ├── io.csv
   │   │   ├── package-0-core.csv
   │   │   ├── package-0.csv
   │   │   ├── perf.csv
   │   │   ├── stderr
   │   │   ├── stdout
   │   │   └── trace.csv
   │   ├── io.csv
   │   ├── package-0-core.csv
   │   ├── package-0.csv
   │   └── perf.csv
   ├── nbody-2
   │   ├── deep-trace
   │   │   ├── io.csv
   │   │   ├── package-0-core.csv
   │   │   ├── package-0.csv
   │   │   ├── perf.csv
   │   │   ├── stderr
   │   │   ├── stdout
   │   │   └── trace.csv
   │   ├── io.csv
   │   ├── package-0-core.csv
   │   ├── package-0.csv
   │   └── perf.csv
   └── nbody-4
       ├── deep-trace
       │   ├── io.csv
       │   ├── package-0-core.csv
       │   ├── package-0.csv
       │   ├── perf.csv
       │   ├── stderr
       │   ├── stdout
       │   └── trace.csv
       ├── io.csv
       ├── package-0-core.csv
       ├── package-0.csv
       └── perf.csv

Perf Aggregation
^^^^^^^^^^^^^^^^

Perf output is aggregated by calculating the mean and standard deviation
for each counter. The output appears as follows.

.. code:: sh

   cat /tmp/demo-processed/bonnie++-1-untrusted/perf.csv

   event,counter_mean,counter_std,counter_unit,metric_mean,unit_metric,perc_runtime_mean
   L1-dcache-load-misses,5113360521.6,103267837.73247407,,3.4019999999999997,of all L1-dcache accesses,31.0
   L1-dcache-loads,150305474853.2,205182282.16844067,,1.1228,G/sec,31.0
   L1-dcache-prefetches,1269258560.2,52122983.51175186,,9.4724,M/sec,31.0
   L1-icache-load-misses,2304340677.4,19231482.942564532,,1.078,of all L1-icache accesses,31.0
   L1-icache-loads,214014918517.2,261940713.8728049,,1.5986,G/sec,31.0
   branch-instructions,80574228395.8,171678771.61877853,,601.847,M/sec,31.0
   branch-load-misses,9837754039.2,26359858.744911663,,73.4822,M/sec,31.0
   branch-loads,80564398959.6,163850241.16953686,,601.7755999999999,M/sec,31.0
   branch-misses,9836256318.8,24316591.9320201,,12.206,of all branches,31.0
   ....

Energy Measurement Aggregation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Energy measurements are aggregated using the **coalescing window
method**. This involves grouping energy samples into fixed time
intervals, or **windows**, to align them with other data samples. By
default, the window size (``W``) is set to ``100ms``. Within each
window, energy samples are averaged to compute the mean and standard
deviation. This ensures energy data is accurately represented over
consistent time intervals, allowing for meaningful comparisons with
other metrics collected during benchmarking.

.. code:: sh

   head  /tmp/demo-processed/launch_nbody.sh-1-untrusted/package-0.csv

   ,bin,relative_time,energy (microjoule)
   0,0,0.0,20926838658.2
   1,50018408,500184086.0,22027975500.0
   2,50019136,500191369.0,21457430207.0
   3,50023813,500238137.0,19852073423.0
   4,50026094,500260940.0,20929957912.0
   5,50031162,500311624.0,20389345794.0
   6,100040329,1000403297.0,21462625388.0
   7,100047796,1000477967.0,19856279871.0
   8,100049695,1000496953.0,22029620099.0


Deep trace analysis
~~~~~~~~~~~~~~~~~~~

This figure (generate from ``deep-trace/trace.csv``) presents two histograms illustrating system events over the duration of a Sysbench run (1 GB workload, 8 threads), binned by relative time:

- **Top subplot (System and Disk Events)**  
  Depicts system calls (``sys-read``, ``sys-write``) and disk I/O events (``dsk-read``, ``dsk-write``). Each bar’s height indicates the count of that event type in the corresponding time bin. Notably, ``sys-read`` exhibits high spikes, suggesting periods of more intense read operations.

- **Bottom subplot (Memory Allocation/Free Events)**  
  Shows memory-related operations (``mm-page-alloc``, ``mm-page-free``, ``kmalloc``, ``kfree``). The concentration of allocations at the start indicates setup overhead, while the large cluster of frees at the end points to cleanup and deallocation.

.. figure:: ./figures/sysbench-1G-8-untrusted.png
  :width: 400
  :alt: Sysbench executed using Gramine with 1Gb and 4 threads.

  Sysbench executed using Gramine with 1Gb and 8 threads.

.. figure:: ./figures/sysbench-1G-4-untrusted.png
  :width: 400
  :alt: Sysbench executed using Gramine with 1Gb and 4 threads.

  Sysbench executed using Gramine with 1Gb and 4 threads.

.. figure:: ./figures/sysbench-1G-2-untrusted.png
  :width: 400
  :alt: Sysbench executed using Gramine with 1Gb and 2 threads.

  Sysbench executed using Gramine with 1Gb and 2 threads.

.. figure:: ./figures/sysbench-1G-1-untrusted.png
  :width: 400
  :alt: Sysbench executed using Gramine with 1Gb and 1 thread.

  Sysbench executed using Gramine with 1Gb and 1 thread.

.. figure:: ./figures/sysbench-1-no-sgx.png
  :width: 400
  :alt: Sysbench executed without Gramine with 1 thread.

  Sysbench executed without Gramine with 1 thread.

.. figure:: ./figures/sysbench-2-no-sgx.png
  :width: 400
  :alt: Sysbench executed without Gramine with 2 threads.

  Sysbench executed without Gramine with 2 threads.

.. figure:: ./figures/sysbench-4-no-sgx.png
  :width: 400
  :alt: Sysbench executed without Gramine with 4 threads.

  Sysbench executed without Gramine with 4 threads.

.. figure:: ./figures/sysbench-8-no-sgx.png
  :width: 400
  :alt: Sysbench executed without Gramine with 8 threads.

  Sysbench executed without Gramine with 8 threads.


Following plots represents execution for the I/O bound test, Allocations, read and 
writes are at the beginning of the execution both for Gramine and non Gramine app.

.. figure:: ./figures/dd-128M-1-encrypted.png
  :width: 400
  :alt: dd executed using encrypted storage in Gramine with 128M

  dd executed using encrypted storage in Gramine with 128M

.. figure:: ./figures/dd-128M-1-untrusted.png
  :width: 400
  :alt: dd executed using unencrypted storage in Gramine 64M

  dd executed using unencrypted storage in Gramine 64M

.. figure:: ./figures/dd-1-no-sgx.png
  :width: 400
  :alt: dd executed without Gramine

  dd executed without Gramine

Disk write analysis
~~~~~~~~~~~~~~~~~~~

This bar chart compares the percentage of sequential and random disk writes between two configurations: ``encrypted`` and ``untrusted``.

- **Sequential Writes (%)** (blue bars):  
  - The ``untrusted`` configuration exhibits a significantly higher percentage of sequential writes compared to ``encrypted``.
  
- **Random Writes (%)** (orange bars):  
  - The ``encrypted`` configuration has a higher proportion of random writes compared to ``untrusted``, where random writes are notably lower.

.. figure:: ./figures/disk_write-sgx-dd-128M.png
  :width: 400
  :alt: Disk write analysis for enclave 128Mb size 

  Disk write analysis for an enclave 128Mb size


.. figure:: ./figures/disk_write-sgx-dd-64M.png
  :width: 400
  :alt: Disk write analysis for enclave 64Mb size 
  
  Disk write analysis for an enclave 64Mb size

Perf analysis
~~~~~~~~~~~~~
The following plots compares results from the ``perf`` command. Since perf reads CPU 
counters results are shown for the ``sysbench`` application.
Cache misses are very high during SGX execution. This is highly due to the fact that 
when an enclave starts the execution, TLBs and Caches are flushed.

.. figure:: ./figures/cache-misses-sgx-sysbench-1G.png
  :width: 400
  :alt: Cache misses for sysbench application

  Cache misses for sysbench application

.. figure:: ./figures/cache-references-sgx-sysbench-1G.png
  :width: 400
  :alt: Cache references for sysbench application

  Cache references for sysbench application

Other counters like branch-misses and branch-loads are balanced between new applications

.. figure:: ./figures/branch-misses-sgx-sysbench-1G.png
  :width: 400
  :alt: Branch misses for sysbench application

  Branch misses for sysbench application

.. figure:: ./figures/branch-loads-sgx-sysbench-1G.png
  :width: 400
  :alt: Branch loads for sysbench application

  Branch loads for sysbench application