# STRING Enrichment Report

This report summarizes functional enrichment of gene clusters using the STRING API.

Important interpretation notes:
- Lower k values favor enrichment due to larger cluster size.
- Increasing k may *increase the number of enriched clusters* as distinct biology separates.
- Loss of enrichment at high k reflects statistical underpowering, not absence of biology.

## Per-k enrichment overview

|   k |   n_clusters |   median_cluster_size |   n_clusters_enriched |   fraction_enriched |   total_enriched_terms |   median_best_fdr |
|----:|-------------:|----------------------:|----------------------:|--------------------:|-----------------------:|------------------:|
|   3 |            3 |                    15 |                     1 |            0.333333 |                     13 |         1.86e-07  |
|   8 |            8 |                     6 |                     4 |            0.5      |                      4 |         0.0017577 |
|  11 |           10 |                     4 |                     4 |            0.4      |                      5 |         0.0069    |

## Enrichment source breakdown

The table below shows how enrichment is distributed across annotation sources.
This helps distinguish pathway-driven, functional, structural, or literature-driven signal.

|   k | category_label    |   n_terms |   n_clusters |   best_fdr |
|----:|:------------------|----------:|-------------:|-----------:|
|   3 | HPO               |        10 |            1 |   1.86e-07 |
|   3 | PMID              |         3 |            1 |   1.93e-05 |
|   8 | HPO               |         2 |            2 |   1.54e-05 |
|   8 | NetworkNeighborAL |         1 |            1 |   0.0064   |
|   8 | PMID              |         1 |            1 |   4.05e-06 |
|  11 | HPO               |         3 |            2 |   1.54e-05 |
|  11 | NetworkNeighborAL |         1 |            1 |   0.0064   |
|  11 | TISSUES           |         1 |            1 |   0.0449   |

## How to use this report

- Look for k values where enrichment spreads across *multiple clusters*.
- Prefer k where GO / pathway categories dominate over domain-only or PubMed-only signal.
- Use heatmaps alongside this report to confirm tissue coherence.
