# STRING Enrichment Report

This report summarizes functional enrichment of gene clusters using the STRING API.

Important interpretation notes:
- Lower k values favor enrichment due to larger cluster size.
- Increasing k may *increase the number of enriched clusters* as distinct biology separates.
- Loss of enrichment at high k reflects statistical underpowering, not absence of biology.

## Per-k enrichment overview

|   k |   n_clusters |   median_cluster_size |   n_clusters_enriched |   fraction_enriched |   total_enriched_terms |   median_best_fdr |
|----:|-------------:|----------------------:|----------------------:|--------------------:|-----------------------:|------------------:|
|   3 |            3 |                    15 |                     2 |            0.666667 |                     25 |        0.00013557 |
|  11 |            9 |                     4 |                     2 |            0.222222 |                      8 |        0.00275    |
|  15 |            7 |                     4 |                     1 |            0.142857 |                      4 |        0.0023     |

## Enrichment source breakdown

The table below shows how enrichment is distributed across annotation sources.
This helps distinguish pathway-driven, functional, structural, or literature-driven signal.

|   k | category_label    |   n_terms |   n_clusters |   best_fdr |
|----:|:------------------|----------:|-------------:|-----------:|
|   3 | HPO               |        21 |            2 |   1.14e-06 |
|   3 | PMID              |         2 |            2 |   0.0043   |
|   3 | SMART domains     |         2 |            1 |   0.0067   |
|  11 | HPO               |         4 |            1 |   0.0023   |
|  11 | SMART domains     |         2 |            1 |   0.0168   |
|  11 | NetworkNeighborAL |         1 |            1 |   0.0047   |
|  11 | RCTM              |         1 |            1 |   0.0032   |
|  15 | HPO               |         4 |            1 |   0.0023   |

## How to use this report

- Look for k values where enrichment spreads across *multiple clusters*.
- Prefer k where GO / pathway categories dominate over domain-only or PubMed-only signal.
- Use heatmaps alongside this report to confirm tissue coherence.
