The cluster explorer of the AllegroMCODE plugin for Cytoscape provides a interactive cluster exploration and realtime postprocessing and is divided into three parts:

  • The cluster exploration allows you to expand or reduce the cluster size based on the node score using the Size Threshold Slider in the tab.
  • The postprocessing tab allows you to modify the cluster by adjusting two parameters of haircut and fluff.
  • The Node Attribute Enumerator in the bottom provides a summary of the node attributes and their frequency in the cluster.
allegromcode-cluster-explorer
allegromcode-cluster-explorer-postprocessing
Those of the three components are explained in greater detail below.
Size Threshold
The slider scale ranges from Min to Max and has an ‘origin’ marker (^) for its starting position. Node Score Cutoff, which is the most influential cluster size determinant, is controlled by the slider. As such, the initial position marker indicates the Node Score Cutoff value originally set in the Finding Parameters section. When moving the slider, the Node Score Cutoff is set to 0 at Min and 100 at Max, however there are several notable differences between the functions of the Size Threshold Slider and the Node Score Cutoff of Cluster Finding Parameter.

  1. During exploration, the cluster is reevaluated without the requirements of satisfying the K-Core parameter. Thus, moving the slider leftwards of the initial position allows the cluster to be reduced to only the seed node.
  2. During exploration in the Max direction, the cluster is unaware of other clusters. While the other tasks of finding a cluster is only allowed to expand up to previously found clusters, the slider expands the cluster despite adjacent cluster borders. Thus, moving the slider rightwards of the initial position allows the cluster to be expanded to as much as the whole network.

Haircut and Fluff are applied after slider movement if they were turned on in the Postprocessing section.

In response to the slider, the cluster is updated with the new cluster’s network graphic and details (number of nodes and edges and new cluster score). Since clusters can expand to large and sometimes unreasonable sizes, the layouter may need extra time to complete its task. When this occurs, a progress bar will appear in the Cluster Pane. But there is no need to wait for the cluster to be drawn and the cluster pane will remain responsive to the slider’s movements. If the new cluster exceeds 1,000 nodes, a place holder (“Too big to show”) will be drawn instead since the graphic representation might take too long to compute.

Postprocessing
  • Haircut

    Once a cluster has been found, some nodes which may have satisfied the Degree Cutoff parameter can only be connected to the cluster by one edge. When Haircut is turned on, AllegroMCODE removes all the singly-connected nodes from clusters. In some cases, though rare, the cluster’s seed node may be only singly connected to the cluster and removed by Haircut.

    haircut-on
    haircut-off
  • Fluff

    When turned on, AllegroMCODE expands cluster cores by one neighbor shell outwards as shown below, according to the Density Cutoff parameter and after the optional haircut step.

    Density Cutoff

    Node density is calculated by dividing the node’s connections by the maximum number of connections possible for that node. If Fluff is turned on, this parameter controls the neighbor inclusion criteria during ‘fluffing’. Nodes whose node densities are no less than Density Cutoff are only included during ‘fluffing’. Fluff expansion occurs after the cluster has already been defined by the algorithm and thus allows clusters to overlap at their edges. A higher value will expand clusters more.

haircut-on
haircut-fluff-on
Node Attribute Enumerator
The Enumerator provides a numerical summary of node attribute values possessed by the currently explored cluster’s members. It is meant to inform the user of the cluster’s contents and aid in determining the cluster’s functional relevance. All node attributes that are available for the loaded network are listed in the select box. When an attribute selection is made in one exploration, it persists for all cluster explorations within the given result.

The table below the select box has two columns and can be sorted by a column in ascending or descending order:

  1. Value
    This column lists all node attribute values that are possessed by the cluster being explored.
  2. Occurrence
    • This column simply displays the number of nodes possessing the particular attribute value listed in each row.
    • The Occurrence numbers are best interpreted when compared with the number of nodes in the cluster. For example, when enumerating Biological Process GO Terms, it may be a good indicator that the given cluster is biologically relevant if 9 of the 10 cluster members share some specific value.

In combination with the Size Threshold Slider, the Enumerator can be used to optimize clusters based on functional relevance. As the slider is being manipulated the Enumerator will automatically report changes in cluster content for the selected attribute.