Download Genomica
Download Data
Gene Mapping Tables
Credits
Tutorial
Overview and FAQ
Load Expression Data
Load Sets
File Repository
The Genome Browser
Find Enriched Sets
Create a Module Network
Create a Module Map
Related Sites
Module Networks
Cancer Module Map
Internal Use
Genomica Wiki

Create a Module Network: Identify Regulatory Networks from Expression Data


A module network is a probabilistic model, based on probabilistic graphical models and Bayesian networks, for identifying regulatory modules from gene expression data. The procedure identifies modules of co-regulated genes, their regulators and the conditions under which regulation occurs, generating testable hypotheses in the form 'regulator X regulates module Y under conditions W'. We applied this method to construct a regulatory network underlying the response of yeast to stress.

Step 1: Load expression data


The first step is loading the expression data for which you want to construct a module network. Details on how to load expression and of Genomica file formats for expression data are given here. In this tutorial we assume that you load the module network sample expression data. Other expression data are available here.

Step 2: Create the module network


You are now ready to create the module network. Choose AlgorithmsLearn module network. The dialog box should look similar to the following:



Move to the Regulation panel, where you can control various parameters of the learned modules' regulation programs. The default parameters are suitable for many applications. For now, however, since the sample file we use is small, we will use greater lookahead and allow smaller experiment partitions. Change Lookahead depth to 1 and Min experiments per context to 3. The dialog box should look similar to the following:



To create the module network, press the Run button. The module network is now ready.


Step 3: Displaying the module network results


Having learned a module network, there are several ways to view the results. First, you can obtain a global 'birds eye' view of the results by selecting the birds eye view from the Tab panels. After enlarging the image using the 'Pixel Size' controls in the left control panel, the birdseye view should look similar to:



This birdseye view shows the modules as horizontal strips (in this sample case, there are two such strips and thus two modules), and for each module, its arrays are shown sorted by the regulation program with each split in the tree shown by separate blocks separated by yellow lines (the color and thickness of these block boundary lines can be controlled in by the 'Border' properties in the left control panel). You can also view the entire structure of the module network tree that was learned by selectinig the 'Tree' tab from the main Tab panel. After expanding all levels of the tree, it should look similar to:

Each node in this tree represents a split. By selecting a node in the tree, you can view a particular module. However, for large files with many modules, this tree may become quite large and difficult to navigate. Thus, the preferred mode for examining a specific module along with its regulation program, is to go back to the birdseye view and select a module that seems interesting, by clicking on its expression profile. For example, going back to the birdseye view above, you can select the leftmost yellow bordered block, and then go to the Cluster view. The resulting cluster view should look similar to:



Note that arrays 7-9 are ordered on the leftmost panel since these arrays are the ones that we selected from the birdseye view. In order to view the full regulation program for this module, you need to set the following configurations, if they are not already configured in your version of Genomica. Firsrt, from the control panel on the left, select "By Descendants" from the combo box of "Sort Experiments" in the left control panel. Second, from the menu "View -> Cluster", select to show the "Experiments Tree". After choosing these configurations, you can view the regulation program by going to the appropriate level of the tree. In our example, click the up-arrow in the left control panel once. Alternatively, enter the cluster number "5" in the box next to the arrows in the left control panel. You should now be able to see the module along with its full regulation program as follows:



The regulation program shows that the module has two splits, thet first is on the gene 'Name 1', splitting arrays 7-9 on the left from arrays 1-6 on the right. The second split, splits the induced arrays 4-6 from arrays 1-3 by using the gene 'Name 2'. In general the regulation program is defined by such decision trees, where each node in the tree represents a query on the value of the regulatory gene. For instance, in our sample the first node repreesents the query 'Is Name 1 up-regulated'? All arrays for which the answer is 'true' would then fall on the righthand side of the split and all other arrays would fall on the lefthand side. Thus, when we get to a leaf we end up with a context that contains all the arrays for which the answers to all queries along the patht leading to that leaf were true. For more details on how this network was learned see the Methods section of the original module networks paper.

You might also want to obtain other global views of the results. For instance, if you want to see all the modules that are predicted to be regulated by a specific regulator, you can select from the menu "Analyze -> Tree Splits", select "All experiment splits", and get back an excel-like spreadsheet which you can then sort by predicted regulators, and select certain rows for examining specific modules. As another global view, you can also select from the menu "Analyze -> Tree Splits" and select the "Visual display", This should result in a global view where each row represents one module, but where unlike the birdseye view, the arrays are sorted in their original order, and the expression of each module is shown as the average expression of all the genes. Moreover, there is also another panel for each module that shows which genes were selected as its regulators. For our sample file, this view looks similar to:



Testing Module Network Robustness


After learning a module network, it is natural to wonder whether any other regulators might have fit the data with similar likelihood. To test the robustness of the regulators of any modules in a learned network, choose AlgorithmsTest regulation program robustness. The dialog box should look similar to the following:



To run the test, press the Run button. The result is a grid like this:



In this example, the results verify that Genes 1 and 2 are significant regulators for the module represented by Cluster 5. For each module, the robustness test randomly resamples K arrays from the original K arrays. (There are nine arrays, or experiments, in this example.) The resampling is done with replacement, so the resampled data set will duplicate the data for some arrays. Genomica learns a regulation program for this data set and then repeats the sampling and learning. (There are 100 trials in this example.) Tallies are kept of how often each regulator appears in a learned regulation program. The more often a regulator is used, the more essential the regulator can be considered.