Experimental Structure and Conditions

Set the experimental structure and the Label-free experiment conditions

This page will explain how a user can define the experimental structure of their dataset


The experimental structure datatable and setting an experimental coordinate

After loading a dataset the user is provided with an interactive blank data table containing 4 for non-LFQ and 5 for LFQ datasets columns. This data table contains all rawfiles the dataset derived from (one for each MS run) and stores its name in the first column. As described in Quickstart, in a typical experiment each raw file is matched to a biological replicate (brep), a technical replicate (trep), a fraction (frac) and - in LFQ experiments only - a condition (cond). In PS another attribute is set to each raw file - that is “used” - if it is used in the analysis or not. The information above are stored in rawfiles_structure which is defined as below:

// rawfiles_structure is an array of objects having the following structure:
// {rawfile: ..., biorep: ..., techrep: ..., fraction: ..., used: ...}
// as an example rawfiles_structure might seem like
// 0: {rawfile: "D131106_014.raw", biorep: 1, techrep: 1, fraction: 1, used: true}
// 1: {rawfile: "D131106_016.raw", biorep: 2, techrep: 1, fraction: 1, used: true}
// 2: {rawfile: "D131106_017.raw", biorep: 3, techrep: 1, fraction: 1, used: true}

Notice that the information for conditions in LFQ experiments is stored seperately in RawFileConditions that has a structure of:

// {name: ..., condition: ...}
// e.g. 0: {name: "D131106_014.raw", condition: "Mut"} 
// ...

Let’s take a simple non-LFQ dataset for example: When the raw files are loaded in step 1 the data table in step 2 is refreshed to display all raw files and the columns brep trep fraction with “-” as predefined values. rawfiles_structure is now an empty array. To assign a brep and trep to a rawfile the user selects it from the data table and types the brep and trep numbers in the box on the right (e.g. 1 - 1). Hitting the Assign button will call assignExpStructCoord in JS that will populate rawfiles_structure with an entry like {rawfile: “D131106_014.raw”, biorep: 1, techrep: 1, fraction: 1, used: true}. The aim of PS is to kinimze user interaction, thus, many raw files can be assigned at once. In case many raw files are fractions of the same technical replicate, the user can select them all and assign a brep and trep to them. assignExpStructCoord will assign successive fraction numbers to all rawfiles under the same brep-trep pair. Another trick is to leave the trep text box blank and assign one brep in many rawfiles. In this case PS will assign this brep to all selected raw files and successive treps to these files (all of them will get a fraction of 1). The function in JS is well commented, the implementation can be described in a few steps: get all selected rawfiles table rows and store them to the items array. Then for each one of them delete their entries in rawfiles_structure. Prepare the variables curr_biorep curr_techrep curr_fraction: In case the user defined both brep and trep, the first encountered selected rawfile should have curr_biorep and curr_techrep equal to the defined brep and trep respectively and fraction 1. An issue here is that the user may already have assigned the same brep and trep pair to other rawfiles, thus the first fraction assigned must be the max_fraction for this brep trep already assigned + 1. To compensate this PS uses rep_counts a nested associative array (in terms of JS a nested array of objects) in a form that rep_counts["biorep"][def_biorep].techrep[def_techrep].fraction.length - 1 will return the aforementioned maximum number. So, the following block can assign the correct fraction number to the new asignments:

// fractions auto completion
if (def_biorep in rep_counts["biorep"] && def_techrep in rep_counts["biorep"][def_biorep].techrep && !isLabelFree)
{
    // If the user already assigned at least another rawfile with the same brep and trep pair
	curr_fraction = (rep_counts["biorep"][def_biorep].techrep[def_techrep].fraction.length - 1) + rep_offset++;
}
else
{
	curr_fraction = rep_offset++;
}

In case the user left the trep text box blank the autocompletion of treps takes place in a similar way. The structure of rep_counts is quite interesting although somehow cryptic… For example in a SILAC experiment with structure of 2 bioreps, 2 tech reps each and 12 fractions each, after setting up the whole structure rep_counts should look like this:

biorep
1:
	techrep
	1:
		fraction
		1:
			1
		2:
			1
		3:
			1
		4:
			1
	2:
		fraction
		1:
			1
		2:
			1
		3:
			1
		4:
			1
2:
	techrep
	1:
		fraction
		1:
			1
		2:
			1
		3:
			1
		4:
			1
	2:
		fraction
		1:
			1
		2:
			1
		3:
			1
		4:
			1

so that rep_counts["biorep"] is an array of two elements each one of them is an object with keys 1 and 2 containing one value “techrep” which is a similar array as biorep. fraction is another similar array with the only diffrerence that its elements have always ‘1’ as value. This array is rebuilt every time rawfiles_structure changes using the set_reps function. set_reps is quite simple since it iterates through all the rawfiles_structure array and makes sure that each defined brep trep and fraction is set in the appropriate position in rep_counts. Now, the procedure described above is not flawless in terms that for example setting 3 rawfiles to the same brep such as

// 0: {rawfile: "D131106_014.raw", biorep: 1, techrep: 1, fraction: 1, used: true}
// 1: {rawfile: "D131106_016.raw", biorep: 1, techrep: 2, fraction: 1, used: true}
// 2: {rawfile: "D131106_017.raw", biorep: 1, techrep: 3, fraction: 1, used: true}

and then assigning D131106_016.raw to brep 2 trep1 frac 1 used true and then assignin a new raw file to biorep 1 leaving the trep text box blank will give the following result:

// 0: {rawfile: "D131106_014.raw", biorep: 1, techrep: 1, fraction: 1, used: true}
// 1: {rawfile: "D131106_016.raw", biorep: 2, techrep: 1, fraction: 1, used: true}
// 2: {rawfile: "D131106_017.raw", biorep: 1, techrep: 3, fraction: 1, used: true}
// 3: {rawfile: "D131106_018.raw", biorep: 1, techrep: 4, fraction: 1, used: true}

although someone would hope for

// 3: {rawfile: "D131106_018.raw", biorep: 1, techrep: 2, fraction: 1, used: true}

This is not a problem though for PS since it supports non-successive bio and techreps. Also, this might be desirable, e.g. someone takes 4 tech reps for a specific bio rep and the second appears to be erroneous. It would be fair to exclude this tech rep from the analysis and name the others 1-3-4 as above. This though is not fair in terms of fractions. This is why after building rawfiles_structure and calling set_reps, the function refresh_fractions is also called. refresh_fractions is simple enough in labelled cases. It implements a double nested iteration of all rawfiles and succesively inserts fractions to all rawfiles of the same brep and trep in rawfiles_structure. For label free experiments it does the exact thing with the only difference that it finds the corresponding condition from RawFileConditions and in the double nested itteration it also makes sure that the group of rawfiles that will receive successive fraction numbers do not only have the same treps and breps but also the same conditions. Finally, refresh fractions refreshes the displayed fraction number in the rawfiles data table in Stage 2. assignExpStructCoord is approximately the same for label free cases. set_s22btnf is another function that is called (in fact called by set_reps) that checks if all necessary conditions are met to advance to stage 3. At the moment thes are:

  • All raw files must have been assigned to brep trep frac (or flagged as not used) i.e. rawfiles_structure.length == rawfiles.length
  • At least one rawfile should be used for the analysis
  • At least 2 bioreps should be used - this is a limma limitation
  • For LFQ data all raw files should be assigned to a condition (or set to used == false) A known issue here is that in LFQ experiments, R returns an error if for a specific condition only one brep is set and this is not validated byh set_s22btnf. This is a very rare incidence indeed but should be fixed… sometime…

Setting LFQ conditions

In LFQ datasets, the conditions are assigned to the rawfiles using a different array that is LFQ conditions which is an array of objects with the structure:

{name: "D131106_014.raw", condition: "Mut"}

A context menu is assigned to raw files data table in rawFileDataTableCMenuInit. This function is easy to read and imoplements some simple commands such as “select all” etc. The “Assign condition” option is available only in LFQ experiments and calls the onAssignCondition function that simply displays the #s2LFQConditions dialog box. In this dialog box, the user can select a previously assigned condition or type in a new one. Hitting the OK button calls the ons2LFQConditionsOK_click function that stores the selected or typed condition to add in CondToAdd. Then it finds all entries in RawFileConditions that correspond to selected rawfiles and erases them to avoid duplicates. Asther that it iterates through all selected rows in rawfiles datatable and adds an entry to RawFileConditions withe the name of the selected rawfile and condition CondToAdd:

RawFileConditions.push({name: $(items_name).text(), condition: CondToAdd});

Then it refreshes the fractions and calls set_s22btnf to check if the Next button should be enabled. This completes this documentation page. After completing Stage 2 in a regular analysis the rawfile_structure and RawFileConditions arrays must be set correctly. In case of Replication Multiplexing many more global variables must have been set but this will be discussed later, in advanced procedures.