Late last week, Simon sent me the new code, revised so as to give us a look under the hoods of the principal components. I know, a mixed metaphor, shoot me! The principal component is hardly an automobile; still, a principal component is a kind of vehicle, and is certainly hiding something. Let’s lift the hood! I wrote to Simon asking for help in early March, but because he was teaching and busy, I had to wait. I finished a bunch of simpler analysis (described partly in the St Patrick’s Day post), organized the paper clips, checked my social media dopamine reservoirs, and waited.
The code I waited for runs in MATLAB and is named goScript. Simon wrote the first version when I was in Nottingham. The heart of goScript is to run a principal component analysis on the raw data for a root. Recall that the raw data in question comprises 37 velocity profiles, namely a profile every 5 min for three hours. To the eye, these profiles all seem superimposed (Fig. 1). While it runs, goScript plots the first three components versus time, where the first principal component (almost always) varies in a rough sinusoid or oscillation (Fig. 2).
Well, I saw that bouncing blue line way back in 2015. And over the years since then, I have tried to explain it; I mean, to find what component of the velocity profile varies. Looking at the overall shape of a velocity profile (Fig. 1), we can discern three zones: the first goes from zero to about 0.4 mm on the x-axis and has velocity increasing gradually; the second goes from about 0.4 to 0.6 mm and has velocity accelerating; and the third goes from about 0.6 mm to the end and has velocity increasing steeply. The slope of the velocity curve equals the relative growth rate of the underlying tissue. The first region corresponds to the meristem, where cells divide and grow slowly. The third region corresponds to the elongation zone, where cells grow rapidly. Indeed, the slope in the third region is nearly ten times that of the first.
Therefore, as a candidate to account for the variation over time, we might nominate either of those slopes (growth rate of the meristem, growth rate of the elongation zone) or the position of the transition between them (a boundary). As I have worked on this project, some of these candidates have been field-tested. For example, Simon added to goScript a few lines of code that run a linear regression on an interval within the elongation zone. However, those explorations were ad hoc. Informative maybe but not rigorous, I doubt even publishable.
To set our numerical analysis on a proper foundation, Simon and I concluded that the velocity profile can be fitted to a pair of regression lines. That is, let mathfind the best fit pair of lines through the data. Because the first and third regions are mainly linear (Fig. 1), the fits should be decent. And the intersection, where these two lines meet, can be taken as the transition between the zones, that is, as the boundary.
This week he sent me the new goScript, appropriately modified, with the ad hockery stripped out (well commented out, just in case), and a nice pairwise linear regression added. I lost no time (well not true, I lost a little time) pushing a set of 12 root through. And by golly the results looked different from the ad hocs, and my previous ideas lost their mooring and floated out to sea. Next, I went to make a kind of smorgasbord plot, to compare results among roots. I discovered that lines of code saving the new plots were absent. Ooops. Plots are displayed while goScript runs but not saved. Fortunately, adding those lines is something my limited coding can handle. But but but one of the missing plots shows a raw velocity profile (from one of the 37 times) and the best fit lines for one time point. To my dismay, the regression line for the first slope appeared to be constrained to go through zero. Such a constraint is arbitrary, unneeded. And will make the fit less good.
Late on Friday, I had to email Simon, begging goScript be tweaked again. Sigh. Tantalized because it looks as though the ad hoc analysis misled me. The new analysis is more … ambiguous. But will it change with a better fit? I wonder. With any luck freeing the fit of that constraint will be easy and I’ll be able to jam this week. We’ll see…