Compare commits

...

2 commits

Author SHA1 Message Date
4cc087596e added ts for 3 clusters 2023-11-07 09:33:13 +01:00
1ef3d7a0fa worked a bit on coverletter 2023-11-07 09:30:47 +01:00
3 changed files with 66 additions and 8 deletions

Binary file not shown.

View file

@ -118,7 +118,7 @@ The noise resulting from the measurement uncertainty inhibits the accuracy of an
\reply{\sophie{now (find sources)}}
\commentB{Line 48}{What bias?}
\reply{\sophie{now}}
\reply{\sophie{now} }
\commentB{Lines 69-71}{ Maybe this sentence referring to charge states could be moved to/merged with line 83?}
\reply{\sophie{now}}
@ -268,7 +268,7 @@ The noise resulting from the measurement uncertainty inhibits the accuracy of an
\reply{This is a good point. This is one of the points we want to make. We want to encourage people to also consider measuring charge state composition to make it wider available and we can understand the solar wind better \sophie{formulierung hier ist noch nicht optimal}}
\commentB{Line 412}{The other properties are also "source-dependent", but the relevant point is that the charge states are preserved during transport to 1 AU.}
\reply{\sophie{now}}
\reply{The phrasing on our side was indeed a bit misleading. We corrected this accordingly \sophie{TODO}}
\commentB{Figure 5}{ Why is there little evidence of a peak for cluster 3 in all of the left-hand panels? Is this a problem with the color scale? Or do the values have a wide spread?}
\reply{\sophie{now}}
@ -316,7 +316,7 @@ The noise resulting from the measurement uncertainty inhibits the accuracy of an
\commentA{Would a time series figure similar to Figure 3 also be useful to demonstrate how the clusters relate to these three types of solar wind?}
\reply{\sophie{now} In general we think this is a good idea. Since we already have a lot of figures we decided against including this one. \sophie{vllt. so eine im Anhang ? }}
\reply{\sophie{now} In general we think this is a good idea. Since we already have a lot of figures we decided against including this one. We included the figure in the appendix \sophie{TODO}}
\commentB{Line 485}{ What specific properties indicate that cluster 1 is polar coronal hole wind, and similarly for the other clusters (add some supporting references)? Are these inferred from Figure 5 or examination of Figure 3, or both? Although the authors don't consider this interpretation of the data a priority, from the point of view of a skeptical researcher reading the paper, indicating that this analysis does identify different types of solar wind similar to those found from previous studies is a strong selling point for this type of technique. So I would suggest expanding this discussion.}
@ -348,11 +348,11 @@ The noise resulting from the measurement uncertainty inhibits the accuracy of an
\commentB{Line 499}{ Some of the slow high density regions might be associated with the heliospheric plasma sheet and not necessarily due to compression. And compression of the HPS could lead to even higher densities.}
\reply{\sophie{now}}
\reply{We totally agree and added it the text. \sophie{TODO}}
\commentB{Line 509}{ Showing the solar cycle variation would also be a nice demonstration of the power of the technique to the skeptical researcher. I would encourage the authors to consider expanding this comment.}
\reply{\sophie{now und sollten wir nochmal diskutieren}}
\reply{thank you for the comment! We are currently investigating the properties that we found. So we wont be able to include a discussion on this in this work. \sophie{now und sollten wir nochmal diskutieren}}
\commentB{Line 513}{ "Both for ..." Does the following comment only apply to the seven cluster case?}
@ -368,7 +368,7 @@ The noise resulting from the measurement uncertainty inhibits the accuracy of an
\commentB{Line 545}{ Explain further why a comparison of the similarity to the reference results using all parameters is a valid way to assess the results using fewer parameters.}
\reply{\sophie{now}}
\reply{If the similarity is high in comparison to the reference cluster we can assume that all the important properties are covered. If the similarity is low compared to reference cluster we can assume that the information of the clustering missing some parameters cannot cover the physics of this model. With this method we can find which parameters and their combinations are redundant for clustering and which parameters are not. As we discussed in the section before the classification containg all parameters is our opinon a reasonable one. \sophie{hier muss ich noch ein bisschen über die Formulierung nachdenken ... und im text addierern }}
\commentB{Line 551}{ Please avoid statements like "it is obvious" when presenting results which are likely to be unfamiliar to the reader and without any explanation in the text of what the figure shows. Explain the results sufficiently so that the reader can identify this "obvious" result. I'm guessing that the reader is supposed to be comparing the distributions of all the various scores for different combinations of parameters, but some guidance would be useful.}
@ -432,7 +432,7 @@ The noise resulting from the measurement uncertainty inhibits the accuracy of an
% end of feature selection 799
\commentB{Line 820}{ "systematic differences..." But this is also the case for other parameters, most obviously $v_{sw}$. Does the dynamic range of a parameter play a role in the "importance" of a parameter? For example, the speed may range over a factor of around two between slow and fast solar wind (say ~300-600 km/s), but the density may vary by a larger factor.}
\reply{\sophie{now}}
\reply{We cover the different dynamic ranges with the scaling. After the scaling the dynamic range of all parameters is similar. }
\commentB{Line 821}{ Is it true that density is a tracer for slow/coronal hole wind in Xu and Borovsky as they use combinations of a number of parameters and Tp (which is correlated with Vsw)?}

View file

@ -1890,6 +1890,62 @@ def big_plot_label(key):
return label
def get_error_big_plot(savelabel = '' , labels = (), nclusters = 7):
data = loadData()
c = Clusters(data, nclusters = nclusters)
count = len(all_parameterlist)
last_length = 1
old_indeces = []
all_parameter_comb_old = [it for sub_list in c.all_parameter_comb for it in sub_list]
for ele,i in zip(all_parameterlist,range(count)):
order_ele = c.get_order_parameter(ele)
labellist.append(c.createlabel(order_ele,label = clabel))
old_indeces.append(all_parameter_comb_old.index(order_ele))
curr_label = ''
if last_length < len(ele):
borders.append(i)
ticklistlabel.append(curr_label)
last_length = len(ele)
if mc_error:
old_indeces = np.array(old_indeces[:-1])
median_mc = median_mc[old_indeces]
percentilelist_mc = percentilelist_mc[old_indeces]
scorelist = []
medianlist = []
percentilelist = []
if ten_fold:
medianlist_ten = []
percentilelist_ten = []
if off_error is not None:
o_clabel, o_loadlabel = off_error
o_err_scores = np.ones((len(labellist), 4, 2))
for l, label in enumerate(labellist):
try:
o_comb_label = label.replace(clabel , '')
print("results/scoreresults_%s%s_all.pickle"%(o_clabel, o_comb_label+ o_loadlabel))
o_sclist=pickle.load(bz2.BZ2File("results/scoreresults_%s%s_all.pickle"%(o_clabel, o_comb_label+o_loadlabel), "rb"))
o_err_scores[l, 0, 0] = np.min(o_sclist[:,0])
o_err_scores[l, 0, 1] = np.max(o_sclist[:,0])
o_err_scores[l, 1, 0] = np.min(o_sclist[:,1])
o_err_scores[l, 1, 1] = np.max(o_sclist[:,1])
o_err_scores[l, 2, 0] = np.min(o_sclist[:,2])
o_err_scores[l, 2, 1] = np.max(o_sclist[:,2])
o_err_scores[l, 3, 0] = np.min(o_sclist[:,3])
o_err_scores[l, 3, 1] = np.max(o_sclist[:,3])
except:
print ("No results file found for:", label )
def big_plot(data,savelabel = "", nclusters=7,clabel ='', xu_scores = False,mc_error = None, para_order = ("dsw","vsw","tsw","colage","B","dO7_6","mcsFe"), loadlabel_scores = '' , ten_fold = False, ntrials = 100, select_scores = 'all',off_error = None ):
'''
creates the big plot for the paper, before the bigExperiment and if wanted the Monte Carlo errors should be run.
@ -2015,7 +2071,7 @@ def big_plot(data,savelabel = "", nclusters=7,clabel ='', xu_scores = False,mc_e
medianlist_ten.append([s1_ten,s2_ten,s3_ten,s4_ten])
percentilelist_ten.append([p1_ten,p2_ten,p3_ten,p4_ten])
medianlist[-1] =[1,1,1,1]
percentilelist[-1]= [[0,0],[0,0],[0,0],[0,0]]
@ -3091,6 +3147,8 @@ if __name__=="__main__":
c = Clusters(data, nclusters = 3, plot = True, ntrials = ntrials)
c.histogram_two_clusters(0,size = 36,x_key ='vsw',y_key ='dO7_6',p1 =('dsw','vsw','tsw','B') ,tf =[2001,2012],p2=('dsw','vsw','tsw','B','dO7_6','mcsFe'), label1 = '_3'+add_label,label2 ='_3'+add_label, savelabel ='', plotlabel = False, scores = True)
c.timeseries(0, label="_3"+add_label, savelabel = '_paper',size=16, keys=["dsw","tsw","B","dO7_6","vsw","colage","mcsFe"], parameters =("dsw","tsw","B","dO7_6","vsw","colage","mcsFe"), tf=[2002.6,2002.9], withICME = True, dayofyear = True, highlightICME = True)
elif int(option) ==43:
"""
here the evaluation for the plot with 7 clusters is done