added ts for 3 clusters

worked a bit on coverletter
2023-11-07 09:33:13 +01:00 · 2023-11-07 09:30:47 +01:00
3 changed files with 66 additions and 8 deletions
--- a/paper/coverletter.synctex.gz
+++ b/paper/coverletter.synctex.gz
--- a/paper/coverletter.tex
+++ b/paper/coverletter.tex
@ -118,7 +118,7 @@ The noise resulting from the measurement uncertainty inhibits the accuracy of an
 \reply{\sophie{now (find sources)}}

 \commentB{Line 48}{What bias?}
-\reply{\sophie{now}}
+\reply{\sophie{now} }

 \commentB{Lines 69-71}{ Maybe this sentence referring to charge states could be moved to/merged with line 83?}
 \reply{\sophie{now}}
@ -268,7 +268,7 @@ The noise resulting from the measurement uncertainty inhibits the accuracy of an
 \reply{This is a good point. This is one of the points we want to make. We want to encourage people to also consider measuring charge state composition to make it wider available and we can understand the solar wind better \sophie{formulierung hier ist noch nicht optimal}}

 \commentB{Line 412}{The other properties are also "source-dependent", but the relevant point is that the charge states are preserved during transport to 1 AU.}
-\reply{\sophie{now}}
+\reply{The phrasing on our side was indeed a bit misleading. We corrected this accordingly \sophie{TODO}}

 \commentB{Figure 5}{ Why is there little evidence of a peak for cluster 3 in all of the left-hand panels? Is this a problem with the color scale? Or do the values have a wide spread?}
 \reply{\sophie{now}}
@ -316,7 +316,7 @@ The noise resulting from the measurement uncertainty inhibits the accuracy of an

 \commentA{Would a time series figure similar to Figure 3 also be useful to demonstrate how the clusters relate to these three types of solar wind?}

-\reply{\sophie{now} In general we think this is a good idea. Since we already have a lot of figures we decided against including this one. \sophie{vllt. so eine im Anhang ? }}
+\reply{\sophie{now} In general we think this is a good idea. Since we already have a lot of figures we decided against including this one. We included the figure in the appendix \sophie{TODO}}

 \commentB{Line 485}{ What specific properties indicate that cluster 1 is polar coronal hole wind, and similarly for the other clusters (add some supporting references)? Are these inferred from Figure 5 or examination of Figure 3, or both? Although the authors don't consider this interpretation of the data a priority, from the point of view of a skeptical researcher reading the paper, indicating that this analysis does identify different types of solar wind similar to those found from previous studies is a strong selling point for this type of technique. So I would suggest expanding this discussion.}

@ -348,11 +348,11 @@ The noise resulting from the measurement uncertainty inhibits the accuracy of an

 \commentB{Line 499}{ Some of the slow high density regions might be associated with the heliospheric plasma sheet and not necessarily due to compression. And compression of the HPS could lead to even higher densities.}

-\reply{\sophie{now}}
+\reply{We totally agree and added it the text. \sophie{TODO}}

 \commentB{Line 509}{ Showing the solar cycle variation would also be a nice demonstration of the power of the technique to the skeptical researcher. I would encourage the authors to consider expanding this comment.}

-\reply{\sophie{now und sollten wir nochmal diskutieren}}
+\reply{thank you for the comment! We are currently investigating the properties that we found. So we wont be able to include a discussion on this in this work. \sophie{now und sollten wir nochmal diskutieren}}

 \commentB{Line 513}{ "Both for ..." Does the following comment only apply to the seven cluster case?}

@ -368,7 +368,7 @@ The noise resulting from the measurement uncertainty inhibits the accuracy of an

 \commentB{Line 545}{ Explain further why a comparison of the similarity to the reference results using all parameters is a valid way to assess the results using fewer parameters.}

-\reply{\sophie{now}}
+\reply{If the similarity is high in comparison to the reference cluster we can assume that all the important properties are covered. If the similarity is low compared to reference cluster we can assume that the information of the clustering missing some parameters cannot cover the physics of this model. With this method we can find which parameters and their combinations are redundant for clustering and which parameters are not. As we discussed in the section before the classification containg all parameters is our opinon a reasonable one.   \sophie{hier muss ich noch ein bisschen über die Formulierung nachdenken ... und im text addierern }}

 \commentB{Line 551}{ Please avoid statements like "it is obvious" when presenting results which are likely to be unfamiliar to the reader and without any explanation in the text of what the figure shows. Explain the results sufficiently so that the reader can identify this "obvious" result. I'm guessing that the reader is supposed to be comparing the distributions of all the various scores for different combinations of parameters, but some guidance would be useful.}

@ -432,7 +432,7 @@ The noise resulting from the measurement uncertainty inhibits the accuracy of an
 % end of feature selection 799 
 \commentB{Line 820}{ "systematic differences..." But this is also the case for other parameters, most obviously $v_{sw}$. Does the dynamic range of a parameter play a role in the "importance" of a parameter? For example, the speed may range over a factor of around two between slow and fast solar wind (say ~300-600 km/s), but the density may vary by a larger factor.}

-\reply{\sophie{now}}
+\reply{We cover the different dynamic ranges with the scaling. After the scaling the dynamic range of all parameters is similar.  }

 \commentB{Line 821}{ Is it true that density is a tracer for slow/coronal hole wind in Xu and Borovsky as they use combinations of a number of parameters and Tp (which is correlated with Vsw)?}

--- a/reduced_kmeans.py
+++ b/reduced_kmeans.py
@ -1890,6 +1890,62 @@ def big_plot_label(key):

    return label

+def get_error_big_plot(savelabel = '' , labels = (), nclusters = 7):
+    data = loadData()
+    c = Clusters(data, nclusters = nclusters)
+
+    count = len(all_parameterlist)
+    last_length = 1
+    old_indeces = []
+    all_parameter_comb_old = [it for sub_list in c.all_parameter_comb for it in sub_list]
+
+    for ele,i in zip(all_parameterlist,range(count)):
+        order_ele = c.get_order_parameter(ele)
+        labellist.append(c.createlabel(order_ele,label = clabel))
+        old_indeces.append(all_parameter_comb_old.index(order_ele))
+        curr_label = ''
+        if last_length < len(ele):
+            borders.append(i)
+
+        ticklistlabel.append(curr_label)
+        last_length = len(ele)
+
+        if mc_error:
+            old_indeces = np.array(old_indeces[:-1])
+            median_mc = median_mc[old_indeces]
+            percentilelist_mc = percentilelist_mc[old_indeces]
+        scorelist = []
+        medianlist = []
+        percentilelist = []
+        if ten_fold:
+            medianlist_ten = []
+            percentilelist_ten = []
+        if off_error is not None:
+            o_clabel, o_loadlabel = off_error
+            o_err_scores = np.ones((len(labellist), 4, 2))
+
+    for l, label in enumerate(labellist):
+        try:
+            o_comb_label = label.replace(clabel , '')
+            print("results/scoreresults_%s%s_all.pickle"%(o_clabel, o_comb_label+ o_loadlabel))
+            o_sclist=pickle.load(bz2.BZ2File("results/scoreresults_%s%s_all.pickle"%(o_clabel, o_comb_label+o_loadlabel), "rb"))
+
+            o_err_scores[l, 0, 0] = np.min(o_sclist[:,0])
+            o_err_scores[l, 0, 1] = np.max(o_sclist[:,0])
+
+            o_err_scores[l, 1, 0] = np.min(o_sclist[:,1])
+            o_err_scores[l, 1, 1] = np.max(o_sclist[:,1])
+
+            o_err_scores[l, 2, 0] = np.min(o_sclist[:,2])
+            o_err_scores[l, 2, 1] = np.max(o_sclist[:,2])
+
+            o_err_scores[l, 3, 0] = np.min(o_sclist[:,3])
+            o_err_scores[l, 3, 1] = np.max(o_sclist[:,3])
+
+        except:
+            print ("No results file found for:", label )
+
+
 def big_plot(data,savelabel = "", nclusters=7,clabel ='', xu_scores = False,mc_error = None, para_order = ("dsw","vsw","tsw","colage","B","dO7_6","mcsFe"), loadlabel_scores = '' , ten_fold = False, ntrials = 100, select_scores = 'all',off_error = None ):
    '''
    creates the big plot for the paper, before the bigExperiment and if wanted the Monte Carlo errors should be run.
@ -2015,7 +2071,7 @@ def big_plot(data,savelabel = "", nclusters=7,clabel ='', xu_scores = False,mc_e
            medianlist_ten.append([s1_ten,s2_ten,s3_ten,s4_ten])
            percentilelist_ten.append([p1_ten,p2_ten,p3_ten,p4_ten])

-    
+
    medianlist[-1] =[1,1,1,1]
    percentilelist[-1]= [[0,0],[0,0],[0,0],[0,0]]

@ -3091,6 +3147,8 @@ if __name__=="__main__":

        c = Clusters(data, nclusters = 3, plot = True, ntrials = ntrials)
        c.histogram_two_clusters(0,size = 36,x_key ='vsw',y_key ='dO7_6',p1 =('dsw','vsw','tsw','B') ,tf =[2001,2012],p2=('dsw','vsw','tsw','B','dO7_6','mcsFe'), label1 = '_3'+add_label,label2 ='_3'+add_label, savelabel ='', plotlabel = False, scores = True)
+        c.timeseries(0, label="_3"+add_label, savelabel = '_paper',size=16, keys=["dsw","tsw","B","dO7_6","vsw","colage","mcsFe"], parameters =("dsw","tsw","B","dO7_6","vsw","colage","mcsFe"),  tf=[2002.6,2002.9], withICME = True, dayofyear = True, highlightICME = True)
+
    elif int(option) ==43:
        """
        here the evaluation for the plot with 7 clusters is done
Author	SHA1	Message	Date
STeichmann	4cc087596e	added ts for 3 clusters	2023-11-07 09:33:13 +01:00
STeichmann	1ef3d7a0fa	worked a bit on coverletter	2023-11-07 09:30:47 +01:00