Mercurial > dive4elements > river
annotate flys-artifacts/src/main/java/de/intevation/flys/artifacts/math/Outlier.java @ 3564:e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
flys-artifacts/trunk@5162 c6561f87-3c4e-4783-a992-168aeb5c3f6f
author | Sascha L. Teichmann <sascha.teichmann@intevation.de> |
---|---|
date | Tue, 31 Jul 2012 16:14:17 +0000 |
parents | ab81ffd1343e |
children | b136113dad53 |
rev | line source |
---|---|
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
1 package de.intevation.flys.artifacts.math; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
2 |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
3 import org.apache.commons.math.MathException; |
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
4 |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
5 import org.apache.commons.math.stat.descriptive.moment.Mean; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
6 import org.apache.commons.math.stat.descriptive.moment.StandardDeviation; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
7 |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
8 import org.apache.commons.math.distribution.TDistributionImpl; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
9 |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
10 import java.util.Collections; |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
11 import java.util.List; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
12 import java.util.ArrayList; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
13 |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
14 import org.apache.log4j.Logger; |
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
15 |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
16 public class Outlier |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
17 { |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
18 public static final double EPSILON = 1e-5; |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
19 |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
20 public static final double DEFAULT_ALPHA = 0.05; |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
21 |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
22 private static Logger log = Logger.getLogger(Outlier.class); |
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
23 |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
24 public static class IndexedValue |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
25 implements Comparable<IndexedValue> |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
26 { |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
27 protected int index; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
28 protected double value; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
29 |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
30 public IndexedValue() { |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
31 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
32 |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
33 public IndexedValue(int index, double value) { |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
34 this.index = index; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
35 this.value = value; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
36 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
37 |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
38 public int getIndex() { |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
39 return index; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
40 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
41 |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
42 public void setIndex(int index) { |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
43 this.index = index; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
44 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
45 |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
46 public double getValue() { |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
47 return value; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
48 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
49 |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
50 public void setValue(double value) { |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
51 this.value = value; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
52 } |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
53 |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
54 @Override |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
55 public int compareTo(IndexedValue other) { |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
56 int diff = index - other.index; |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
57 if (diff < 0) return -1; |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
58 return diff > 0 ? +1 : 0; |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
59 } |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
60 } // class IndexedValue |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
61 |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
62 public static class Outliers { |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
63 |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
64 protected List<IndexedValue> retained; |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
65 protected List<IndexedValue> removed; |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
66 |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
67 public Outliers() { |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
68 } |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
69 |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
70 public Outliers( |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
71 List<IndexedValue> retained, |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
72 List<IndexedValue> removed |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
73 ) { |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
74 this.retained = retained; |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
75 this.removed = removed; |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
76 } |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
77 |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
78 public boolean hasOutliers() { |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
79 return !removed.isEmpty(); |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
80 } |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
81 |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
82 public List<IndexedValue> getRetained() { |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
83 return retained; |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
84 } |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
85 |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
86 public void setRetained(List<IndexedValue> retained) { |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
87 this.retained = retained; |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
88 } |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
89 |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
90 public List<IndexedValue> getRemoved() { |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
91 return removed; |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
92 } |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
93 |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
94 public void setRemoved(List<IndexedValue> removed) { |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
95 this.removed = removed; |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
96 } |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
97 } // class Outliers |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
98 |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
99 public Outlier() { |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
100 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
101 |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
102 public static Outliers findOutliers(List<IndexedValue> inputValues) { |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
103 return findOutliers(inputValues, DEFAULT_ALPHA); |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
104 } |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
105 |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
106 public static Outliers findOutliers( |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
107 List<IndexedValue> inputValues, |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
108 double alpha |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
109 ) { |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
110 boolean debug = log.isDebugEnabled(); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
111 |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
112 if (debug) { |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
113 log.debug("outliers significance: " + alpha); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
114 } |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
115 |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
116 alpha = 1d - alpha; |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
117 |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
118 ArrayList<IndexedValue> outliers = new ArrayList<IndexedValue>(); |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
119 |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
120 ArrayList<IndexedValue> values = |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
121 new ArrayList<IndexedValue>(inputValues); |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
122 |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
123 for (;;) { |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
124 int N = values.size(); |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
125 |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
126 if (debug) { |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
127 log.debug("Values to check: " + N); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
128 } |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
129 |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
130 if (N < 4) { |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
131 break; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
132 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
133 |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
134 Mean mean = new Mean(); |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
135 StandardDeviation std = new StandardDeviation(); |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
136 |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
137 for (IndexedValue value: values) { |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
138 mean.increment(value.getValue()); |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
139 std .increment(value.getValue()); |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
140 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
141 |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
142 double m = mean.getResult(); |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
143 double s = std.getResult(); |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
144 |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
145 if (debug) { |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
146 log.debug("mean: " + m); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
147 log.debug("std dev: " + s); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
148 } |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
149 |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
150 double maxZ = -Double.MAX_VALUE; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
151 int iv = -1; |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
152 for (int i = N-1; i >= 0; --i) { |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
153 IndexedValue v = values.get(i); |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
154 double z = Math.abs(v.getValue()-m); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
155 if (debug) { |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
156 log.debug("z candidate: " + z); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
157 } |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
158 if (z > maxZ) { |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
159 maxZ = z; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
160 iv = i; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
161 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
162 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
163 |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
164 if (Math.abs(s) < EPSILON) { |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
165 break; |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
166 } |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
167 |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
168 maxZ /= s; |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
169 |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
170 TDistributionImpl tdist = new TDistributionImpl(N-2); |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
171 |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
172 double t; |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
173 |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
174 try { |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
175 t = tdist.inverseCumulativeProbability(alpha/(N+N)); |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
176 } |
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
177 catch (MathException me) { |
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
178 log.error(me); |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
179 break; |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
180 } |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
181 |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
182 t *= t; |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
183 |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
184 double za = ((N-1)/Math.sqrt(N))*Math.sqrt(t/(N-2d+t)); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
185 |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
186 if (debug) { |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
187 log.debug("max: " + maxZ + " crit: " + za); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
188 } |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
189 |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
190 if (maxZ > za) { |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
191 outliers.add(values.get(iv)); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
192 values.remove(iv); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
193 } |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
194 else { |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
195 if (debug) { |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
196 log.debug("values left: " + N); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
197 } |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
198 break; |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
199 } |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
200 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
201 |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
202 Collections.sort(outliers); |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
203 |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
204 return new Outliers(values, outliers); |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
205 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
206 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
207 // vim:set ts=4 sw=4 si et sta sts=4 fenc=utf8 : |