Mercurial > dive4elements > river
annotate flys-artifacts/src/main/java/de/intevation/flys/artifacts/math/Outlier.java @ 4173:7d4480c0e68e
Allow users to select the current relevant discharge table in historical discharge table calculattion.
In addition to this, the discharge tables in the helper panel displayed in the client is ordered in time.
author | Ingo Weinzierl <ingo.weinzierl@intevation.de> |
---|---|
date | Thu, 18 Oct 2012 12:13:48 +0200 |
parents | b136113dad53 |
children |
rev | line source |
---|---|
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
1 package de.intevation.flys.artifacts.math; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
2 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
3 import java.util.List; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
4 |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
5 import org.apache.commons.math.MathException; |
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
6 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
7 import org.apache.commons.math.distribution.TDistributionImpl; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
8 |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
9 import org.apache.commons.math.stat.descriptive.moment.Mean; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
10 import org.apache.commons.math.stat.descriptive.moment.StandardDeviation; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
11 |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
12 import org.apache.log4j.Logger; |
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
13 |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
14 public class Outlier |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
15 { |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
16 public static final double EPSILON = 1e-5; |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
17 |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
18 public static final double DEFAULT_ALPHA = 0.05; |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
19 |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
20 private static Logger log = Logger.getLogger(Outlier.class); |
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
21 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
22 protected Outlier() { |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
23 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
24 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
25 public static Integer findOutlier(List<Double> values) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
26 return findOutlier(values, DEFAULT_ALPHA); |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
27 } |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
28 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
29 public static Integer findOutlier(List<Double> values, double alpha) { |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
30 boolean debug = log.isDebugEnabled(); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
31 |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
32 if (debug) { |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
33 log.debug("outliers significance: " + alpha); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
34 } |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
35 |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
36 alpha = 1d - alpha; |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
37 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
38 int N = values.size(); |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
39 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
40 if (debug) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
41 log.debug("Values to check: " + N); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
42 } |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
43 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
44 if (N < 3) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
45 return null; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
46 } |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
47 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
48 Mean mean = new Mean(); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
49 StandardDeviation std = new StandardDeviation(); |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
50 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
51 for (Double value: values) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
52 double v = value.doubleValue(); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
53 mean.increment(v); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
54 std .increment(v); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
55 } |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
56 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
57 double m = mean.getResult(); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
58 double s = std.getResult(); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
59 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
60 if (debug) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
61 log.debug("mean: " + m); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
62 log.debug("std dev: " + s); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
63 } |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
64 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
65 double maxZ = -Double.MAX_VALUE; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
66 int iv = -1; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
67 for (int i = N-1; i >= 0; --i) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
68 double v = values.get(i).doubleValue(); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
69 double z = Math.abs(v - m); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
70 if (z > maxZ) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
71 maxZ = z; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
72 iv = i; |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
73 } |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
74 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
75 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
76 if (Math.abs(s) < EPSILON) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
77 return null; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
78 } |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
79 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
80 maxZ /= s; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
81 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
82 TDistributionImpl tdist = new TDistributionImpl(N-2); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
83 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
84 double t; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
85 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
86 try { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
87 t = tdist.inverseCumulativeProbability(alpha/(N+N)); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
88 } |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
89 catch (MathException me) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
90 log.error(me); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
91 return null; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
92 } |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
93 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
94 t *= t; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
95 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
96 double za = ((N-1)/Math.sqrt(N))*Math.sqrt(t/(N-2d+t)); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
97 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
98 if (debug) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
99 log.debug("max: " + maxZ + " crit: " + za); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
100 } |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
101 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
102 return maxZ > za |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
103 ? Integer.valueOf(iv) |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
104 : null; |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
105 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
106 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
107 // vim:set ts=4 sw=4 si et sta sts=4 fenc=utf8 : |