Mercurial > dive4elements > river
annotate flys-artifacts/src/main/java/de/intevation/flys/artifacts/math/Outlier.java @ 4221:480de0dbca8e
Extended location input helper.
The locationpicker has now an attribute whether the input is distance or
location to display one or two clickable columns.
Replaced the record click handler with cell click handler.
author | Raimund Renkert <rrenkert@intevation.de> |
---|---|
date | Tue, 23 Oct 2012 13:17:20 +0200 |
parents | b136113dad53 |
children |
rev | line source |
---|---|
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
1 package de.intevation.flys.artifacts.math; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
2 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
3 import java.util.List; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
4 |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
5 import org.apache.commons.math.MathException; |
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
6 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
7 import org.apache.commons.math.distribution.TDistributionImpl; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
8 |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
9 import org.apache.commons.math.stat.descriptive.moment.Mean; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
10 import org.apache.commons.math.stat.descriptive.moment.StandardDeviation; |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
11 |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
12 import org.apache.log4j.Logger; |
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
13 |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
14 public class Outlier |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
15 { |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
16 public static final double EPSILON = 1e-5; |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
17 |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
18 public static final double DEFAULT_ALPHA = 0.05; |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
19 |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
20 private static Logger log = Logger.getLogger(Outlier.class); |
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
21 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
22 protected Outlier() { |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
23 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
24 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
25 public static Integer findOutlier(List<Double> values) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
26 return findOutlier(values, DEFAULT_ALPHA); |
3011
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
27 } |
ab81ffd1343e
FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2646
diff
changeset
|
28 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
29 public static Integer findOutlier(List<Double> values, double alpha) { |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
30 boolean debug = log.isDebugEnabled(); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
31 |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
32 if (debug) { |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
33 log.debug("outliers significance: " + alpha); |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
34 } |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
35 |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
36 alpha = 1d - alpha; |
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
37 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
38 int N = values.size(); |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
39 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
40 if (debug) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
41 log.debug("Values to check: " + N); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
42 } |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
43 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
44 if (N < 3) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
45 return null; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
46 } |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
47 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
48 Mean mean = new Mean(); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
49 StandardDeviation std = new StandardDeviation(); |
3564
e01b9d1bc941
FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3011
diff
changeset
|
50 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
51 for (Double value: values) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
52 double v = value.doubleValue(); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
53 mean.increment(v); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
54 std .increment(v); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
55 } |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
56 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
57 double m = mean.getResult(); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
58 double s = std.getResult(); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
59 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
60 if (debug) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
61 log.debug("mean: " + m); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
62 log.debug("std dev: " + s); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
63 } |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
64 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
65 double maxZ = -Double.MAX_VALUE; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
66 int iv = -1; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
67 for (int i = N-1; i >= 0; --i) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
68 double v = values.get(i).doubleValue(); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
69 double z = Math.abs(v - m); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
70 if (z > maxZ) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
71 maxZ = z; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
72 iv = i; |
2646
c11da3540b70
Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
2645
diff
changeset
|
73 } |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
74 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
75 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
76 if (Math.abs(s) < EPSILON) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
77 return null; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
78 } |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
79 |
3565
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
80 maxZ /= s; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
81 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
82 TDistributionImpl tdist = new TDistributionImpl(N-2); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
83 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
84 double t; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
85 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
86 try { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
87 t = tdist.inverseCumulativeProbability(alpha/(N+N)); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
88 } |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
89 catch (MathException me) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
90 log.error(me); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
91 return null; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
92 } |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
93 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
94 t *= t; |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
95 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
96 double za = ((N-1)/Math.sqrt(N))*Math.sqrt(t/(N-2d+t)); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
97 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
98 if (debug) { |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
99 log.debug("max: " + maxZ + " crit: " + za); |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
100 } |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
101 |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
102 return maxZ > za |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
103 ? Integer.valueOf(iv) |
b136113dad53
FixA: Only evict only one(!) data point as outlier before recalculating the function.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
3564
diff
changeset
|
104 : null; |
2645
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
105 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
106 } |
4f7d1ea38404
Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff
changeset
|
107 // vim:set ts=4 sw=4 si et sta sts=4 fenc=utf8 : |