annotate flys-artifacts/src/main/java/de/intevation/flys/artifacts/math/Outlier.java @ 3564:e01b9d1bc941

FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken. flys-artifacts/trunk@5162 c6561f87-3c4e-4783-a992-168aeb5c3f6f
author Sascha L. Teichmann <sascha.teichmann@intevation.de>
date Tue, 31 Jul 2012 16:14:17 +0000
parents ab81ffd1343e
children b136113dad53
rev   line source
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
1 package de.intevation.flys.artifacts.math;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
2
2646
c11da3540b70 Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2645
diff changeset
3 import org.apache.commons.math.MathException;
c11da3540b70 Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2645
diff changeset
4
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
5 import org.apache.commons.math.stat.descriptive.moment.Mean;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
6 import org.apache.commons.math.stat.descriptive.moment.StandardDeviation;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
7
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
8 import org.apache.commons.math.distribution.TDistributionImpl;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
9
3011
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
10 import java.util.Collections;
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
11 import java.util.List;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
12 import java.util.ArrayList;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
13
2646
c11da3540b70 Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2645
diff changeset
14 import org.apache.log4j.Logger;
c11da3540b70 Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2645
diff changeset
15
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
16 public class Outlier
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
17 {
3564
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
18 public static final double EPSILON = 1e-5;
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
19
3011
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
20 public static final double DEFAULT_ALPHA = 0.05;
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
21
2646
c11da3540b70 Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2645
diff changeset
22 private static Logger log = Logger.getLogger(Outlier.class);
c11da3540b70 Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2645
diff changeset
23
3011
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
24 public static class IndexedValue
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
25 implements Comparable<IndexedValue>
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
26 {
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
27 protected int index;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
28 protected double value;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
29
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
30 public IndexedValue() {
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
31 }
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
32
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
33 public IndexedValue(int index, double value) {
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
34 this.index = index;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
35 this.value = value;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
36 }
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
37
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
38 public int getIndex() {
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
39 return index;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
40 }
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
41
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
42 public void setIndex(int index) {
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
43 this.index = index;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
44 }
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
45
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
46 public double getValue() {
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
47 return value;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
48 }
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
49
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
50 public void setValue(double value) {
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
51 this.value = value;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
52 }
3011
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
53
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
54 @Override
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
55 public int compareTo(IndexedValue other) {
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
56 int diff = index - other.index;
3564
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
57 if (diff < 0) return -1;
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
58 return diff > 0 ? +1 : 0;
3011
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
59 }
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
60 } // class IndexedValue
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
61
3011
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
62 public static class Outliers {
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
63
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
64 protected List<IndexedValue> retained;
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
65 protected List<IndexedValue> removed;
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
66
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
67 public Outliers() {
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
68 }
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
69
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
70 public Outliers(
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
71 List<IndexedValue> retained,
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
72 List<IndexedValue> removed
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
73 ) {
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
74 this.retained = retained;
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
75 this.removed = removed;
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
76 }
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
77
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
78 public boolean hasOutliers() {
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
79 return !removed.isEmpty();
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
80 }
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
81
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
82 public List<IndexedValue> getRetained() {
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
83 return retained;
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
84 }
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
85
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
86 public void setRetained(List<IndexedValue> retained) {
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
87 this.retained = retained;
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
88 }
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
89
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
90 public List<IndexedValue> getRemoved() {
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
91 return removed;
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
92 }
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
93
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
94 public void setRemoved(List<IndexedValue> removed) {
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
95 this.removed = removed;
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
96 }
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
97 } // class Outliers
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
98
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
99 public Outlier() {
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
100 }
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
101
3011
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
102 public static Outliers findOutliers(List<IndexedValue> inputValues) {
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
103 return findOutliers(inputValues, DEFAULT_ALPHA);
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
104 }
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
105
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
106 public static Outliers findOutliers(
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
107 List<IndexedValue> inputValues,
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
108 double alpha
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
109 ) {
3564
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
110 boolean debug = log.isDebugEnabled();
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
111
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
112 if (debug) {
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
113 log.debug("outliers significance: " + alpha);
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
114 }
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
115
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
116 alpha = 1d - alpha;
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
117
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
118 ArrayList<IndexedValue> outliers = new ArrayList<IndexedValue>();
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
119
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
120 ArrayList<IndexedValue> values =
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
121 new ArrayList<IndexedValue>(inputValues);
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
122
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
123 for (;;) {
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
124 int N = values.size();
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
125
3564
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
126 if (debug) {
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
127 log.debug("Values to check: " + N);
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
128 }
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
129
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
130 if (N < 4) {
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
131 break;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
132 }
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
133
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
134 Mean mean = new Mean();
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
135 StandardDeviation std = new StandardDeviation();
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
136
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
137 for (IndexedValue value: values) {
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
138 mean.increment(value.getValue());
3011
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
139 std .increment(value.getValue());
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
140 }
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
141
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
142 double m = mean.getResult();
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
143 double s = std.getResult();
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
144
3564
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
145 if (debug) {
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
146 log.debug("mean: " + m);
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
147 log.debug("std dev: " + s);
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
148 }
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
149
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
150 double maxZ = -Double.MAX_VALUE;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
151 int iv = -1;
2646
c11da3540b70 Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2645
diff changeset
152 for (int i = N-1; i >= 0; --i) {
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
153 IndexedValue v = values.get(i);
3564
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
154 double z = Math.abs(v.getValue()-m);
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
155 if (debug) {
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
156 log.debug("z candidate: " + z);
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
157 }
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
158 if (z > maxZ) {
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
159 maxZ = z;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
160 iv = i;
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
161 }
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
162 }
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
163
3564
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
164 if (Math.abs(s) < EPSILON) {
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
165 break;
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
166 }
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
167
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
168 maxZ /= s;
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
169
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
170 TDistributionImpl tdist = new TDistributionImpl(N-2);
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
171
3564
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
172 double t;
2646
c11da3540b70 Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2645
diff changeset
173
3564
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
174 try {
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
175 t = tdist.inverseCumulativeProbability(alpha/(N+N));
2646
c11da3540b70 Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2645
diff changeset
176 }
c11da3540b70 Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2645
diff changeset
177 catch (MathException me) {
c11da3540b70 Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2645
diff changeset
178 log.error(me);
3564
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
179 break;
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
180 }
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
181
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
182 t *= t;
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
183
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
184 double za = ((N-1)/Math.sqrt(N))*Math.sqrt(t/(N-2d+t));
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
185
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
186 if (debug) {
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
187 log.debug("max: " + maxZ + " crit: " + za);
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
188 }
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
189
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
190 if (maxZ > za) {
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
191 outliers.add(values.get(iv));
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
192 values.remove(iv);
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
193 }
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
194 else {
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
195 if (debug) {
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
196 log.debug("values left: " + N);
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
197 }
e01b9d1bc941 FixA: Corrected the formulas of Grubbs' test for outliers. Still a bit broken.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 3011
diff changeset
198 break;
2646
c11da3540b70 Checked in out dated version of outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2645
diff changeset
199 }
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
200 }
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
201
3011
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
202 Collections.sort(outliers);
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
203
3011
ab81ffd1343e FixA: Reactivated rewrite of the outlier checks.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents: 2646
diff changeset
204 return new Outliers(values, outliers);
2645
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
205 }
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
206 }
4f7d1ea38404 Added simple Grubb's outlier test.
Sascha L. Teichmann <sascha.teichmann@intevation.de>
parents:
diff changeset
207 // vim:set ts=4 sw=4 si et sta sts=4 fenc=utf8 :

http://dive4elements.wald.intevation.org