Mercurial > dive4elements > river
annotate artifacts/src/main/java/org/dive4elements/river/artifacts/math/StdDevOutlier.java @ 6714:b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Which is actually a calculation that removes outliers based on
Standard Error
Developed and analyized together with Tom.
author | Andre Heinecke <aheinecke@intevation.de> |
---|---|
date | Tue, 30 Jul 2013 17:32:28 +0200 |
parents | af13ceeba52a |
children | 5e38e2924c07 |
rev | line source |
---|---|
5863
4897a58c8746
River artifacts: Added new copyright headers.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
5838
diff
changeset
|
1 /* Copyright (C) 2011, 2012, 2013 by Bundesanstalt für Gewässerkunde |
4897a58c8746
River artifacts: Added new copyright headers.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
5838
diff
changeset
|
2 * Software engineering by Intevation GmbH |
4897a58c8746
River artifacts: Added new copyright headers.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
5838
diff
changeset
|
3 * |
5994
af13ceeba52a
Removed trailing whitespace.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
5863
diff
changeset
|
4 * This file is Free Software under the GNU AGPL (>=v3) |
5863
4897a58c8746
River artifacts: Added new copyright headers.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
5838
diff
changeset
|
5 * and comes with ABSOLUTELY NO WARRANTY! Check out the |
5994
af13ceeba52a
Removed trailing whitespace.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
5863
diff
changeset
|
6 * documentation coming with Dive4Elements River for details. |
5863
4897a58c8746
River artifacts: Added new copyright headers.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
5838
diff
changeset
|
7 */ |
4897a58c8746
River artifacts: Added new copyright headers.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
5838
diff
changeset
|
8 |
5831
bd047b71ab37
Repaired internal references
Sascha L. Teichmann <teichmann@intevation.de>
parents:
4816
diff
changeset
|
9 package org.dive4elements.river.artifacts.math; |
4794
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
10 |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
11 import java.util.List; |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
12 |
6714
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
13 import org.apache.log4j.Logger; |
4794
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
14 |
6714
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
15 /* XXX: |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
16 * Warning: This class is called StdDevOutlier because it caculates the |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
17 * Standard Deviation method for outlier removal as the BFG calls it. |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
18 * But the actual calculation used to remove the outliers calculates |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
19 * the Standard Error and not the Standard Deviation! */ |
4794
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
20 |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
21 public class StdDevOutlier |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
22 { |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
23 public static final double DEFAULT_FACTOR = 3; |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
24 |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
25 private static Logger log = Logger.getLogger(StdDevOutlier.class); |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
26 |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
27 protected StdDevOutlier() { |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
28 } |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
29 |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
30 public static Integer findOutlier(List<Double> values) { |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
31 return findOutlier(values, DEFAULT_FACTOR, null); |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
32 } |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
33 |
4795
8ee270a3ef25
Small code cleanups in S/Q outlier tests.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
4794
diff
changeset
|
34 public static Integer findOutlier( |
8ee270a3ef25
Small code cleanups in S/Q outlier tests.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
4794
diff
changeset
|
35 List<Double> values, |
4816
846b0441f905
Removed trailing whitespace.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
4795
diff
changeset
|
36 double factor, |
6714
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
37 double [] stdErrResult |
4795
8ee270a3ef25
Small code cleanups in S/Q outlier tests.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
4794
diff
changeset
|
38 ) { |
4794
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
39 boolean debug = log.isDebugEnabled(); |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
40 |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
41 if (debug) { |
6714
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
42 log.debug("factor for std dev test (that calculates std err): " + factor); |
4794
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
43 } |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
44 |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
45 int N = values.size(); |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
46 |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
47 if (debug) { |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
48 log.debug("Values to check: " + N); |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
49 } |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
50 |
4795
8ee270a3ef25
Small code cleanups in S/Q outlier tests.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
4794
diff
changeset
|
51 if (N < 3) { |
4794
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
52 return null; |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
53 } |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
54 |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
55 double maxValue = -Double.MAX_VALUE; |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
56 int maxIndex = -1; |
6714
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
57 |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
58 double squareSumResiduals = 0; |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
59 for (Double db: values) { |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
60 squareSumResiduals += Math.pow(db, 2); |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
61 } |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
62 |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
63 double stdErr = Math.sqrt(squareSumResiduals / (N - 2)); |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
64 |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
65 double accepted = factor * stdErr; |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
66 |
4795
8ee270a3ef25
Small code cleanups in S/Q outlier tests.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
4794
diff
changeset
|
67 for (int i = N-1; i >= 0; --i) { |
4794
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
68 double value = Math.abs(values.get(i)); |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
69 if (value > maxValue) { |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
70 maxValue = value; |
4795
8ee270a3ef25
Small code cleanups in S/Q outlier tests.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
4794
diff
changeset
|
71 maxIndex = i; |
4794
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
72 } |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
73 } |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
74 |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
75 if (debug) { |
6714
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
76 log.debug("std err: " + stdErr); |
4794
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
77 log.debug("accepted: " + accepted); |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
78 log.debug("max value: " + maxValue); |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
79 } |
4795
8ee270a3ef25
Small code cleanups in S/Q outlier tests.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
4794
diff
changeset
|
80 |
6714
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
81 if (stdErrResult != null) { |
b265cd6cfda5
issue748: Change StandardDeviation implmentation to what BFG calls Standard Deviation
Andre Heinecke <aheinecke@intevation.de>
parents:
5994
diff
changeset
|
82 stdErrResult[0] = stdErr; |
4794
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
83 } |
4795
8ee270a3ef25
Small code cleanups in S/Q outlier tests.
Sascha L. Teichmann <teichmann@intevation.de>
parents:
4794
diff
changeset
|
84 |
4794
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
85 return maxValue > accepted ? maxIndex : null; |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
86 } |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
87 } |
a7d080347ac3
MINFO: Allow two methods for outlier test in SQ relation.
Raimund Renkert <rrenkert@intevation.de>
parents:
diff
changeset
|
88 // vim:set ts=4 sw=4 si et sta sts=4 fenc=utf8 : |