Lem provided an algorithms.py (the interface) and custom_algo.py (user implementation of their algorithms). I added some algorithms to custom_algo.py and made a couple of suggested changes to algorithms.py. We will discuss these here, among other things. See the snippets XXX.
Lem provided an algorithms.py (the interface) and custom_algo.py (user implementation of their algorithms). I added some algorithms to custom_algo.py and made a couple of suggested changes to algorithms.py. We will discuss these here, among other things.
<h2>Administrative Details</h2>
<h2>Administrative Details</h2>
...
@@ -24,16 +24,31 @@ Lem provided two algorithms to go with the interface (in algorithms.py), `sapien
...
@@ -24,16 +24,31 @@ Lem provided two algorithms to go with the interface (in algorithms.py), `sapien
<h2>Trust-weighted histogram algorithm</h2>
<h2>Trust-weighted histogram algorithm</h2>
Also added was Eric's [trust_weighted_histogram algorithm](Dan's-proposal-for-trust-weighted-histograms). This algorithm requires an additional input, the number of bins, so a `misc_input` field was added to `AlgorithmInput` to include this easily. We will discuss this algorithm a little more below since it brings up some ideas worthy of review.
Also added was Eric's [trust_weighted_histogram algorithm](Dan's-proposal-for-trust-weighted-histograms). This algorithm requires an additional input, the number of bins, so a `misc_input` field was added to `AlgorithmInput` to include this easily:
```python
classAlgorithmInput:
components:list[ComponentData]
misc_input:dict
```
The user (or algorithm developer) would then include this when creating an `AlgorithmInput`, eg:
<h2>straight_average and straight_average_intermediate algorithm</h2>
<h2>straight_average and straight_average_intermediate algorithm</h2>
I added the straight_average algorithm to custom_algo.py. This implements the ideas in [A simple averaging technique to supplement the Bayes equation](A simple averaging technique to supplement the Bayes equation). It does not handle multiple levels, however, because the averaged output is not automatically the input to the next level (as it is in Bayes). Using the averaged output as the input to the next level results in an average of averages which is not the same as averaging all the probabilities in the population. See the discussion in the above link under 'Combining input probabilities in simple averaging' to understand this.
I added the straight_average algorithm to custom_algo.py. This implements the ideas in [A simple averaging technique to supplement the Bayes equation](A simple averaging technique to supplement the Bayes equation). It does not handle multiple levels, however, because the averaged output in this case is not automatically the input to the next level (as it is in Bayes). Using the averaged output as the input to the next level results in an average of averages which is not the same as averaging all the probabilities in the population. See the discussion in the above link under 'Combining input probabilities in simple averaging' to understand this better.
A proposed way to handle this issue involves the `straight_average_intermediate` algorithm which is very similar to `straight_average` but uses `intermediate_results` to allow for correctly sending results to the next level up. In essence the `intermediate_results` are just the Sapienza trust-modified probabilities of every source in the sub-group to be analyzed. The sub-group in this case is a node and its direct descendants (children). The average of the group is calculated and the list of probabilities (modified by trust) become the results for the next level. Two examples of a multi-level tree were created manually to test this concept (just run custom_algo.py). In real life transferring results between nodes will be handled by the server code, so this is really just an experiment to see that things are conceptually ok.
A proposed way to handle this issue involves the `straight_average_intermediate` algorithm which is very similar to `straight_average` but uses `intermediate_results` to allow for correctly sending results to the next level up. In essence the `intermediate_results` are just the Sapienza trust-modified probabilities of every source in the sub-group to be analyzed. The sub-group in this case is a node and its direct descendants (children). The average of the group is calculated and the list of probabilities (modified by trust) become the results for the next level. Two examples of a multi-level tree were created manually to test this concept (just run custom_algo.py). In real life transferring results between nodes will presumably be handled by the server code, so this is really just an experiment to see that things are conceptually ok.
<h2>Example of using straight_average_intermediate</h2>
<h2>Example of using straight_average_intermediate</h2>
Let's start by using the case from [A simple averaging technique to supplement the Bayes equation](A simple averaging technique to supplement the Bayes equation):