Score addition of parents to each node
Score based structure learning is a heuristic powered application of optimization based search.It consists of two components: a search strategy and a scoring mechanism to assess the searchoutput’s quality (and hence inform the search path).
As a consequence the quality of a BN’soutput is governed by the quality of the heuristics it employs and hence the search strategy itexecutes.A common model is the K2 algorithm, where the following assumptions are made:•Applies only to discrete variables (this does not preclude the inclusion of continuous vari-ables, but it does force the removal of any explicit continuous and/or nominal informationwithin the data)•Assumes a maximum number of parents per node•Assumes a total order on the set of variables such that, e.g. if the node order is ‘A’, ‘B’,’C’ then: ‘A’ can’t have parents, ‘B’ may have ‘A’ as parent, ‘C’ may have ‘A’ and ‘B’ asparentsBased on these heuristics a greedy search is then performed from a start condition (e.g. anexpert defined BN structure) with the iterative addition of parents to each node until a localK2 score maximum is found.The K2 search strategy is pre-defined and rigid.
Thus, while undoubtedly effective, it is unlikelyto be universally optimal. This is especially the case where the heuristic’s explicit assumptionsdo not hold true (e.g. the maximum number of parents is prohibitively low, and thus cannotaccommodate the true network). The CNN approach detailed in this thesis could be adaptedto define a bespoke, and potentially more effective, K2 search strategy. Two modified K2approaches are envisioned, and discussed in greater depth below.
Modified K2The K2 algorithm’s greedy search approach can be modified replacing the search strategy dic-tated by node ordering with a list of edges ranked by a neural network’s prediction probability.This approach is similar to the existing Chow-Liu approach, where pairwise associations rankthe addition of edges, but instead employs a more nuanced scoring mechanism (K2) and doesnot mandate a tree structure (Chow-Liu).The replacement of the handcrafted heuristic has the potential to improve the overall searchoutcome by reflecting on the prior experience of the CNN. Machine learning offers the potentialof applicability across a range of BN structure types by defining unique heuristics based onthe training data that could obviate the need for handcrafting heuristics. The outline of thisapproach as detailed in pseudocode in algorithms 3 and 4 below: