Developer(s) | StARLinG Lab |
---|---|
Initial release | December 29, 2016 |
Stable release | 1.1.1
/ August 1, 2019 |
Repository | github |
Written in | Java |
Platform | Linux, macOS, Windows |
Type | Machine Learning, Relational dependency network |
License | GPL 3.0 |
Website | starling |
Relational dependency networks (RDNs) are graphical models which extend dependency networks to account for relational data. Relational data is data organized into one or more tables, which are cross-related through standard fields. A relational database is a canonical example of a system that serves to maintain relational data. A relational dependency network can be used to characterize the knowledge contained in a database.
Relational Dependency Networks (or RDNs) aims to get the joint probability distribution over the variables of a dataset represented in the relational domain. They are based on Dependency Networks (or DNs) and extend them to the relational setting. RDNs have efficient learning methods where an RDN can learn the parameters independently, with the conditional probability distributions estimated separately. Since there may be some inconsistencies due to the independent learning method, RDNs use Gibbs sampling to recover joint distribution, like DNs.
Unlike Dependency Networks, RDNs need three graphs to fully represent them.
In other words, the data graph guides how the model graph will be rolled out to generate the inference graph.
The learning methods of an RDN are similar to that employed by a DNs. i.e., all conditional probability distributions can be learned for each of the variables independently. However, only conditional relational learners can be used during the parameter estimation process for RDNs. Therefore, the learners used by DNs, like decision trees or logistic regression, do not work for RDNs.
Neville, J., & Jensen, D. (2007) [1] conducted some experiments comparing RDNs when learning with Relational Bayesian Classifiers and RDNs when learning with Relational Probability Trees. Natarajan et al. (2012) [2] used a series of regression models to represent conditional distributions.
This learning method makes the RDN a model with an efficient learning time. However, this method also makes RDNs susceptible to some structural or numerical inconsistencies. If the conditional probability distribution estimation method uses feature selection, it is possible that a given variable finds a dependency between itself and another variable while the latter doesn't find this dependency. In this case, the RDN is structurally inconsistent. In addition, if the joint distribution doesn't sum to one owing to the approximations caused by the independent learning, then it is called a numerical inconsistency. Such inconsistencies can, however, be bypassed during the inference step.
RDN inference begins with the creation of an inference graph through a process called roll out. In this process, the model graph is rolled out over the data graph to form the inference graph. Next, Gibbs sampling technique can be used to recover a conditional probability distribution.
RDNs have been applied in many real-world domains. The main advantages of RDNs are their ability to use relationship information to improve the model's performance. Diagnosis, forecasting, automated vision, sensor fusion and manufacturing control are some examples of problems where RDNs were applied.
Some suggestions of RDN implementations: