I thank you for your answer. This is an absolutely interesting question. The method we will use is based on PubMed Entrez API. Using a Python code, we retrieve the number of PubMed publications jointly involving the name of the disease and the name of the risk factor and we retrieve the PubMed ID of the most relevant search result. As you already know, the relations having important number of query results are more likely to be accurate. After we do this, we will give the list of the retrieved risk factors to a number of physicians for verification. After that, we will add them to Wikidata using QuickStatements. As for the concerns you raised, they are absolutely accurate. We can discuss them in details after the acceptance of the proposal. You can also be a co-author of the work.
interesting proposal, indeed. I left a comment on the page, which basically says we need qualifiers and references. The references because some claims will be controversial (even if backed up by literature), while I also see these links as likely time bound, due to changing health policies (I can think of a few other reasons), at least for the residence examples (I hope no one gets any ideas about Q4 'risk factor' 'residence' 'war zone').

I thank you for your efforts. I invite you to support the proposal of risk factor as a Wikidata property. The property will be an excellent contribution to Wikidata as it can allow this high-scale knowledge base to be useful for digital epidemiology purposes. The proposal is available in This property is a generalization of the first property proposal I developed.
