Hi!
Well, no, HDFS is a means to and end of storing data in a form that can be cleaned with ETL processes so that /then/ they can go to the somewhere/something - which is a lot of use cases but most prominently our dashboards and ad-hoc research tasks.
Thanks for explaining more! I think I understand you concern better now. With the renewed attention to WDQS productization, the point may be moot soon, but in case it won't be, I just wanted to explore a possibility of using the same infrastructure but with different inputs - or maybe possibility of building a bridge between HDFS and whatever we have in labs. I'm not saying this necessarily makes sense, but if it doesn't, I'd like to know why.
reinvent the wheel every time we build a thing. If we can't do HDFS and going to production isn't going to work, then let's talk about what the alternatives are. Until then the use case is "the data being in HDFS so that analysts can consume it" and higher-level use cases are overthinking.
OK. Then if we go to production soon (hopefully) I assume we have an existing workflow allowing us to get stuff to HDFS. If not, we _may_ (again, if that doesn't make sense, fine, but would like to hear the reasons) explore the possibility of some process that would allow us to get data from whatever we have now (which can be rather flexible) into HDFS.