Hi!
You can seek back on EventBus events, but not permanently (by default, only up to 1 week). If you want to respond to changes in an event stream, you
1 week is not enough for this use case, but if it could be extended to, say, 1 month, that could be workable.
The reason is that the starting point for the WDQS server install is wikidata dump, which is made weekly. Then the server is catching up to the data that changed from the dump point until the current moment. However, there could be dump failures or other conditions which may make most recent dump unusable. It also takes to load the dump itself. So the delta between current moment and data in freshly deployed WDQS server could be 2 weeks or even more. We need to be able to catch up to the changes since then. We probably will never need the full month, but it's a conservative limit we're using now for how far back we can ask for data. 2 weeks would probably work too even if it could mean some scenarios become more complicated to handle.
should consume the full event stream realtime and react to the events as they come in. A proper Stream Processing system (like Flink or Spark
This is not possible for the WDQS Updater. Since WDQS server is completely independent of Wikidata, it can be started and stopped at anytime. There's no way to ensure that at every moment something is changed in Wikidata all WDQS instances that are interested in this change are up and running. There needs to be an intermediary system that keeps the data. So far recent changes API served as this system, but since it does not know about secondary data, it's no longer enough.
this stream will be relatively small, and you don’t need fancy features like time based windowing. You just need to update something based on an event, right?
Well, I need something based on an even that I can ask something like: "give me all events that happened since time point T". For T being, say, from a second ago to 2 weeks ago.
The change-propagation service that the Services team is building can help you with this. It allows you to consume events, and specify matching rules and actions to take based on those rules.
I see no mention of ability to consume past events. Is it possible?