1

I need a way to fire Window, based on either count reaching a specified limit or data in Window reaching a specified size (byte count). I did find Data driven trigger based on Count, but not on number of bytes, which, if existed, would have made it possible to make a composite trigger from two. Is there a way to achieve this?

Sumit
  • 706
  • 1
  • 8
  • 16
  • Unfortunately, setting triggers based on accumulated message size is currently not supported option. Please, have a look for workaround in other Stackoverflow [thread](https://stackoverflow.com/questions/46428605/dataflow-apache-beam-trigger-window-on-number-of-bytes-in-window) and let me know if it's helpful for you. The proposed solution was to implement a customized writer, based on the linked documentation. It may help get you on the right track for implementing your own solution. – aga Jun 08 '20 at 14:28

1 Answers1

1

There's not a way to achieve this with triggers.

The best option is to use state in a ParDo, which will let you track whatever you want in a persistent manner. State is scoped by Window, so you can buffer elements in state for that key and window until your threshold is reached.

See https://beam.apache.org/documentation/programming-guide/#state-and-timers for detailed information.

danielm
  • 3,000
  • 10
  • 15