0

I have an U-SQL script that process a batch of csv files. I am using virtual columns to retrieve a product id for each file. All files that are read have the same product id.

I have made a connection to a database in my custom Outputter, that retrieves meta information based on the product id. This works but a call to the db is made for all the files (which is expected behaviour).

But is it possible to create a Global function that is only run once and appended to the output for all the files? This will work for me since all the files have the same product id.

reachify
  • 3,657
  • 2
  • 19
  • 22

1 Answers1

0

Sorry for the late reply. It is not clear to me what you are trying to achieve.

First, your code will not run once deployed to the cluster since user code is not allowed to cross the container-boundary (see Does U-SQL allow custom code to call external services for details why).

Secondly, a U-SQL script is a declarative query. Thus there is no intermediate snapshotting of results. If your "global" function is part of that query and executes in the right place of the job graph only once, you can do so. But without knowing the details of what you want to do, it is hard to provide advise.

Michael Rys
  • 6,684
  • 15
  • 23