9

I want to use gRPC to share very large file (more than 6GB) between endpoints and a server.

The project where I'm currently working require a central server where endpoints can upload and download files. One of the constraint is that endpoints don't know each others, but they can receive and send messages each others from a common bus.

To implement this server and its communication with endpoints, I'm evaluating to use gRPC. Do you think is the best solution for file stream? what alternatives do I have?

thanks in advance.

Marco Fiorillo
  • 91
  • 1
  • 1
  • 6
  • 2
    How about simply providing a download link? http servers have been handling big downloads since their inception, why reinvent the wheel here? – Lasse V. Karlsen Jun 19 '20 at 12:31
  • 1
    gRPC is useful, but it doesn't solve the problem of huge bandwidth issues; I mean... sure, you could use a stream of chunks in gRPC terms, but... a regular vanilla http download seems much simpler – Marc Gravell Jun 19 '20 at 12:42
  • The huge bandwidth issues is a big problem, but I also have read that gRPC isn't optimize in case of file larger than 2gb, but I didn't find anything about it other than a single post in a forum. Does someone know something about it? btw I forgot to say that I need to use a safe way to transfer this file, like use SSL. – Marco Fiorillo Jun 19 '20 at 13:00
  • @MarcoFiorillo gRPC can work inside TLS; that isn't a problem - and if you really want to use gRPC, you would make it a "server streaming" method that returns multiple *segments* of the file in separate chunks, rather than one single *unary* response; is that what you're after? – Marc Gravell Jun 19 '20 at 13:25
  • To add to Marc Gravel answer, the downloading side (caller) will have to aggregate the streaming chunks manually... Grpc guarantees in-order replays, so no need to send the chunk index. Recently I proposed to standardize the file transfer proto message both to grpc and grpc web projects. May be wider community will help this happen. – Alexander.Furer Jun 20 '20 at 14:14

2 Answers2

6

gRPC with client/server streaming is capable of handling upload/download of files. However, there's a discussion here on the performance of gRPC vs HTTP for file upload/download, which says HTTP is any day going to be faster to upload/download because this is just reading/writing incoming bytes, while gRPC performs additional serialization/deserialization for each message in the stream adding significant overhead.

There is another blog doing some benchmark on the same - https://ops.tips/blog/sending-files-via-grpc/ .

If you are looking to implement something that has to handle scale, you can do some more research.

Himanshu Shekhar
  • 318
  • 9
  • 15
4

If you really want to do this over gRPC, then the key thing is to make the response "server streaming", so that instead of returning 6GiB in one chunk, it returns multiple chunks of whatever size you need, for example maybe 128kiB at a time (or whatever); you can so this with something like:

syntax = "proto3";
message FileRequest {
    string id = 1; // or whatever
}
message FileResponse  {
    bytes chunk = 1; // some segment of the file
}

service SearchService {
  rpc GetFile(FileRequest) returns (stream FileResponse);
}

but nothing is automatic: it is now your job to write the multiple segments back.

I suspect a vanilla http download-style response may be simpler!

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • Thanks for the answer, but if you have some other options, I would like to to evaluate it. I was evaluating also a TCP Client + SSL Stream, but I also evaluating the HTTP Download response! Thanks I thought there was some kind of automation to merge al the chuck together. – Marco Fiorillo Jun 19 '20 at 13:35
  • Can you give me a link or some more information about the "http download-style response"? Thanks – Marco Fiorillo Jun 19 '20 at 14:08
  • @MarcoFiorillo I mean regular http[s] - just: the same way that files have been transferred for several decades – Marc Gravell Jun 19 '20 at 14:30
  • I'm sorry, I thought you were talking about some framework based on HTTP(S). – Marco Fiorillo Jun 20 '20 at 15:04
  • @MarcoFiorillo people using HTTP/1.2 since decades that opens a new TCP connection when you send data(byte stream in your case) so opening a new TCP connection is expensive. Whereas with gRPC you can reuse same TCP connection and stream your data. I'm not sure how REST implementation is faster than gRPC for you – DV Singh Sep 10 '22 at 20:53