1

For fun I wrote this Ruby socket server which actually works quite nicely. I'm plannin on using it for the backend of an iOS App. My question for now is, when in the thread do I need a Mutex? Will I need one when accessing a shared variable such as @clients?

require 'rubygems'
require 'socket'

module Server

    @server = Object.new
    @clients = []
    @sessions
    def self.run(port=3000)
        @server = TCPServer.new port

        while (socket=@server.accept)
            @clients << socket
            Thread.start(socket) do |socket|
                begin
                    loop do
                        begin
                            msg = String.new
                            while(data=socket.read_nonblock(1024))
                      msg << data
                                  break if data.to_s.length < 1024
                            end
                            @clients.each do |client| client.write "#{socket} says: #{msg}" unless client == socket end
                        rescue

                        end

                    end
                rescue => e
                    @clients.delete socket
                    puts e
                    puts "Killed client #{socket}"
                    Thread.kill self
                end
            end

        end
    end

end

Server.run

--Edit--

According to the answer from John Bollinger I need to synchronize the thread any time that a thread needs to access a shared resource. Does this apply to database queries? Can I read/write from a postgres database with ActiveRecord ORM inside multiple threads at once?

OneChillDude
  • 7,856
  • 10
  • 40
  • 79
  • Yes; such operations are not atomic in Ruby. The GIL may cause them to appear atomic, but I wouldn't necessarily count on it. – Chris Heald Oct 21 '14 at 18:12
  • Global Interpreter Lock - MRI Ruby won't execute Rubyland code in more than one thread at a time because of it, which can give the illusion of thread safety in some circumstances. – Chris Heald Oct 21 '14 at 18:50
  • You are creating a new thread for every socket. This will work fine for a while, but at some point you will definitely run into problems as more and more clients connect. If you want to handle lots of open sockets you should look into socket.io. (Or check out eventmachine if you want to stick to ruby) – Mark Meeus Oct 21 '14 at 18:56

2 Answers2

2

Any data that may be modified by one thread and read by a different one must be protected by a Mutex or a similar synchronization construct. Inasmuch as multiple threads may safely read the same data at the same time, a synchronization construct a bit more sophisticated than a single Mutex might yield better performance.

In your code, it looks like not only does @clients need to be properly synchronized, but so also do all its elements because writing to a socket is a modification.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • So basically you're saying that any time I read or write to any memory that is shared by multiple threads I need a synchronization construct? What about writing to a database? Can I make ActiveRecord queries from multiple threads at once? – OneChillDude Oct 21 '14 at 18:36
  • Thread starts are synchronization events. You do not need to synchronize reads of data that cannot have been modified since the current thread started. You also do not need to synchronize reads of data that are modified only by the reading thread. You do need to synchronize any time you modify memory that could be read by other threads, and any time you read memory that could have been modified by another thread. – John Bollinger Oct 21 '14 at 18:51
  • ActiveRecord is threadsafe, it uses a different connections on different threads. So you shouldn't worry about that. – Mark Meeus Oct 21 '14 at 18:53
  • You can make ActiveRecord queries from multiple threads as long as each uses a separate connection. – John Bollinger Oct 21 '14 at 18:53
2

Don't use a mutex unless you really have to.

It's pity the literature on Ruby multi-threading is so scarce, the only good book written on the topic is Working With Ruby Threads from Jesse Storimer. I've learned a lot of useful principles from there, one of which is: Don't use a mutex if there are better alternatives. In your case, there are. If you use Ruby without any gems, the only thread-safe data structure is a Queue. An array is not safe. However, with the thread_safe gem you can create one:

require 'thread_safe'

sa = ThreadSafe::Array.new # supports standard Array.new forms
sh = ThreadSafe::Hash.new # supports standard Hash.new forms

Regarding your question, it's only if any thread MODIFIES a shared data structure that you'll need to protect it with a mutex (assuming all the threads just read from that data structure, none writes to it, see John's comment for explanation on a case where you might need a mutex if one thread is reading, while another is writing to a thread etc). You don't need one for accessing unchanging data. If you're using Active Record + Postgres, yes Active Records IS thread safe, as for Postgres, you might want to follow these instructions (Behavior in Threaded Programs) to check that.

Also, be aware of race conditions (see How to Make ActiveRecord ThreadSafe which is one inherent problem which you should be aware of when coding multi-threaded apps).

Avdi Grimm had one very sound advice for multi-threaded apps: When testing them, make them fail loud and fast. So don't forget to add at the top:

Thread.abort_on_exception = true

so your threads don't silently fail if something wrong happens.

Community
  • 1
  • 1
daremkd
  • 8,244
  • 6
  • 40
  • 66
  • You absolutely do need to synchronize via a mutex or other synchronization construct when you read data that could have been modified by another thread. Otherwise, the reader is not guaranteed *ever* to see the modifications, and if they do see them then they are not certain to see a consistent view of them. Synchronization is a mutual effort. It doesn't work for just one party to participate. – John Bollinger Oct 22 '14 at 13:08
  • Have you read my answer at all? My point was to use a THREAD SAFE DATA STRUCTURE before using a mutex, not to not use a mutex at all. Read before you comment. – daremkd Oct 22 '14 at 13:12
  • With that said, it is true that there are synchronization constructs other than mutexes, and that you should choose (or build) the one most appropriate for your situation. That could be a construct that is baked in to another class. In any case a `Queue` is *not* appropriate for the OP's case, though a `ThreadSafe::Array` might be. – John Bollinger Oct 22 '14 at 13:12
  • Yes, I read the answer. In particular, I read this: "it's only when you're MODIFYING a shared data structure that you'll need a mutex. You don't need it for accessing it," (emphasis in the original) which is very wrong. – John Bollinger Oct 22 '14 at 13:14
  • Thanks, I've modified my answer and added a reference to your comment. – daremkd Oct 22 '14 at 13:21
  • I've slightly edited the answer to make your intended meaning (that evidently I mistook) more clear. – John Bollinger Oct 22 '14 at 13:41