Some ideas for what they're worth...19/Jan 2015
Chat applications have become quite popular these days. It used to be that us developers used IRC (internet relay chat). We still do of course, but applications such as HipChat, Slack, and Gitter are starting to slowly replace that.
IRC historically had robust robots that would hang out in the channel answering questions and sending notifications. The problem was, IRC is kinda ugly. Not that we need anything pretty, but style is counting for a lot these days with developers who buy into the rock star image. Of course it’s also easier on the eyes for the occasional non-developer who has to use the chat.
I’ve used each of the apps I mentioned above, but how do they work? Would I do anything different? I would. I tried to do some research into what technology they were using and I don’t think any of them are using the same stack that I would choose for these types of applications.
These applications are all about events when you get down to it. A message is an event and commit messages being forwarded to the channel are events. Everything is an event when you get down to it. All the hooks, etc.
So to my surprise I discovered many of these applications are not using time series databases. Many seem to use basic SQL. Don’t get me wrong, support for transactions (rollbacks on network failures) can be nice…But is a SQL database or even something like MongoDB really the best choice? I don’t think so.
Remember, searching through chat history is also a feature here (presumably all handled easily by ElasticSearch). Though you can always index things no matter which database you use.
Also note that chat history becomes less important over time and for pricing plan purposes you’ll want to remove data older than a certain period of time. Even if you back it up to S3. You only need so much chat history. So now you’re talking about a cleanup process on the database (and search) too.
A Timeseries Database?
InfluxDB is a really interesting database that’s currently changing over to use BoltDB which is a really interesting storage engine. BoltDB is very fast which is great for real-time chat applications (though I don’t suspect SQL was too slow). It excels with sequential reads as well. Perfect! When you enter a room after being away, that’s precisely what happens with these applications. It replays, in sequential order, the events since you were last there.
InfluxDB has the notion of series so you can also divide up the type of events quite nicely. You can even merge these series together. So you have a diverse enough range of queries to get the data back out and filter it which important. This allows users to find exactly what they are after or cut out all of the chat and look specifically for certain notifications with ease. Again, not that you couldn’t do this with any other database…But InfluxDB makes it much nicer and it also easily facilitates aggregations on top of it all.
You could even run your project management through one of these chat applications with InfluxDB and get a sense for progress. With the aggregation features you could quickly gather all sorts of statistics and metrics. Don’t stop with commits and issues for developers – you could also be pulling in web application metrics and server metrics (the primary focus of InfluxDB anyway).
This truly would be an event command center for your business. More so than what these applications offer today (and with greater ease and speed).
InfluxDB also has a few other special features (with more on the way) aside from aggregations that help out an chat application as well:
It’s fast. It handles search! No need now for a separate search engine. It should really be fast enough here, especially if you can filter it down by a time range. Something like ElasticSearch would be faster (especially with lots of data), but I don’t think a chat application even needs something as robust as ElasticSearch.
Note: There are pure Go based search engines (bleve) as well if we did want to start indexing things and have a more performant search.
When you’re talking about a huge amount of data, InfluxDB’s continuous queries can really help here. It can help a system scale better by writing grouped and aggregate data to a separate series. This would result in faster speed which would provide better user experience for various reporting needs. There’d be many uses for continuous queries from an analytics point of view.
Chunked HTTP Responses
Again, for speed and efficiency, InfluxDB will send data back out as it’s ready in batches. This greatly enhances user experience because they won’t need to wait as long for data to load. It also helps the system scale by not sending huge amounts of data all at once. The system resources won’t spike as much. It’s likely already done by these applications with the other databases, but it is not something the database really supports itself. So this would also be a time saving feature for the developers of the chat application too.
Chunked HTTP responses would also be a very nice feature to have when considering mobile devices on slower connections.
InfluxDB allows you to specify a period of time to keep data. It’ll automatically throw out older data. This is wonderful when you’re trying to build chat as a service. It takes care of that cleanup and puts a limit on accounts based on the pricing plan.
Of course you could take the older history and move it to S3 as well. The beauty here again is that the developers building the chat application now have less work to do. It also is built into InfluxDB, unlike other databases, so the performance is likely to be better than mass deletes on other databases. It was simply built for this task and it’s a task we need in a chat application.
PubSub & Custom Functions
These features are coming in the future and will also obviously help a chat application. I won’t really get into great detail about the possibilities, but you could use your imagination.
Google’s Go language is becoming quite popular and InfluxDB and BoltDB both use it (so does the bleve search engine). I’d naturally build the chat application using Go.
I’d also look into the wonderful go-json-rest package for the API (my goto). This package also provides a streaming API which is very nice. Especially for chat.
Between chunked HTTP queries and a streaming API…Wow, we really do have a wonderful real-time streaming service on our hands here. Remember to go beyond chat. We aren’t just sending text messages back and forth between people. We’ll be doing so much more and now we have an extremely strong foundation to support our robust application.
Go is extremely fast and it also happens to create wonderful binaries that can run on a variety of platforms. This makes deployment easy. It also happens to compile super fast (again making deployment easy). It now even supports Android.
When you’re talking about speed, concurrency, and pushing data around - accept no substitutions. Go is where it’s at.
This would be the back-end stack that I would use. Basically it’s 100% Go when you get down to it. From a business perspective you know you’re going to need all Go developers. You’ll likely even need less of them now to build the application.
You don’t need to worry about how one piece of technology in the stack works with another. All of those typical concerns when mixing technologies are a non-issue because everything is under the same language now.
It also scales really well across hardware. It should lower the hosting costs too. Technically, you now don’t need a separate database server. It can all become one single app because InfluxDB can be embedded into your application. I’d still likely keep that separate, but if you know you’re trying to offer a smaller plan on a budget - you now have less server overhead. From a business perspective this means your profit margins are higher.
From a business perspective, this is simply a far more efficient way to build an application of this nature. Keep in mind this architecture also means your system admin and devops costs will be lowered.
In fact, I would suspect most teams could get by hosting such an application for $5-$10/mo. I personally recommend Digital Ocean for affordable and performant hosting. I think a 1GB or even 512MB RAM server would do the trick here.
Oh, right, I almost forgot. Speaking of devops, you can use the same InfluxDB to monitor the health of the application server(s) with Grafana. You can even build into the application, rules to automatically scale and upgrade servers based on the system resources. Or use Docker or something fancier if you like, but keep in mind that InfluxDB was built for system metrics and so it has some nice features for you there as well.
You will still need front-end development of course. I might use Node.js for the front-end application to easily get it on desktops across multiple platforms. Native mobile apps are fine as well, though you could use something like AppGyver with the Ionic Framework if you were really trying to be creative with your time. So there’s still a fair amount of work left, but the back-end is super solid with the above technology stack.
It is my belief (and passion) that we need to think about more than just building an application these days. We really need to think about what technology choices mean from a business and user experience perspective. There’s also plenty of creative solutions out there given we have an almost infinite number of tools at our disposal.
This is how I would build a “chat” application, though it’d really do a whole lot more than just chat.