October 29, 2007

What is Boost.Asio, and why we should use it

Writing portable networking code in ะก++ is problem with long history, and was developed many libraries, that tried to solve this problem. But i could say, that Boost.Asio is the best implementation between existing. On the base of Boost.Asio already exists some number of libraries and applications (libpion, libtorrent), and development of some new is in progress. Beside this, on the base of Boost.Asio i had projected and developed a filtering subsystem for web-filtering product SKVT "Dozor-Jet" aka WebBoss.
Boost.Asio has following main features:
  • ability to write cross-platform networking code, working on the most of existing platforms - Windows, Unix-like, Tru64, QNX, etc..
  • supports both IPv4 and IPv6
  • support for TCP & UDP
  • support for asynchronous operations
  • provide std::iostream compatible interfaces
  • support for SSL connections
  • support for delayed operations (timers)
For me, main advantage of Boost.Asio (besides cross-platform work) is, that on each platform, it use most effective strategy (epoll on Linux 2.6, kqueue on FreeBSD/MacOSX, Overlapped IO on MS Windows), and that, library allow to use different strategies - synchronous and asynchronous work with sockets, usage of streaming input/output, compatible with std::iostream. And these strategies could be mixed in one application, for example, accept connections in async mode, and than run thread, which will do input/output in sync mode (test-otpc.cpp example).
For demonstration of Boost.Asio's features i wrote (some of parts was adopted from Boost.Asio examples) several examples, implementing different data handling strategies (more examples could be found on the home page of Boost.Asio). I had used these examples for selecting appropriate data handling strategies for my own applications. All these examples implements "stupid" web-server, that read request and return back single page, independent on address, that was specified in request. Currently published following examples (all sources include file common.h):
  • test-mcmt.cpp - implements data handling strategy Many Connections/Many Threads - it run several threads, each of them accept connections and perform input/output in async mode;
  • test-otpc.cpp - implements strategy One Thread per Connection - one thread accepts connections in async mode, and than run thread, in which input/output performed in sync mode (code, performing this is in files test-otpc-conn.cpp & test-otpc-conn.hpp);
  • test-otpc-tp.cpp - almost same as test-otpc.cpp, but instead of dedicated thread for every connection, used pool of threads, implemented by threadpool library
To build examples (all sources are here) you need boost 1.35, that currently you can take from boost svn trunk, but i hope, that release of version 1.35 will be in near future, so you will not need to fetch and compile it manually.
I think, that on the base of this note it will possible to write more complete article, but this is future task, but now you can ask me about boost.asio directly :-)

22 comments:

Sander said...

Alex, the external links in your article are not working. I'm very interested in seeing your examples, can you fix this for me?

Alex Ott said...

2sander:
i had uploaded examples to http://alexott.googlepages.com/testpage - I has problem with my current hosting and look for new one

Sander said...

Excellent, and thank you for your fast response!

tom said...

ello Alex,

how do You think what will be faster? what is better?

boost::asio::async_read(socket_,
boost::asio::buffer(buffer_.data(),max_data_length),
boost::bind(&AnldConnection::handle_read, shared_from_this(),
boost::asio::placeholders::error));


or this one


boost::asio::async_read_until(socket_, buf, boost::regex("\r\n\r\n"),
boost::bind(&connection::handle_read, shared_from_this(),
ba::placeholders::error,
ba::placeholders::bytes_transferred));

Alex Ott said...

2tom: the second example is better from high-level view, but from performance point of view, it's better to use fixed size buffers, and do parsing manually (some details here), but this will require to do some additional work. Please look to the code for my small article about creating a http proxy with asio

tom said...

once again me ;)

Alex, You must be fluent in server-client & boost.asio field so I have another few questions.
(I hope You will find some time to answer)

I have still problem to understand difference between synchronous & asynchronous methods .
I know what benefits are if I split client-server communication and calculation(or some another job which server is doing) into two
separate threads. simply, calculations are not delayed by client-server communication.
but I have problem with synchronous & asynchronous in boost.asio context.

I splited server job and communication into different threads - server use single io_service and a thread pool calling io_service::run(), in another few threads server performs its job.
The question is what difference is between read & async_read. I'm already reading in separate (from calculation) thread. So, what advantage of using async_read is? is this thread able to start communicating with another client during unfinished async_read from previous? or maybe something else?

another question: how this "async" is achieved? by nonblocking sockets?

tom

Alex Ott said...

2tom:
The main difference is, that if you use sync read, then your thread will blocked until data will arrived. And when you use async_read, then reading thread will not blocking - this will allow to has less reading threads, keeping resource consumption low.
The optimal model could be following:
- you create one or more reading/writing threads (IO threads) (may be listening on different ports) that will read data asynchronously
- you create pool of threads that will perform long-running operations
- when you'll get new data, you can initiate processing of the data using one of the threads from pool. when data processing is finished, you'll call again IO thread and pass it results of the calculations.
Using this approach you can keep your resource consumption low.
For sync IO you'll need to create separate thread for every connection, and perform all calculations in the same thread, where IO is performed. This design is much simplier to implement, but it scale not well

tom said...

And when you use async_read, then reading thread will not blocking - this will allow to has less reading threads

ok, but why? I would like to know mechanism.

thread will not blocking - it means that is able to perform another job.
what is this job in boost.asio machinery?
So my question is:
is this thread able to start communicating with another client during unfinished async_read from previous?or no?
if no, what kind of job is doing thread while async_read?

tom

Alex Ott said...

asio implements async operation by creating additional threads (called io_services), that do the dispatching of the operations, when they finished (it use different methods on different platforms)
So you can initiate several async_read operations and your code will called only when requested number of bytes will received to your buffer. So you can perform other operations in this thread

Alex Ott said...

2tom: you can find basic information about asio approach by searching for "proactive pattern"
there are also links in boost.asio design document, that you can find on asio's site

tom said...

thanks a lot Alex for your replyes,

I have another one question,

after reading "smal article" I have question what is recipe to make connection persistent for client which want send more data without closing connection. should boost.asio users remeber about some conditions?

I have tryed in such way(please see below). but unfortunately it doesn't work


my server is using a single io_service and a thread pool calling io_service::run().

for (std::size_t i = 0; i < thread_pool_size_; ++i){
boost::shared_ptr <boost::thread> boost::thread(
boost::bind(&boost::asio::io_service::run, &io_service_)));
threads.push_back(thread);
}

I'm using async_read & transfer_at_least

void AnldConnection::start(){
boost::asio::async_read(socket_,
boost::asio::buffer(buffer_),boost::asio::transfer_at_least(at_least_),
boost::bind(&AnldConnection::handle_read, shared_from_this(),
boost::asio::placeholders::error));
}

my handle_read in shortcut looks like:

AnldConnection::handle_read(const boost::system::error_code& e)
{
result = request_parser_.parse(request_, buffer_.data());
request_handler_.handle_request(request_, reply_);

boost::asio::async_write(socket_, reply_.to_buffers(),
strand_.wrap(
boost::bind(&AnldConnection::handle_write, shared_from_this(),
boost::asio::placeholders::error)));
}

parse method is parsing message.if it find "connection: keep-alive" line in message than sets isPersistent_ variable to true.
handle_request method is asembling reply based on request parameters.

and last one handle_write

void AnldConnection::handle_write(const boost::system::error_code& e){
if(request_.isPersistent_){
start();
}else{
boost::system::error_code ignored_ec;
socket_.shutdown(boost::asio::ip::tcp::socket::shutdown_both, ignored_ec);
}
}


in sync client after connecting to server I'm doing

for (n=0;n <nqueries;n++){
boost::asio::write(socket, request);
std::cout << "write done"<<std::endl;

boost::asio::streambuf response;
boost::asio::read_until(socket, response, "\r\n\r\n");
std::cout << "read done"<<std::endl;
}

the results is:
write done
read done
write done

so, in second cycle of loop client is able to write but server doesn't read.
when i see at my logfile i see that start() method is invoked (in second cycle) but async_read doesn't read, or something else.

what do You think Alex, what could be wrong?

tom

tom said...

there are also links in boost.asio design document, that you can find on asio's site

Of course I have read asio doc's but I'm geneticist... and it is hard to understand the details.
Guys from team can help me with c++ but not with client-server ideology.

so, I'm disturbing You Alex. ;)
hope You don't waste to much of Your time.

tom

Alex Ott said...

2tom: no problem with time ;-)
for persistent connection - are you sure, that server awaiting for new data?
To make connection persistent request/response should match several conditions, described in RFC 2616 (HTTP/1.1) - Connection: keep-alive + http/1.0 or http/1.1 and No Connection: close. Response from server must include Content-Length header, and so on.
For sync case, it's seems, that server already closed connection (you need to check result of read operation), and client dropped out of loop when you try to write to closed socket.
But for precise details, i need to look to code...
P.S. do you really need C++ in your software - is it performance critical? I know, that many scientists use functional languages, Haskell for example, that better match math thinking ;-) I prefer to use functional programming myself, when it possible, especially at the start of new project, when i need to create a lot of prototypes

tom said...

>are you sure, that server awaiting for new data?
>(you need to check result of read operation)

what I'm sure now is that server doesn't close connection (i added log to the destructor of AnldConnection object).

I'm also checking amount of data transferred and error code( I have added boost::system::error_code error to client's write and read_until).

result is:

write done: 79 | error_code: 0
read done: 26 | error_code: 0
write done: 0 | error_code: 0

so in seconds loop I see that write was finished with success but... did not transferred any data.!?!

>But for precise details, i need to look to code...
give me only the little sign that You have time for that Alex ;)

tom

ps.
>C++ in your software - is it performance critical? I know,

I'm geneticist but actually (from about a year) I'm working as programmer. Such a life in Poland - stay here as scientist and become poor&never stop complain or emigrate or do something else. I have switched to IT ;). I'm trying to code system daemon so, c++ and high performance are needed.

>that many scientists use functional languages
heh, I have never had enough time to learn even one such language...:( small task I have performed with c++, huge with fortran).

tom said...

AGHRHRGAAGRAHAHAAAAAAAAAAA!!#@######

streams, streams, streams!I forgot that request to server is stream

boost::asio::streambuf request;
std::size_t size;
std::ostream request_stream(&request);
request_stream << "command: NDSSTATUS\r\n";
request_stream << "parameters: ALL\r\n";
request_stream << "connection: keep-alive\r\n";
request_stream << "end_of_message\r\n\r\n";

for (n=0;n <nqueries;n++){
boost::asio::write(socket, request);
.
.
.
}

of course after first write (in first cycle of loop) stream is empty.
So, it is possible that client transferred 0 data with success...
when I moved request declaration & preparation into inside of loop everything is OK.

at last...3 days lost... nocomments

once again, many thanks Alex for Your replies.

tom

Alex Ott said...

that what i mean, when i said, that i need whole source file ;-)
P.S. if you'll have problems in future, you can reach me via jabber at alexott@gmail.com
P.P.S. I'm physic (nuclear and plasma physic) myself, and also switched to IT many years ago

tom said...

>P.S. if you'll have problems in future, you can reach me via jabber

cool,
thanks a lot Alex.

unfortunately I don't even know what jabber is... (should I shaming?)

So, last question here on the blog (next will be via jabber).

do You know if boost has something like (I don't know how it call) job_scheduler? but maybe You know it as some combination from job/event/task & manager/dispatcher/scheduler words, eg. eventmanager?

Daemon which I'm actually coding have to perform some periodic or invoked on signal tasks. So, I need some object when I can register some type of task and this object take care about invoking in proper time this task.

I have grabbed EventManager from 'kosmos file system' project (look if You want http://sourceforge.net/projects/kosmosfs/). But I think this piece of code is not optimal. If boost contain something like that I prefer to use boost solutions.

tom

Alex Ott said...

if you have gmail account, then you already has jabber account - you just need to install google talk client, or use other (for example http://psi-im.org). There are lot of jabber servers over the world, and you can select any.

For the event manager, you can use boost::asio's timers, that allows to specify when to invoke given function.

sdwdd said...

Hello, Alex.

Thanks for a great article!

Tried to compile the sources you've provided.
It does compile, but the `test-mcmt` is not working. It just doesn't react on requests(however binds on a port succesfully). `test-otpc` and `test-otpc-tp` work okay.

I'm under gentoo linux with boost 1.38 installed.

Alex Ott said...

2sdwdd: It's very strange - all code was written & debuged under Linux. I'll check this after I'll return from vacation

Alex Ott said...

2sdwdd: I just checked test-mcmt on my Mac OS X 10.4 with boost 1.38 (from svn trunk) - all works fine with different number of clients.
If on your system test-mcmt hangs, then try to attach to it with gdb and try to obtain stacktrace

sdwdd said...

sorry, that was a problem with my internal configuration.

everything works perfect.
thanks!