Fixed typos and articles.

This commit is contained in:
Marek Majkowski 2010-10-08 12:01:32 +01:00
parent 8bfa40e87f
commit 6cce9c05e3
3 changed files with 76 additions and 75 deletions

View File

@ -60,15 +60,15 @@ Introduction
RabbitMQ is a message broker. The principle idea is pretty simple: it accepts RabbitMQ is a message broker. The principle idea is pretty simple: it accepts
and forwards messages. You can think about it as a post office: when you send and forwards messages. You can think about it as a post office: when you send
mail to the post box and you're pretty sure that Mr. Postman will eventually mail to the post box you're pretty sure that Mr. Postman will eventually
deliver the mail to your recipient. Using this metaphor RabbitMQ is a post box, deliver the mail to your recipient. Using this metaphor RabbitMQ is a post box,
post office and a postman. a post office and a postman.
The major difference between RabbitMQ and the post office is the fact that it The major difference between RabbitMQ and the post office is the fact that it
doesn't deal with the paper, instead it accepts, stores and forwards binary doesn't deal with the paper, instead it accepts, stores and forwards binary
blobs of data - _messages_. blobs of data - _messages_.
RabbitMQ uses a weird jargon, but it's simple once you'll get it. For example: RabbitMQ uses a specific jargon. For example:
* _Producing_ means nothing more than sending. A program that sends messages * _Producing_ means nothing more than sending. A program that sends messages
is a _producer_. We'll draw it like that, with "P": is a _producer_. We'll draw it like that, with "P":
@ -85,7 +85,7 @@ RabbitMQ uses a weird jargon, but it's simple once you'll get it. For example:
is not bound by any limits, it can store how many messages you is not bound by any limits, it can store how many messages you
like - it's essentially an infinite buffer. Many _producers_ can send like - it's essentially an infinite buffer. Many _producers_ can send
messages that go to the one queue, many _consumers_ can try to messages that go to the one queue, many _consumers_ can try to
receive data from one _queue_. Queue'll be drawn as like that, with receive data from one _queue_. A queue will be drawn as like that, with
its name above it: its name above it:
{% dot -Gsize="10,0.9" -Grankdir=LR%} {% dot -Gsize="10,0.9" -Grankdir=LR%}
@ -98,7 +98,7 @@ RabbitMQ uses a weird jargon, but it's simple once you'll get it. For example:
} }
{% enddot %} {% enddot %}
* _Consuming_ has a similar meaning to receiving. _Consumer_ is a program * _Consuming_ has a similar meaning to receiving. A _consumer_ is a program
that mostly waits to receive messages. On our drawings it's shown with "C": that mostly waits to receive messages. On our drawings it's shown with "C":
{% dot -Gsize="10,0.3" -Grankdir=LR %} {% dot -Gsize="10,0.3" -Grankdir=LR %}
@ -136,10 +136,10 @@ messages from that queue.
> #### RabbitMQ libraries > #### RabbitMQ libraries
> >
> RabbitMQ speaks AMQP protocol. To use Rabbit you'll need a library that > RabbitMQ speaks a protocol called AMQP. To use Rabbit you'll need a library
> understands the same protocol as Rabbit. There is a choice of libraries > that understands the same protocol as Rabbit. There is a choice of libraries
> for almost every programming language. Python it's not different and there is > for almost every programming language. For python it's not different and there
> a bunch of libraries to choose from: > is a bunch of libraries to choose from:
> >
> * [py-amqplib](http://barryp.org/software/py-amqplib/) > * [py-amqplib](http://barryp.org/software/py-amqplib/)
> * [txAMQP](https://launchpad.net/txamqp) > * [txAMQP](https://launchpad.net/txamqp)
@ -202,12 +202,12 @@ the message will be delivered, let's name it _test_:
At that point we're ready to send a message. Our first message will At that point we're ready to send a message. Our first message will
just contain a string _Hello World!_, we want to send it to our just contain a string _Hello World!_ and we want to send it to our
_test_ queue. _test_ queue.
In RabbitMQ a message never can be send directly to the queue, it always In RabbitMQ a message can never be send directly to the queue, it always
needs to go through an _exchange_. But let's not get dragged by the needs to go through an _exchange_. But let's not get dragged by the
details - you can read more about _exchanges_ in [third part of this details - you can read more about _exchanges_ in [the third part of this
tutorial]({{ python_three_url }}). All we need to know now is how to use a default exchange tutorial]({{ python_three_url }}). All we need to know now is how to use a default exchange
identified by an empty string. This exchange is special - it identified by an empty string. This exchange is special - it
allows us to specify exactly to which queue the message should go. allows us to specify exactly to which queue the message should go.
@ -246,7 +246,7 @@ them on the screen.
Again, first we need to connect to RabbitMQ server. The code Again, first we need to connect to RabbitMQ server. The code
responsible for connecting to Rabbit is the same as previously. responsible for connecting to Rabbit is the same as previously.
Next step, just like before, is to make sure that the The next step, just like before, is to make sure that the
queue exists. Creating a queue using `queue_declare` is idempotent - we can queue exists. Creating a queue using `queue_declare` is idempotent - we can
run the command as many times you like, and only one will be created. run the command as many times you like, and only one will be created.
@ -258,7 +258,7 @@ You may ask why to declare the queue again - we have already declared it
in our previous code. We could have avoided that if we were sure in our previous code. We could have avoided that if we were sure
that the queue already exists. For example if `send.py` program was that the queue already exists. For example if `send.py` program was
run before. But we're not yet sure which run before. But we're not yet sure which
program to run as first. In such case it's a good practice to repeat program to run first. In such cases it's a good practice to repeat
declaring the queue in both programs. declaring the queue in both programs.
> #### Listing queues > #### Listing queues

View File

@ -19,19 +19,19 @@ digraph G {
} }
{% enddot %} {% enddot %}
In [previous part]({{ python_two_url}}) of this tutorial we've learned how In [previous part]({{ python_two_url}}) of this tutorial we created
to create a task queue. The core assumption behind a task queue is that a task a task queue. The core assumption behind a task queue is that a task
is delivered to exactly one worker. In this part we'll do something completely is delivered to exactly one worker. In this part we'll do something completely
different - we'll try to deliver a message to multiple consumers. This different - we'll try to deliver a message to multiple consumers. This
pattern is known as "publish-subscribe". pattern is known as "publish-subscribe".
To illustrate this this tutorial, we're going to build a simple To illustrate this, we're going to build a simple
logging system. It will consist of two programs - first will emit log logging system. It will consist of two programs - the first will emit log
messages and second will receive and print them. messages and the second will receive and print them.
In our logging system every running copy of the receiver program In our logging system every running copy of the receiver program
will be able to get the same messages. That way we'll be able to run one will get the same messages. That way we'll be able to run one
receiver and direct the logs to disk, in the same time we'll be able to run receiver and direct the logs to disk; and at the same time we'll be able to run
another receiver and see the same logs on the screen. another receiver and see the same logs on the screen.
Essentially, emitted log messages are going to be broadcasted to all Essentially, emitted log messages are going to be broadcasted to all
@ -42,14 +42,14 @@ Exchanges
--------- ---------
In previous parts of the tutorial we've understood how to send and In previous parts of the tutorial we've understood how to send and
receive messages. Now it's time to introduce the full messaging model receive messages to and from a queue. Now it's time to introduce
in Rabbit. the full messaging model in Rabbit.
Let's quickly remind what we've learned: Let's quickly cover what we've learned:
* _Producer_ is user application that sends messages. * A _producer_ is user application that sends messages.
* _Queue_ is a buffer that stores messages. * A _queue_ is a buffer that stores messages.
* _Consumer_ is user application that receives messages. * A _consumer_ is user application that receives messages.
The core idea in the messaging model in Rabbit is that the The core idea in the messaging model in Rabbit is that the
@ -62,7 +62,7 @@ exchange is a very simple thing. On one side it receives messages from
producers and the other side it pushes them to queues. The exchange producers and the other side it pushes them to queues. The exchange
must know exactly what to do with a received message. Should it be must know exactly what to do with a received message. Should it be
appended to a particular queue? Should it be appended to many queues? appended to a particular queue? Should it be appended to many queues?
Or should it get silently discarded. The exact rules for that are Or should it get discarded. The exact rules for that are
defined by the _exchange type_. defined by the _exchange type_.
{% dot -Gsize="10,1.3" -Grankdir=LR %} {% dot -Gsize="10,1.3" -Grankdir=LR %}
@ -107,8 +107,8 @@ queues it knows. And that's exactly what we need for our logger.
> amq.headers headers > amq.headers headers
> ...done. > ...done.
> >
> You can see a few `amq.` exchanges. They're created by default, but with a > You can see a few `amq.` exchanges. They're created by default, but
> bit of luck you'll never need to use them. > chances are you'll never need to use them.
<div></div> <div></div>
@ -123,7 +123,7 @@ queues it knows. And that's exactly what we need for our logger.
> routing_key='test', > routing_key='test',
> body=message) > body=message)
> >
> The _empty string_ exchange is special: message is > The _empty string_ exchange is special: messages are
> routed to the queue with name specified by `routing_key`. > routed to the queue with name specified by `routing_key`.
@ -131,15 +131,16 @@ queues it knows. And that's exactly what we need for our logger.
Temporary queues Temporary queues
---------------- ----------------
As you may remember previously we were using queues which had a specified name - As you may remember previously we were using queues which had a specified name
`test` in first `task_queue` in second tutorial. Being able to name a (remember `test` and `task_queue`?). Being able to name a
queue was crucial for us - we needed to point the workers to the same queue was crucial for us - we needed to point the workers to the same
queue. Essentially, giving a queue a name is important when you don't queue. Essentially, giving a queue a name is important when you
want to loose any messages when the consumer disconnects. want to share the queue between multiple consumers.
But that's not true for our logger. We do want to hear only about But that's not true for our logger. We do want to hear about all
currently flowing log messages, we do not want to hear the old currently flowing log messages, we do not want to receive only a subset
ones. To solve that problem we need two things. of messages. We're also interested only in currently flowing messages
not in the old ones. To solve that we need two things.
First, whenever we connect to Rabbit we need a fresh, empty queue. To do it First, whenever we connect to Rabbit we need a fresh, empty queue. To do it
we could create a queue with a random name, or, even better - let server we could create a queue with a random name, or, even better - let server
@ -179,7 +180,7 @@ digraph G {
We've already created a fanout exchange and a queue. Now we need to We've already created a fanout exchange and a queue. Now we need to
tell the exchange to send messages to our queue. That relationship, tell the exchange to send messages to our queue. That relationship
between exchange and a queue is called a _binding_. between exchange and a queue is called a _binding_.
{% highlight python %} {% highlight python %}
@ -193,7 +194,7 @@ our queue.
> #### Listing bindings > #### Listing bindings
> >
> You can list existing bindings using, you guessed it, > You can list existing bindings using, you guessed it,
> `rabbitmqctl list_bindings` command. > `rabbitmqctl list_bindings`.
Putting it all together Putting it all together
@ -226,9 +227,9 @@ digraph G {
The producer program, which emits log messages, doesn't look much The producer program, which emits log messages, doesn't look much
different than in previous tutorial. The most important change is different than in previous tutorial. The most important change is
that, we now need to publish messages to `logs` exchange instead of that, we now need to publish messages to the `logs` exchange instead of
the nameless one. We need to supply the `routing_key` parameter, but the nameless one. We need to supply a `routing_key`, but
it's value is ignored for `fanout` exchanges. Here goes the code for its value is ignored for `fanout` exchanges. Here goes the code for
`emit_log.py` script: `emit_log.py` script:
{% highlight python linenos=true %} {% highlight python linenos=true %}
@ -249,9 +250,10 @@ it's value is ignored for `fanout` exchanges. Here goes the code for
{% endhighlight %} {% endhighlight %}
[(emit_log.py source)]({{ examples_url }}/python/emit_log.py) [(emit_log.py source)]({{ examples_url }}/python/emit_log.py)
As you see, we avoided declaring exchange. If the `logs` exchange As you see, we avoided declaring exchange here. If the `logs` exchange
isn't created at the time this code is executed the message will be isn't created at the time this code is executed the message will be
lost. That's okay for us. lost. That's okay for us - if no consumer is listening yet (ie:
the exchange hasn't been created) we can discard the message.
The code for `receive_logs.py`: The code for `receive_logs.py`:

View File

@ -16,8 +16,8 @@ digraph G {
{% enddot %} {% enddot %}
In the [first part of this tutorial]({{ python_one_url }}) we've learned how to send In the [first part of this tutorial]({{ python_one_url }}) we've learned how
and receive messages from a named queue. In this part we'll create a to send and receive messages from a named queue. In this part we'll create a
_Task Queue_ that will be used to distribute time-consuming work across multiple _Task Queue_ that will be used to distribute time-consuming work across multiple
workers. workers.
@ -25,7 +25,8 @@ The main idea behind Task Queues (aka: _Work Queues_) is to avoid
doing resource intensive tasks immediately. Instead we schedule a task to doing resource intensive tasks immediately. Instead we schedule a task to
be done later on. We encapsulate a _task_ as a message and save it to be done later on. We encapsulate a _task_ as a message and save it to
the queue. A worker process running in a background will pop the tasks the queue. A worker process running in a background will pop the tasks
and eventually execute the job. and eventually execute the job. When you run many workers the work will
be shared between them.
This concept is especially useful in web applications where it's This concept is especially useful in web applications where it's
impossible to handle a complex task during a short http request impossible to handle a complex task during a short http request
@ -38,16 +39,17 @@ Preparations
In previous part of this tutorial we were sending a message containing In previous part of this tutorial we were sending a message containing
"Hello World!" string. Now we'll be sending strings that "Hello World!" string. Now we'll be sending strings that
stand for complex tasks. We don't have any real hard tasks, like stand for complex tasks. We don't have any real hard tasks, like
image to be resized or pdf files to be rendered, so let's fake it by just images to be resized or pdf files to be rendered, so let's fake it by just
pretending we're busy - by using `time.sleep()` function. We'll take pretending we're busy - by using `time.sleep()` function. We'll take
the number of dots in the string as a complexity, every dot will the number of dots in the string as a complexity, every dot will
account for one second of "work". For example, a fake task described account for one second of "work". For example, a fake task described
by `Hello!...` string will take three seconds. by `Hello...` string will take three seconds.
We need to slightly modify our `send.py` code, to allow sending We need to slightly modify the _send.py_ code from previous example,
arbitrary messages from command line: to allow sending arbitrary messages from the command line. This program
will schedule tasks to our work queue, so let's name it `new_task.py`:
{% highlight python %} {% highlight python linenos=true linenostart=12 %}
import sys import sys
message = ' '.join(sys.argv[1:]) or "Hello World!" message = ' '.join(sys.argv[1:]) or "Hello World!"
channel.basic_publish(exchange='', routing_key='test', channel.basic_publish(exchange='', routing_key='test',
@ -56,10 +58,11 @@ arbitrary messages from command line:
{% endhighlight %} {% endhighlight %}
Our `receive.py` script also requires some changes: it needs to fake a Our old _receive.py_ script also requires some changes: it needs to fake a
second of work for every dot in the message body: second of work for every dot in the message body. It will pop messages
from the queue and do the task, so let's call it `worker.py`:
{% highlight python %} {% highlight python linenos=true linenostart=13 %}
import time import time
def callback(ch, method, header, body): def callback(ch, method, header, body):
@ -76,7 +79,7 @@ One of the advantages of using Task Queue is the
ability to easily parallelize work. If we have too much work for us to ability to easily parallelize work. If we have too much work for us to
handle, we can just add more workers and scale easily. handle, we can just add more workers and scale easily.
First, let's try to run two `worker.py` scripts in the same time. They First, let's try to run two `worker.py` scripts at the same time. They
will both get messages from the queue, but how exactly? Let's will both get messages from the queue, but how exactly? Let's
see. see.
@ -86,8 +89,6 @@ script. These consoles will be our two consumers - C1 and C2.
shell1$ ./worker.py shell1$ ./worker.py
[*] Waiting for messages. To exit press CTRL+C [*] Waiting for messages. To exit press CTRL+C
&nbsp;
shell2$ ./worker.py shell2$ ./worker.py
[*] Waiting for messages. To exit press CTRL+C [*] Waiting for messages. To exit press CTRL+C
@ -109,8 +110,6 @@ Let's see what is delivered to our workers:
[x] Received 'Third message...' [x] Received 'Third message...'
[x] Received 'Fifth message.....' [x] Received 'Fifth message.....'
&nbsp;
shell2$ ./worker.py shell2$ ./worker.py
[*] Waiting for messages. To exit press CTRL+C [*] Waiting for messages. To exit press CTRL+C
[x] Received 'Second message..' [x] Received 'Second message..'
@ -128,27 +127,27 @@ Doing our tasks can take a few seconds. You may wonder what happens if
one of the consumers got hard job and has died while doing it. With one of the consumers got hard job and has died while doing it. With
our current code once RabbitMQ delivers message to the customer it our current code once RabbitMQ delivers message to the customer it
immediately removes it from memory. In our case if you kill a worker immediately removes it from memory. In our case if you kill a worker
we will loose the message it was just processing. We'll also loose all we will lose the message it was just processing. We'll also lose all
the messages that were dispatched to this particular worker and not the messages that were dispatched to this particular worker and not
yet handled. yet handled.
But we don't want to loose any task. If a workers dies, we'd like the task But we don't want to lose any task. If a workers dies, we'd like the task
to be delivered to another worker. to be delivered to another worker.
In order to make sure a message is never lost, RabbitMQ In order to make sure a message is never lost, RabbitMQ
supports message _acknowledgments_. It's an information, supports message _acknowledgments_. It's a bit of data,
sent back from the consumer which tells Rabbit that particular message sent back from the consumer which tells Rabbit that particular message
had been received, fully processed and that Rabbit is free to delete had been received, fully processed and that Rabbit is free to delete
it. it.
If consumer dies without sending ack, Rabbit will understand that a If consumer dies without sending ack, Rabbit will understand that a
message wasn't processed fully and will redispatch it to another message wasn't processed fully and will redelivered it to another
consumer. That way you can be sure that no message is lost, even if consumer. That way you can be sure that no message is lost, even if
the workers occasionally die. the workers occasionally die.
There aren't any message timeouts, Rabbit will redispatch the There aren't any message timeouts; Rabbit will redeliver the
message only when the worker connection dies. It's fine if processing message only when the worker connection dies. It's fine even if processing
a message takes even very very long time. a message takes a very very long time.
Message acknowledgments are turned on by default. Though, in previous Message acknowledgments are turned on by default. Though, in previous
@ -169,7 +168,7 @@ once we're done with a task.
Using that code we may be sure that even if you kill a worker using Using that code we may be sure that even if you kill a worker using
CTRL+C while it was processing a message, nothing will be lost. Soon CTRL+C while it was processing a message, nothing will be lost. Soon
after the worker dies all unacknowledged messages will be redispatched. after the worker dies all unacknowledged messages will be redelivered.
> #### Forgotten acknowledgment > #### Forgotten acknowledgment
> >
@ -179,8 +178,8 @@ after the worker dies all unacknowledged messages will be redispatched.
> Rabbit will eat more and more memory as it won't be able to release > Rabbit will eat more and more memory as it won't be able to release
> any unacked messages. > any unacked messages.
> >
> In order to debug this kind of mistakes you may use `rabbitmqctl` > In order to debug this kind of mistake you can use `rabbitmqctl`
> to print `messages_unacknowledged` field: > to print the `messages_unacknowledged` field:
> >
> $ sudo rabbitmqctl list_queues name messages_ready messages_unacknowledged > $ sudo rabbitmqctl list_queues name messages_ready messages_unacknowledged
> Listing queues ... > Listing queues ...
@ -198,7 +197,7 @@ When RabbitMQ quits or crashes it will forget the queues and messages
unless you tell it not to. Two things are required to make sure that unless you tell it not to. Two things are required to make sure that
messages aren't lost: we need to mark both a queue and messages as durable. messages aren't lost: we need to mark both a queue and messages as durable.
First, we need to make sure that Rabbit will never loose our `test` First, we need to make sure that Rabbit will never lose our `test`
queue. In order to do so, we need to declare it as _durable_: queue. In order to do so, we need to declare it as _durable_:
channel.queue_declare(queue='test', durable=True) channel.queue_declare(queue='test', durable=True)
@ -248,9 +247,9 @@ Rabbit doesn't know anything about that and will still dispatch
messages evenly. messages evenly.
That happens because Rabbit dispatches message just when a message That happens because Rabbit dispatches message just when a message
enters the queue. It doesn't look in number of unacknowledged messages enters the queue. It doesn't look at the number of unacknowledged messages
for a consumer. It just blindly dispatches every n-th message to a for a consumer. It just blindly dispatches every n-th message to a
every consumer. the n-th consumer.
{% dot -Gsize="10,1.3" -Grankdir=LR %} {% dot -Gsize="10,1.3" -Grankdir=LR %}
digraph G { digraph G {
@ -273,7 +272,7 @@ In order to defeat that we may use `basic.qos` method with the
`prefetch_count=1` settings. That allows us to tell Rabbit not to give `prefetch_count=1` settings. That allows us to tell Rabbit not to give
more than one message to a worker at a time. Or, in other words, don't more than one message to a worker at a time. Or, in other words, don't
dispatch a new message to a worker until it has processed and dispatch a new message to a worker until it has processed and
acknowledged previous one. acknowledged the previous one.
{% highlight python %} {% highlight python %}
channel.basic_qos(prefetch_count=1) channel.basic_qos(prefetch_count=1)