diff --git a/python/tutorial-one.mdx b/python/tutorial-one.mdx index 11e9fec..d062901 100644 --- a/python/tutorial-one.mdx +++ b/python/tutorial-one.mdx @@ -60,15 +60,15 @@ Introduction RabbitMQ is a message broker. The principle idea is pretty simple: it accepts and forwards messages. You can think about it as a post office: when you send -mail to the post box and you're pretty sure that Mr. Postman will eventually +mail to the post box you're pretty sure that Mr. Postman will eventually deliver the mail to your recipient. Using this metaphor RabbitMQ is a post box, -post office and a postman. +a post office and a postman. The major difference between RabbitMQ and the post office is the fact that it doesn't deal with the paper, instead it accepts, stores and forwards binary blobs of data - _messages_. -RabbitMQ uses a weird jargon, but it's simple once you'll get it. For example: +RabbitMQ uses a specific jargon. For example: * _Producing_ means nothing more than sending. A program that sends messages is a _producer_. We'll draw it like that, with "P": @@ -85,7 +85,7 @@ RabbitMQ uses a weird jargon, but it's simple once you'll get it. For example: is not bound by any limits, it can store how many messages you like - it's essentially an infinite buffer. Many _producers_ can send messages that go to the one queue, many _consumers_ can try to - receive data from one _queue_. Queue'll be drawn as like that, with + receive data from one _queue_. A queue will be drawn as like that, with its name above it: {% dot -Gsize="10,0.9" -Grankdir=LR%} @@ -98,7 +98,7 @@ RabbitMQ uses a weird jargon, but it's simple once you'll get it. For example: } {% enddot %} - * _Consuming_ has a similar meaning to receiving. _Consumer_ is a program + * _Consuming_ has a similar meaning to receiving. A _consumer_ is a program that mostly waits to receive messages. On our drawings it's shown with "C": {% dot -Gsize="10,0.3" -Grankdir=LR %} @@ -136,10 +136,10 @@ messages from that queue. > #### RabbitMQ libraries > -> RabbitMQ speaks AMQP protocol. To use Rabbit you'll need a library that -> understands the same protocol as Rabbit. There is a choice of libraries -> for almost every programming language. Python it's not different and there is -> a bunch of libraries to choose from: +> RabbitMQ speaks a protocol called AMQP. To use Rabbit you'll need a library +> that understands the same protocol as Rabbit. There is a choice of libraries +> for almost every programming language. For python it's not different and there +> is a bunch of libraries to choose from: > > * [py-amqplib](http://barryp.org/software/py-amqplib/) > * [txAMQP](https://launchpad.net/txamqp) @@ -202,12 +202,12 @@ the message will be delivered, let's name it _test_: At that point we're ready to send a message. Our first message will -just contain a string _Hello World!_, we want to send it to our +just contain a string _Hello World!_ and we want to send it to our _test_ queue. -In RabbitMQ a message never can be send directly to the queue, it always +In RabbitMQ a message can never be send directly to the queue, it always needs to go through an _exchange_. But let's not get dragged by the -details - you can read more about _exchanges_ in [third part of this +details - you can read more about _exchanges_ in [the third part of this tutorial]({{ python_three_url }}). All we need to know now is how to use a default exchange identified by an empty string. This exchange is special - it allows us to specify exactly to which queue the message should go. @@ -246,7 +246,7 @@ them on the screen. Again, first we need to connect to RabbitMQ server. The code responsible for connecting to Rabbit is the same as previously. -Next step, just like before, is to make sure that the +The next step, just like before, is to make sure that the queue exists. Creating a queue using `queue_declare` is idempotent - we can run the command as many times you like, and only one will be created. @@ -258,7 +258,7 @@ You may ask why to declare the queue again - we have already declared it in our previous code. We could have avoided that if we were sure that the queue already exists. For example if `send.py` program was run before. But we're not yet sure which -program to run as first. In such case it's a good practice to repeat +program to run first. In such cases it's a good practice to repeat declaring the queue in both programs. > #### Listing queues diff --git a/python/tutorial-three.mdx b/python/tutorial-three.mdx index baa5d61..9a689e8 100644 --- a/python/tutorial-three.mdx +++ b/python/tutorial-three.mdx @@ -19,19 +19,19 @@ digraph G { } {% enddot %} -In [previous part]({{ python_two_url}}) of this tutorial we've learned how -to create a task queue. The core assumption behind a task queue is that a task +In [previous part]({{ python_two_url}}) of this tutorial we created +a task queue. The core assumption behind a task queue is that a task is delivered to exactly one worker. In this part we'll do something completely different - we'll try to deliver a message to multiple consumers. This pattern is known as "publish-subscribe". -To illustrate this this tutorial, we're going to build a simple -logging system. It will consist of two programs - first will emit log -messages and second will receive and print them. +To illustrate this, we're going to build a simple +logging system. It will consist of two programs - the first will emit log +messages and the second will receive and print them. In our logging system every running copy of the receiver program -will be able to get the same messages. That way we'll be able to run one -receiver and direct the logs to disk, in the same time we'll be able to run +will get the same messages. That way we'll be able to run one +receiver and direct the logs to disk; and at the same time we'll be able to run another receiver and see the same logs on the screen. Essentially, emitted log messages are going to be broadcasted to all @@ -42,14 +42,14 @@ Exchanges --------- In previous parts of the tutorial we've understood how to send and -receive messages. Now it's time to introduce the full messaging model -in Rabbit. +receive messages to and from a queue. Now it's time to introduce +the full messaging model in Rabbit. -Let's quickly remind what we've learned: +Let's quickly cover what we've learned: - * _Producer_ is user application that sends messages. - * _Queue_ is a buffer that stores messages. - * _Consumer_ is user application that receives messages. + * A _producer_ is user application that sends messages. + * A _queue_ is a buffer that stores messages. + * A _consumer_ is user application that receives messages. The core idea in the messaging model in Rabbit is that the @@ -62,7 +62,7 @@ exchange is a very simple thing. On one side it receives messages from producers and the other side it pushes them to queues. The exchange must know exactly what to do with a received message. Should it be appended to a particular queue? Should it be appended to many queues? -Or should it get silently discarded. The exact rules for that are +Or should it get discarded. The exact rules for that are defined by the _exchange type_. {% dot -Gsize="10,1.3" -Grankdir=LR %} @@ -107,8 +107,8 @@ queues it knows. And that's exactly what we need for our logger. > amq.headers headers > ...done. > -> You can see a few `amq.` exchanges. They're created by default, but with a -> bit of luck you'll never need to use them. +> You can see a few `amq.` exchanges. They're created by default, but +> chances are you'll never need to use them.
@@ -123,7 +123,7 @@ queues it knows. And that's exactly what we need for our logger. > routing_key='test', > body=message) > -> The _empty string_ exchange is special: message is +> The _empty string_ exchange is special: messages are > routed to the queue with name specified by `routing_key`. @@ -131,15 +131,16 @@ queues it knows. And that's exactly what we need for our logger. Temporary queues ---------------- -As you may remember previously we were using queues which had a specified name - -`test` in first `task_queue` in second tutorial. Being able to name a +As you may remember previously we were using queues which had a specified name +(remember `test` and `task_queue`?). Being able to name a queue was crucial for us - we needed to point the workers to the same -queue. Essentially, giving a queue a name is important when you don't -want to loose any messages when the consumer disconnects. +queue. Essentially, giving a queue a name is important when you +want to share the queue between multiple consumers. -But that's not true for our logger. We do want to hear only about -currently flowing log messages, we do not want to hear the old -ones. To solve that problem we need two things. +But that's not true for our logger. We do want to hear about all +currently flowing log messages, we do not want to receive only a subset +of messages. We're also interested only in currently flowing messages +not in the old ones. To solve that we need two things. First, whenever we connect to Rabbit we need a fresh, empty queue. To do it we could create a queue with a random name, or, even better - let server @@ -179,7 +180,7 @@ digraph G { We've already created a fanout exchange and a queue. Now we need to -tell the exchange to send messages to our queue. That relationship, +tell the exchange to send messages to our queue. That relationship between exchange and a queue is called a _binding_. {% highlight python %} @@ -193,7 +194,7 @@ our queue. > #### Listing bindings > > You can list existing bindings using, you guessed it, -> `rabbitmqctl list_bindings` command. +> `rabbitmqctl list_bindings`. Putting it all together @@ -226,9 +227,9 @@ digraph G { The producer program, which emits log messages, doesn't look much different than in previous tutorial. The most important change is -that, we now need to publish messages to `logs` exchange instead of -the nameless one. We need to supply the `routing_key` parameter, but -it's value is ignored for `fanout` exchanges. Here goes the code for +that, we now need to publish messages to the `logs` exchange instead of +the nameless one. We need to supply a `routing_key`, but +its value is ignored for `fanout` exchanges. Here goes the code for `emit_log.py` script: {% highlight python linenos=true %} @@ -249,9 +250,10 @@ it's value is ignored for `fanout` exchanges. Here goes the code for {% endhighlight %} [(emit_log.py source)]({{ examples_url }}/python/emit_log.py) -As you see, we avoided declaring exchange. If the `logs` exchange +As you see, we avoided declaring exchange here. If the `logs` exchange isn't created at the time this code is executed the message will be -lost. That's okay for us. +lost. That's okay for us - if no consumer is listening yet (ie: +the exchange hasn't been created) we can discard the message. The code for `receive_logs.py`: diff --git a/python/tutorial-two.mdx b/python/tutorial-two.mdx index c5f1257..02eb244 100644 --- a/python/tutorial-two.mdx +++ b/python/tutorial-two.mdx @@ -16,8 +16,8 @@ digraph G { {% enddot %} -In the [first part of this tutorial]({{ python_one_url }}) we've learned how to send -and receive messages from a named queue. In this part we'll create a +In the [first part of this tutorial]({{ python_one_url }}) we've learned how +to send and receive messages from a named queue. In this part we'll create a _Task Queue_ that will be used to distribute time-consuming work across multiple workers. @@ -25,7 +25,8 @@ The main idea behind Task Queues (aka: _Work Queues_) is to avoid doing resource intensive tasks immediately. Instead we schedule a task to be done later on. We encapsulate a _task_ as a message and save it to the queue. A worker process running in a background will pop the tasks -and eventually execute the job. +and eventually execute the job. When you run many workers the work will +be shared between them. This concept is especially useful in web applications where it's impossible to handle a complex task during a short http request @@ -38,16 +39,17 @@ Preparations In previous part of this tutorial we were sending a message containing "Hello World!" string. Now we'll be sending strings that stand for complex tasks. We don't have any real hard tasks, like -image to be resized or pdf files to be rendered, so let's fake it by just +images to be resized or pdf files to be rendered, so let's fake it by just pretending we're busy - by using `time.sleep()` function. We'll take the number of dots in the string as a complexity, every dot will account for one second of "work". For example, a fake task described -by `Hello!...` string will take three seconds. +by `Hello...` string will take three seconds. -We need to slightly modify our `send.py` code, to allow sending -arbitrary messages from command line: +We need to slightly modify the _send.py_ code from previous example, +to allow sending arbitrary messages from the command line. This program +will schedule tasks to our work queue, so let's name it `new_task.py`: -{% highlight python %} +{% highlight python linenos=true linenostart=12 %} import sys message = ' '.join(sys.argv[1:]) or "Hello World!" channel.basic_publish(exchange='', routing_key='test', @@ -56,10 +58,11 @@ arbitrary messages from command line: {% endhighlight %} -Our `receive.py` script also requires some changes: it needs to fake a -second of work for every dot in the message body: +Our old _receive.py_ script also requires some changes: it needs to fake a +second of work for every dot in the message body. It will pop messages +from the queue and do the task, so let's call it `worker.py`: -{% highlight python %} +{% highlight python linenos=true linenostart=13 %} import time def callback(ch, method, header, body): @@ -76,7 +79,7 @@ One of the advantages of using Task Queue is the ability to easily parallelize work. If we have too much work for us to handle, we can just add more workers and scale easily. -First, let's try to run two `worker.py` scripts in the same time. They +First, let's try to run two `worker.py` scripts at the same time. They will both get messages from the queue, but how exactly? Let's see. @@ -86,8 +89,6 @@ script. These consoles will be our two consumers - C1 and C2. shell1$ ./worker.py [*] Waiting for messages. To exit press CTRL+C -  - shell2$ ./worker.py [*] Waiting for messages. To exit press CTRL+C @@ -109,8 +110,6 @@ Let's see what is delivered to our workers: [x] Received 'Third message...' [x] Received 'Fifth message.....' -  - shell2$ ./worker.py [*] Waiting for messages. To exit press CTRL+C [x] Received 'Second message..' @@ -128,27 +127,27 @@ Doing our tasks can take a few seconds. You may wonder what happens if one of the consumers got hard job and has died while doing it. With our current code once RabbitMQ delivers message to the customer it immediately removes it from memory. In our case if you kill a worker -we will loose the message it was just processing. We'll also loose all +we will lose the message it was just processing. We'll also lose all the messages that were dispatched to this particular worker and not yet handled. -But we don't want to loose any task. If a workers dies, we'd like the task +But we don't want to lose any task. If a workers dies, we'd like the task to be delivered to another worker. In order to make sure a message is never lost, RabbitMQ -supports message _acknowledgments_. It's an information, +supports message _acknowledgments_. It's a bit of data, sent back from the consumer which tells Rabbit that particular message had been received, fully processed and that Rabbit is free to delete it. If consumer dies without sending ack, Rabbit will understand that a -message wasn't processed fully and will redispatch it to another +message wasn't processed fully and will redelivered it to another consumer. That way you can be sure that no message is lost, even if the workers occasionally die. -There aren't any message timeouts, Rabbit will redispatch the -message only when the worker connection dies. It's fine if processing -a message takes even very very long time. +There aren't any message timeouts; Rabbit will redeliver the +message only when the worker connection dies. It's fine even if processing +a message takes a very very long time. Message acknowledgments are turned on by default. Though, in previous @@ -169,7 +168,7 @@ once we're done with a task. Using that code we may be sure that even if you kill a worker using CTRL+C while it was processing a message, nothing will be lost. Soon -after the worker dies all unacknowledged messages will be redispatched. +after the worker dies all unacknowledged messages will be redelivered. > #### Forgotten acknowledgment > @@ -179,8 +178,8 @@ after the worker dies all unacknowledged messages will be redispatched. > Rabbit will eat more and more memory as it won't be able to release > any unacked messages. > -> In order to debug this kind of mistakes you may use `rabbitmqctl` -> to print `messages_unacknowledged` field: +> In order to debug this kind of mistake you can use `rabbitmqctl` +> to print the `messages_unacknowledged` field: > > $ sudo rabbitmqctl list_queues name messages_ready messages_unacknowledged > Listing queues ... @@ -198,7 +197,7 @@ When RabbitMQ quits or crashes it will forget the queues and messages unless you tell it not to. Two things are required to make sure that messages aren't lost: we need to mark both a queue and messages as durable. -First, we need to make sure that Rabbit will never loose our `test` +First, we need to make sure that Rabbit will never lose our `test` queue. In order to do so, we need to declare it as _durable_: channel.queue_declare(queue='test', durable=True) @@ -248,9 +247,9 @@ Rabbit doesn't know anything about that and will still dispatch messages evenly. That happens because Rabbit dispatches message just when a message -enters the queue. It doesn't look in number of unacknowledged messages +enters the queue. It doesn't look at the number of unacknowledged messages for a consumer. It just blindly dispatches every n-th message to a -every consumer. +the n-th consumer. {% dot -Gsize="10,1.3" -Grankdir=LR %} digraph G { @@ -273,7 +272,7 @@ In order to defeat that we may use `basic.qos` method with the `prefetch_count=1` settings. That allows us to tell Rabbit not to give more than one message to a worker at a time. Or, in other words, don't dispatch a new message to a worker until it has processed and -acknowledged previous one. +acknowledged the previous one. {% highlight python %} channel.basic_qos(prefetch_count=1)