Redpanda’s HTTP Proxy, often called PandaProxy, lets you interact with your Redpanda cluster using standard HTTP requests, meaning you can produce messages to topics and consume messages from topics without needing a dedicated Kafka client library. This is super handy for quick scripting, integration with services that don’t have Kafka clients, or just for understanding what’s going on under the hood.

Let’s see it in action. Imagine you have a Redpanda cluster running, and you want to send a message to a topic named my-topic. You can use curl for this:

curl -X POST \
  -H "Content-Type: application/json" \
  --data '{"message": "Hello Redpanda via HTTP!"}' \
  http://localhost:8082/topics/my-topic

This command hits the Redpanda HTTP Proxy running on localhost:8082. The POST request to /topics/my-topic tells Redpanda to send the JSON payload as a message to that topic.

Now, to consume that message, you can use another curl command. Consuming is a bit more involved because you need to specify a consumer group and an offset. Let’s say you want to start consuming from the latest offset:

curl -X GET \
  -H "Content-Type: application/vnd.kafka.json.v2+json" \
  "http://localhost:8082/consumers/my-consumer-group/instances/my-consumer-instance?timeout=1000&max_bytes=10000"

Wait, that’s not right. The above command is for retrieving messages that have already been fetched by a consumer instance. To actually fetch messages for the first time, you need to create a consumer instance first. Here’s how you’d create a consumer instance and then fetch messages:

First, create the consumer instance:

curl -X POST \
  -H "Content-Type: application/vnd.kafka.v2+json" \
  --data '{"name": "my-consumer-instance", "format": "json", "auto.offset.reset": "earliest", "enable.auto.commit": true}' \
  http://localhost:8082/consumers/my-consumer-group

This creates a consumer named my-consumer-instance within the my-consumer-group, configured to read messages in JSON format and to start from the earliest available offset. enable.auto.commit: true means Redpanda will automatically track your progress.

Now that you have a consumer instance, you can fetch messages from my-topic:

curl -X GET \
  "http://localhost:8082/consumers/my-consumer-group/instances/my-consumer-instance/subscriptions" \
  -H "Content-Type: application/vnd.kafka.v2+json" \
  --data '{"topics": ["my-topic"]}'

This command subscribes your consumer instance to my-topic. After subscribing, you can then poll for messages:

curl -X GET \
  "http://localhost:8082/consumers/my-consumer-group/instances/my-consumer-instance/records?timeout=1000&max_bytes=10000"

This GET request to /records will pull messages from the topics your consumer is subscribed to. The timeout parameter (in milliseconds) keeps the connection open for a bit, waiting for new messages, and max_bytes limits the size of the response.

The core problem PandaProxy solves is abstracting away the complexities of the Kafka wire protocol. Instead of dealing with TCP connections, message framing, and intricate request/response structures, you’re just making RESTful HTTP calls. This makes Redpanda accessible from virtually any programming language or tool that can make HTTP requests.

Internally, PandaProxy acts as a translation layer. When you send an HTTP POST to /topics/some-topic, PandaProxy receives it, parses the JSON payload, and then constructs a Kafka ProduceRequest to send to the Redpanda broker. For consumption, when you GET from /records, PandaProxy translates that into a Kafka FetchRequest, receives the FetchResponse from the broker, and then formats the data into a JSON response for you. The consumer group and instance management are also handled by the proxy, which maintains the necessary state for tracking offsets and group coordination behind the scenes.

The format parameter in the consumer creation request ("format": "json") is crucial. It tells PandaProxy how to serialize and deserialize messages. If you produce JSON, you’ll want to consume JSON. Other options exist, like avro, which would require a Schema Registry to be configured and available for both producing and consuming.

One aspect often overlooked is how PandaProxy handles consumer offsets and group membership. When you create a consumer instance and enable.auto.commit: true, PandaProxy periodically commits the offsets of the records it has delivered to you. It does this by sending Kafka OffsetCommitRequest messages to the brokers on your behalf. This means you don’t have to manually manage commits, but it also means you can’t fine-tune commit strategies like you might with a native Kafka client. If you need more control, you’d set enable.auto.commit: false and then manually commit offsets using a separate HTTP endpoint.

The next conceptual hurdle you’ll likely encounter is understanding how to manage consumer groups and rebalancing when multiple instances are running.

Want structured learning?

Take the full Redpanda course →