List of posts - <antirez>

Scaling HNSWs

antirez 5 days ago.

I’m taking a few weeks of pause on my HNSWs developments (now working on some other data structure, news soon). At this point, the new type I added to Redis is stable and complete enough, it’s the perfect moment to reason about what I learned about HNSWs, and turn it into a blog post. That kind of brain dump that was so common pre-AI era, and now has become, maybe, a bit more rare. Well, after almost one year of thinking and implementing HNSWs and vector similarity stuff, it is time for some writing. However this is not going to be an intro on HNSWs: too many are present already. This is the “extra mile” instead. If you know HNSWs, I want to share with you my more “advanced” findings, especially in the context of making them fast enough to allow for a “Redis” experience: you know, Redis is designed for low latency and high performance, and HNSWs are kinda resistant to that, so there were challenges to expose HNSWs as an abstract data structure.

read the full post at http://antirez.com/news/156

AI is different

▼

antirez 95 days ago.

Regardless of their flaws, AI systems continue to impress with their ability to replicate certain human skills. Even if imperfect, such systems were a few years ago science fiction. It was not even clear that we were so near to create machines that could understand the human language, write programs, and find bugs in a complex code base: bugs that escaped the code review of a competent programmer.

Since LLMs and in general deep models are poorly understood, and even the most prominent experts in the field failed miserably again and again to modulate the expectations (with incredible errors on both sides: of reducing or magnifying what was near to come), it is hard to tell what will come next. But even before the Transformer architecture, we were seeing incredible progress for many years, and so far there is no clear sign that the future will not hold more. After all, a plateau of the current systems is possible and very credible, but it would likely stimulate, at this point, massive research efforts in the next step of architectures.

read the full post at http://antirez.com/news/155

Coding with LLMs in the summer of 2025 (an update)

▼

antirez 119 days ago.

Frontier LLMs such as Gemini 2.5 PRO, with their vast understanding of many topics and their ability to grasp thousands of lines of code in a few seconds, are able to extend and amplify the programmer capabilities. If you are able to describe problems in a clear way and, if you are able to accept the back and forth needed in order to work with LLMs, you can reach incredible results such as:

1. Eliminating bugs you introduced in your code before it ever hits any user: I experienced this with Vector Sets implementation of Redis. I would end eliminating all the bugs eventually, but many were just removed immediately by Gemini / Claude code reviews.

read the full post at http://antirez.com/news/154

Human coders are still better than LLMs

▼

antirez 171 days ago.

This is a short story of how humans are still so much more capable of LLMs. Note that I'm not anti-AI or alike, you know it if you know me / follow me somewhere. I use LLMs routinely, like I did today, when I want to test my ideas, for code reviews, to understand if there are better approaches than what I had in mind, to explore stuff at the limit of my expertise, and so forth (I wrote a blog post about coding with LLMs almost two years, when it was not exactly cool: I was already using LLMs for coding and never stopped, I'll have to write an update, but that's not the topic of this post).

read the full post at http://antirez.com/news/153

What I learned during the license switch

▼

antirez 198 days ago.

Yesterday, it was a very intense day. In Italy it was 1st of May, the workers holiday, so in the morning I went for a 4h walk in the Etna with friends <3, I love walking, and I often take pauses when coding just to walk, to return later at the keyboard with a few more kilometers on my legs, and walking in the Etna is amazing (Etna is the largest active volcano in Europe, and I happen to live in Catania, that is on its slopes).

Then at 6PM I was at home to release my blog post about the AGPL license switch, and I started following the comments, feedbacks, private messages, and I learned a few things in the process.

read the full post at http://antirez.com/news/152

Redis is open source again

▼

antirez 199 days ago.

Five months ago, I rejoined Redis and quickly started to talk with my colleagues about a possible switch to the AGPL license, only to discover that there was already an ongoing discussion, a very old one, too. Many people, within the company, had the feeling that the AGPL was a better pick than SSPL, and while eventually Redis switched to the SSPL license, the internal discussion continued.

I tried to give more strength to the ongoing pro-AGPL license side. My feeling was that the SSPL, in practical terms, failed to be accepted by the community. The OSI wouldn’t accept it, nor would the software community regard the SSPL as an open license. In little time, I saw the hypothesis getting more and more traction, at all levels within the company hierarchy.

read the full post at http://antirez.com/news/151

Reproducing Hacker News writing style fingerprinting

▼

antirez 214 days ago.

About three years ago I saw a quite curious and interesting post on Hacker News. A student, Christopher Tarry, was able to use cosine similarity against a vector of top words frequencies in comments, in order to detect similar HN accounts — and, sometimes, even accounts actually controlled by the same user, that is, fake accounts used to uncover the identity of the writer.

This is the original post: https://news.ycombinator.com/item?id=33755016

I was not aware, back then, of Burrows-Delta method for style detection: it seemed kinda magical that you just needed to normalize a frequency vector of top words to reach such quite remarkable results. I read a few wikipedia pages and took mental note of it. Then, as I was working with Vectors for Redis I remembered about this post, searched the web only to discover that the original page was gone and that the author, in the original post and website, didn’t really explained very well how the data was processed, the top words extracted (and, especially, how many were used) and so forth. I thought I could reproduce the work with Vector Sets, once I was done with the main work. Now the new data type is in the release candidate, and I found some time to work on the problem. This is a report of what I did, but before to continue, the mandatory demo site: you can play with it at the following link:

read the full post at http://antirez.com/news/150

Vector Sets are part of Redis

▼

antirez 227 days ago.

Yesterday we finally merged vector sets into Redis, here you can find the README that explains in detail what you get:

https://github.com/redis/redis/blob/unstable/modules/vector-sets/README.md

The goal of the new data structure is, in short, to create a new “Set alike” data type, similar to Sorted Sets, where instead of having a scalar as a score, you have a vector, and you can add and remove elements the Redis way, without caring about anything except the properties of the abstract data structure Redis implements, ask for elements similar to a given query vector (or a vector associated to some element already in the set), and so forth. But more about that later, a bit of background, first:

read the full post at http://antirez.com/news/149

AI is useless, but it is our best bet for the future

▼

antirez 238 days ago.

I used AI with success 5 minutes ago.

Just five minutes ago, I was writing a piece of software and relied on AI for assistance. Yet, here I am, starting this blog post by telling you that artificial intelligence, so far, has proven somewhat useless. How can I make such a statement if AI was just so helpful a moment ago? Actually, there's no contradiction here if we clarify exactly what we mean.

Here’s the thing: at this very moment, artificial intelligence can support me significantly. If I'm struggling with complicated code or need to understand an advanced scientific paper on math, I can turn to AI for clarity. It can help me generate an image for a project, make a translation, clean my YouTube transcript. Clearly, it’s practical and beneficial in these everyday tasks.

read the full post at http://antirez.com/news/148

Big LLMs weights are a piece of history

▼

antirez 245 days ago.

By multiple accounts, the web is losing pieces: every year a fraction of old web pages disappear, lost forever. We should regard the Internet Archive as one of the most valuable pieces of modern history; instead, many companies and entities make the chances of the Archive to survive, and accumulate what otherwise will be lost, harder and harder. I understand that the Archive headquarters are located in what used to be a church: well, there is no better way to think of it than as a sacred place.

read the full post at http://antirez.com/news/147

Reasoning models are just LLMs

▼

antirez 280 days ago.

It’s not new, but it’s accelerating. People that used to say that LLMs were a fundamentally flawed way to reach any useful reasoning and, in general, to develop any useful tool with some degree of generality, are starting to shuffle the deck, in the hope to look less wrong. They say: “the progresses we are seeing are due to the fact that models like OpenAI o1 or DeepSeek R1 are not just LLMs”. This is false, and it is important to show their mystification as soon as possible.

First, DeepSeek R1 (don’t want to talk about o1 / o3, since it’s a private thing we don’t have access to, but it’s very likely the same) is a pure decoder only autoregressive model. It’s the same next token prediction that was so strongly criticized. There isn’t, in any place of the model, any explicit symbolic reasoning or representation.

read the full post at http://antirez.com/news/146

We are destroying software

▼

antirez 281 days ago.

We are destroying software by no longer taking complexity into account when adding features or optimizing some dimension.

We are destroying software with complex build systems.

We are destroying software with an absurd chain of dependencies, making everything bloated and fragile.

We are destroying software telling new programmers: “Don’t reinvent the wheel!”. But, reinventing the wheel is how you learn how things work, and is the first step to make new, different wheels.

We are destroying software by no longer caring about backward APIs compatibility.

read the full post at http://antirez.com/news/145

From where I left

▼

antirez 341 days ago.

I’m not the kind of person that develops a strong attachment to their own work. When I decided to leave Redis, about 1620 days ago (~ 4.44 years), I never looked at the source code, commit messages, or anything related to Redis again. From time to time, when I needed Redis, I just downloaded it and compiled it. I just typed “make” and I was very happy to see that, after many years, building Redis was still so simple.

My detachment was not the result of me hating my past work. While in the long run my creative work was less and less important and the “handling the project” activities became more and more substantial — a shift that many programmers are able to do, but that’s not my bread and butter — well, I still enjoyed doing Redis stuff when I left. However, I don’t share the vision that most people at my age (I’m 47 now) have: that they are still young. I wanted to do new stuff, especially writing. I wanted to stay more with my family and help my relatives. I definitely needed a break.

read the full post at http://antirez.com/news/144

Playing audio files in a Pi Pico without a DAC

▼

antirez 620 days ago.

The Raspberry Pico is suddenly becoming my preferred chip for embedded development. It is well made, durable hardware, with a ton of features that appear designed with smartness and passion (the state machines driving the GPIOs are a killer feature!). Its main weakness, the lack of connectivity, is now resolved by the W variant. The data sheet is excellent and documents every aspect of the chip. Moreover, it is well supported by MicroPython (which I’m using a lot), and the C SDK environment is decent, even if full of useless complexities like today fashion demands: a cmake build system that in turn generates a Makefile, files to define this and that (used libraries, debug outputs, …), and in general a huge overkill for the goal of compiling tiny programs for tiny devices. No, it’s worse than that: all this complexity to generate programs for a FIXED hardware with a fixed set of features (if not for the W / non-W variant). Enough with the rant about how much today software sucks, but it must be remembered.

read the full post at http://antirez.com/news/143

First Token Cutoff LLM sampling

▼

antirez 674 days ago.

From a theoretical standpoint, the best reply provided by an LLM is obtained by always picking the token associated with the highest probability. This approach makes the LLM output deterministic, which is not a good property for a number of applications. For this reason, in order to balance LLMs creativity while preserving adherence to the context, different sampling algorithms have been proposed in recent years.

Today one of the most used ones, more or less the default, is called top-p: it is a form of nucleus sampling where top-scoring tokens are collected up to a total probability sum of “p”, then random weighted sampling is performed.

read the full post at http://antirez.com/news/142

Translating blog posts with GPT-4, or: on hope and fear

▼

antirez 677 days ago.

My usual process for writing blog posts is more or less in two steps:

1. Think about what I want to say for weeks or months. No, I don’t spend weeks focusing on a blog post, the process is exactly reversed: I write blog posts about things that are so important to me to be in my mind for weeks.

2. Then, once enough ideas collapsed together in a decent form, I write the blog post in 30 minutes, often without caring much about the form, and I hit “publish”. This process usually works writing the titles of the sections as I initially just got the big picture of what I want to say, and then filling the empty paragraphs with text.

read the full post at http://antirez.com/news/141

LLMs and Programming in the first days of 2024

▼

antirez 684 days ago.

I'll start by saying that this article is not meant to be a retrospective on LLMs. It's clear that 2023 was a special year for artificial intelligence: to reiterate that seems rather pointless. Instead, this post aims to be a testimony from an individual programmer. Since the advent of ChatGPT, and later by using LLMs that operate locally, I have made extensive use of this new technology. The goal is to accelerate my ability to write code, but that's not the only purpose. There's also the intent to not waste mental energy on aspects of programming that are not worth the effort. Countless hours spent searching for documentation on peculiar, intellectually uninteresting aspects; the efforts to learn an overly complicated API, often without good reason; writing immediately usable programs that I would discard after a few hours. These are all things I do not want to do, especially now, with Google having become a sea of spam in which to hunt for a few useful things.

read the full post at http://antirez.com/news/140

The origins of the Idle Scan

▼

antirez 759 days ago.

The Idle scan was conceived at the end of 1998, evidenced by emails. I had moved to Milan a few months prior, having been there since September if I recall correctly, brimming with new ideas, unaware that my stay in that city would be brief. I spent the summer on the beaches of Sicily, mainly occupied with reading many books recommended by the folks at Seclab (mostly by David). However, those readings needed a catalyst: the Idle scan was an attack born from theoretical rumination, but the stream of thoughts originated from a rather practical circumstance. I had recently created Hping, a tool whose logo was borrowed from that of Nutella. I mention this to emphasize the seriousness that governed my efforts at that time — after all, I was only twenty-one and already in Northern Italy with a full-time job on my shoulders; some understanding was warranted.

read the full post at http://antirez.com/news/139

In defense of linked lists

▼

antirez 1108 days ago.

A few days ago, on Twitter (oh, dear Twitter: whatever happens I’ll be there as long as possible – if you care about people that put a lot of energy in creating it, think twice before leaving the platform). So, on Twitter, I was talking about a very bad implementation of linked lists written in Rust. From the tone of certain replies, I got the feeling that many people think linked lists are like a joke. A trivial data structure that is only good for coding interviews, otherwise totally useless. In a word: the bubble sort of data structures. I disagree, so I thought of writing this blog post full of all the things I love about linked lists.

read the full post at http://antirez.com/news/138

Scrivendo Wohpe

▼

antirez 1218 days ago.

(English translation of this post: http://antirez.com/news/136)

Dopo due anni di lavoro, finalmente, Wohpe, il mio primo libro di fantascienza, ma anche il mio primo scritto di prosa di questa lunghezza, è uscito nelle librerie fisiche italiane, su Amazon, e negli altri store digitali. Lo trovate qui: https://www.amazon.it/Wohpe-Salvatore-Sanfilippo/dp/B09XT6J3WX

Dicevo: il primo scritto di questa lunghezza. Ma posso considerarmi del tutto nuovo alla scrittura? Ho scritto per vent’anni in questo blog e in quelli passati che ho tenuto nel corso del tempo, e molto spesso ho usato Facebook per scrivere brevi racconti, frutto di fantasie o basati su fatti reali. Oltre a ciò, ho scritto di cose tecniche, specialmente riguardo la programmazione, per un tempo altrettanto lungo, e sono stato un lettore di racconti e di romanzi per tutto il corso della mia vita. E allora perché scrivere Wohpe è stato anche imparare a scrivere da zero?

read the full post at http://antirez.com/news/137

Writing Wohpe

▼

antirez 1218 days ago.

(Traduzione italiana di questo post: http://antirez.com/news/137)

[Sorry for the form of this post. For the first time I wrote a post in two languages: Italian and English. So I went for the unusual path of writing it in Italian to start, translating it with Google Translate, and later I just scanned it to fix the biggest issues. At this point GT is so good you can get away with this process.]

After two years of work, finally, Wohpe, my first science fiction book, but also my first prose writing of this length, has been released in Italian physical bookstores, on Amazon, and in other digital stores. You can find it here: https://www.amazon.it/Wohpe-Salvatore-Sanfilippo/dp/B09XT6J3WX

read the full post at http://antirez.com/news/136

Programming and Writing

▼

antirez 1647 days ago.

One year ago I paused my programming life and started writing a novel, with the illusion that my new activity was deeply different than the previous one. A river of words later, written but more often rewritten, I’m pretty sure of the contrary: programming big systems and writing novels have many common traits and similar processes.

The most obvious parallel between the two activities is that in both of them you write something. Code is not prose written in a natural language, yet it has a set of fixed rules (a grammar), certain forms that most programmers will understand as natural and others that, while formally correct, will sound hard to grasp.

read the full post at http://antirez.com/news/135

The open source paradox

▼

antirez 1870 days ago.

A new idea is insinuating in social networks and programming communities. It’s the proportionality between the money people give you for coding something, and the level of demand for quality they can claim to have about your work.

As somebody said, the best code is written when you are supposed to do something else [1]. Like a writer will do her best when writing that novel that, maybe, nobody will pay a single cent for, and not when doing copywriting work for a well known company, programmers are likely to spend more energies in their open source side projects than during office hours, while writing another piece of a project they feel stupid, boring, pointless. And, if the company is big enough, chances are it will be cancelled in six months anyway or retired one year after the big launch.

read the full post at http://antirez.com/news/134

The end of the Redis adventure

▼

antirez 1965 days ago.

When I started the Redis project more than ten years ago I was in one of the most exciting moments of my career. My co-founder and I had successfully launched two of the major web 2.0 services of the Italian web. In order to make them scalable we had to invent many new concepts, that were already known in the field most of the times, but we didn’t know, nor we cared to check. Problem? Let’s figure out a solution. We wanted to solve problems but we wanted, even more, to have fun. This was the playful environment where Redis was born.

read the full post at http://antirez.com/news/133

Redis 6.0.0 GA is out!

▼

antirez 2026 days ago.

Finally Redis 6.0.0 stable is out. This time it was a relatively short cycle between the release of the first release candidate and the final release of a stable version. It took about four months, that is not a small amount of time, but is not a lot compared to our past records :)

So the big news are the ones announced before, but with some notable changes. The old stuff are: SSL, ACLs, RESP3, Client side caching, Threaded I/O, Diskless replication on replicas, Cluster support in Redis-benchmark and improved redis-cli cluster support, Disque in beta as a module of Redis, and the Redis Cluster Proxy (now at https://github.com/RedisLabs/redis-cluster-proxy).

read the full post at http://antirez.com/news/132

Redis 6 RC1 is out today

▼

antirez 2159 days ago.

So it happened again, a new Redis version reached the release candidate status, and in a few months it will hit the shelves of most supermarkets. I guess this is the most “enterprise” Redis version to date, and it’s funny since I took quite some time in order to understand what “enterprise” ever meant. I think it’s word I genuinely dislike, yet it has some meaning. Redis is now everywhere, and it is still considerably able to “scale down”: you can still download it, compile it in 30 seconds, and run it without any configuration to start hacking. But being everywhere also means being in environments where things like encryption and ACLs are a must, so Redis, inevitably, and more than thanks to me, I would say, in spite of my extreme drive for simplicity, adapted.

read the full post at http://antirez.com/news/131

Client side caching in Redis 6

▼

antirez 2327 days ago.

[Note: this post no longer describes the client side implementation in the final implementation of Redis 6, that changed significantly, see https://redis.io/topics/client-side-caching]

The New York Redis day was over, I get up at the hotel at 5:30, still pretty in sync with the Italian time zone and immediately went walking on the streets of Manhattan, completely in love with the landscape and the wonderful feeling of being just a number among millions of other numbers. Yet I was thinking at the Redis 6 release with the feeling that, what was probably the most important feature at all, the new version of the Redis protocol (RESP3), was going to have a very slow adoption curve, and for good reasons: wise people avoid switching tools without very good reasons. After all why I wanted to improve the protocol so badly? For two reasons mainly, to provide clients with more semantical replies, and in order to open to new features that were hard to implement with the old protocol; one feature in particular was the most important to me: client side caching.

read the full post at http://antirez.com/news/130

The struggles of an open source maintainer

▼

antirez 2376 days ago.

Months ago the maintainer of an OSS project in the sphere of system software, with quite a big and active community, wrote me an email saying that he struggles to continue maintaining his project after so many years, because of how much psychologically taxing such effort is. He was looking for advices from me, I’m not sure to be in the position of giving advices, however I told him I would write a blog post about what I think about the matter. Several weeks passed, and multiple times I started writing such post and stopped, because I didn’t had the time to process the ideas for enough time. Now I think I was able to analyze myself to find answers inside my own weakness, struggles, and desire of freedom, that inevitably invades the human minds when they do some task, that also has some negative aspect, for a prolonged amount of time. Maintaining an open source project is also a lot of joy and fun and these latest ten years of my professional life are surely memorable, even if not the absolute best (I had more fun during my startup times after all). However here I’ll focus on the negative side; simply make sure you don’t get the feeling it is just that, there is also a lot of good in it.

read the full post at http://antirez.com/news/129

Redis streams as a pure data structure

▼

antirez 2431 days ago.

The new Redis data structure introduced in Redis 5 under the name of “Streams” generated quite some interest in the community. Soon or later I want to run a community survey, talking with users having production use cases, and blogging about it. Today I want to address another issue: I’m starting to suspect that many users are only thinking at Streams as a way to solve Kafka(TM)-alike use cases. Actually the data structure was designed to *also* work in the context of messaging with producers and consumers, but to think that Redis Streams are just good for that is incredibly reductive. Streaming is a terrific pattern and “mental model” that can be applied when designing systems with great success, but Redis Streams, like most Redis data structures, are more general, and can be used to model dozen of different unrelated problems. So in this blog post I’ll focus on Streams as a pure data structure, completely ignoring its blocking operations, consumer groups, and all the messaging parts.

read the full post at http://antirez.com/news/128

Gopher: a present for Redis

▼

antirez 2456 days ago.

Ten years ago Redis was announced on Hacker News, and I use this as virtual birthdate for the project, simply because it is more important when it was announced to the public than the actual date of the project first line of code (think at it conception VS actual birth in animals).  I’ll use the ten years of Redis as an excuse to release something I played a bit in the previous days, thinking to use it for the 1st April fool: but such date is far and I want to talk to you about this project now… So, happy birthday Redis! Here it’s your present: a Gopher protocol implementation.

read the full post at http://antirez.com/news/127

An update about Redis developments in 2019

▼

antirez 2461 days ago.

Yesterday a concerned Redis user wrote the following on Hacker News:

— https://news.ycombinator.com/item?id=19204436 —
I love Redis, but I'm a bit skeptical of some of the changes that are currently in development. The respv3 protocol has some features that, while they sound neat, also could significantly complicate client library code. There's also a lot of work going into a granular acl. I can't imagine why this would be necessary, or a higher priority than other changes like multi-thread support, better persistence model, data-types, etc.

read the full post at http://antirez.com/news/126

Why RESP3 will be the only protocol supported by Redis 6

▼

antirez 2564 days ago.

[EDIT! I'm reconsidering all this because Marc Gravell
 from Stack Overflow suggested that we could just switch protocol for backward compatibility per-connection, sending a command to enable RESP3. That means no longer need for a global configuration that switches the behavior of the server. Put in that way it is a lot more acceptable for me, and I'm reconsidering the essence of the blog post]

A few weeks after the release of Redis 5, I’m here starting to implement RESP3, and after a few days of work it feels very well to see this finally happening. RESP3 is the new client-server protocol that Redis will use starting from Redis 6. The specification at https://github.com/antirez/resp3 should explain in clear terms how this evolution of our old protocol, RESP2, should improve the Redis ecosystem. But let’s say that the most important thing is that RESP3 is more “semantic” than RESP2. For instance it has the concept of maps, sets (unordered lists of elements), attributes of the returned data, that may augment the reply with auxiliary information, and so forth. The final goal is to make new Redis clients have less work to do for us, that is, just deciding a set of fixed rules in order to convert every reply type from RESP3 to a given appropriate type of the client library programming language.

read the full post at http://antirez.com/news/125

Writing system software: code comments.

▼

antirez 2598 days ago.

For quite some time I’ve wanted to record a new video talking about code comments for my "writing system software" series on YouTube. However, after giving it some thought, I realized that the topic was better suited for a blog post, so here we are. In this post I analyze Redis comments, trying to categorize them.  Along the way I try to show why, in my opinion, writing comments is of paramount importance in order to produce good code, that is maintainable in the long run and understandable by others and by the authors during modifications and debugging activities.

read the full post at http://antirez.com/news/124

LOLWUT: a piece of art inside a database command

▼

antirez 2622 days ago.

The last few days have been quite intense. One of the arguments, about the dispute related to replacing or not the words used in Redis replication with different ones, was the following: is it worthwhile to do work that does not produce any technological result?

As I was changing the Redis source code to get rid of a specific word where possible, I started to think that whatever my idea was about the work I was doing, I’m the kind of person that enjoys writing code that has no measurable technological effects. Replacing words is just annoying, even if, even there, there were a few worthwhile technological challenges. But there is some other kind of code that I believe has a quality called “hack value”. It may not solve any technological problem, yet it’s worth to write. Sometimes because the process of writing the code is, itself, rewarding. Other times because very technically advanced ideas are used to solve a not useful problem. Sometimes code is just written for artistic reasons.

read the full post at http://antirez.com/news/123

On Redis master-slave terminology

▼

antirez 2628 days ago.

Today it happened again. A developer, that we’ll call Mark to avoid exposing his real name, read the Redis 5.0 RC5 change log, and was disappointed to see that Redis still uses the “master” and “slave” terminology in order to identify different roles in Redis replication.

I said that I was sorry he was disappointed about that, but at the same time, I don’t believe that terminology out of context is offensive, so if I use master-slave in the context of databases, and I’m not referring in any way to slavery. I originally copied the terms from MySQL, and now they are the way we call things in Redis, and since I do not believe in this battle (I’ll tell you later why), to change the documentation, deprecate the API and add a new one, change the INFO fields, just to make a subset of people that care about those things more happy, do not make sense to me.

read the full post at http://antirez.com/news/122

Redis is not "open core"

▼

antirez 2641 days ago.

Human beings have a strong tendency to put new facts into pre-existing categories. This is useful to mentally and culturally classify similar events under the same logical umbrella, so when two days ago I clarified that the Redis core was still released under the vanilla BSD license, and only certain Redis modules developed by Redis Labs were going to change license, from AGPL to a different non open source license, people said “Ah! Ok you are going open core”.

The simplification this time does not work if it is in your interest to capture the truth of what is happening here. An open core technology requires two things. One is that the system is modular, and the other is that parts of such system are made proprietary in order to create a product around an otherwise free software. For example providing a single node of a database into the open source, and then having the clustering logic and mechanism implemented in a different non-free layer, is an open core technology. Similarly is open core if I write a relational database with a modular storage system, but the only storage that is able to provide strong guarantees is non free. In an open core business model around an open source system it is *fundamental* that you take something useful out of the free software part.

read the full post at http://antirez.com/news/121

Redis will remain BSD licensed

▼

antirez 2643 days ago.

Today a page about the new Common Clause license in the Redis Labs web site was interpreted as if Redis itself switched license. This is not the case, Redis is, and will remain, BSD licensed. However in the era of [edit] uncontrollable spreading of information, my attempts to provide the correct information failed, and I’m still seeing everywhere “Redis is no longer open source”. The reality is that Redis remains BSD, and actually Redis Labs did the right thing supporting my effort to keep the Redis core open as usually.

read the full post at http://antirez.com/news/120

Redis Lua scripting: several security vulnerabilities fixed

▼

antirez 2713 days ago.

A bit more than one month ago I received an email from the Apple Information Security team. During an auditing the Apple team found a security issue in the Redis Lua subsystem, specifically in the cmsgpack library. The library is not part of Lua itself, it is an implementation of MessagePack I wrote myself. In the course of merging a pull request improving the feature set, a security issue was added. Later the same team found a new issue in the Lua struct library, again such library was not part of Lua itself, at least in the release of Lua we use: we just embedded the source code inside our Lua implementation in order to provide some functionality to the Lua interpreter that is available to Redis users. Then I found another issue in the same struct package, and later the Alibaba team found many other issues in cmsgpack and other code paths using the Lua API. In a short amount of time I was sitting on a pile of Lua related vulnerabilities.

read the full post at http://antirez.com/news/119

Clarifications on the Incapsula Redis security report

▼

antirez 2724 days ago.

A few days ago I started my day with my Twitter feed full of articles saying something like: “75% of Redis servers infected by malware”. The obvious misquote referred to a research by Incapsula where they found that 75% of the Redis instances left open on the internet, without any protection, on a public IP address, are infected [1].

[1] https://www.incapsula.com/blog/report-75-of-open-redis-servers-are-infected.html

Many folks don’t need any clarification about all this, because if you have some grip on computer security and how Redis works, you can contextualize all this without much efforts. However I’m writing this blog post for two reasons. The obvious one is that it can help the press and other users that are not much into security and/or Redis to understand what’s going on. The second is that the exposed Redis instances are a case study about safe defaults that should be interesting for the security circles.

read the full post at http://antirez.com/news/118

A short tale of a read overflow

▼

antirez 2839 days ago.

[This blog post is also experimentally available on Medium: https://medium.com/antirez/a-short-tale-of-a-read-overflow-b9210d339cff]

When a long running process crashes, it is pretty uncool. More so if the process happens to take a lot of state in memory. This is why I love web programming frameworks that are able, without major performance overhead, to create a new interpreter and a new state for each page view, and deallocate every resource used at the end of the page generation. It is an inherently more reliable programming paradigm, where memory leaks, descriptor leaks, and even random crashes from time to time do not constitute a serious issue. However system software like Redis is at the other side of the spectrum, a side populated by things that should never crash.

read the full post at http://antirez.com/news/117

An update on Redis Streams development

▼

antirez 2852 days ago.

I saw multiple users asking me what is happening with Streams, when they’ll be ready for production uses, and in general what’s the ETA and the plan of the feature. This post will attempt to clarify a bit what comes next.

To start, in this moment Streams are my main priority: I want to finish this work that I believe is very useful in the Redis community and immediately start with the Redis Cluster improvements plans. Actually the work on Cluster has already started, with my colleague Fabio Nicotra that is porting redis-trib, the Cluster management tool, inside the old and good redis-cli. This step involves translating the code from Ruby to C. In the meantime, a few weeks ago I finished writing the Streams core, and I deleted the “streams” feature branch, merging everything into the “unstable” branch.

read the full post at http://antirez.com/news/116

Redis PSYNC2 bug post mortem

▼

antirez 2906 days ago.

Four days ago a user posted a critical issue in the Redis Github repository. The problem was related to the new Redis 4.0 PSYNC2 replication protocol, and was very critical. PSYNC2 brings a number of good things to Redis replication, including the ability to resynchronize just exchanging the differences, and not the whole data set, after a failover, and even after a slave controlled restart. The problem was about this latter feature: with PSYNC2 the RDB file is augmented with replication information. After a slave is restarted, the replication metadata is loaded back, and the slave is able to perform a PSYNC attempt, trying to handshake with the master and receive the differences since the last disconnection.

read the full post at http://antirez.com/news/115

Streams: a new general purpose data structure in Redis.

▼

antirez 2967 days ago.

Until a few months ago, for me streams were no more than an interesting and relatively straightforward concept in the context of messaging. After Kafka popularized the concept, I mostly investigated their usefulness in the case of Disque, a message queue that is now headed to be translated into a Redis 4.2 module. Later I decided that Disque was all about AP messaging, which is, fault tolerance and guarantees of delivery without much efforts from the client, so I decided that the concept of streams was not a good match in that case.

read the full post at http://antirez.com/news/114

Doing the FizzleFade effect using a Feistel network

▼

antirez 3001 days ago.

Today I read an interesting article about how the Wolfenstein 3D game implemented a fade effect using a Linear Feedback Shift Register. Every pixel of the screen is set red in a pseudo random way, till all the screen turns red (or other colors depending on the event happening in the game). The blog post describing the implementation is here and is a nice read: http://fabiensanglard.net/fizzlefade/index.php

You  may wonder why the original code used a LFSR or why I'm proposing a different approach, instead of the vanilla setPixel(rand(),rand()): doing this with a pseudo random generator, as noted in the blog post, is slow, but is also visually very unpleasant, since the more red pixels you have on the screen already, the less likely is that you hit a new yet-not-red pixel, so the final pixels take forever to turn red (I *bet* that many readers of this blog post tried it in the old times of the Spectum, C64, or later with QBASIC or GWBasic). In the final part of the blog post the author writes:

read the full post at http://antirez.com/news/113

The mythical 10x programmer

▼

antirez 3183 days ago.

A 10x programmer is, in the mythology of programming, a programmer that can do ten times the work of another normal programmer, where for normal programmer we can imagine one good at doing its work, but without the magical abilities of the 10x programmer. Actually to better characterize the “normal programmer” it is better to say that it represents the one having the average programming output, among the programmers that are professionals in this discipline.

The programming community is extremely polarized about the existence or not of such a beast: who says there is no such a thing as the 10x programmer, who says it actually does not just exist, but there are even 100x programmers if you know where to look for.

read the full post at http://antirez.com/news/112

Redis on the Raspberry Pi: adventures in unaligned lands

▼

antirez 3187 days ago.

After 10 million of units sold, and practically an endless set of different applications and auxiliary devices, like sensors and displays, I think it’s deserved to say that the Raspberry Pi is not just a success, it also became one of the preferred platforms for programmers to experiment in the embedded space. Probably with things like the Pi zero, it is also becoming the platform in order to create hardware products, without incurring all the risks and costs of designing, building, and writing software for vertical devices.

read the full post at http://antirez.com/news/111

The first release candidate of Redis 4.0 is out

▼

antirez 3271 days ago.

It’s not yet stable but it’s soon to become, and comes with a long list of things that will make Redis more useful for we users: finally Redis 4.0 Release Candidate 1 is here, and is bold enough to call itself 4.0 instead of 3.4. For me semantic versioning is not a thing, what I like instead is try to communicate, using version numbers and jumps, what’s up with the new version, and in this specific case 4.0 means “this is the shit”.

It’s just that Redis 4.0 has a lot of things that Redis should have had since ages, in a different world where one developer can, like Ken The Warrior, duplicate itself in ten copies and start to code. But it does not matter how hard I try to learn about new vim shortcuts, still the duplicate-me thing is not in my chords.

read the full post at http://antirez.com/news/110

Random notes on improving the Redis LRU algorithm

▼

antirez 3398 days ago.

Redis is often used for caching, in a setup where a fixed maximum memory to use is specified. When new data arrives, we need to make space by removing old data. The efficiency of Redis as a cache is related to how good decisions it makes about what data to evict: deleting data that is going to be needed soon is a poor strategy, while deleting data that is unlikely to be requested again is a good one.

In other terms every cache has an hits/misses ratio, which is, in qualitative terms, just the percentage of read queries that the cache is able to serve. Accesses to the keys of a cache are not distributed evenly among the data set in most workloads. Often a small percentage of keys get a very large percentage of all the accesses. Moreover the access pattern often changes over time, which means that as time passes certain keys that were very requested may no longer be accessed often, and conversely, keys that once were not popular may turn into the most accessed keys.

read the full post at http://antirez.com/news/109

Writing an editor in less than 1000 lines of code, just for fun

▼

antirez 3416 days ago.

WARNING: Long pretty useless blog post. TLDR is that I wrote, just for fun, a text editor in less than 1000 lines of code that does not depend on ncurses and has support for syntax highlight and search feature. The code is here: http://github.com/antirez/kilo.

Screencast here: https://asciinema.org/a/90r2i9bq8po03nazhqtsifksb

For the sentimentalists, keep reading…

A couple weeks ago there was this news about the Nano editor no longer being part of the GNU project. My first reaction was, wow people still really care about an old editor which is a clone of an editor originally part of a terminal based EMAIL CLIENT. Let’s say this again, “email client”. The notion of email client itself is gone at this point, everything changed. And yet I read, on Hacker News, a number of people writing how they were often saved by the availability of nano on random systems, doing system administrator tasks, for example. Nano is also how my son wrote his first program in C. It’s an acceptable experience that does not require past experience editing files.

read the full post at http://antirez.com/news/108

Programmers are not different, they need simple UIs.

▼

antirez 3463 days ago.

I’m spending days trying to get a couple of APIs right. New APIs about modules, and a new Redis data type.
I really mean it when I say *days*, just for the API. Writing drafts, starting the implementation shaping data structures and calls, and then restarting from scratch to iterate again in a better way, to improve the design and the user facing part.

Why I do that, delaying features for weeks? Is it really so important?
Programmers are engineers, maybe they should just adapt to whatever API is better to export for the system exporting it.

read the full post at http://antirez.com/news/107

Redis Loadable Modules System

▼

antirez 3477 days ago.

It was a matter of time but it eventually happened. In the Redis 1.0 release notes, 7 years ago, I mentioned that one of the interesting features for the future was “loadable modules”. I was really interested in such a feature back then, but over the years I became more and more skeptic about the idea of adding loadable modules in Redis. And probably for good reasons.

Modules can be the most interesting feature of a system and the most problematic one at the same
time: API incompatibilities between versions, low quality modules crashing the system, a lack

read the full post at http://antirez.com/news/106

Three ideas about text messages

▼

antirez 3480 days ago.

I’m aboard of a flight bringing me to San Francisco. Eventually I purchased the slowest internet connection of my life (well at least for a good reason), but for several hours I was without internet, as usually when I fly.

I don’t mind staying disconnected for some time usually. It’s a good time to focus, write some code, or a blog post like this one. However when I’m disconnected, what makes the most difference is not Facebook or Twitter or Github, but the lack of text messages.

At this point text messages are a fundamental thing in my life. They are also probably the main source of distraction. I use messages to talk with my family, even just to communicate between different floors. I use messages with friends to organize dinners and vacations. I even use messages with the plumber or the doctor.

read the full post at http://antirez.com/news/105

Redis 3.2.0 is out!

▼

antirez 3481 days ago.

It took more than expected, but finally we have it, Redis 3.2.0 stable is out with changes that may be useful to a big number of Redis users. At this point I covered the changes multiple time, but the big ones are:

* The GEO API. Index whatever you want by latitude and longitude, and query by radius, with the same speed and easy of use of the other Redis data structures. Here you can find the API documentation: http://redis.io/commands/#geo. Thank you to Matt Stancliff for the initial implementation, that was reworked but is still at the core of the GEO API, and to the developers of ARDB for providing the geo indexing code that Matt used.

read the full post at http://antirez.com/news/104

100 more of those BITFIELDs

▼

antirez 3551 days ago.

Today Redis is 7 years old, so to commemorate the event a bit I passed the latest couple of days doing a fun coding marathon to implement a new crazy command called BITFIELD.

The essence of this command is not new, it was proposed in the past by me and others, but never in a serious way, the idea always looked a bit strange. We already have bit operations in Redis: certain users love it, it’s a good way to represent a lot of data in a compact way. However so far we handle each bit separately, setting, testing, getting bits, counting all the bits that are set in a range, and so forth.

read the full post at http://antirez.com/news/103

The binary search of distributed programming

▼

antirez 3564 days ago.

Yesterday night I was re-reading Redlock analysis Martin Kleppmann wrote (http://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html). At some point Martin wonders if there is some good way to generate monotonically increasing IDs with Redis.

This apparently simple problem can be more complex than it looks at a first glance, considering that it must ensure that, in all the conditions, there is a safety property which is always guaranteed: the ID generated is always greater than all the past IDs generated, and the same ID cannot be generated multiple times. This must hold during network partitions and other failures. The system may just become unavailable if there are less than the majority of nodes that can be reached, but never provide the wrong answer (note: as we'll see this algorithm has another liveness issue that happens during high load of requests).

read the full post at http://antirez.com/news/102

Is Redlock safe?

▼

antirez 3568 days ago.

Martin Kleppmann, a distributed systems researcher, yesterday published an analysis of Redlock (http://redis.io/topics/distlock), that you can find here: http://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html

Redlock is a client side distributed locking algorithm I designed to be used with Redis, but the algorithm orchestrates, client side, a set of nodes that implement a data store with certain capabilities, in order to create a multi-master fault tolerant, and hopefully safe, distributed lock with auto release capabilities.

read the full post at http://antirez.com/news/101

Disque 1.0 RC1 is out!

▼

antirez 3606 days ago.

Today I’m happy to announce that the first release candidate for Disque 1.0 is available.

If you don't know what Disque is, the best starting point is to read the README in the Github project page at http://github.com/antirez/disque.

Disque is a just piece of software, so it has a material value which can be zero or more, depending on its ability to make useful things for people using it. But for me there is an huge value that goes over what Disque, materially, is. It is the value of designing and doing something you care about. It’s the magic of programming: where there was nothing, now there is something that works, that other people may potentially analyze, run, use.

read the full post at http://antirez.com/news/100

Generating unique IDs: an easy and reliable way

▼

antirez 3648 days ago.

Two days ago Mike Malone published an interesting post on Medium about the V8 implementation of Math.random(), and how weak is the quality of the PRNG used: http://bit.ly/1SPDraN.

The post was one of the top news on Hacker News today. It’s pretty clear and informative from the point of view of how Math.random() is broken and how should be fixed, so I’ve nothing to add to the matter itself. But since the author discovered the weakness of the PRNG in the context of generating large probably-non-colliding IDs, I want to share with you an alternative that I used multiple times in the past, which is fast and extremely reliable.

read the full post at http://antirez.com/news/99

6 years of commit visualized

▼

antirez 3649 days ago.

Today I was curious about plotting all the Redis commits we have on Git, which are 90% of all the Redis commits. There was just an initial period where I used SVN but switched very soon.

Full size image here: http://antirez.com/misc/commitsvis.png



Each commit is a rectangle. The height is the number of affected lines (a logarithmic scale is used). The gray labels show release tags.

There are little surprises since the amount of commit remained pretty much the same over the time, however now that we no longer backport features back into 3.0 and future releases, the rate at which new patchlevel versions are released diminished.

read the full post at http://antirez.com/news/98

Recent improvements to Redis Lua scripting

▼

antirez 3650 days ago.

Lua scripting is probably the most successful Redis feature, among the ones introduced when Redis was already pretty popular: no surprise that a few of the things users really want are about scripting. The following two features were suggested multiple times over the last two years, and many people tried to focus my attention into one or the other during the Redis developers meeting, a few weeks ago.

1. A proper debugger for Redis Lua scripts.
2. Replication, and storage on the AOF, of Lua scripts as a set of write commands materializing the *effects* of the script, instead of replicating the script itself as we normally do.

read the full post at http://antirez.com/news/97

A few things about Redis security

▼

antirez 3666 days ago.

IMPORTANT EDIT: Redis 3.2 security improved by implementing protected mode. You can find the details about it here: https://www.reddit.com/r/redis/comments/3zv85m/new_security_feature_redis_protected_mode/

From time to time I get security reports about Redis. It’s good to get reports, but it’s odd that what I get is usually about things like Lua sandbox escaping, insecure temporary file creation, and similar issues, in a software which is designed (as we explain in our security page here http://redis.io/topics/security) to be totally insecure if exposed to the outside world.

read the full post at http://antirez.com/news/96

Moving the Redis community on Reddit

▼

antirez 3678 days ago.

I’m just back from the Redis Dev meeting 2015. We spent two incredible days talking about Redis internals in many different ways. However while I’m waiting to receive private notes from other attenders, in order to summarize in a blog post what happened and what were the most important ideas exposed during the meetings, I’m going to touch a different topic here. I took the non trivial decision to move the Redis mailing list, consisting of 6700 members, to Reddit.

This looks like a crazy ideas probably in some way, and “to move” is probably not the right verb, since the ML will still exist. However it will only be used in order to receive announcements of new releases, critical informations like security related ones, and from time to time, links to very important discussions that are happening on Reddit.

read the full post at http://antirez.com/news/95

Clarifications about Redis and Memcached

▼

antirez 3704 days ago.

If you know me, you know I’m not the kind of guy that considers competing products a bad thing. I actually love the users to have choices, so I rarely do anything like comparing Redis with other technologies.
However it is also true that in order to pick the right solution users must be correctly informed.

This post was triggered by reading a blog post published by Mike Perham, that you may know as the author of a popular library called Sidekiq, that happens to use Redis as backend. So I would not consider Mike a person which is “against” Redis at all. Yet in his blog post that you can find at the URL http://www.mikeperham.com/2015/09/24/storing-data-with-redis/ he states that, for caching, “you should probably use Memcached instead [of Redis]”. So Mike simply really believes Redis is not good for caching, and he arguments his thesis in this way:

read the full post at http://antirez.com/news/94

Lazy Redis is better Redis

▼

antirez 3705 days ago.

Everybody knows Redis is single threaded. The best informed ones will tell you that, actually, Redis is *kinda* single threaded, since there are threads in order to perform certain slow operations on disk. So far threaded operations were so focused on I/O that our small library to perform asynchronous tasks on a different thread was called bio.c: Background I/O, basically.

However some time ago I opened an issue where I promised a new Redis feature that many wanted, me included, called “lazy free”. The original issue is here: https://github.com/antirez/redis/issues/1748.

read the full post at http://antirez.com/news/93

About Redis Sets memory efficiency

▼

antirez 3733 days ago.

Yesterday Amplitude published an article about scaling analytics, in the context of using the Set data type. The blog post is here: https://amplitude.com/blog/2015/08/25/scaling-analytics-at-amplitude/

On Hacker News people asked why not using Redis instead: https://news.ycombinator.com/item?id=10118413 

Amplitude developers have their set of reasons for not using Redis, and in general if you have a very specific problem and want to scale it in the best possible way, it makes sense to implement your vertical solution. I’m not adverse to reinventing the wheel, you want your very specific wheel sometimes, that a general purpose system may not be able to provide. Moreover creating your solution gives you control on what you did, boosts your creativity and your confidence in what you, as a developer can do, makes you able to debug whatever bug may arise in the future without external help.

read the full post at http://antirez.com/news/92

Thanks Pivotal, Hello Redis Labs

▼

antirez 3777 days ago.

I consider myself very lucky for contributing to the open source. For me OSS software is not just a license: it means transparency in the development process, choices that are only taken in order to improve software from the point of view of the users, documentation that attempts to cover everything, and simple, understandable systems. The Redis community had the privilege of finding in Pivotal, and VMware before, a company that thinks at open source in the same way as we, the community of developers, think of it.

read the full post at http://antirez.com/news/91

Commit messages are not titles

▼

antirez 3799 days ago.

Nor subjects, for what matters. Everybody will tell you to don't add a dot at the end of the first line of a commit message. I followed the advice for some time, but I'll stop today, because I don't believe commit messages are titles or subjects. They are synopsis of the meaning of the change operated by the commit, so they are small sentences. The sentence can be later augmented with more details in the next lines of the commit message, however many times there is *no* body, there is just the first line. How many emails or articles you see with just the subject or the title? Very little, I guess. So for me it is like:

read the full post at http://antirez.com/news/90

Plans for Redis 3.2

▼

antirez 3810 days ago.

I’m back from Paris, DotScale 2015 was a very interesting conference. Before leaving I was working on Sentinel in the context of the unstable branch: the work was mainly about connection sharing. In short, it is the ability of a few Sentinels to scale, monitoring many masters. Before to leave, and now that I’m back, I tried to “secure” a set of features that will be the basis for Redis 3.2. In the next weeks I’ll be focusing developing these features, so I thought it’s worth to share the list with you ASAP.

read the full post at http://antirez.com/news/89

Adventures in message queues

▼

antirez 3899 days ago.

EDIT: In case you missed it, Disque source code is now available at http://github.com/antirez/disque

It is a few months that I spend ~ 15-20% of my time, mostly hours stolen to nights and weekends, working to a new system. It’s a message broker and it’s called Disque. I’ve an implementation of 80% of what was in the original specification, but still I don’t feel like it’s ready to be released. Since I can’t ship, I’ll at least blog… so that’s the story of how it started and a few details about what it is.

read the full post at http://antirez.com/news/88

Redis Conference 2015

▼

antirez 3904 days ago.

I’m back home, after a non easy trip, since to travel from San Francisco to Sicily is kinda NP complete: there are no solutions involving less than three flights. However it was definitely worth it, because the Redis Conference 2015 was very good, SF was wonderful as usually and I was able to meet with many interesting people. Here I’ll limit myself to writing a short account of the conference, but the trip was also an incredible experience because I discovered old and new friends, that are not just smart programmers, but also people I could imagine being my friends here in Sicily. I never felt alone while I was 10k kilometers away from my home.

read the full post at http://antirez.com/news/87

Side projects

▼

antirez 3916 days ago.

Today Redis is six years old. This is an incredible accomplishment for me, because in the past I switched to the next thing much faster. There are things that lasted six years in my past, but not like Redis, where after so much time, I still focus most of my everyday energies into.

How did I stopped doing new things to focus into an unique effort, drastically monopolizing my professional life? It was a too big sacrifice to do, for an human being with a limited life span. Fortunately I simply never did this, I never stopped doing new things.

read the full post at http://antirez.com/news/86

Why we don’t have benchmarks comparing Redis with other DBs

▼

antirez 3944 days ago.

Redis speed could be one selling point for new users, so following the trend of comparative “advertising” it should be logical to have a few comparisons at Redis.io. However there are two problems with this. One is of goals: I don’t want to convince developers to adopt Redis, we just do our best in order to provide a suitable product, and we are happy if people can get work done with it, that’s where my marketing wishes end. There is more: it is almost always impossible to compare different systems in a fair way.

read the full post at http://antirez.com/news/85

Redis latency spikes and the Linux kernel: a few more details

▼

antirez 4031 days ago.

Today I was testing Redis latency using m3.medium EC2 instances. I was able to replicate the usual latency spikes during BGSAVE, when the process forks, and the child starts saving the dataset on disk. However something was not as expected. The spike did not happened because of disk I/O, nor during the fork() call itself.

The test was performed with a 1GB of data in memory, with 150k writes per second originating from a different EC2 instance, targeting 5 million keys (evenly distributed). The pipeline was set to 4 commands. This translates to the following command line of redis-benchmark:

read the full post at http://antirez.com/news/84

Redis latency spikes and the 99th percentile

▼

antirez 4035 days ago.

One interesting thing about the Stripe blog post about Redis is that they included latency graphs obtained during their tests. In order to persist on disk Redis requires to call the fork() system call. Usually forking using physical servers, and most hypervisors, is fast even with big processes. However Xen is slow to fork, so with certain EC2 instance types (and other virtual servers providers as well), it is possible to have serious latency spikes every time the parent process forks in order to persist on disk. The Stripe graph is pretty clear in this regard.

read the full post at http://antirez.com/news/83

This is why I can’t have conversations using Twitter

▼

antirez 4036 days ago.

Yesterday Stripe engineers wrote a detailed report of why they had an issue with Redis. This is very appreciated. In the Hacker News thread I explained that because now we have diskless replication (http://antirez.com/news/81) now persistence is no longer mandatory for people having a master-slaves replicas set. This changes the design constraints: now that we can have diskless replicas synchronization, it is worth it to better support the Stripe (ex?) use case of replicas set with persistence turned down, in a more safe way. This is a work in progress effort.

read the full post at http://antirez.com/news/82

Diskless replication: a few design notes.

▼

antirez 4038 days ago.

Almost a month ago a number of people interested in Redis development met in London for the first Redis developers meeting. We identified together a number of features that are urgent (and are now listed in a Github issue here: https://github.com/antirez/redis/issues/2045), and among the identified issues, there was one that was mentioned multiple times in the course of the day: diskless replication.

The feature is not exactly a new idea, it was proposed several times, especially by EC2 users that know that sometimes it is not trivial for a master to provide good performances during slaves synchronization. However there are a number of use cases where you don’t want to touch disks, even running on physical servers, and especially when Redis is used as a cache. Redis replication was, in short, forcing users to use disk even when they don’t need or want disk durability.

read the full post at http://antirez.com/news/81

A few arguments about Redis Sentinel properties and fail scenarios.

▼

antirez 4044 days ago.

Yesterday distributed systems expert Aphyr, posted a tweet about a Redis Sentinel issue experienced by an unknown company (that wishes to remain anonymous):

“OH on Redis Sentinel "They kill -9'd the master, which caused a split brain..."
“then the old master popped up with no data and replicated the lack of data to all the other nodes. Literally had to restore from backups."

OMG we have some nasty bug I thought. However I tried to get more information from Kyle, and he replied that the users actually disabled disk persistence at all from the master process. Yep: the master was configured on purpose to restart with a wiped data set.

read the full post at http://antirez.com/news/80

Redis cluster, no longer vaporware.

▼

antirez 4056 days ago.

The first commit I can find in my git history about Redis Cluster is dated March 29 2011, but it is a “copy and commit” merge: the history of the cluster branch was destroyed since it was a total mess of work-in-progress commits, just to shape the initial idea of API and interactions with the rest of the system.

Basically it is a roughly 4 years old project. This is about two thirds the whole history of the Redis project. Yet, it is only today, that I’m releasing a Release Candidate, the first one, of Redis 3.0.0, which is the first version with Cluster support.

read the full post at http://antirez.com/news/79

Queues and databases

▼

antirez 4143 days ago.

Queues are an incredibly useful tool in modern computing, they are often used in order to perform some possibly slow computation at a latter time in web applications. Basically queues allow to split a computation in two times, the time the computation is scheduled, and the time the computation is executed. A “producer”, will put a task to be executed into a queue, and a “consumer” or “worker” will get tasks from the queue to execute them. For example once a new user completes the registration process in a web application, the web application will add a new task to the queue in order to send an email with the activation link. The actual process of sending an email, that may require retrying if there are transient network failures or other errors, is up to the worker.

read the full post at http://antirez.com/news/78

A proposal for more reliable locks using Redis

▼

antirez 4202 days ago.

-----------------
UPDATE: The algorithm is now described in the Redis documentation here => http://redis.io/topics/distlock. The article is left here in its older version, the updates will go into the Redis documentation instead.
-----------------

Many people use Redis to implement distributed locks. Many believe that this is a great use case, and that Redis worked great to solve an otherwise hard to solve problem. Others believe that this is totally broken, unsafe, and wrong use case for Redis.

read the full post at http://antirez.com/news/77

Using Heartbleed as a starting point

▼

antirez 4238 days ago.

The strong reactions about the recent OpenSSL bug are understandable: it is not fun when suddenly all the internet needs to be patched. Moreover for me personally how trivial the bug is, is disturbing. I don’t want to point the finger to the OpenSSL developers, but you just usually think at those class of issues as a bit more subtle, in the case of a software like OpenSSL. Usually you fail to do sanity checks *correctly*, as opposed to this bug where there is a total *lack* of bound checks in the memcpy() call.

read the full post at http://antirez.com/news/76

Redis new data structure: the HyperLogLog

▼

antirez 4247 days ago.

Generally speaking, I love randomized algorithms, but there is one I love particularly since even after you understand how it works, it still remains magical from a programmer point of view. It accomplishes something that is almost illogical given how little it asks for in terms of time or space. This algorithm is called HyperLogLog, and today it is introduced as a new data structure for Redis.

Counting unique things
===

Usually counting unique things, for example the number of unique IPs that connected today to your web site, or the number of unique searches that your users performed, requires to remember all the unique elements encountered so far, in order to match the next element with the set of already seen elements, and increment a counter only if the new element was never seen before.

read the full post at http://antirez.com/news/75

Fascinating little programs

▼

antirez 4266 days ago.

Yesterday and today I managed to spend some time with linenoise (http://github.com/antirez/linenoise), a minimal line-editing library designed to be a simple and small replacement for readline.
I was trying to merge a few pull requests, to fix issues, and doing some refactoring at the same time. It was some kind of nirvana I was feeling: a complete control of small, self-contained, and useful code.

There is something special in simple code. Here I’m not referring to simplicity to fight complexity or over engineering, but to simplicity per se, auto referential, without goals if not beauty, understandability and elegance.

read the full post at http://antirez.com/news/74

What is performance?

▼

antirez 4279 days ago.

The title of this blog post is an apparently trivial to answer question, however it is worth to consider a bit better what performance really means: it is easy to get confused between scalability and performance, and to decompose performance, in the specific case of database systems, in its different main components, may not be trivial. In this short blog post I’ll try to write down my current idea of what performance is in the context of database systems.

A good starting point is probably the first slide I use lately in my talks about Redis. This first slide is indeed about performance, and says that performance is mainly three different things.

read the full post at http://antirez.com/news/73

Happy birthday Redis!

▼

antirez 4281 days ago.

Today Redis is 5 years old, at least if we count starting from the initial HN announcement [1], that’s actually a good starting point. After all an open source project really exists as soon as it is public.

I’m a bit shocked I worked for five years straight to the same thing. The opportunities for learning new things I had because of the directions where Redis pushed me, and the opportunities to learn new things that I missed because I had almost consistently no time for random hacking, are huge.

read the full post at http://antirez.com/news/72

A simple distributed algorithm for small idempotent information

▼

antirez 4286 days ago.

In this blog post I’m going to describe a very simple distributed algorithm that is useful in different programming scenarios.
The algorithm is useful when you need to take some kind of information synchronized among a number of processes.
The information can be everything as long as it is composed of a small number of bytes, and as long as it is idempotent, that is, the current value of the information does not depend on the previous value, and we can just replace an old value, with the new one.

read the full post at http://antirez.com/news/71

Redis Cluster and limiting divergences.

▼

antirez 4318 days ago.

Redis Cluster is finally on its road to reach the first stable release in a short timeframe as already discussed in the Redis google group [1]. However despite a design never proposed for the implementation of Redis Cluster was analyzed and discussed at long in the past weeks (unfortunately creating some confusion: many people, including notable personalities of the NoSQL movement, confused the analyzed proposal with Redis Cluster implementation), no attempt was made to analyze or categorize Redis Cluster itself.

read the full post at http://antirez.com/news/70

Some fun with Redis Cluster testing

▼

antirez 4351 days ago.

One of the steps to reach the goal of providing a "testable" Redis Cluster experience to users within a few weeks, is some serious testing that goes over the usual "I'm running 3 nodes in my macbook, it works". Finally this is possible, since Redis Cluster entered into the "refinements" stage, and most of the system design and implementation is in its final form already.

In order to perform some testing I assembled an environment like that:

* Hardware: 6 real computers: 2 macbook pro, 2 macbook air, 1 Linux desktop, 1 Linux tiny laptop called EEEpc running with a single core at 800Mhz.

read the full post at http://antirez.com/news/69

Redis as AP system, reloaded

▼

antirez 4358 days ago.

So finally something really good happened from the Redis criticism thread.

At the end of the work day I was reading about Redis as AP and merge operations on Twitter. At the same time I was having a private email exchange with Alexis Richardson (from RabbitMQ, and, my boss). Alexis at some point proposed that perhaps a way to improve safety was to asynchronously ACK the client about what commands actually were not received so that the client could retry. This seemed a lot of efforts in the client side, but somewhat totally opened my view on the matter.

read the full post at http://antirez.com/news/68

The Redis criticism thread

▼

antirez 4360 days ago.

A few days ago I tried to do an experiment by running some kind of “call for critiques” in the Redis mailing list:

https://groups.google.com/forum/#!topic/redis-db/Oazt2k7Lzz4

The thread has reached 89 posts so far, probably one of the biggest threads in the history of the Redis google group.
The main idea was that critiques are a mix of pointless attacks, and truth, so to extract the truth from critiques can be a good exercise, it means to have some seed idea for future improvements from the part of the population that is not using or is not happy with your system.

read the full post at http://antirez.com/news/67

WAIT: synchronous replication for Redis

▼

antirez 4364 days ago.

Redis unstable has a new command called "WAIT". Such a simple name, is indeed the incarnation of a simple feature consisting of less than 200 lines of code, but providing an interesting way to change the default behavior of Redis replication.

The feature was extremely easy to implement because of previous work made. WAIT was basically a direct consequence of the new Redis replication design (that started with Redis 2.8). The feature itself is in a form that respects the design of Redis, so it is relatively different from other implementations of synchronous replication, both at API level, and from the point of view of the degree of consistency it is able to ensure.

read the full post at http://antirez.com/news/66

Blog lost and recovered in 30 minutes

▼

antirez 4367 days ago.

Yesterday I lost all my blog data in a rather funny way. When I installed this new blog engine, that is basically a Lamer News slightly modified to serve as a blog, I spinned a Redis instance manually with persistence *disabled* just to see if it was working and test it a bit.

I just started a screen instance, and run something like ./redis-server --port 10000. Since this is equivalent to an empty config file with just "port 10000" inside I was running no disk backed at all.

Since Redis very rarely crashes, guess what, after more than one year it was still running inside the screen session, and I totally forgot that it was running like that, happily writing controversial posts in my blog. Yesterday my server was under attack. This caused an higher then normal load, and Linode rebooted the instance. As a result my blog was gone.

read the full post at http://antirez.com/news/65

The fight against sexism is not a free pass

▼

antirez 4368 days ago.

Today Joyent wrote a blog post in the company blog about an issue that started with this pull request in the libuv project: https://github.com/joyent/libuv/pull/1015#issuecomment-29538615

Basically the developer Ben Noordhuis rejected a pull request involving a change in the documentation to use gender-neutral form instead of “him”. Joyent replied with this incredible post: http://www.joyent.com/blog/the-power-of-a-pronoun.

In the blog post you can read:

“But while Isaac is a Joyent employee, Ben is not—and if he had been, he wouldn't be as of this morning: to reject a pull request that eliminates a gendered pronoun on the principle that pronouns should in fact be gendered would constitute a fireable offense for me and for Joyent.”

read the full post at http://antirez.com/news/64

Finally Redis collections are iterable

▼

antirez 4403 days ago.

Redis API for data access is usually limited, but very direct and straightforward.

It is limited because it only allows to access data in a natural way, that is, in a data structure obvious way. Sorted sets are easy to access by score ranges, while hashes by field name, and so forth.
This API “way” has profound effects on what Redis is and how users organize data into it, because an API that is data-obvious means fast operations, less code and less bugs in the implementation, but especially forcing the application layer to make meaningful choices: the database as a system in which you are responsible of organizing data in a way that makes sense in your application, versus a database as a magical object where you put data inside, and then it will be able to fetch and organize data for you in any format.

read the full post at http://antirez.com/news/63

New Redis Cluster meta-data handling

▼

antirez 4434 days ago.

This blog post describes the new algorithm used in Redis Cluster in order to propagate and update metadata, that is hopefully significantly safer than the previous algorithm used. The Redis Cluster specification was not yet updated, as I'm rewriting it from scratch, so this blog post serves as a first way to share the algorithm with the community.

Let's start with the problem to solve. Redis Cluster uses a master - slave design in order to recover from nodes failures. The key space is partitioned across the different masters in the cluster, using a concept that we call "hash slots". Basically every key is hashed into a number between 0 and 16383. If a given key hashes to 15, it means it is in the hash slot number 15. These 16k hash slots are split among the different masters.

read the full post at http://antirez.com/news/62

English has been my pain for 15 years

▼

antirez 4459 days ago.

Paul Graham managed to put a very important question, the one of the English language as a requirement for IT workers, in the attention zone of news sites and software developers [1]. It was a controversial matter as he referred to "foreign accents" and the internet is full of people that are just waiting to overreact, but this is the least interesting part of the question, so I'll skip that part. The important part is, no one talks about the "English problem" usually, and I always felt a bit alone in that side, like if it was a problem only affecting me, so in this blog post I want to share my experience about English.

read the full post at http://antirez.com/news/61

Twilio incident and Redis

▼

antirez 4499 days ago.

Twilio just released a post mortem about an incident that caused issues with the billing system:

http://www.twilio.com/blog/2013/07/billing-incident-post-mortem.html

The problem was about a Redis server, since Twilio is using Redis to store the in-flight account balances, in a master-slaves setup, with multiple slaves in different data centers for obvious availability and data safety concerns.

This is a short analysis of the incident, what Twilio can do and what Redis can do to avoid this kind of issues.

read the full post at http://antirez.com/news/60

San Francisco

▼

antirez 4537 days ago.

Yesterday night I returned back home after a short trip in San Francisco. Before memory fades out and while my feelings are crisp enough, I'm writing a short report of the trip. The point of view is that of a south European programmer exposed for a few days to what is probably the most active information technology ecosystem and economy of the world.

Reaching San Francisco
===

If you want to reach San Francisco from Sicily, there are no direct flights helping you. My flight was a Lufthansa flight from Catania to Munich, and finally from Munich to San Francisco. This is a total of 15 hours flight, plus the stop in Munich waiting for the second flight.

read the full post at http://antirez.com/news/59

Exploring synchronous replication in Redis

▼

antirez 4556 days ago.

Redis uses streamed asynchronous replication, that's one of the simplest forms of replication you can imagine: a continuos stream of writes is sent to the slaves, without waiting for the slaves to process the writes in any way before replying to the client.

I always gave that almost for granted, as I always assumed Redis was not a good match for synchronous replication, that has an higher latency. However recently I tried to fix another issue with Redis replication, that is, timeouts are all up to the slave.

read the full post at http://antirez.com/news/58

Availability on planet Terah

▼

antirez 4562 days ago.

Terah is a planet far away, where networks never split. They have a single issue with their computer networks, from time to time, single hosts break in a way or the other. Sometimes is a broken power supply, other times a crashed disk, or a software issue completely blocking the system.

The inhabitants of this strange planet use two database systems. One is imported from planet Earth via the Galactic Exchange Program, and is called EDB. The other is produced by engineers from Terah, and is called TDB. The databases are functionally equivalent, but they have different semantics when a network partition happens. While the database from Earth stops accepting writes as long as it is not connected with the majority of the other database nodes, the database from Terah works as long as the majority of the clients can reach at least a database node (incidentally, the author of this story released a similar software project called Sentinel, but this is just a coincidence).

read the full post at http://antirez.com/news/57

[more]