Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,583
|
Comments: 51,214
Privacy Policy · Terms
filter by tags archive
time to read 2 min | 208 words

Over time, projects get more features. And that is fine, as long as they are orthogonal features. It is when those features overlap that they are really putting the hurt on us.

For example, with the recent Changes API addition to RavenDB, one of the things that was really annoying is that in order to actually implement this feature, I had to implement this to:

  • Embedded
  • Client Server
  • Sharded
  • Silverlight

And that one is easy. We have bugs right now that are happening because people are using two or three bundles at the same time, and each of them works fine, but in conjunction, you get strange results.

What should happen when the Unique Constraints bundle creates an internal document when you have the Versioning bundle enabled? How about when we add replication to the mix?

I am not sure if I have any good ideas about the matter. Most of the things that we do are orthogonal to one another, but when used in combination, they actually have to know about their impact on other things.

My main worry is that as time goes by, we have more & more of those intersections. And that adds to the cost of maintaining and support the product.

time to read 3 min | 535 words

Another upcoming feature for the next RavenDB release is full encryption support. We got several requests for this feature from customers that are interested in using RavenDB to store highly critical data. Think financial records, health information, etc.

For those sort of applications, there are often regulatory concerns about the data at rest. I’ll state upfront that I think that a lot of those regulations make absolutely no sense from a practical standpoint, but…

The end result is that we can never put any plaintext data on disk. Which result in some interesting problems for our use case. To start with, it is fairly easy to go about creating encryption for documents. They are independent from one another and are usually read / written in a single chunk. In fact, there have been several articles published on exactly how to do that. The problem is with indexes, which are not read as a whole, in fact, we have to support random reads through it.

In the currently released version, you could encrypt documents using a custom bundle, but you can’t get real encryption across the board. This like in flight documents, partial map/reduce data and the indexes themselves will not be encrypted and be saved in plain text format, even with a custom bundle.

In the next version of RavenDB (this feature will be available for the Enterprise version), we have made sure that all of that just works. Everything is encrypted, and there is no plain text data on disk for any reason. RavenDB will transparently encrypt/decrypt the data for you when it is actually sent to disk.

By default, we use AES-128 (you can change that, if you want, but there is a not insignificant hit if you want to just to AES-256 and it is just as secure, barring a quantum computer) to encrypt the data.

The funny part (or not so funny part) is that the actual process of encrypting the data was a relatively straightforward process. We had to spend a lot more time & effort on the actual management aspect of this feature.

For example, encryption requires an encryption key, so how do you manage that?

In RavenDB, we have two types of configurations. Server wide, which is usually located at the App.config file and database specific, which is located at the System database. For the App.config file, we provide support for encrypting the file using DPAPI, using the standard .NET config file encryption system. For database specific values, we provide our own support for encrypting the values using DPAPI.

So, the end result is:

  • Your documents and indexes are encrypted when they are on disk using strong encryption.
  • You can use a server wide or database specific key for the encryption (for that matter, you turn on/off encryption at the database level).
  • Your encryption key is guarded using DPAPI.
  • Obviously, you should backup the encryption key, because we have no way of recovering your data without it. 
  • The data is safely encrypted on disk, and the OS guarantee that no one can access the encryption key.

And, finally: You get to tick off the “no plaintext data at rest” checkbox and move on to do actual feature development Smile.

time to read 6 min | 1106 words

A customer had an interesting problem in the mailing list. He had the following model:

public class Case
{
    public string Id { get; set; }
    public string[] CasesCitedByThisCase { get; set; }
    public string Name { get; set; }
    // other interesting properties
}

And he needed to be able to query and sort by the case’s properties, but also by its importance. A case important is determined by how many other cases are citing it.

This seems hard to do, because you cannot access other documents during standard indexing, so you have to use map reduce to make this work. The problem is that it is really hard to just map/reduce to make this work, you need two disparate sets of information from the documents.

This is why we have multi map, and we can create the appropriate index like this:

public class Cases_Search : AbstractMultiMapIndexCreationTask<Cases_Search.Result>
{
    public class Result
    {
        public string Id { get; set; }
        public int TimesCited { get; set; }
        public string Name { get; set; }
    }

    public Cases_Search()
    {
        AddMap<Case>(cases =>
                        from c in cases
                        select new
                        {
                        c.Id,
                        c.Name,
                        TimesCited = 0
                        }
            );

        AddMap<Case>(cases =>
                        from c in cases
                        from cited in c.CasesCitedByThisCase
                        select new
                        {
                            Id = cited,
                            Name = (string)null,
                            TimesCited = 1
                        }
            );

        Reduce = results =>
                    from result in results
                    group result by result.Id
                    into g
                    select new
                    {
                    Id = g.Key,
                    Name = g.Select(x => x.Name).FirstOrDefault(x => x != null),
                    TimesCited = g.Sum(x => x.TimesCited)
                    };
    }
}

We are basically doing two passes on the cases, one to get the actual case information, the next to get the information about the cases it cite. We then take all of that information together and reduce it together, resulting in the output that we wanted.

The really nice thing about this? This being a RavenDB index, all of the work is done once, and queries on this are going to be really cheap.

time to read 2 min | 303 words

Rob Eisenberg has just let me talked about his latest project, which is really cool, on so many fronts.

The project is RPGWithMe, and before we get any deeper, I just want to share some screen shots with you. You can click on them to see the full size images.

But don’t go away yet, there is more stuff below.

tabletop1

character sheet

tabletop2

As I said, there are several reasons that I like this project. To start with, the sheer geeky fun of RPG is not to be underestimated. And this looks beautiful.

I am no longer an RPG addict, but I still got lost in the site for a while, just because it is such a nice experience.

And, finally, it used RavenDB and RavenHQ! In fact, let me give it to you in Rob’s own words:

This product uses RavenDB through and through. It was a beautiful development experience. I really don't think it would have been possible without it. I'm in production on top of RavenHQ. I've also
got a full stage environment running it. Everything is cloud based.

Those are the sort of emails that really make my day! And just before the weekend too, so I can spend some time playing reviewing itSmile.

time to read 2 min | 218 words

The following piece of code is taken from the RavenDB’s RemoteDatabaseChanges class, which implements the client side behavior for the RavenDB Changes API:

image

As you can see, we are doing something really strange here, DisposeAsync().

The reason it is there is that we need to send a command to the server to tell it to disconnect the connection from its end. Sure, we can just abort the request from our end (in fact, we are doing that), but only after we have tried to get the server to do this.

Why are we doing this?

The answer is quite simple. We want good Fiddler support.

By sending the disconnect command, we ensure that the connection will be properly closed, and Fiddler can then show the content, like so:

Being able to look at those (even if only after the connection has been closed) is invaluable when doing diagnostics, debugging or just wanting to take a peek.

On the hand, if we weren’t doing this, we would get 502 or 504 errors from Fiddler, which is annoying and opaque. And that isn’t a good way to create a good feel for people trying out your products.

time to read 2 min | 309 words

Nitpicker corner: If you tell me that HTTP is built on TCP I’ll agree, then point out that this is completely irrelevant to the discussion.

I got asked why RavenDB uses HTTP for transport, instead of TCP. Surely binary TCP would be more efficient to work with, right?

Well, the answer it complex, but it boils down to this:

image

Huh? What does Fiddler has to do with RavenDB transport mechanism?

Quite a lot, actually. Using HTTP enable us to do a lot of amazing things, but the most important thing it allows us to do?

It is freaking easy to debug and work with.

  • It has awesome tools like Fiddler that are easy to use and understand.
  • We can piggyback on things like IIS for hosting easily.
  • We can test out scaling with ease using off the shelf tools.
  • We can use hackable URLs to demo how things work at the wire level.

In short, HTTP it human readable.

For that matter, I just remapped the Changes API solely for the purpose of making it Fiddler friendly.

Coming back again to building pro software, making sure that your clients and your support team can diagnose things easily.

Compare this:

image_thumb2

To this:

There is an entire world of difference between the quality that you can give between the two.

And that is why RavenDB is using HTTP, because going with TCP would mean writing a lot of our own tools and making things harder all around.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
  2. Webinar (7):
    05 Jun 2025 - Think inside the database
  3. Recording (16):
    29 May 2025 - RavenDB's Upcoming Optimizations Deep Dive
  4. RavenDB News (2):
    02 May 2025 - May 2025
  5. Production Postmortem (52):
    07 Apr 2025 - The race condition in the interlock
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}