Hacker News new | past | comments | ask | show | jobs | submit login
KV.js (github.com/heyputer)
187 points by ent101 on April 18, 2023 | hide | past | favorite | 66 comments



So, basically it's an in-process, single-threaded KV store, made out of a standard JS Object with a bunch of operations over keys or values, vaguely along the lines of Redis.

These operations make it stand out a bit from just typing `const kV = {}`, especially set operations, and the presence of TTL for keys.

Could be useful. For me though the usefulness of Redis comes mostly from its being a common store across many processes / nodes, so it can serve ads a cache, a queue, etc. Maybe it's also useful for a single JS process in a similar capacity.


What I've wanted before is an easy in-memory NodeJS (typescript) cache that multiple processes can share. Like, we deploy on AWS EC2s with multiple vCPUs; I don't want a separate in-memory cache for each of them, but I also don't want the latency of Redis.


You can run Redis on the same machine with a Unix socket to share it between processes without the network latency


Could you use Kubernetes to solve this? Have a single pod running the Redis instance and then multiple running Node.js talking to the Redis instance via something like DAPR (https://dapr.io/)


The irony of having someone comment on the complexity of running bare redis on the same server, only for someone else suggesting a whole Kubernetes setup for essentially the same thing.

If you're already heavily invested in Kubernetes this is certainly a solution, though I feel this adds too much abstraction for the amount of problem it's solving.


If you already have a k8s setup, and the Redis is but a small addition to it, this may make sense.

Using k8s to solve just that would of course be overkill.


Yeah I should be clear, I meant this question purely as a curiosity, not a recommendation.


Wouldn't Kubernetes and/or Dapr just add even more latency before the packets reach Redis?

It seems like the parent comment's author is implying that Redis is already too slow even when running it on bare metal and connecting directly to it.

Adding additional things like service discovery and proxies in the network path will not speed it up. To speed it up you would need to bring Redis closer, so perhaps you meant using k8 tools like "pod topology spread contraints" to do so (juxtaposing with the default k8 behavior that may spread workloads out in an undesirable manner)?

Simply going from "we are not using k8s" to "we are using k8s" won't speed up latency to Redis.


I was under the assumption OP's redis instance was running on a separate machine and they wanted to be and to achieve a setup that "felt like" it was distributed (container isolation, etc) but without the network hops adding in latency.

My question was more around whether you could use K8's to effectively simulate that, but on a single beefy machine to remove the network hops.

It was also a curiosity "is this possible" type of question, not an "I'm saying this is a good idea" type of question.


Yes, but the question is should you. If you already have Kubernetes knowledge and infrastructure, why not? Otherwise, maybe not.


Yeah I should be clear, I meant this question purely as a curiosity, not a recommendation.

I'm a PM, trying to properly understand K8's, definitely not providing recommendations hahaha.


What you mean by Redis latency? It should be pretty much minimal, I don't know any IPC setups that would be faster


I struggle to see how this is better than just {} or Map, except as a mock for Redis.


This really is just “new Map()” with a little bit of set dressing.


I don't really agree, the logic around expiring values by itself is already a beneficial feature that isn't available out of the box.


The expiring part is very simple. For just get/set/delete and scheduled garbage collection, here’s an implementation that’s almost precisely equivalent to what KV.js offers (just with sane arguments and return types, fewer bugs, and markedly better performance due to not supporting stuff you certainly don’t need and removing unnecessary abstractions):

  const store = new Map();
  const expiryTimes = new Map();

  function get(key) {
      // (This works even for non-expiring keys because undefined is not less than any number.)
      if (expiryTimes.get(key) < Date.now()) {
          store.delete(key);
      }
      return store.get(key);
  }

  function set(key, value, expiry) {
      store.set(key, value);
      if (expiry) {
          expiryTimes.set(expiry);
      }
  }

  function del(key) {
      store.delete(key);
      expiryTimes.delete(key);
  }

  function flushExpired() {
      const now = Date.now();
      for (const [key, time] of expiryTimes) {
          if (time < now) {
              del(key);
          }
      }
  }

  // Optional: automatically delete expired entries to free memory, once a second since that’s what KV.js does, but that’s probably waaaaay more often than you need.
  setInterval(flushExpired, 1000);
Working directly with store and expiryTimes for more advanced stuff will genuinely regularly be easier than working with the KV.js abstraction.


This comment is a good example of why "oh this is easy, I can write this myself" should not be the default approach in some cases. There's a severe bug in your set method, expiry times are not cleared when overwriting an expiring value with a value with no expiry. The library linked as the main thread does not have this bug.


True, that bug exists in what I wrote. Insert an `else { expiryTimes.delete(expiry); }` at the end of set to fix it. However, I note that KV.js has similarly severe bugs (some noted in my other comment in this thread—methods that use expired values, get() mangling falsy values) and its greater complexity makes it more likely to have bugs.

Very often I agree with you about the dangers of writing it yourself. But very often I also sufficiently dislike how libraries are implemented, with unnecessary bloat like excessive flexibility (all the options and such) or explicit runtime type checking rather than just letting things go wonky or blow up if given bad data.

In reality, in a case like this I would actually recommend something other than a generic container. Non-expiring keys? Use variables if the keys are a finite set of known values, or a Map if the keys are in fact arbitrary. Expiring keys? Use something vaguely like this, but probably wrapping Map<K, [number, V]>, made simpler by not supporting non-expiring keys (which mixture was what caused my bug, and which mixture I was never fond of).


Yeah that’s true. But because it’s using the system hashmap it inherits those performance characteristics and maximum size. Map is hard limited to 2^24 = 16,777,216 keys in v8. Map can also have performance spikes that cause event loop delay when you start loading it heavily. If you expect higher load, using a two levels of map (eg Map<k1, Map<k2, V>>) can help with both issues.

I went to look how the library accounted for these issues and it does not. I didn’t look to see if it uses timer coalescing or intervals, I wouldn’t want O(millions) of setTimout calls on my service either.

I think it’s fine to use in the browser or a hobby server, but due to the limitations of Map I would not use it in my production server.

Redis can do 2^32 = 4,294,967,296 keys (a few more), and you can do the “multi level map” by sharding your key space across multiple Redis or Memcached processes. Redis et al also has a large advantage of the cache surviving application deploy. Again, not everyone needs this but to me the main benefit of a remote cache is consistently lowering latencies, versus a in-memory cache where your latency will spike after every deploy as the cache refills.


Yep. This is a quintessential example of “I’m going to learn by recreating the essence of a complex thing”… taken to the end. And then a logo is slapped on top. I’m all for js libs but this is not performant. Just use redis.


Can you run Redis in browser? You can’t.

Is an in-process, in-memory database faster than making a query over network? Yes it is.



no need an extra layer of abstraction to get an "in memory" database in javascript. no need, babel, webpack, ChatGpt, npm and 10000 libraries for this really. Let me show (works even on the browser):

const KV = {};

KV["credentials"] = {access_key: "XXXX", secret: "YYYY"};

console.log(KV["credentials"].secret);


Maybe? Depends on the rest of the software you’d be developing and what features and abstractions you’d need.

The point was that “just use redis” is a poor and incorrect advice.


Now show me the rest of the functionality of the OP library like TTL.


Someone already replied to this in another comment.


Run Redis on the same machine. No network latency. You only have a few memcpys and context switches versus an in-process solution, and if those make a difference you shouldn't be using JS anyways


Exactly. Just an extra layer of abstraction, no idea what it tries to solve.


Maybe... the need to have a map with some additional set of accessors and/or utility functions around it ?

If you look at the code; it's literally just a map with some function around it. If you end up in a situation where you feel the need to wrap a map with some function to handle some typical needs (eg: access a random key; manage a ttl/expire for a key; ...) you might as well be using this.

The fact that the name of the functions have been taken from redis means that if/when you move to an external KV store, the transition will be trivial. It also means you don't really need to learn it if you already know redis.

I don't see becoming the #1 library on npm; with conferences and speakers debating the intricacies of its architecture. But hey it might solve someones need.


Disclaimer: I'm not the author, but I have shared personal projects here as well in the past.

HackerNews is the worst place on earth to share a side project, specially on its early days.

You will get all the hate and push back you can get, even if you did it for fun, for learning purposes or because you don't need "the real professional thing" because you're using your library on a small command line application that doesn't need to ship redis with it.

- "It doesn't scale"

- "Just use Redis"

- "It's too simple"

- "I'd rather just use new Map()"

- "It's just copying the basics of a large project and putting a logo on it"

So what? Innovation in the world should stop right now because everything is already invented? Or because you would do it better? Or because this is not web scale enough? or not used in production? Maybe the author doesn't need an external project for their small scale personal project and redis is overkill? Maybe the author is a 15 year old person in a third world country which you're discouraging from getting in the industry? who knows.

IMHO, all of you with these negative opinions towards any not yet popular side project that's shared here would make the world a bit better by either leaving positive comments, constructive criticism or just shutting up.

Thanks for your downvotes.


> "HackerNews is the worst place on earth to share a side project, specially on its early days."

Depends on your expectations.

A lot of online forums are hobbyist cheerleading groups. You get a flood of likes and heart emojis, but when it subsides, you're left with very little that would actually help you improve your craft.

HN is closer to the style of brutally direct criticism you get in art school (I know because I went to one). When professionals are tearing your work apart, it definitely doesn't feel good. But artists know they need to train themselves to receive that kind of feedback and process it beyond the initial rush of defensive emotions. The other people aren't necessarily right in what they say, but they decided to tell you something you didn't want to hear even though it would have been more comfortable for everyone to stay quiet. (And it's always possible that some negative feedback is simply active meanness — but that's also a useful lesson because every profession has that kind of people. You need exposure to different feedback styles to filter out what can help you grow.)


Hacker news is a startup community. Im making a startup now - there is a whole lot of times where you get hash criticism, dumb critism, criticism that shows they plainly didnt read/understand what you said. However, because of the way these things work, you just gotta eat gravel, smile, and figure out how to present it differently to try and reduce the confusion next time. Because its no use trying to tell the consumer that they're wrong.

I normally don't go for this "tough love" thing in everyday life at all, but in a context like this it does make sense at least.


> You will get all the hate and push back you can get, even if you did it for fun, for learning purposes

If you post code that you did for fun or learning purposes, and you didn't write an article about what you learned either, I don't see what's wrong about getting pushback. It doesn't make for a very good post.

> or because you don't need "the real professional thing" because you're using your library on a small command line application that doesn't need to ship redis with it.

Being standalone doesn't get downvotes.


It’s not HN specifically. It’s the world of JS and communities with voting mechanisms. For example I deleted my Reddit account years ago.

JS has a serious cry baby problem. If code is written without maximum consideration for easy, whatever that means, people will cry. These are large tears like an overflowing river from people who supposedly are adults wanting to be taken seriously.


Yep I just recently launched a SaaS and we got over $900k in revenue in our first 3 weeks (affiliate marketing). I'm not going to post about it because I would just get shit on for something. Welcome to the internet :(


I agree, but as creators this harsh feedback can become diamonds if polished through.

Some can be ignored sure, but I would try to read and value each of them.

I'm scared to do any Show HN because of this too, but oh well it forces you to make better shit before putting it out there.

Look im gonna share something I did in 2 hours yesterday now, watch people destroy me on the comments. (Twitter Zero)


It would be better to use camelCase to be compatible with node-redis, and then it could be a drop-in replacement in the future.


That’s a great idea! Thank you :)


SharedArrayBuffer is back right? It works across different worker instances now right? It'd be fun to see a cross-thread kv library emerge.


I don't see this mentioned, but is KV.js used somewhere in production? Is there a recommended way to use it?

For example in a separate worker, in a child process or in the same process as the node.js app, or embedded in an domain-specific server. Side note, is lua support in consideration? Some redis workloads rely heavily on it.

Looks cool and exhaustive nevertheless, the author did a great job!


Redis only uses Lua because Redis is written in C, and Lua is dynamic but has a good JIT compiler.

For software written in JS, it would be natural to use another dynamic language with a good JIT compiler, namely JS again.

I could see how introducing Lua might make sense if KV.js decided to be a faithful reimplementation of Redis's interface and (parts of) semantics, only in JS. Then dropping it in as a compatible replacement might be easier if Lua were supported, like in the original Redis.


The Lua embedded in Redis doesn't have JIT. It's just the vanilla PUC-Rio Lua 5.1. Not LuaJIT


Oh! Thanks. With that, a JS-based version of KV store may have faster scripts than Redis, if written carefully enough to avoid massive GC.


"I don't see this mentioned, but is KV.js used somewhere in production? "

Hey, author here. It's used in puter.com, my startup. Still in alpha though.

"For example in a separate worker, in a child process or in the same process as the node.js app," Right now I'd say in the same process as the node.js app. Open to suggestions.

"Side note, is lua support in consideration?" Yes.

"the author did a great job!"

Thank you! Please do let me know how I could make it better :)


> "It's used in puter.com, my startup"

Great domain, congratulations :)


In the past I needed something like redis, but in-process and “programmable from the outside”.

I used Go libraries to do it then, but as I prefer JS/TS I’ll surely look at KV.js next time the need arises.

Were you driven by any specific need/requirement while building KV.js?


Here is somewhat of a technical review. I have attempted to be objective, and also to provide some useful insights in the middle of it, as well as pointing out a few bugs here and there.

Not to beat about the bush: I would strongly recommend against using (or designing) something like this unless you’re actively aiming for usage compatibility with Redis (I don’t know how actually compatible it might be; I don’t use Redis; as for whether aiming for that is worthwhile, my opinion is generally not). If you aren’t explicitly aiming for that, the code is significant bloat around very simple patterns and you would do much better to use those underlying patterns.

The original source code is 100KB, and 3023 lines long (including 1021 comment and 346 blank lines). Minified, it’s 24KB, and gzipped 5.3KB. Except it also depends on glob, so add on another 75KB minified/23KB gzipped for it and its transitive dependencies, for this functionality that you probably don’t use but which probably won’t be eliminated from your build.

The typical user of something like this should really just use a Map directly, with maybe a second Map and auxiliary function for expiring keys, if they need that. Much less code, much faster, much simpler. If you need any of the more exotic functionality, code it yourself, it’s simple enough; then you can also choose more meaningful names, too.

> fast

Well, I took a look at the first method after the constructor, set, and found stuff like this:

  for (const opt of options) {
      switch (opt.toUpperCase()) {
          case 'NX':
              nx = true;
              break;
          …
          case 'EX':
              ex = Number.parseInt(options[options.indexOf(opt) + 1], 10);
Ugh. Gratuitous and unnecessary case-insensitivity, use of an array instead of an object, unwise indexOf because you used a values iterator rather than retaining the index… I suppose quite a bit of this is bound up in a probably-misguided (in my opinion) mirroring of Redis’s API. But in general, this direction of design is bad for JavaScript.

But to go deeper:

The thing to realise about this set method is that despite being 83 lines long, the most common case of passing no options makes it precisely equivalent to `this.store.set(key, value);`—just slower and heavier.

As time goes by I become more and more strongly opposed to options objects because they harm performance, code size and complexity, unless mitigated at build time by partial evaluation, which is not something done in JavaScript-land (Facebook tried it a few years ago with Prepack, but quickly gave up, and I’m not sure quite why).

Many of these options are either unwise variation (EX/PX/EXAT/PXAT) or to make it do materially different things (GET—I note it makes a lie of the return type declared in the JSDoc comment); my typical recommendation would be splitting things out into different methods, because tooling can cut away unused methods (… though it mostly won’t actually do so—unused functions, sure, but tooling is more leery about culling unused methods), and it helps the code to run fast. I won’t give examples because the method is messy enough in its diversity that it’d be painful.

But even retaining the current behaviour, there is one technique that I would expect to significantly improve throughput: add a little more code, a fast path for the no-options case.

  set(key, value, options) {
      if (!options) {
          // Fast path.
          this.store.set(key, value);
      } else {
          // Slow path.
          … the 81 lines of current method body …
      }
  }
—⁂—

get’s few instructions show just how simplistic and imitable the expiry feature is:

  get(key) {
      const isExpired = this._checkAndRemoveExpiredKey(key);
      if (isExpired) {
          return null;
      }
      return this.store.get(key) || null;
  }
Pretty sure that return statement is buggy, `|| null` will mangle falsy values (e.g. false, 0, "") to null.

—⁂—

Skimming through much of the rest, there are quite a few weird and/or inefficient and/or wrong things. Examples: incrby() uses expired values (looks like that might happen fairly commonly); incrby() happily mangles types; decr() has a superfluous try-catch-rethrow; decrby(key, decrement) isn’t just incrby(key, -decrement); keys() uses a .entries() iterator for no reason; also produces an array rather than an iterator (mildly more subjective); mset()… eww array rather than object; various list methods: wow these are verbose and often very bad fit for JavaScript, but I’ll limit myself to just remarking that in `list === undefined || !Array.isArray(list)` the `list === undefined ||` is superfluous; zdiffstore: clearly not tested, this.ZDIFF is undefined; blocking list methods like brpop(): um, you know JavaScript is single-threaded, right? You’re busy-waiting until the timeout for something that cannot possibly change. (And no, there’s no SharedArrayBuffer or such here which could be multi-threaded, because it checks Array.isArray.)

—⁂—

I think a lot of this would be better as free-standing functions that operate on a simple Map. It seems to me that expiry is really the only reason you can’t obviously and completely do this. For that reason, the direction I’d suggest would instead be making a new class that exposes the Map interface with only the minimum changes required to support expiry (only working with timestamps in milliseconds, no relative values or seconds, both of which are just begging for trouble), and then have all the rest of the more exotic functionality sit on top of that, perhaps then just as standalone functions.


Amendment due to https://news.ycombinator.com/item?id=35619757: the fast path for set would need to also include `this.expireTimes.delete(key);`.


Not the author here, but thank you for taking the time to do this review.


You’re welcome. It’s a trap I often fall into. I invariably end up spending longer on it than I intended to!


tl;dr ?


The expire() function should be made to work without a setTimeout. This makes it possible for the kvjs() instance to be garbage collected automatically when there is no more reference to it.

I recommend to use window.requestIdleCallback() directly after each read/write and the try to clean up old values, otherwise leave them there. That will also ensure that the cleanup never happens while the event loop is in a hot path.


Does it work with ESM ? Like:

> import { kvjs } from '@heyputer/kv.js'

If it does you could add an example in the readme to make that clear. I try to avoid commonjs-only modules nowadays.

Edit: Also, is it really a single source file or it's compiled ? It'd be hard to code review this 2.6k lines file.


I’ll admit I only skimmed it for a few minutes but I found the code really nice to read through! I highly doubt it is compiled output. When the entire code base essentially just consists of a single data structure I think that a single file, even if it is long, is perfectly appropriate. What I’d suggest for reviewing it is to code fold all the methods in your editor and then expand each one as you get to it. Props to the author for such an elegant implementation.


Isn't one of the merits of ESM that it can import either? I believe this is supposed to work:

import theDefaultExport from '@heyputer/kv.js';


Browsers can not import CommonJS


This looks super cool! I have a use case where I store some stateful stuff in Redis that a worker constantly interacts w/ and updates. Redis is great, but the project is a very side thing and would love not to have to pay for bandwidth / just embed it inside of it. Going to take a look at this and see if I can make it work! Didn't see any serialization formats, so might just do that myself. Ty for building this!


I've been looking at the source code, and I can't understand why they are converting Number to String before storing it in the map.

What's the reason for it? Is it something specific to JavaScript?


No tests?


Subscriptions would be cool, allowing you to leverage this library for state management.

Add an async storage interface so you can persist and you’ll have an awesome little database.


Add decent persistence. If you can just run express+KV to get a webserver with a database on any platform then this would be cool.


I like this style... one big file with everything in there, and finding things is just a Ctrl+F away.


You could just use a good IDE or editor instead which indexes the code for you...


No performance metrics?

I've done something similar in the past but find JS just too slow


Window drag and drop doesn’t work on touch devices for puter.com


> KV.JS is a fast…

What makes it “fast”? Is there an explanation or comparison?




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: